* Conditionally include backtraces in ROCPROFSYS_THROW based on verbosity
Modify ROCPROFSYS_THROW to only include backtraces when:
debug mode is enabled, OR
verbose level is >= 2, OR
running in CI environment
* Fix formatting errors
* Replace `cmake-format` with `gersemi`
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Remove .cmake-format.yaml
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Update workflow to use gersemi
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Update CONTRIBUTING.md
* Update helper scripts
* Don't include `*/external/*` in workflows
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Updating clang-format to v18
- Updates the pre-commit-config
- Formats source files according to the utility
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Update format source workflow
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Update CONTRIBUTING
* Update comment in .clang-format
* Update CONTRIBUTING.md
* Update helper script
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
On AMD-SMI, in rocm 7.0, vcn_activity and jpeg_activity will not be reported when XCP (partition) stats, vcn_busy and jpeg_busy, are available. This causes the activity tracking to fail. The fix is to read the busy values when activity values are not supported.
For issue: SWDEV-536439
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* SWDEV-535445: rocprof-sys-avail shows jpeg_activity even when unsupported
* Added vcn tracking
* jpeg and vcn description now includes supported gpus
* Add getter methods per device to check vcn and jpeg support
Add logic to check if vcn activity and vcn busy values are supported for each device.
Add logic to check if jpeg activity and jpeg busy values are supported for each device.
Co-authored-by: Sajina P Kandy <sputhala@amd.com>
* Add getter methods per device to check vcn and jpeg support (#228)
* Formatting
* Variable fix
* List of supported GPUs are now ordered
* Removed the ability to see which gpu supports jpeg and vcn activity to reduce clutter
* Formatting
* Testing for busy support
* jpeg and vcn only show if supported
* Removed commented code
* Formatting
* Applied amd_smi cpp/hpp fixes
* Added break condition for xcp loop
* Modified loops for efficiency
* Removed unneccessary macro
* Removed unneccessary includes
---------
Co-authored-by: Sajina Kandy <sputhala@amd.com>
Co-authored-by: Sajina PK <Sajina.PuthalathKandy@amd.com>
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.
* SWDEV-507117: Unify OMP Target Offload Events into a Single Perfetto Timeline Row
* Fixed warning and format
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* SWDEV-533856: Handle dynamic event for HIP api for perfetto
* Refactor: Generalize function using template
* Format Source
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Path to merge script not found unless user explicitly sources "share/rocprofiler-systems/setup-env.sh" to setup PATHs.
- Instead, let's derive the path when the application loads and use it when executing the helper script
- Rename script to rocprof-sys-merge-output.sh.
- Change install folder to <prefix>/libexec/rocprofiler-systems based on dev-ops feedback.
- Updated PATH variable in the modulefile and source scrtipt.
- For SWDEV-528101
- Addresses concern that device metric tracks are still shown in Perfetto trace file even when only -H is specified to rocprof-sys-sample (and vice versa).
- Update sampling call-stack docs.
Fixes a bug where all the `ROCPROFSYSE_AMD_SMI_METRICS` values were being recorded by default.
Fixes bug with the 'all' and 'none' values giving an exception when specified for `ROCPROFSYSE_AMD_SMI_METRICS`.
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Remove tooling initialization from rocprofiler_configure:
when rocprofiler configure is called from __hip_module_ctor
(which in turn is called as a global constructor when loading shared
libraries or before main in a hip program), initializing tooling
in it can cause problems because it is too early to do some of the tasks
that it involves (e.g. opening shared libraries, creating threads).
Instead, we rely on rocprofsys_main to initialize tooling later.
- Skip rocprofiler_configure if ROCPROFSYS_PRELOAD is not set since
preload is required for tooling (such as perfetto, which is used by
the rocprofiler callbacks) to be initialized.
- Revert RCCL initialization changes: These are no longer needed since rocprofsys_init_tooling_hidden will not
be called from rocprofiler_configure
- Force rocprofiler_configure in rocprofsys_init_tooling_hidden if it hasn't been
called through __hip_module_ctor global constructor
- Fix for rocjpeg sample cmake due to changes in the rocJPEG project
- Fix for rocprofiler-sdk version check - change the format
- Edits to docs for jpeg and vcn activity support - mention that these values may not be supported on all ASICs.
- Check AMDSMI header version to fix compilation failure with v2.0 header change
- Fix ROCM-SMI references in documentation and tests
- Check AMDSMI library version at runtime and output in logs
- Fix a possible exception occurring when an in-flight sample is outstanding while the component is shutting down.
- Register a cleanup function in tim::manager instance to write out data in
counter storages
- The counter_storage::write() calls in tool_fini happen after the storage is destroyed
which is too late for the write to happen.
- Adjust traits for counter_data_tracker
- Add MIN, MAX, VAR, STDDEV columns
- Remove DEPTH, UNITS, %SELF columns
- Update "add_validation_test" to test for the existence of output file(s).
- Added step to test perfetto output for `transpose-rocprofiler-sampling`
and `transpose-rocprofiler-binary-rewrite`
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* Add check to skip counter_storage::write() if internal storage field is destroyed.
* Output warning message if counter data is not available when trying to write out to Timemory
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Add rocDecode API Tracing support using domain `rocjpeg_api` in ROCPROFSYS_ROCM_DOMAINS.
- Modify existing `videodecode` and `jpegdecode` ctests to verify API tracing
- Print Perfetto values for easy debugging in verbose mode
- Convert CMake error to a warning and skip building the "decode" examples if requirements are not found
- Add JPEG activity track in perfetto trace
- Add JPEG decode tests to the examples
- Change existing videodecode test to include JPEG testing
- Rename videodecode test file to decode to include jpeg tests too
- Fix a bug in the test which checks for total activity of 0
- Disable rocDecode and rocJPEG samples from the github image files
- VA API tracing using Timemory gotcha wrappers.
- rocDecode API tracing integration using callback to ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API
- Updated videodecode ctest to validate rocDecode APIs in perfetto trace.
* Integrating amd-smi into rocprofiler-systems due to rocm-smi deprecation.
* No functionality changes to users other than naming conventions.
* New tracks available in perfetto- gpu busy percentage metrics now splits gfx busy into separate gfx, umc, and mm engine measurements.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Added script to merge multiprocess output automatically to one file when there are multiprocess proto files written into output folder
- Execute the merge multiprocess script from the rank 0 process
- Added the scripts folder path to env path, via setup-env.sh
- Installed merge_multiprocess_output.sh to /share/rocprofiler-systems/bin dir
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Enable VCN activity tracing on different instances from the GPU metrics fetched using rsmi_dev_gpu_metrics_info_get in the ROCm-SMI. library.
The tracing can be controlled with ROCPROFSYS_ROCM_SMI_METRICS by setting the value as vcn_activity, Currently this configuration takes the following values: busy, temp, power, mem_usage, vcn_activity.
By default, all the 5 values will be enabled.
Signed-off-by: Sajina P Kandy <Sajina.PuthalathKandy@amd.com>
Co-authored-by: Sajina Kandy <sputhala-amd@amd.com>
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
- Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change.
- Porting from https://github.com/ROCm/omnitrace/pull/411
- Improve OMPT support
- Add OpenMP target example to testing
- Update Timemory submodule to use ROCm/Timemory rather than NERSC/Timemory
- Update `actions/upload-artifacts` to v4
- Standardize the `cmake_minimum_required` to 3.18.4 across workflows, project, and examples
- Updated Ubuntu 20.04 workflows
- Fix for proto files not being viewable in Perfetto UI
- Ported from https://github.com/ROCm/omnitrace/pull/411
- Update Workflows
- Use V47 trace_processor_shell for certain OS releases.
- RedHat 8, SUSE 15.5, and Ubuntu 20.04 are no longer compatible with the latest trace_processor_shell.
- Incompatible version of GLIBC.
- Remove notes about Perfetto workaround in documentation.
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
The Omnitrace program is being renamed.
Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"
---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>