26 Commits

Autor SHA1 Mensaje Fecha
Kian Cossettini 698ac6b8bc [rocprofiler-systems] Add build option for "examples" to specify gfx-arch (#2626)
## Motivation
 - Added `check_rocminfo` function that returns true if the provided regex was found, false otherwise. Can also use `GET_OUTPUT` to get the raw output filtered with or without a regex.
 - Moved `rocprofiler_systems_get_gfx_archs()` to `MacroUtilities.cmake` 
 - Added `rocprofiler_systems_lookup_gfx()`, which detects whether a given `gfx` is from the `instinct`, `radeon` or `apu` family.
 - Added `ROCPROFSYS_GFX_TARGETS` as a build argument. Used to specify the offloading architectures that GPU examples should compile for. If empty, defaults to whatever your system has.
 - GPU examples now check if the given `gfx` targets (from `ROCPROFSYS_GFX_TARGETS`) are supported.
 - OMPVV offload tests now only compile if `amdflang` version is `>= 20`
 - Improve link time by reducing the number of GFX targets that binaries need to support.
   - RCCL is now passed a `GPU_TARGETS` var specifying the architectures to build/link against.
2026-01-20 12:13:21 -05:00
Milan Radosavljevic 318d13870f [rocprofiler-systems] Update logging to use spdlog library (#2428)
## Motivation

- Structured logging with proper log levels (TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Better performance through compile-time formatting
- Consistent formatting using fmt library
- Runtime log level control via arguments and environment variables
- Easier maintenance and debugging capabilities

## Technical Details

- Added spdlog as a submodule and integrated it into CMake build system
- Created new `rocprofiler-systems-logger` library wrapping spdlog functionality
- Replaced custom logging macros (`ROCPROFSYS_VERBOSE`, `ROCPROFSYS_DEBUG`, `ROCPROFSYS_FATAL`, `ROCPROFSYS_REQUIRE`, `ROCPROFSYS_CI_THROW`, etc.) with spdlog equivalents (`LOG_DEBUG`, `LOG_WARNING`, `LOG_CRITICAL`, etc.)
- Implemented log level control through command-line arguments and environment variables
- Converted assertion macros to proper error handling with exceptions and std::abort()
2026-01-14 15:27:51 -05:00
marantic-amd bb83791b17 Remove redundant ROCPROFSYS_TRACE_CACHED variable from the code (#2434) 2025-12-25 13:36:04 +01:00
marantic-amd ba1380a75d Put cached perfetto traces as default one (#2138)
* Put cached perfetto traces as default one

* Improve cached data and perfetto traces in order to be more aligned with E2E tests

* Addressing PR comments and findings

* Force early instrumentation bundle instantiation

* Sync-up insturumented containers with thread growth data

* Revert ompvv number of host threads to default 8

* Fixed counter track namings for amd-smi

* AIPROFSYST-34 [rocprof-sys] Update documentation describing newly introduced changes to default tracing mechanism
2025-12-22 12:47:35 +01:00
Milan Radosavljevic ee7305e795 [rocprof-sys] Add test cleanup fixtures for binary-rewrite and runtime-instrument tests (#2012)
- Added `binary-rewrite-cleanup` and `runtime-instrument-cleanup` tests that remove instrumented binaries and output directories using `cmake -E rm -rf`
- Implemented CMake test fixtures (`FIXTURES_SETUP` and `FIXTURES_CLEANUP`) to establish proper test ordering:
  - `binary-rewrite` sets up the `binary-rewrite-fixture`
  - `binary-rewrite-run` and validation tests require this fixture
  - `binary-rewrite-cleanup` performs cleanup for this fixture
  - Same pattern applied for `runtime-instrument`
- Extended `ROCPROFILER_SYSTEMS_ADD_PYTHON_TEST` to accept `FIXTURES_REQUIRED` parameter
- Updated validation tests to require appropriate cleanup fixtures based on test name pattern matching
- Added fixture requirements to Python code-coverage tests
2025-11-28 18:51:54 -05:00
Sajina PK 09b8342e22 [Rocprofiler-systems] : Add XGMI and PCIe metrics to the profiling data (#1628)
* Add XGMI and PCIe metrics to the profiling data

Add support for AMD XGMI (GPU-to-GPU interconnect) and PCIe
metrics:
  * XGMI link width in bits
  * XGMI link speed in GT/s
  * Per-link read bandwidth (KB)
  * Per-link write bandwidth (KB)

- Add new categories for PCIe metrics:
  * PCIe link width
  * PCIe link speed in GT/s
  * Accumulated bandwidth (MB)
  * Instantaneous bandwidth (MB/s)

* Fix VCN/JPEG insert logic

* Modify the gpu_metrics struct to accomodate XCP structure

* Add ctest automation for gpu interconnect metrics

* Refactor to move gpu_metrics struct and serialization to another file

* Possible fix for timeout in CI

Fix redundant skip check in ctest
Add xgmi and pcie option in rocprof-sys-avail.

* Change2: Address review comments

Change ctest sampling to avoid timeout
Change variable name and code structuring

* Add option in ctest to run rocprof-sys-run without rewrite

Run transferbench with rocprof-sys-run without sampling

* Change3: Fix sample insert bug and address review comments

xgmi and pci support check
renaming variables
additional hip_api validation in rocpd

* Reduce the load from the trnasferBench sample

The CI builds were timing out when flushing a big temporary file to the
DB: (2720824.23 KB / 2720.82 MB / 2.72 GB)...
2025-11-14 19:42:33 -05:00
David Galiffi 540eda3865 [rocprof-sys] Forward ctest labels from the execution test to the validation test. (#1697)
* Forward ctest labels from the execution test to the validation test.

* Adjust test validation parameters for amid_smi samples

The actual number of samples will vary depending on the GPU. This test
is just to validate the presence of the samples
2025-11-13 21:49:07 -05:00
Milan Radosavljevic 833c250c27 Add clean up fixture for trace cache temporary files (#1836)
* Add clean up fixture for trace cache tmp files

* Switch to bash instead of cmake running command
2025-11-13 21:01:04 -05:00
Aleksandar Djordjevic f39a60ac25 [rocprofiler-systems] Apply new CMake formatting for the latest gersemi version (#1778)
* Fix cmake formatting

* Updated rev. in `.pre-commit-config.yaml`

* Pin the gersemi used in CI to v0.23.1, matching the pre-commit

---------

Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-10 13:08:44 -05:00
Kian Cossettini 2a080641a1 [rocprofiler-systems] Consolidate CTests to tests/ folder (#1461)
* Consolidate CTests to tests/ folder

* Remove comment

* Consolidate CTests to tests/ folder

* Remove comment

* Separate source code and test code for thread-limit into appropriate folders

* Remove sleeper.cpp and instead use linux sleep cmd

* Merge python-console tests into python-tests
2025-11-03 11:03:35 -05:00
David Galiffi 3d7a5eec0e Setup rocprofsys_root environment variable (#1561)
* Setup `rocprofsys_root` environment variable

* Update `CHANGELOGS`

* Fixed formatting

* Add rocpd output and validation to python tests

* Refactoring environment setup
2025-10-28 13:06:07 -04:00
David Galiffi 32f9fa6ca5 Enable some simple ROCpd testing (#834)
* Add for rocpd testing and output validation

Add for transpose, video-decode, jpeg-decode, roctx, and openmp-target
Add JSON check to pre-commit-config

Co-authored-by: Marjan Antic <Marjan.Antic@amd.com>

* Remove redundant environment variable

* Fix spelling typo

* Fix typo in error message

* Fix memory_allocation query

* Incorperate feedback from review. Handle case where there are multiple matching "name_prefix" tables.

* Fix environment settings in `rocprof-sys-testing.cmake`

Accidently removed in previous refactoring.

* Formatting python file

---------

Co-authored-by: Marjan Antic <Marjan.Antic@amd.com>
2025-10-20 17:40:10 -04:00
ajanicijamd 02883c3d8d Fixed openmp-vv tests (#1203)
* LD_LIBRARY_PATH was being overridden so tool's libraries could not be found.
2025-10-03 21:33:02 -04:00
ajanicijamd e2fc692ee0 Allow events to be grouped by HIP stream ID (#274)
- Corelate memory_copy and kernel_dispatch events with their HIP stream_id and add stream_id as an annotation in Perfetto.
- By default, group memory_copy and kernel_dispatch events in Perfetto output by their stream_id.
- Add option, with the configuration setting ROCPROFSYS_ROCM_GROUP_BY_QUEUE, to group by HSA queue instead.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 4b4a846b58]
2025-07-23 21:28:26 -04:00
David Galiffi 8fcf3a50b0 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 122623a929]
2025-06-22 10:44:33 -04:00
David Galiffi c7c3c3f97e Use rocprofiler-sdk for RCCL-API tracing (#126)
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.

[ROCm/rocprofiler-systems commit: af77d93f75]
2025-06-06 11:36:17 -04:00
anujshuk-amd ef90bab236 Update transpose-rocprofiler-* tests (#210)
- Updating counters collected and tested for on Navi-based machines
- Update add CMake function to query GPU architectures
- Update decode tests to use new functions

[ROCm/rocprofiler-systems commit: 4c24975626]
2025-05-22 14:04:33 -04:00
Aleksandar Djordjevic 395bf369fc [CMake] Fix GPU detection function
* Fix cmake GPU detection

* formatting cmake

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 39f676aa71]
2025-04-28 05:07:00 +02:00
David Galiffi 6a960a1edb Added copyright information to requested files (#167)
For SWDEV-526556

[ROCm/rocprofiler-systems commit: b25b6cec92]
2025-04-15 18:39:53 -04:00
David Galiffi bd0eeb9555 Reapply "Upgrade ROCm-SMI to AMD SMI (#86)" (#147)
* Reapply "Upgrade ROCm-SMI to AMD SMI (#86)"

This reverts commit 9fcea73122.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 85bbea4954]
2025-03-25 17:31:27 -04:00
Sajina PK 3ca3d63d5c Fix for excluding JPEG and VCN activity test. (#135)
JPEG activity recording is currently only supported on MI300 serries.
VCN activity is supported in MI100 also but there is a bug currently being fixed by FW.

- Currently only testing the Activity verification tests for MI300
- Also moves the Jpeg image copying code to after the package is found.

[ROCm/rocprofiler-systems commit: e605e5d33f]
2025-03-11 14:12:28 -04:00
Sohaib Nadeem 95a07edf0b Fix hardware counter summary files not being generated after profiling (#124)
- Register a cleanup function in tim::manager instance to write out data in
counter storages

- The counter_storage::write() calls in tool_fini happen after the storage is destroyed
which is too late for the write to happen.

- Adjust traits for counter_data_tracker

- Add MIN, MAX, VAR, STDDEV columns
- Remove DEPTH, UNITS, %SELF columns

- Update "add_validation_test" to test for the existence of output file(s).
- Added step to test perfetto output for `transpose-rocprofiler-sampling`
and `transpose-rocprofiler-binary-rewrite`

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 42922ec851]
2025-03-05 16:05:18 -05:00
David Galiffi 9fcea73122 Revert "Upgrade ROCm-SMI to AMD SMI (#86)" (#100)
This reverts commit 8c5db3f1d8.

[ROCm/rocprofiler-systems commit: b3eee295dd]
2025-02-07 11:45:26 -05:00
cfallows-amd 8c5db3f1d8 Upgrade ROCm-SMI to AMD SMI (#86)
* Integrating amd-smi into rocprofiler-systems due to rocm-smi deprecation.
* No functionality changes to users other than naming conventions.
* New tracks available in perfetto- gpu busy percentage metrics now splits gfx busy into separate gfx, umc, and mm engine measurements.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 0c32dfd6bc]
2025-01-30 21:32:07 -05:00
David Galiffi b29cfac106 Update to use rocprofiler-sdk (#55)
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
   - Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change. 

[ROCm/rocprofiler-systems commit: 88aa2d3cbe]
2024-12-13 18:48:39 -05:00
David Galiffi 489eda995d Rename Omnitrace to ROCm Systems Profiler (#4)
The Omnitrace program is being renamed. 

Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"

---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d07bf508a9]
2024-10-15 11:20:40 -04:00