24 Commits

Autor SHA1 Mensaje Fecha
Milan Radosavljevic b533f56197 Add automatic PyTorch library discovery for Python applications (#2623)
* Add automatic PyTorch library discovery for Python applications (#2623)
2026-01-20 08:42:49 +01:00
Milan Radosavljevic 318d13870f [rocprofiler-systems] Update logging to use spdlog library (#2428)
## Motivation

- Structured logging with proper log levels (TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Better performance through compile-time formatting
- Consistent formatting using fmt library
- Runtime log level control via arguments and environment variables
- Easier maintenance and debugging capabilities

## Technical Details

- Added spdlog as a submodule and integrated it into CMake build system
- Created new `rocprofiler-systems-logger` library wrapping spdlog functionality
- Replaced custom logging macros (`ROCPROFSYS_VERBOSE`, `ROCPROFSYS_DEBUG`, `ROCPROFSYS_FATAL`, `ROCPROFSYS_REQUIRE`, `ROCPROFSYS_CI_THROW`, etc.) with spdlog equivalents (`LOG_DEBUG`, `LOG_WARNING`, `LOG_CRITICAL`, etc.)
- Implemented log level control through command-line arguments and environment variables
- Converted assertion macros to proper error handling with exceptions and std::abort()
2026-01-14 15:27:51 -05:00
Sajina PK b3f59a37e4 [Rocprofiler-system]: Fix GPU event enumeration for rocprof-sys-avail and CLI option for parsing GPU HW Counters (#2476)
## Motivation

The `rocprof-sys-avail -H -c GPU` command is returning blank output which is expected to display a list of available GPU hardware counters instead.
The `rocprof-sys-sample` and `rocprof-sys-run` is missing the `--gpu-events` option for specifying GPU counter events during profiling.

## Technical Details

The initialize_event_info() function had a logic bug where it only called set_agents() if the agent_manager was empty, but the actual issue was that the gpu_agents and cpu_agents vectors were empty even when agents were discovered.
Fixed the conditional logic to properly call set_agents() when gpu_agents and cpu_agents are empty, regardless of the agent_manager state.

Added the `--gpu-events (-G)` option which sets the `ROCPROFSYS_ROCM_EVENTS` environment variable to the specified values.

Fixes an issue where unsupported GPU/APU arch is being skipped gracefully - more details about this issue in the below comment.
2026-01-09 11:59:45 -05:00
marantic-amd bb83791b17 Remove redundant ROCPROFSYS_TRACE_CACHED variable from the code (#2434) 2025-12-25 13:36:04 +01:00
marantic-amd ba1380a75d Put cached perfetto traces as default one (#2138)
* Put cached perfetto traces as default one

* Improve cached data and perfetto traces in order to be more aligned with E2E tests

* Addressing PR comments and findings

* Force early instrumentation bundle instantiation

* Sync-up insturumented containers with thread growth data

* Revert ompvv number of host threads to default 8

* Fixed counter track namings for amd-smi

* AIPROFSYST-34 [rocprof-sys] Update documentation describing newly introduced changes to default tracing mechanism
2025-12-22 12:47:35 +01:00
habajpai-amd 30161885e2 refactor: centralize update_env across binaries with unit test added … (#2029)
* refactor: centralize update_env across binaries with unit test added for testing

* removed unused includes suggested by clangd and small cleanup

* use centralized update_env in argparse as well

* review comments incorporated

* move update_env tests closer to common library

* fix: missing common:: prefix in rocprof-sys-sample

* cmake formatting
2025-12-04 19:24:27 +05:30
marantic-amd 3b11e01716 Perfetto traces from cached data (#1704)
## Motivation

The idea is to unify the way and place where we store our traces. Current implementation uses `trace_cache` for rocpd traces, but perfetto is in lined inside of each module. This change allows us to have a single point in code where we will collect data, process it and store it in the desired format. This means that we can declutter the code further and have single point of responsibility and single point of failure.

## Technical Details

New `processor` (perfetto_post_processing.cpp) is added to the `trace_cache` which purpose is to use the cached data to populate perfetto tracks. Cache manager is responsible for keeping the instance of this processor and for its lifetime.
2025-12-01 09:59:16 -05:00
habajpai-amd b09834e784 refactor: duplicated path helpers into common/path.hpp (#1249)
* refactor: duplicated path helpers into common/path.hpp

* update rocprof-sys-instrument to use shared path utility

* Add path::realpath(std::string[, std::string*]) helper function in common/path.hpp for binaries

* common: centralize remove_env implementation in environment.hpp

* remove unused includes from rocprof-sys binaries and argparse

* changing set to unordered_set wherever sorting is not required and additional cleanup

* review comment incorporated

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* copilot review for remove_env incorporated

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-20 09:55:00 +05:30
Kian Cossettini edfda63701 Remove OMPT category and fix certain preprocessor checks (#1165)
* Part 1: Remove OMPT Category
* Part 2: Properly remove backend choices
* Part 3: Ensure preprocessor checks if user defined var to OFF
2025-10-02 21:08:18 -04:00
habajpai-amd 74fc268a32 Add libomptarget discovery to prevent OpenMP/HIP segfaults (#1043)
This PR fixes a segmentation fault seen when running rocprof-sys-sample with multi-process OpenMP/HIP applications.
The crash was caused by missing libomptarget.so on the runtime loader path or incorrect LD_PRELOAD settings.

Fixes SWDEV-552804

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-01 09:51:26 -04:00
David Galiffi 4d959460e1 Add ROCPROFSYS_PATH variable to environment (#1103)
* Add ROCPROFSYS_ROOT to the env for sample

* Add env for causal

* Add env for instrument

* Check for null and address memory leak

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-09-24 13:52:34 -04:00
Kian Cossettini 07a7b9b845 Use rocprofiler-SDK for OMPT tracing (#702)
Switch to using SDK for OMPT tracing and remove older OMPT code path
2025-08-26 16:54:01 -04:00
David Galiffi 8fcf3a50b0 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 122623a929]
2025-06-22 10:44:33 -04:00
David Galiffi c7c3c3f97e Use rocprofiler-sdk for RCCL-API tracing (#126)
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.

[ROCm/rocprofiler-systems commit: af77d93f75]
2025-06-06 11:36:17 -04:00
David Galiffi 6fe19b681a Fix path to post-processing merge script (#187)
- Path to merge script not found unless user explicitly sources "share/rocprofiler-systems/setup-env.sh" to setup PATHs.
- Instead, let's derive the path when the application loads and use it when executing the helper script
- Rename script to rocprof-sys-merge-output.sh.
- Change install folder to <prefix>/libexec/rocprofiler-systems based on dev-ops feedback.
- Updated PATH variable in the modulefile and source scrtipt.
- For SWDEV-528101

[ROCm/rocprofiler-systems commit: adc66956b0]
2025-05-02 16:52:54 -04:00
Luca Bruni 579596dbba Appropriately filter data based on -D and -H options (#163)
- Addresses concern that device metric tracks are still shown in Perfetto trace file even when only -H is specified to rocprof-sys-sample (and vice versa).
- Update sampling call-stack docs.

[ROCm/rocprofiler-systems commit: 8ae6651357]
2025-04-30 09:50:51 -04:00
David Galiffi bd0eeb9555 Reapply "Upgrade ROCm-SMI to AMD SMI (#86)" (#147)
* Reapply "Upgrade ROCm-SMI to AMD SMI (#86)"

This reverts commit 9fcea73122.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 85bbea4954]
2025-03-25 17:31:27 -04:00
David Galiffi 2c9d92be33 Remove remaining roctracer references (#82)
[ROCm/rocprofiler-systems commit: e437200e9e]
2025-02-07 23:27:58 -05:00
David Galiffi 9fcea73122 Revert "Upgrade ROCm-SMI to AMD SMI (#86)" (#100)
This reverts commit 8c5db3f1d8.

[ROCm/rocprofiler-systems commit: b3eee295dd]
2025-02-07 11:45:26 -05:00
cfallows-amd 8c5db3f1d8 Upgrade ROCm-SMI to AMD SMI (#86)
* Integrating amd-smi into rocprofiler-systems due to rocm-smi deprecation.
* No functionality changes to users other than naming conventions.
* New tracks available in perfetto- gpu busy percentage metrics now splits gfx busy into separate gfx, umc, and mm engine measurements.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 0c32dfd6bc]
2025-01-30 21:32:07 -05:00
Peter Park 3f9a3861ac Update copyright year to 2025 (#83)
[ROCm/rocprofiler-systems commit: 0a15d355e0]
2025-01-29 16:53:16 -05:00
Maarten Arnst 0447cfdc58 Update to KOKKOS_TOOLS_LIBS env var (#69)
[ROCm/rocprofiler-systems commit: 043a8010a9]
2025-01-29 16:53:15 -05:00
David Galiffi b29cfac106 Update to use rocprofiler-sdk (#55)
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
   - Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change. 

[ROCm/rocprofiler-systems commit: 88aa2d3cbe]
2024-12-13 18:48:39 -05:00
David Galiffi 489eda995d Rename Omnitrace to ROCm Systems Profiler (#4)
The Omnitrace program is being renamed. 

Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"

---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d07bf508a9]
2024-10-15 11:20:40 -04:00