Граф коммитов

245 Коммитов

Автор SHA1 Сообщение Дата
David Galiffi 4b0fb2cdf5 Rename "corr_id" to "stack_id" in Perfetto annotations to match new n… (#1618)
* Rename "corr_id" to "stack_id" in Perfetto annotations to match new naming in schema.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* correlation_id.ancestor was not added until ROCPROFILER_VERSION 1.0

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-11-04 14:20:11 -05:00
Kian Cossettini 883caf2719 [rocprofiler-systems] Overhaul skip condition of implicit_task and add ROCPD validation test (#1589)
- Add rocpd validation check and fix implicit_task check
- SWDEV-562896
2025-10-31 09:59:23 -04:00
marantic-amd 08d259c24c Fix the issue when sampling JAX with rocpd (#1552) 2025-10-27 09:59:51 -04:00
Milan Radosavljevic 8806be162c Change how cache manager handles child process trace cache for rocpd (#1033)
* Change how cache manager handles child process trace cache

* Sampling and backtrace metrics to cache

* Apply cmake formatting

* Fix parsing of metadata json

* Code clean up

* Fix build nlohmann json from source

* Fix storage parsed finished callback

* Revert sampling for child process

* Change cache file name generating

* Fix thread start stop

* Fix process start end timestamp

* Applied suggestions from code review

* Try with late start of flushing task thread

* Change dockerfiles for ci

* Revert changes on github workflows

* Remove json_fwd.hpp include

* fix dump

* Build nlohmann/json by default

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update location of build artifacts for nlohmann/json

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Revert use_output_suffix

* Remove unused logs

* Fix cache store inside counter due to structure change

* Remove decode tests from debian ci

* Fix issue where all databases have the same UUID (#1499)

Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>

* Removing the cpack and install steps to save space

* Revert "Remove decode tests from debian ci"

This reverts commit ddabf6dd142dcf438e6b8997b8abe86f2c868468.

* Revert "Removing the cpack and install steps to save space"

This reverts commit 973da3a1ba99d99d529af5269d30e177092f9bfa.

* Add prepare-runner job as dependency to clean up the space

* Fix formatting

* Free up even more space

* Remove verbose for workflows

* remove hw_counters from ext_data

* move space clean up inside container

* try to remove external folder to free up space

* Check space

* Refactor Cleanup to it's own step

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
2025-10-24 11:47:15 -04:00
Milan Radosavljevic 48fdcebf62 Add caching of category region for rocpd (#1420)
* Add caching of category region

Fix vaapi traces

Remove region_with_name

* Applied suggestions from code review
2025-10-20 16:05:14 -04:00
Milan Radosavljevic 00faa48ac2 Add flushing of perfetto buffer (#1417)
- Add flushing of perfetto buffer
- Add `ROCPROFSYS_PERFETTO_FLUSH_PERIOD_MS` config setting.
- Update CHANGELOG.sh
- Resolves SWDEV-518817

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-17 09:30:29 -04:00
marantic-amd f2ccc96cfd Add missing counter events handling for ROCPD (#1305)
* Add missing counter events handling for ROCPD

* Update projects/rocprofiler-systems/source/lib/rocprof-sys/library/rocprofiler-sdk/counters.cpp

* Update projects/rocprofiler-systems/source/lib/rocprof-sys/library/rocprofiler-sdk/counters.cpp

* Fixed formatting

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: Marjan Antic <Marjan.Antic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-15 23:15:19 -04:00
David Galiffi 182a750c08 Fix for thread limit tests (#323)
* Fix for thread limit tests. Which are failing due to exceeding the number of threads allowed.

Signed-off-by: Anuj-Kumar Shukla <AnujKumar.Shukla@amd.com>

* Update CMakeLists.txt

* Stopping thread creation after max thread limit

* Adressed review comments

* Update projects/rocprofiler-systems/tests/source/CMakeLists.txt

---------

Signed-off-by: Anuj-Kumar Shukla <AnujKumar.Shukla@amd.com>
Co-authored-by: anujshuk-amd <anujshuk@amd.com>
2025-10-09 19:07:14 -04:00
Kian Cossettini 0c53a12a88 [rocprofiler-systems] [ROCpd] Add OMPT callbacks to ROCpd (#1016)
* Add OMPT to ROCpd

* Use correct category

* Added wrapper functions for future control

* Formatting

* Fix naming

* Comment change

* Remove ompt_get_cb_args

* Switched to using region_sample for OMPT

* Remove relic function

* Remove get_use_rocpd that was used in this pr (one still remains)

* Rename ompt_get_args_string and reuse in tool_tracing_callback_stop

* Make lock init and destroy cb instant

* [Prototype] ROCPD Name fix

* [Prototype] ROCPD Name fix P1

* [Prototype] ROCPD Name fix P2

* ROCPD Name fix

* Var name changes

* Rewrite cb overwrite to single function

* [Important] Use parallel_data as key for parallel callback map

* Fix workflow failure

* Make cpp USE_ROCM consistent with hpp and use default constructor if USE_ROCM = 0

* Add missing ROCPROFILER_VERSION check

* Improve readability

* Make ompt storage maps thread local

* Part 1: Variable name fix, memory cleanup, and fixed asserts

* Part 2: Add comments

* Part 3: Add CI_THROW

* Part 4: Formatting

* Part 5: Move #include to cpp
2025-10-07 19:01:25 -04:00
Kian Cossettini edfda63701 Remove OMPT category and fix certain preprocessor checks (#1165)
* Part 1: Remove OMPT Category
* Part 2: Properly remove backend choices
* Part 3: Ensure preprocessor checks if user defined var to OFF
2025-10-02 21:08:18 -04:00
habajpai-amd 74fc268a32 Add libomptarget discovery to prevent OpenMP/HIP segfaults (#1043)
This PR fixes a segmentation fault seen when running rocprof-sys-sample with multi-process OpenMP/HIP applications.
The crash was caused by missing libomptarget.so on the runtime loader path or incorrect LD_PRELOAD settings.

Fixes SWDEV-552804

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-01 09:51:26 -04:00
David Galiffi 4d959460e1 Add ROCPROFSYS_PATH variable to environment (#1103)
* Add ROCPROFSYS_ROOT to the env for sample

* Add env for causal

* Add env for instrument

* Check for null and address memory leak

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-09-24 13:52:34 -04:00
Kian Cossettini 7eb606a582 Make lock init and destroy cb events instant (#1074)
Removed names changes for `ROCPROFILER_OMPT_ID_lock_init` and `ROCPROFILER_OMPT_ID_lock_destroy`. 
Made both of these callbacks instant.
2025-09-24 07:41:47 -04:00
David Galiffi a57fd50865 Update the rocprof-sys-rt library (#786)
Derived from Dyninst_RT 13.0.0
2025-09-03 09:19:43 -04:00
Sajina PK 2da209da7f SWDEV-536287 - Detect SELinux mode and log error if enabled (#819)
* Detect SELinux mode and fail-fast

* Detect SELinux status by reading /sys/fs/selinux/enforce during initialization.
* Fix the verbose mode for HIP Stream events

* Add more information in the logs
Add information to the user about how to change the setting
2025-09-03 09:16:36 -04:00
Kian Cossettini 07a7b9b845 Use rocprofiler-SDK for OMPT tracing (#702)
Switch to using SDK for OMPT tracing and remove older OMPT code path
2025-08-26 16:54:01 -04:00
Milan Radosavljevic df7b9d559f Fix collecting of stream id's for rocpd (#751) 2025-08-26 16:17:42 -04:00
Milan Radosavljevic 96a46962ad Change amd_smi and cpu_freq modules to use trace cache for rocpd (#690)
* Move amd-smi to use caching mechanism

* Add VCN and JPEG activity to rocpd

* Switch cpu_freq to use caching mechanism

* Different approach with xcp activity & applied suggestions from code review

* Applied suggestions from code review

* Fix shadowing

* Applied suggestions from code review
2025-08-26 14:00:04 -04:00
David Galiffi 847580dd9e Update minimum_cmake_required to match version used in CI (#679)
- Update minimum_cmake_required to match version used in CI
  - We should match the minimum version that we test against

- Ensure ".S" files are treated as assembly.
2025-08-21 15:56:47 -04:00
systems-assistant[bot] 1f86010ca2 ROCpd support [Part 2] (#109)
* Rocpd part 2, caching

* Fix shadowed variables

* backward compatibility

* Fixed designated initializers

* Fix timemory include

* Remove benchmark & Fix build issues for rhel

* Add missing bracket

* Fix shadowing and pedantic

* Fix pedantic pt2

* Fix duplicated SDK calls

* Add decay in get_size_impl

* Rename sample cache to trace cache

* Add cache storage supported types

* Resolving track naming in sampling module

* fix sampling of flushing thread

* fix sampling of flushing thread 2

* throw exception upon store while buffer storage is not running

* Prevent fork crashing

* Fix rebase issue

* Applied suggestions from code review

* Change flushing thread to use PTL

* Fix agent creation order

* Fix stream id ci throw

* Remove force setup of rocprofiler-sdk

* Code cleanup

* Change initialization for agent

* Add missing namespace

* Fix the mismatch within the tool_agent->device_id

* Switch from using handle to use agent type index

* Fix pmc info comparator in metadata registry

---------

Co-authored-by: Aleksandar <aleksandar.djordjevic@amd.com>
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com>
Co-authored-by: Marjan Antic <marantic@amd.com>
2025-08-19 22:01:04 -04:00
habajpai-amd 15fb4943e2 Fix the openmp-target ctest (#300)
- openmp-target: add runtime rpath for libomptarget and update tests
- Handle events not associated with a HIP Stream
  - Kernels from OpenMP target offload are not associated with a HIP stream. Fix handling with the callback record's stream_id is 0

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: c424dac261]
2025-07-31 10:41:27 -04:00
Aleksandar Djordjevic 166babf234 ROCpd support [Part 1] (#279)
- Add rocpd support for
 - cpu_frequency
 - amd_smi
 - sampling


[ROCm/rocprofiler-systems commit: 26ae543012]
2025-07-28 11:33:52 -04:00
ajanicijamd e2fc692ee0 Allow events to be grouped by HIP stream ID (#274)
- Corelate memory_copy and kernel_dispatch events with their HIP stream_id and add stream_id as an annotation in Perfetto.
- By default, group memory_copy and kernel_dispatch events in Perfetto output by their stream_id.
- Add option, with the configuration setting ROCPROFSYS_ROCM_GROUP_BY_QUEUE, to group by HSA queue instead.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 4b4a846b58]
2025-07-23 21:28:26 -04:00
Sajina PK d4da72bf2d Update gotcha submodule from timemory (#277)
* Update gotcha submodule from timemory

* Fix build failure and add copilot suggestions

* Fix formatting errors

[ROCm/rocprofiler-systems commit: d26486ad83]
2025-07-14 21:12:10 -04:00
darren-amd a5ca94ab9c Fix ROCtx event ranges in trace output (#278)
* Fix marker api traces

* Remove space

* Formatting change

* Small change

* Update Changelog

* Add period to changelog

* Update source/lib/rocprof-sys/library/rocprofiler-sdk.cpp

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* Fix roctx tests

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: c996c23a13]
2025-07-14 19:31:14 -04:00
Sajina PK 329183b112 Conditionally include backtraces in ROCPROFSYS_THROW based on verbosity (#272)
* Conditionally include backtraces in ROCPROFSYS_THROW based on verbosity

Modify ROCPROFSYS_THROW to only include backtraces when:
  debug mode is enabled, OR
  verbose level is >= 2, OR
  running in CI environment

* Fix formatting errors

[ROCm/rocprofiler-systems commit: b0ff07b4fe]
2025-07-07 14:14:02 -04:00
David Galiffi 8fcf3a50b0 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 122623a929]
2025-06-22 10:44:33 -04:00
David Galiffi 0403aaa97f Use clang-format-18 for source formatting (#256)
* Updating clang-format to v18

- Updates the pre-commit-config
- Formats source files according to the utility

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update format source workflow

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING

* Update comment in .clang-format

* Update CONTRIBUTING.md

* Update helper script

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 1e13b590e7]
2025-06-22 08:48:08 -04:00
Sajina PK 6a8cef771e Show VCN and JPEG busy values where VCN/JPEG activity is not supported. (#232)
On AMD-SMI, in rocm 7.0, vcn_activity and jpeg_activity will not be reported when XCP (partition) stats, vcn_busy and jpeg_busy, are available. This causes the activity tracking to fail. The fix is to read the busy values when activity values are not supported.

For issue: SWDEV-536439

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: e3741f678b]
2025-06-19 16:23:30 -04:00
David Galiffi 133834335d Unhandled enum in switch statement (#247)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 244c193a57]
2025-06-17 09:57:27 -04:00
Kian Cossettini 348afae1a8 Improve rocprof-sys-avail to report VCN and JPEG metrics on supported devices (#226)
* SWDEV-535445: rocprof-sys-avail shows jpeg_activity even when unsupported

* Added vcn tracking

* jpeg and vcn description now includes supported gpus

* Add getter methods per device to check vcn and jpeg support

Add logic to check if vcn activity and vcn busy values are supported for each device.
Add logic to check if jpeg activity and jpeg busy values are supported for each device.

Co-authored-by: Sajina P Kandy <sputhala@amd.com>

* Add getter methods per device to check vcn and jpeg support (#228)

* Formatting

* Variable fix

* List of supported GPUs are now ordered

* Removed the ability to see which gpu supports jpeg and vcn activity to reduce clutter

* Formatting

* Testing for busy support

* jpeg and vcn only show if supported

* Removed commented code

* Formatting

* Applied amd_smi cpp/hpp fixes

* Added break condition for xcp loop

* Modified loops for efficiency

* Removed unneccessary macro

* Removed unneccessary includes

---------

Co-authored-by: Sajina Kandy <sputhala@amd.com>
Co-authored-by: Sajina PK <Sajina.PuthalathKandy@amd.com>

[ROCm/rocprofiler-systems commit: 0380cf58ba]
2025-06-09 16:14:53 -04:00
David Galiffi c7c3c3f97e Use rocprofiler-sdk for RCCL-API tracing (#126)
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.

[ROCm/rocprofiler-systems commit: af77d93f75]
2025-06-06 11:36:17 -04:00
habajpai-amd f718bd907c SWDEV-507117: Unify OMP Target Offload Events into a Single Perfetto … (#230)
* SWDEV-507117: Unify OMP Target Offload Events into a Single Perfetto Timeline Row

* Fixed warning and format

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: c5507e3740]
2025-06-06 11:52:30 +05:30
Sajina PK 916aac1e92 Enable MPI tracing for Fortran (#185)
- Move the MPI gotcha functionality from Timemory to the repo.
- Add the PMPI Fortran MPI functions to the existing mpi gotcha handle.

[ROCm/rocprofiler-systems commit: 4fcd8cc78d]
2025-06-04 18:06:18 -04:00
habajpai-amd 6e4ced65b8 SWDEV-533856: Handle dynamic event for HIP api for perfetto (#225)
* SWDEV-533856: Handle dynamic event for HIP api for perfetto

* Refactor: Generalize function using template

* Format Source

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: abecaa8bf8]
2025-06-04 15:11:26 +05:30
David Galiffi a4208bbd94 Fix compilation failure in amd-smi 26.0.0 (#223)
- The parameter "year" was removed from amdsmi_version_t.
- For SWDEV-535858, SWDEV-535870

[ROCm/rocprofiler-systems commit: 650827c5ea]
2025-06-02 18:22:13 -04:00
habajpai-amd 75a335b245 Add corr_id for HIP Runtime API in Perfetto (#218)
for SWDEV-533883

---------

Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>

[ROCm/rocprofiler-systems commit: 39090bfc54]
2025-05-29 16:00:31 -04:00
Pranjal Swarup 3fa24a0012 ROCPROFSYS_AMD_SMI_METRICS visibility (#208)
* Removed advanced category from ROCPROFSYS_AMD_SMI_METRICS to have this property visible with rocprof-sys-avail.

[ROCm/rocprofiler-systems commit: 4c7560c78c]
2025-05-15 13:31:40 -04:00
Sajina PK f14ca86a74 Add new VA-API methods to the gotcha wrappers (#203)
For a new feature in rocJPEG adding new VA-APIs to the gotcha wrapper

[ROCm/rocprofiler-systems commit: 90ad264447]
2025-05-13 08:05:55 -04:00
David Galiffi 6fe19b681a Fix path to post-processing merge script (#187)
- Path to merge script not found unless user explicitly sources "share/rocprofiler-systems/setup-env.sh" to setup PATHs.
- Instead, let's derive the path when the application loads and use it when executing the helper script
- Rename script to rocprof-sys-merge-output.sh.
- Change install folder to <prefix>/libexec/rocprofiler-systems based on dev-ops feedback.
- Updated PATH variable in the modulefile and source scrtipt.
- For SWDEV-528101

[ROCm/rocprofiler-systems commit: adc66956b0]
2025-05-02 16:52:54 -04:00
anujshuk-amd 31dc2414e3 Reverting PR-154 Changes since VCN data not seen on Perfetto file (#191)
[ROCm/rocprofiler-systems commit: ff109912c2]
2025-05-02 16:19:43 -04:00
David Galiffi 490ff33d25 Conditionally include ROCPROFILER_BUFFER_TRACING_PAGE_MIGRATION (#193)
- Include only if ROCPROFILER_SDK_VERSION < 1.0.0, as it is being removed
- For SWDEV-530639

[ROCm/rocprofiler-systems commit: 0f16d45445]
2025-05-02 15:05:27 -04:00
Sajina PK 8c424f2074 Fix to overlapping VCN and JPEG tracks in perfetto (#192)
- Fix overlapping VCN and JPEG activity values in Perfetto output.
- Modify the storage of the activity values to be more efficient.

[ROCm/rocprofiler-systems commit: 99a411fe52]
2025-05-01 19:40:49 -04:00
Luca Bruni 579596dbba Appropriately filter data based on -D and -H options (#163)
- Addresses concern that device metric tracks are still shown in Perfetto trace file even when only -H is specified to rocprof-sys-sample (and vice versa).
- Update sampling call-stack docs.

[ROCm/rocprofiler-systems commit: 8ae6651357]
2025-04-30 09:50:51 -04:00
anujshuk-amd 35b8748c20 Fix ROCPROFSYS_AMD_SMI_METRICS parsing (#178)
Fixes a bug where all the `ROCPROFSYSE_AMD_SMI_METRICS` values were being recorded by default.
Fixes bug with the 'all' and 'none' values giving an exception when specified for `ROCPROFSYSE_AMD_SMI_METRICS`.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 8d48048bd3]
2025-04-28 09:22:20 -04:00
Sohaib Nadeem 160edff37f Initialization fixes (#154)
- Remove tooling initialization from rocprofiler_configure:
when rocprofiler configure is called from __hip_module_ctor
(which in turn is called as a global constructor when loading shared
libraries or before main in a hip program), initializing tooling
in it can cause problems because it is too early to do some of the tasks
that it involves (e.g. opening shared libraries, creating threads).
Instead, we rely on rocprofsys_main to initialize tooling later.

- Skip rocprofiler_configure if ROCPROFSYS_PRELOAD is not set since
preload is required for tooling (such as perfetto, which is used by
the rocprofiler callbacks) to be initialized.

- Revert RCCL initialization changes: These are no longer needed since rocprofsys_init_tooling_hidden will not
be called from rocprofiler_configure

- Force rocprofiler_configure in rocprofsys_init_tooling_hidden if it hasn't been
called through __hip_module_ctor global constructor

[ROCm/rocprofiler-systems commit: 0e535daa93]
2025-04-21 17:04:24 -04:00
David Galiffi bb4ed0b3ba Add rocm-6.4 to workflows (#165)
* Add rocm-6.4 to workflows

* Update containers.yml

* Update cpack.yml

* Update cpack.yml

* Disable OpenMP Target Examples on GitHub Runners

* Fix build warnings.

Switch statements with unhandled enums.

* Enable testing on 6.3 and 6.4

* Ubuntu 24 workflow. Build both ROCm 6.3 and 6.4

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 169c9a0d49]
2025-04-18 09:32:26 -04:00
David Galiffi 83cc60f4b7 Fix rocprofiler-sdk includes (#169)
For compatibility with recent rocprofiler-sdk change.

[ROCm/rocprofiler-systems commit: 2680ccc3a7]
2025-04-16 21:18:06 -04:00
anujshuk-amd 36f7de25a2 Change the default value of ROCPROFSYS_SAMPLING_CPUS to "none" (#164)
[ROCm/rocprofiler-systems commit: 807a622b04]
2025-04-11 17:09:26 -04:00
Sajina PK 04fb7e4fe7 RocJpeg cmake and document fixes (#157)
- Fix for rocjpeg sample cmake due to changes in the rocJPEG project
- Fix for rocprofiler-sdk version check - change the format
- Edits to docs for jpeg and vcn activity support - mention that these values may not be supported on all ASICs.

[ROCm/rocprofiler-systems commit: fad3a0d341]
2025-04-09 16:20:02 -04:00