2070 Commity

Autor SHA1 Wiadomość Data
Kian Cossettini edfda63701 Remove OMPT category and fix certain preprocessor checks (#1165)
* Part 1: Remove OMPT Category
* Part 2: Properly remove backend choices
* Part 3: Ensure preprocessor checks if user defined var to OFF
2025-10-02 21:08:18 -04:00
David Galiffi c0f8627e7f Update CI Docker files (#1202)
- Add `nlohmann-json-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `gmock-dev` and `gtest-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `--set solver classic` to conda config to resolve an issue setting up the conda environment
- Fix Perfetto package installation on ubuntu noble image.
- Add a check and log error if pip installation fail 

---------

Co-authored-by: jbonnell-amd <jason.bonnell@amd.com>
2025-10-02 21:06:01 -04:00
cfreeamd fb8ab442b6 rocr: Don't assert in hsa_shut_down when no agents (#1115)
* rocr: Don't assert in hsa_shut_down when no agents

Instead, print error message and return an error. Prior to
this patch, the assertion would occur when hsa_shut_down() is
called more than once.

* rocr: Reorder Unload  ASAN clean-up on shut down
2025-10-02 17:20:53 -07:00
cfreeamd 402aa7e253 rocr: Support batching in InterceptQueue store (#1194)
* rocr: Support batching in InterceptQueue store

* Fix comment, loop bounds
2025-10-02 10:37:40 -07:00
cfreeamd 55feeefcff Revert "rocr: Remove QueueProxy (#700)" (#1167)
This reverts commit c34c9826c3,
which was causing test failures.
2025-10-01 18:24:43 -07:00
habajpai-amd 74fc268a32 Add libomptarget discovery to prevent OpenMP/HIP segfaults (#1043)
This PR fixes a segmentation fault seen when running rocprof-sys-sample with multi-process OpenMP/HIP applications.
The crash was caused by missing libomptarget.so on the runtime loader path or incorrect LD_PRELOAD settings.

Fixes SWDEV-552804

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-01 09:51:26 -04:00
Marius Brehler 026a4e82a3 Rollup of build changes needed for compat with TheRock. (#1086)
* Rollup of build changes needed for compat with TheRock.
* When built for a non-default ROCM location, the HIP headers can't be found by a few targets.
* Uses pkg_check for DRM libraries like ROCR-Runtime does (which avoids accidental fallback to system versions).
* Robust fix for nolink targets
* nolink targets essentially exist for include directories
* all nolink targets are automatically added to rocprofiler-sdk-headers with a $<BUILD_INTERFACE:...> generator expression
* Re-add previously used mechanism to find drm libs

---------

Co-authored-by: Marius Brehler <marius.brehler@amd.com>
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-09-30 18:39:10 -04:00
Jin Jung c6d44b47d4 Fix VulkanTest::CreateMappedStorage _WIN64 segfault (#1173)
* Fix VulkanTest::CreateMappedStorage _WIN64 segfault

* Fix Indentation
2025-09-30 14:52:05 -07:00
ywang103-amd eeeaa06159 attach/detach: change workload of unit test to accommodate SDK's current limitation (#1169)
* add double mode of workload dynamic_share with on remove sleeping and
set ROCP_TOOL_ATTACH=1 for running workload

* add comment in dynamic_shared.hip to exaplain how to use argv

* refactor the attach/detach profiling time in unit tests
2025-09-30 13:16:43 -07:00
abchoudh-amd f45c8d5f6b Bugfixes for test failure (#1106)
- Bugfixes
- Update test instructions using docker
2025-09-30 15:48:41 -04:00
Jason Bonnell 953fd60e9b rocprofiler GHCR Rename (#1112)
- Rename the GHCR packages for rocprofiler Docker images to reduce the number of packages that will be released on the repository
- Changed package name to only include the OS instead of OS+Version - version moved to the tag instead.
- Updated Dockerfile.*.ci files to specify target ROCm version from tarball in name.
2025-09-30 15:15:12 -04:00
Venkateshwar Reddy Kandula c441a87a00 [rocprofiler-sdk][RCCL] RCCL New API changes for RCCL_API_TRACE_VERSION_PATCH = 2 (#985)
- Address build issue with RCCL sync with NCCL commit: ROCm/rccl@08a7be2
- Patch Version Bump-up PR: ROCm/rccl#1916
2025-09-30 12:42:42 -04:00
systems-assistant[bot] d1ee1f0cba Upgrade binutils version from 2.42 to newer 2.44 version (#113)
* Upgrade binutils version from 2.42 to newer 2.44

---------

Co-authored-by: Marjan Antic <marantic@amd.com>
Co-authored-by: Sajina Kandy <sputhala@amd.com>
2025-09-29 14:50:33 -04:00
itrowbri 956daca743 [Docs][rocprofv3]Add Consecutive Kernels Parameter Description to Docs (#1111)
* Add consecutive kernels parameter description

* remove space

* Updated docs and CHANGELOG
2025-09-29 11:21:13 -05:00
Ajay GunaShekar 81775169cc SWDEV-1 - hipcc args: --rocm-path to --hip-path in tests (#998) 2025-09-26 15:35:20 -07:00
Rahul Manocha 538f1c3b74 SWDEV-556205 - fix segfault in hiprtc (#1058)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-09-26 09:06:37 -07:00
Gerardo Hernandez e45c56c0f8 SWDEV-1 - if hipconfig process invocation by cmake fails, produce a readable error and abort
* SWDEV-1 - if platform auto-detection via hipconfig fails, provide a meaningful error and do not try to parse the output
* SWDEV-1 - if getting HIP_VERSION via hipconfig fails, provide a meaningful error and do not try to parse the output
2025-09-26 14:50:57 +01:00
German Andryeyev bb1295bcdf SWDEV-547108 - Fix compilation errors under Windows (#1085)
Also correct AQL print under Windows
2025-09-26 09:42:50 -04:00
Rahul Manocha 2bc561d404 SWDEV-557057 - fix for datatype for hipMemcpy3DBatchAsync (#1114)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-09-25 13:53:23 -07:00
Sourabh U Betigeri b24f922a24 SWDEV-552620 - Adds a new graph benchmark test for different topologies (#1073) 2025-09-25 09:50:10 -07:00
David Yat Sin cd48105282 rocr: Fix ext-fine-grain flag on host memory (#1067)
Fix for extended-fine-grain flag not set in thunk when
allocating host memory.
2025-09-25 11:10:43 -04:00
Godavarthy Surya, Anusha fb72d7f851 SWDEV-524746 - Part-II Add multi device support for hip graph. Updated kernel arg manager for each device (#813)
- Updated kernel arg manager to support allocating kernel args on multiple devices for single graph.
- Updated AQL path to capture on the device where graph node is added.

Co-authored-by: Anusha GodavarthySurya <Anusha.GodavarthySurya@amd.com>
2025-09-25 20:38:18 +05:30
MachineTom 4a31affb76 Users/taosang/SWDEV-510994 - Refractor atomics header and tests (#902)
* SWDEV-550626 - Refactor atomics header and tests

1. Introduce __HIP_ATOMIC_BACKWARD_COMPAT.
By default we define __HIP_ATOMIC_BACKWARD_COMPAT=1 to
let hip atomic functions maintain old assumptions. if
users want to adopt the new behavior, that is , by default
assume no-fine-grained no-remote-memory, then they can
define __HIP_ATOMIC_BACKWARD_COMPAT=0 and get the new
behaviour.

2. Use  __HIP_ATOMIC_BACKWARD_COMPAT_MEMORY to replace
original __HIP_FINE_GRAINED_MEMORY  in atomic header.
And apply __HIP_FINE_GRAINED_MEMORY onto all 
atomicXXX_system() functions to prevent failure on memory
allocated by hipHostMalloc().

3. Replace HIP_TEST_FINE_GRAINED_MEMORY with
HIP_TEST_ATOMIC_BACKWARD_COMPAT_MEMORY in hip-tests.

4. Fix negative test errors.
    Fix managed memory test error on memory order.
    some other minor changes.
    As a result  all originally disabled tests are enabled.

5. Add more atomics tests in some cases.

6. Reduce test time in each case.
     Reduce iteration number to 1 for tests that cost too much time.

8. Put common codes into hip_test_common.hh
2025-09-25 10:58:59 -04:00
systems-assistant[bot] becb4646bd SWDEV-546346 - [catch2][dtest] Tests for hipStreamSetAttribute and hipStreamGetAttribute (#524)
* SWDEV-546346 - [catch2][dtest] Tests for hipStreamSetAttribute and hipStreamGetAttribute

* SWDEV-546346 - Modified Kernel, added info statement

---------

Co-authored-by: Rambabu Swargam <rambabu.swargam@amd.com>
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2025-09-25 15:29:26 +05:30
vedithal-amd 5f12d9b789 Fix instructions to build standalone binary (#1116) 2025-09-24 16:31:08 -04:00
David Galiffi 4d959460e1 Add ROCPROFSYS_PATH variable to environment (#1103)
* Add ROCPROFSYS_ROOT to the env for sample

* Add env for causal

* Add env for instrument

* Check for null and address memory leak

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-09-24 13:52:34 -04:00
solaiys 8912930840 [rocm-core] Adding a tool for ROCM Deployment Health Check (#958)
* Adding a tool for ROCM Deployment Health Check

rdhc.py - This simple tool will check for the rocm
installation and its readiness on the current system and its working status.
Check the README file for more info.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
2025-09-24 22:43:42 +05:30
Istvan Kiss 83fb0c8c47 SWDEV-541514 - Docs update 2025-09-15 (#993)
Co-authored-by: Julia Jiang <56359287+jujiang-del@users.noreply.github.com>
2025-09-24 09:57:00 -07:00
Dmitrii 0575606e49 chore: [rdc] Add copyright notice (#1098) 2025-09-24 09:07:20 -07:00
vedithal-amd bd7a1de879 Remove rocprofv1/v2 in favour of rocprofiler-sdk (#673)
* Set default rocprof interface as rocprofiler-sdk

* Remove rocrprofv1 and rocprofv2 interfaces

* Remove deprecation notice for rocprof v1/v2/v3 interfaces
  * Make rocprofiler-sdk the default interface and make rocprofv3 interface opt-in using ROCPROF=rocprofv3

* Add deprecation notice for rocprofv3
2025-09-24 10:37:01 -04:00
vedithal-amd 7df02745eb [rocprofiler-compute] Update gfx12 counter definitions (#1003)
* Update gfx12 counter definitions

* Add counter defintions for Navi 4

* Made changes per https://github.com/ROCm/rocm-systems/pull/238 and
  doubled checked register specification

* bugfix
2025-09-24 10:32:21 -04:00
vedithal-amd f5505b5989 Use ROCM_PATH for sdk library path (#1097) 2025-09-24 10:31:20 -04:00
marandje a90f28cd5c SWDEV-555178 - Fix and enable Unit_hipMemVmm_Uncached (#1090) 2025-09-24 15:51:22 +02:00
jamessiddeley-amd 05315c5bb2 fixed function argument dir in apply_filters (#1100) 2025-09-24 09:50:58 -04:00
marandje 778f2f05bf SWDEV-555296 - Fix and enable Unit_hipEventIpc (#992) 2025-09-24 15:48:53 +02:00
Kian Cossettini 7eb606a582 Make lock init and destroy cb events instant (#1074)
Removed names changes for `ROCPROFILER_OMPT_ID_lock_init` and `ROCPROFILER_OMPT_ID_lock_destroy`. 
Made both of these callbacks instant.
2025-09-24 07:41:47 -04:00
Ioannis Assiouras c53bdb9643 SWDEV-556866 - Added misssing include of rocrctx.hpp in rocurilocator (#1094) 2025-09-24 06:44:02 +01:00
Benjamin Welton 9743ff0c74 Improve error message for invalid extra counters YAML format (#219)
When users provide an incorrectly formatted YAML file to the -E/--extra-counters
option in rocprofv3, they now receive a clear error message showing:

- What went wrong (invalid YAML format)
- The correct rocprofiler-sdk YAML structure with example
- The actual content that failed to parse

This addresses confusion where users might use the legacy ROCProfiler YAML
format instead of the new rocprofiler-sdk schema format.
2025-09-23 22:23:57 -05:00
SaleelK d0e622e978 hip-tests: Fix hipPerfBufferCopySpeed (#946)
* Fix formatting and buffer size
2025-09-23 23:11:20 -04:00
itrowbri abd6029603 [rocprofv3] MultiKernelDispatch ATT support (#774)
* Initial consecutive kernel WIP

* Updated logic after discussion, create context only when needed, change set of captured ids to dispatch_id_t type

* Updated to fix concurrency issues and revert kernel_iterations

* Add captured id in first lock capture

* Updated code to use wlock, added comments, removed some unecessary atomic

* Cleaned up, need to add test

* Add test to check that generated stats csv file is not empty

* Updated test to check if vector-ops kernels are being used

* Fix phase bug

* Updated for comments

* Flattened ATT logic a bit

* Fix incorrect if-statement

* Fix merge conflict
2025-09-23 20:19:27 -05:00
SaleelK 34b9184686 clr: Fix memory corruption for memset nodes (#1068)
* Detect graph capture and use graph kernelarg memory for FillBuffer pattern
2025-09-23 17:17:33 -07:00
Giovanni Lenzi Baraldi aece11079c SWDEV-553006: Fix slow lookup of debug symbols (#821)
* SWDEV-553006: Fix slow lookup of debug symbols

* Refactor

* Better docs

* Update projects/rocprofiler-sdk/source/include/rocprofiler-sdk/cxx/codeobj/code_printing.hpp
2025-09-24 01:54:53 +02:00
vedithal-amd 4962f237c2 Fix workload path (#1096) 2025-09-23 17:47:31 -04:00
xuchen-amd c3054c00b1 [rocprofiler-compute] Type annotation patch for analysis_db.py (#981) 2025-09-23 17:05:37 -04:00
systems-assistant[bot] 872f0aed0c Live attach/detach and its unit tests (#53) 2025-09-23 13:17:08 -04:00
Jonathan R. Madsen 9278770b89 [rocprofiler-sdk] ROCpd GOTCHA Fix (#720)
* Update GOTCHA submodule

- public API for gotcha_init
- switch repo to ROCm/gotcha

* rocpd interop GOTCHA updates

- fix issues wrapping dlopen/dlsym
2025-09-23 10:45:56 -05:00
xuchen-amd 68cd123b0f [rocprofiler-compute][TUI] improve for cross-platform uses (#1007) 2025-09-23 10:59:29 -04:00
Ioannis Assiouras 97fc90c58f SWDEV-556250 - Added synchronization before validating the result in Unit_hipStreamLegacy* tests (#1062) 2025-09-23 14:14:51 +01:00
Kian Cossettini b2a026f134 Increase timeout for openmp-vv ctests (#1083)
- Set `SAMPLING_TIMEOUT` and `REWRITE_TIMEOUT` to 300 seconds for `openmp-vv` ctests.
2025-09-23 07:45:56 -04:00
systems-assistant[bot] 1e9d8abbf6 [rocpd] Convert to perfetto does not display scratch_memory correctly - SWDEV-542550 (#168)
Add scratch memory to pftrace generated with rocpd

----

Co-authored-by: Marko Crnobrnja <Marko.Crnobrnja@amd.com>
Co-authored-by: Aleksei Tumakaev <atumakae@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
2025-09-23 09:55:30 +02:00