76333 Revīzijas

Autors SHA1 Ziņojums Datums
Yiltan 6290db319c [GDA/BNXT] Implemented CQE Collapsing (#279) 2025-10-23 14:53:44 -04:00
Pengda Xie a4bbd73dc6 SWDEV-556684 - Remove HSAIL support (#1183) 2025-10-23 11:21:49 -07:00
Kian Cossettini db949445c3 [rocprofiler-systems] Overhaul OpenMP-VV Test compilation (#1389)
* Reworked Compilation

* Formatting

* Change compile log name

* Optimize Code

* Remove gfx940 and gfx941
2025-10-23 13:58:11 -04:00
Aurelien Bouteiller bdb30e2984 Tests/syncall (#291)
* SyncAll test case would run Sync

* Despecialized name for argument reader

* Rename sync-test to team-sync-test as it uses teams

* Another stab at probing NUM_GPUS

[ROCm/rocshmem commit: 054bc33dc4]
2025-10-23 13:40:41 -04:00
Aurelien Bouteiller 054bc33dc4 Tests/syncall (#291)
* SyncAll test case would run Sync

* Despecialized name for argument reader

* Rename sync-test to team-sync-test as it uses teams

* Another stab at probing NUM_GPUS
2025-10-23 13:40:41 -04:00
Venkateshwar Reddy Kandula 8c89ed8ab1 [rocprofiler-sdk][CI] Use rock infra for rocprofiler-sdk build docs jobs (#1518)
* Initial changes to move build docs job to rock infra

* misc. fix

* clean up code.
2025-10-23 11:17:13 -05:00
Kapil S. Pawar be249ae356 Fix segmentation fault related to ext-profiler plugin (#1986)
[ROCm/rccl commit: 912d53caba]
2025-10-23 09:26:35 -05:00
Kapil S. Pawar 912d53caba Fix segmentation fault related to ext-profiler plugin (#1986) 2025-10-23 09:26:35 -05:00
Venkateshwar Reddy Kandula 40f9f15ece use rhel 8.10 amdgpu kernel driver for rhel 8.8 (#1490) 2025-10-23 09:00:10 -05:00
Charis Poag Jones 933fdc3c7e [SWDEV-558141] Fix rocm-smi --setsclk [0...n] & other clocks in partitioned configurations (#1493)
Changes:
  - Fix `rocm-smi --setsclk [0 .. n]` for multiple devices to continue on fail when
    in a partitioned configuration (ex. in DPX/QPX/CPX/etc).
  - Partitioned configurations or devices which do not support changing
    sclk/mclk/pcie clks will now continue on failure. Will report a "not
    supported" or other (rocm-smi) error codes for these devices.
  - Updates impact other clock settings such as `--setmclk` and
    `--setpcie`.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-10-23 08:56:41 -05:00
Edgar Gabriel 3eadf8cc62 fix Win_flush prototype in function table (#289)
the bug was exposed when trying to compile a backend with HDP flush
support.

[ROCm/rocshmem commit: e2c6bb8bd4]
2025-10-23 08:43:41 -05:00
Edgar Gabriel e2c6bb8bd4 fix Win_flush prototype in function table (#289)
the bug was exposed when trying to compile a backend with HDP flush
support.
2025-10-23 08:43:41 -05:00
vedithal-amd 2a37cbf2ca Bump VERSION and add CHANGELOG for ROCm 7.1.1 release (#1447) 2025-10-23 09:34:18 -04:00
ywang103-amd ee805d1014 remove option of json as rocprofv3's intermediate file to avoid test failures of outdated code (#1474) 2025-10-23 09:33:54 -04:00
Gopesh Bhardwaj 30bcf123a8 build fix for linker error (#1376) 2025-10-23 17:35:51 +05:30
Ioannis Assiouras 6d6b136374 SWDEV-559166 - Fix data races in GetSubmissionBatch, CaptureAndSet and SetQueueStatus (#1441) 2025-10-23 12:18:31 +01:00
amd-srinivas1 e99bd0c783 SWDEV-546345-[catch2][dtest]-Added tests for hipMemcpy3DPeer Apis(Memory Management) (#897)
* SWDEV-546345-Added tests for hipMemcpy3DPeer apis

* SWDEV-546345-Removed nested SECTIONS.

* SWDEV-546345-Optimized the code.

* SWDEV-546345-Addressed Review comments

* SWDEV-546345-Added image check support

---------

Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2025-10-23 14:40:13 +05:30
ywang103-amd 9b562c0e58 pc sampling multi kernel (#1382)
* initial commit

* add csv support extraction for non kernel selection mode

* add --kernel-trace for rocprofiler-sdk mode

* make non kernel selective mode runnable

* make kernel selection work with -k

* remove upper case of arg hint

* update documentation

* display same kernel name at only one place and merge instruction id with same obj id as well as offset

* remove kernel name's display for single kernel selection

* change log added

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-10-23 01:26:08 -04:00
Gerardo Hernandez a128884078 SWDEV-541351 - query engine clock frequency via amdsmi to avoid clock tests being flaky (#1186) 2025-10-23 06:09:51 +01:00
Aryan Salmanpour dc2f000c69 CMake cleanup (#197)
[ROCm/rocjpeg commit: d2ae241911]
2025-10-22 20:34:33 -07:00
Aryan Salmanpour d2ae241911 CMake cleanup (#197) 2025-10-22 20:34:33 -07:00
Jimbo 37f2be9140 SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister (#962)
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

* SWDEV-554174 Added hipHostRegisterIoMemory flag in test cases

* SWDEV-554174 : Did formatting corrections

* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set

* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set

* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

---------

Co-authored-by: Anavena Venkatesh <Anavena.Venkatesh@amd.com>
Co-authored-by: Rambabu Swargam <rambabu.swargam@amd.com>
2025-10-22 20:25:59 -04:00
Kanangot Balakrishnan, Bindhiya 09a97f02ed [SWDEV-542718] Correct socket_affinity (#760)
* [SWDEV-542718] Correct socket_affinity

Updated Socket affinity to show bitmask and expanded cpu list.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Update per-device local_cpulist for socket_affinity

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."

This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.

* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Addressed review comments

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-10-22 16:20:41 -05:00
Kanangot Balakrishnan, Bindhiya 3924171d74 [SWDEV-542718] Correct socket_affinity (#760)
* [SWDEV-542718] Correct socket_affinity

Updated Socket affinity to show bitmask and expanded cpu list.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Update per-device local_cpulist for socket_affinity

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."

This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.

* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Addressed review comments

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 09a97f02ed]
2025-10-22 16:20:41 -05:00
Mark Meserve 79076c4ad5 attach: Cleanup docs from initial commmit (#1302)
- Remove unimplemented older API functions
- Remove mentions of reattach API
- Remove details on implementing a process attachment library
  - This will return later as a theory of operation
2025-10-22 16:16:49 -05:00
Edgar Gabriel d37af80d7e add support for GPUs using wavefront size of 32 (#285)
* add gfx1100 support

Add support for Radeon 7900 GPUs (RX and PRO), and 7800 PRO.

I was contemplating to add gfx1101 and gfx1102 GPUs as well, but those are the lower end models that are more unlikely to be used for compute intensive jobs. In addition, I do not have access to them to test the support.

* update WF_SIZe for different options

Radeon systems use a WarpSize of 32, unlike current Instinct systems,
which use a warp size of 64. For the device side, a gfx specific ifdef
is sufficient. For the host side, we need to query the device
properties.

* adjust functional tests to wf_size of 32

* update unit tests to handle wf_size of 32

* address reviewer comments

[ROCm/rocshmem commit: d0c2845031]
2025-10-22 16:04:58 -05:00
Edgar Gabriel d0c2845031 add support for GPUs using wavefront size of 32 (#285)
* add gfx1100 support

Add support for Radeon 7900 GPUs (RX and PRO), and 7800 PRO.

I was contemplating to add gfx1101 and gfx1102 GPUs as well, but those are the lower end models that are more unlikely to be used for compute intensive jobs. In addition, I do not have access to them to test the support.

* update WF_SIZe for different options

Radeon systems use a WarpSize of 32, unlike current Instinct systems,
which use a warp size of 64. For the device side, a gfx specific ifdef
is sufficient. For the host side, we need to query the device
properties.

* adjust functional tests to wf_size of 32

* update unit tests to handle wf_size of 32

* address reviewer comments
2025-10-22 16:04:58 -05:00
Todd tiantuo Li bc7898c687 SWDEV-556751 - skip Unit_hipEventRecord (#1239) 2025-10-22 13:49:22 -07:00
Joseph Macaranas 4ba8c94aab [External CI] Add references to rocm-systems super repo (#1935)
- In order to trigger downstream jobs to verify projects that consume rccl, references to those repos are required.

[ROCm/rccl commit: c2e71e83d1]
2025-10-22 16:07:05 -04:00
Joseph Macaranas c2e71e83d1 [External CI] Add references to rocm-systems super repo (#1935)
- In order to trigger downstream jobs to verify projects that consume rccl, references to those repos are required.
2025-10-22 16:07:05 -04:00
xuchen-amd 578589d363 [rocprofiler-compute] metrics generator (#1199) 2025-10-22 15:17:43 -04:00
Avinash Kethineedi b771a26916 Add ROCSHMEM_CTX_INVALID for invalid context handling (#287)
* Add `ROCSHMEM_CTX_INVALID` for invalid context handling
  - Define `ROCSHMEM_CTX_INVALID` as {nullptr, nullptr}
  - Add == and != operators to rocshmem_ctx_t
  - Use `ROCSHMEM_CTX_INVALID` on failed context creation
  - Skip ctx destroy if context is invalid

* Update docs for context create and destroy APIs usage and behavior

[ROCm/rocshmem commit: 955c22aeed]
2025-10-22 12:00:56 -05:00
Avinash Kethineedi 955c22aeed Add ROCSHMEM_CTX_INVALID for invalid context handling (#287)
* Add `ROCSHMEM_CTX_INVALID` for invalid context handling
  - Define `ROCSHMEM_CTX_INVALID` as {nullptr, nullptr}
  - Add == and != operators to rocshmem_ctx_t
  - Use `ROCSHMEM_CTX_INVALID` on failed context creation
  - Skip ctx destroy if context is invalid

* Update docs for context create and destroy APIs usage and behavior
2025-10-22 12:00:56 -05:00
David Galiffi e453705d9b Check if test exists before adding validation (#1478)
* Check if test exists before adding validation

* Adjust validation parameters for rocpd_string

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-10-22 12:19:38 -04:00
Swati Rawat 3808f7ea76 rocpd documentation improvements (#1498) 2025-10-22 11:59:22 -04:00
pcritchl-amd 63a991a8b9 SWDEV-543498 - Some compute Ubertrace profiles are missing queue timing data (#1146) 2025-10-22 08:56:33 -07:00
Jatin Chaudhary ee93c9ddab SWDEV-545100 - add two SPIRV targets (#1037) 2025-10-22 11:39:45 -04:00
solaiys eab103d4ed [RDHC] Update rocm-core package scripts to include rdhc script (#1482)
* Add rdhc script in to rocm-core package
  * Create the rdhc symlink within the package itself.
  * rdhc tool support is not enabled for windows.

  * [RDHC] Check if the required pip pkgs are present and warn .
     rdhc checks the required pip packages are present or not.
     if not warns the user and exits gracefully.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
2025-10-22 19:54:40 +05:30
Aravind Ravikumar a7a1647926 Adding reservation time for salloc in CI (#1992)
Co-authored-by: arravikum <arravikum@amd.com>

[ROCm/rccl commit: 506c2e9878]
2025-10-22 10:00:01 -04:00
Aravind Ravikumar 506c2e9878 Adding reservation time for salloc in CI (#1992)
Co-authored-by: arravikum <arravikum@amd.com>
2025-10-22 10:00:01 -04:00
marandje aa4dee57b5 SWDEV-555295 - Fix and enable Unit_hipFreeAsync_Negative_Parameters (#991) 2025-10-22 15:57:54 +02:00
Vladimir Indic 920b33c0b9 PCS: Temporarily Masking Trap Handler Latency (#1109)
- Temporarily, masking out the trap handler latency, by detecting
untagged error samples.
- Disabling checks for the number of invalid samples.
2025-10-22 14:18:08 +02:00
ehsanhosseinzadehKhaligh f5b45c549d Updating npkit_trace_generator.py to check npkit directory (#1891)
* create dir regardless of default or user-provided path if it doesn't exist
* Fix npkit_dump_dir on npkit_trace_generator.py

---------

Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>

[ROCm/rccl commit: aec4f0a659]
2025-10-22 02:51:16 -05:00
ehsanhosseinzadehKhaligh aec4f0a659 Updating npkit_trace_generator.py to check npkit directory (#1891)
* create dir regardless of default or user-provided path if it doesn't exist
* Fix npkit_dump_dir on npkit_trace_generator.py

---------

Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>
2025-10-22 02:51:16 -05:00
Ammar ELWazir 9cf8a5e0b5 [ROCProfiler-SDK] Remove Python library dependency from Python bindings (#1451) 2025-10-21 22:09:36 -05:00
arvindcheru 285061f05b Enhance ROCM-Core for Windows (#1467)
* Enhance Code for support for Windows cpp build
* Updated ROCM-Core README build steps
* File copyright Headers Updated
2025-10-21 23:04:23 -04:00
Venkateshwar Reddy Kandula 4f590499c6 [rocprofiler-sdk] Fix rocm-release compatibility latest (#1479)
* Update rocprofiler-sdk-rocm_release_compatibility.yml

* apply Copilot

* addr comments

* remove 6.2 requirements. 6.2 now can use normal Install requirements step
2025-10-21 21:45:18 -05:00
Kian Cossettini f0a41b65f7 [rocprofiler-systems] Add Fortran main detection to rocprof-sys-instrument to avoid instrumenting around C "main" wrapper (#1322)
* Add check for Fortran main

* Comment change

* MAIN__ -> Fortran main

* Cray Compiler comment change

* Add changelog and troubleshooting comments

* Improve CHANGELOG.md message

* Change CHANGELOG msg to be in 7.2.0

* Apply review change #1

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply review change #2

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply review change #3

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-21 16:41:29 -04:00
Afzal Patel 1efa14da5b add roctracer and rocm-core include directiories (#1970)
[ROCm/rccl commit: 724680f87c]
2025-10-21 13:53:57 -04:00
Afzal Patel 724680f87c add roctracer and rocm-core include directiories (#1970) 2025-10-21 13:53:57 -04:00