Graphe des révisions

75382 Révisions

Auteur SHA1 Message Date
Edgar Gabriel 3ce10dc688 fix allreduce tester (#385)
- use the reduce_psync buffers for synchronization in allreduce, not the
  barrier_psync.
- execute a wwg barrier after the allreduce operation. After internal
  discussion it was determined that it is required for correctness.

[ROCm/rocshmem commit: 6f512e92a5]
2026-01-16 08:10:25 -06:00
Edgar Gabriel 6f512e92a5 fix allreduce tester (#385)
- use the reduce_psync buffers for synchronization in allreduce, not the
  barrier_psync.
- execute a wwg barrier after the allreduce operation. After internal
  discussion it was determined that it is required for correctness.
2026-01-16 08:10:25 -06:00
Fábio Mestre 7794ac9ac6 [hip-tests] Fix Float16 accuracy tests (#2178)
Tests were relying on floats for calculating ulp values when validating the output. This is not correct given that the calculations are done using Float16. The fix is to update the test framework to use fp16 ulp instead.
2026-01-16 13:25:11 +00:00
spolifroni-amd e0d00500d0 bump rocm-docs-core version (#691) 2026-01-15 20:13:09 -08:00
Kian Cossettini 9f014db6a4 [rocprofiler-systems] Update install path for examples (#2625)
* Update install path for examples to `share/rocprofiler-systems/examples`

----

Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2026-01-15 21:51:16 -05:00
German Andryeyev e438308541 rocr/libhskamt: Add wsl build in thunk 2026-01-15 17:29:50 -05:00
Omri Mor 93493e3e46 ionic: fix byteswap functions (added in #345), missed in #368 (#388)
[ROCm/rocshmem commit: 885e41ec62]
2026-01-15 14:19:19 -08:00
Omri Mor 885e41ec62 ionic: fix byteswap functions (added in #345), missed in #368 (#388) 2026-01-15 14:19:19 -08:00
German Andryeyev 5c5b9729ff Add 'projects/rocr-runtime/libhsakmt/include/hsakmt/drm/' from commit '8c47e25315e70f9c8cdd57a5790d3e080938c969'
git-subtree-dir: projects/rocr-runtime/libhsakmt/include/hsakmt/drm
git-subtree-mainline: 5319163521
git-subtree-split: 8c47e25315
2026-01-15 16:06:07 -05:00
Omri Mor 3260759dfd Replace byteswap interface to align with C++23 std::byteswap (#368)
* byteswap<T> returns by value
* replace hand-rolled implementations with Clang __builtin_bswap<N> intrinsics
* new high-level interface endian::to_be, endian::from_be, etc. to indicate conversion direction

[ROCm/rocshmem commit: cf8b72a047]
2026-01-15 13:03:01 -08:00
Omri Mor cf8b72a047 Replace byteswap interface to align with C++23 std::byteswap (#368)
* byteswap<T> returns by value
* replace hand-rolled implementations with Clang __builtin_bswap<N> intrinsics
* new high-level interface endian::to_be, endian::from_be, etc. to indicate conversion direction
2026-01-15 13:03:01 -08:00
German Andryeyev 5319163521 Add 'projects/rocr-runtime/libhsakmt/include/impl/' from commit 'c34ec1e52fcb52da248c00207ebe646197ea9d3e'
git-subtree-dir: projects/rocr-runtime/libhsakmt/include/impl
git-subtree-mainline: 55f7d39fa5
git-subtree-split: c34ec1e52f
2026-01-15 15:54:37 -05:00
German Andryeyev 55f7d39fa5 Add 'projects/rocr-runtime/libhsakmt/src/dxg/' from commit '029690f0a4f62fefefbb67305a066a72e99f8c0b'
git-subtree-dir: projects/rocr-runtime/libhsakmt/src/dxg
git-subtree-mainline: 8760fb4976
git-subtree-split: 029690f0a4
2026-01-15 15:51:21 -05:00
Mark Meserve 8760fb4976 attach: Formalize ROCAttach API (#1653)
* attach: Formalize ROCAttach API

- Make ROCAttach public with public headers
- Change detach to take a PID
  - attach and detach are now reentrant
- Cleanup of states and signal handling in ptrace session
- Fixes mixed up definition of ROCPROF_ATTACH_TOOL_LIBRARY
  - ROCPROF_ATTACH_TOOL_LIBRARY now always means the tool library loaded by the attachment target
  - ROCPROF_ATTACH_LIBRARY refers to the library used to perform attachment
- Add direct call of rocprof-attach
- Fix python library call of rocprof-attach
  - Function now named attach(), changed from main()

* attach: rocprof-compute ROCAttach updates

- Update to new library names
- Correct usage of C lib detach

* attach: add test for rocattach

- Disable ASan, TSan, and UBSan for the new parallel-attach test
- Lower log level for LSan tests, existing behavior from other tests

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2026-01-15 14:32:14 -06:00
dsclear-amd 2482bff0b7 Excludes (more) docs-only changes from .azuredevops/rocm_ci_caller.yml. (#2615)
Motivation

We wish to avoid triggering full Jenkins runs for docs-only PRs, as this takes up testing resources and slows development time. rocm_ci_caller.yml already excludes some docs-only changes, but this can be improved to exclude them along more paths.
Technical Details

The checks that rocm_ci_caller.yml uses to determine if a changed file in a PR is worth a Jenkins run has been increased to exclude more paths and more file suffixes.
JIRA ID

AIROCDOC-78, AIROCDOC-424
Test Plan

    Created a test branch users/dsclear/shorten_workflows_test_root with the changes in this PR, branched from develop.
    Branched users/dsclear/shorten_workflows_test_bin_3 and users/dsclear/shorten_workflows_test_text_3 from users/dsclear/shorten_workflows_test_root.
    Modified users/dsclear/shorten_workflows_test_bin_3 to add two .h files, and submitted a PR into users/dsclear/shorten_workflows_test_root (Test PR, do not merge. Test PR to test Jenkins CI/CD modifications. #2613).
    Modified users/dsclear/shorten_workflows_test_text_3 to add a new .txt file, and submitted a PR into users/dsclear/shorten_workflows_test_root (Test PR, do not merge. Test PR to test Jenkins CI/CD modifications (docs only). #2614).

Test Result

The test PR in step 3 caused rocm_ci_caller.yml to attempt to trigger Jenkins, as this is a 'non-docs' change.
The test PR in step 4 had the attempt to trigger Jenkins skipped, as this is a 'docs-only' change.
2026-01-15 14:54:20 -05:00
Mario Limonciello 838b3dccf1 Adjust amdgpu version output for amd-smi (#2563)
* Fix the amdgpu version string comparison

The intention behind it was to avoid showing the string if it's not
got information.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

* Display the kernel version in amd-smi output

This is an interesting debugging point, especially in the case of
not having a DKMS package installed.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Moving os_kernel_version to static --driver

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2026-01-15 11:11:58 -08:00
yugang-amd fe60c39256 Bump rocm-docs-core to 1.31.2 (#2627)
* Update requirements.in

* Update requirements.txt
2026-01-15 13:18:30 -05:00
yugang-amd bcd9119dbc Bump rocm-docs-core to 1.31.2 (#387)
[ROCm/rocshmem commit: 491739c9b4]
2026-01-15 13:17:51 -05:00
yugang-amd 491739c9b4 Bump rocm-docs-core to 1.31.2 (#387) 2026-01-15 13:17:51 -05:00
Bindhiya Kanangot Balakrishnan aa16cca39a [SWDEV-549108] Increase gpu_metrics API execution test threshold (#2617)
Increased threshold from 2100 μs to 3100 µs to accommodate
gpu_metric read time variation across Navi systems.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2026-01-15 11:20:17 -06:00
Matthias Gehre 1883f736ad Fix double-free crash when librocm_smi64.so and libamd_smi.so are loaded together (#2531)
Problem:
When TheRock-based PyTorch package is installed along with amdsmi, importing
torch causes a double-free crash on exit (GitHub issue ROCm/TheRock#2269).

Root cause:
Both librocm_smi64.so and libamd_smi.so export the C++ static member
'amd::smi::Device::devInfoTypesStrings'. When libraries are loaded with
RTLD_GLOBAL, the dynamic linker resolves libamd_smi.so's reference to this
symbol to the one in librocm_smi64.so. This causes:
1. librocm_smi64.so registers its destructor for devInfoTypesStrings
2. libamd_smi.so also registers a destructor, but for the SAME address
3. On exit, both destructors run on the same object -> double-free

Fix:
Change devInfoTypesStrings from a class static member to a file-local static
variable. This ensures the symbol has internal linkage and is not exported,
preventing the symbol collision.

Changes:
- rocm_smi_device.h: Remove static member declaration
- rocm_smi_device.cc: Change from 'Device::devInfoTypesStrings' to file-local
  'static const std::map<...> devInfoTypesStrings'
- rocm_smi.cc: Remove the global alias to the (now removed) class member

Tested on gfx1151. `import torch` crashed on exit before the fix, and doesn't crash after the fix.
2026-01-15 08:43:47 -08:00
Filip Jankovic 29cd25df66 Add hipDeviceAttributeExpertSchedMode (#2435)
* Add hipDeviceAttributeExpertSchedMode

---------

Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>

* Update hipDeviceAttributeExpertSchedMode unit test

* Move check to ROCr from thunk interface

* Revert unrelated whitespace changes

* Revert version bump

---------

Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
2026-01-15 08:41:39 -08:00
Milan Radosavljevic 940488ed58 [rocprofiler-systems] Fix naming and description of process_page category (#2606) 2026-01-15 16:10:50 +01:00
Jeff Jiang b2752f68cf * rocDecode: Added two typo fixes. (#690) 2026-01-14 21:49:53 -05:00
Milan Radosavljevic 318d13870f [rocprofiler-systems] Update logging to use spdlog library (#2428)
## Motivation

- Structured logging with proper log levels (TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Better performance through compile-time formatting
- Consistent formatting using fmt library
- Runtime log level control via arguments and environment variables
- Easier maintenance and debugging capabilities

## Technical Details

- Added spdlog as a submodule and integrated it into CMake build system
- Created new `rocprofiler-systems-logger` library wrapping spdlog functionality
- Replaced custom logging macros (`ROCPROFSYS_VERBOSE`, `ROCPROFSYS_DEBUG`, `ROCPROFSYS_FATAL`, `ROCPROFSYS_REQUIRE`, `ROCPROFSYS_CI_THROW`, etc.) with spdlog equivalents (`LOG_DEBUG`, `LOG_WARNING`, `LOG_CRITICAL`, etc.)
- Implemented log level control through command-line arguments and environment variables
- Converted assertion macros to proper error handling with exceptions and std::abort()
2026-01-14 15:27:51 -05:00
Joseph Narlo 499127c0b9 [SWDEV-553434] No direct way to get the BASEBOARD temperature info (#2502)
* [SWDEV-553434] No direct way to get the BASEBOARD temperature info. Need to iterate all gpus

Signed-off-by: amd-josnarlo <josnarlo.amd.com>

---------

Signed-off-by: amd-josnarlo <josnarlo.amd.com>
Co-authored-by: amd-josnarlo <josnarlo.amd.com>
2026-01-14 13:52:58 -06:00
David Yat Sin a3b445118d SWDEV-519413 - Ignore ROCr shutdown events (#1616)
ROCr now reports a shutdown event, but this is not a fatal error. Ignore
this event.
2026-01-14 11:28:03 -08:00
xuchen-amd 71b9ea6ba0 [rocprofiler-compute] improve config management system (#2359) 2026-01-14 13:20:27 -05:00
Luca Bruni d7ff927690 [clr] Fix device printf pointer advancement issue with string format specifiers (#1313) 2026-01-14 13:05:25 -05:00
habajpai-amd bad8d915c3 Fix: Add visibility hidden to devInfoTypesStrings to prevent symbol interposition (#2575) 2026-01-14 09:48:49 -08:00
Gopesh Bhardwaj b18db05091 [rocprofiler-sdk] Fixing docs build (#2608) 2026-01-14 10:13:17 -05:00
pghoshamd d2a1fc945e SWDEV-569319 Fix dangling reference warning (#2509)
* SWDEV-569319 Fix dangling reference warning

* fix nullptr warning

* use emplace

* return regular pointer
2026-01-13 15:39:03 -06:00
hongkzha-amd 9dc2488b6b rocrtst: Add test cases for interrupt disabled mode (#2385)
Add explicit test cases to verify ROCr functionality with interrupts
disabled (HSA_ENABLE_INTERRUPT=0). This ensures compatibility with
virtio, dtif, and WSL configurations which require interrupt-disabled
mode.

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
2026-01-13 12:10:11 -06:00
hongkzha-amd b3c4e94e70 rocr: Improve memory protection and WSL compatibility (#2274)
* rocr: Add ProtectMemory API and use it in RemoveAccess
Replace munmap + mmap with mprotect when removing memory access.
This improves performance by 5-10x, ensures atomicity (no race
condition window), and prepares for WSL/DXG compatibility fixes.

Suggested-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>

* rocr: Skip CPU mapping operations on WSL
On WSL, CPU cannot access GPU VRAM due to platform restrictions.
CPU access would fault-in system RAM instead, causing data corruption
and memory leaks. Return HSA_STATUS_ERROR to fail fast rather than
silently creating broken mappings. GPU-to-GPU mappings remain functional.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>

* rocr: reduce ifdef linux
v2: Fix IsDXG check logic

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>

---------
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
2026-01-13 12:08:20 -06:00
Geo Min dfdb64572c [TheRock CI] Adding working single node tests (#2142)
* Adding working single node tests

* Revert to old docker sha

* adding back no perf tests

---------

Co-authored-by: Aravind Ravikumar <arravikum@amd.com>

[ROCm/rccl commit: 4b295c9893]
2026-01-13 08:35:58 -08:00
Geo Min 4b295c9893 [TheRock CI] Adding working single node tests (#2142)
* Adding working single node tests

* Revert to old docker sha

* adding back no perf tests

---------

Co-authored-by: Aravind Ravikumar <arravikum@amd.com>
2026-01-13 08:35:58 -08:00
Jan Stephan 2e8c863341 Use doxysphinx includes for enums, macros and global types (#2273)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2026-01-13 17:33:49 +01:00
Jan Stephan 88584f3c0d Fix wrong call to executable (#2290)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2026-01-13 17:31:10 +01:00
Adam Pryor 9425a2f687 [SWDEV-569427] Fix segfault calling bad page info (#2547) 2026-01-13 09:44:49 -06:00
AidanBeltonS 607d66e87c Add messages to static asserts to prevent warnings (#1011) 2026-01-13 14:02:36 +00:00
Fábio Mestre 09a01ee11c Replace usages of __ockl_clz with builtins (#2234) 2026-01-13 11:15:46 +01:00
Fábio Mestre 61325db1c8 Fix AMD_LOG_LEVEL_SIZE env variable (#2463)
AMD_LOG_LEVEL_SIZE is being used in a global variable.
This always uses the default value of 2048 because the
HIP runtime doesn't have the opportunity to load
environment variables at the point where global variables
are initialized.

The solution is to use AMD_LOG_LEVEL_SIZE inside
truncate_log_file() function.
2026-01-13 09:57:49 +00:00
Jan Stephan 35a5274b84 CSS: Don't reference images that aren't generated by Doxygen (#2295)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2026-01-13 10:11:57 +01:00
David Galiffi 2daec0e4d0 Revert 63713f01e0 (#2585)
## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve. -->
Remove Fortran example due to Palamida scan violation.

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
Revert 63713f01e0.
New test to be added later.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2026-01-12 23:44:26 -05:00
randyh62 21b6021848 Restore Lane masks bit shift content (#2411)
Co-authored-by: Christophe Paquot <35546540+chrispaquot@users.noreply.github.com>
2026-01-12 19:01:19 -05:00
dsclear-amd d5f490fa2f Sets heavy GitHub CI workflows to not trigger on text documentation-only changes. (#2417)
Sets heavy GitHub CI workflows to not trigger on docs-only changes.

Specifically, sets azure-ci-dispatcher.yml and therock-ci.yml, as well as many rocprofiler workflows, to not trigger when the change consists entirely of docs-only files.
2026-01-12 18:31:30 -05:00
Jason Bonnell 95a31b10cd Fix aqlprofile-continuous_integration.yml workflow (#2582)
* Fix typo in matrix definition for aqlprofile-continuous_integration.yml

* Update ROCM_VERSION to 7.1.1

* Minor changes to core-rpm step

* Add working-directory to test steps

* Revert changes

* Add set -v to rpm test step

* Remove Python venv line from rpm test step
2026-01-12 15:53:04 -05:00
Jin Jung d4758bc29e SWDEV-570501 - Add Windows support for hipGraphicsGLRegisterBuffer (#2323) 2026-01-12 13:10:46 -06:00
SaleelK e6e0378acd clr: Always query new engine for intergpu copies (#2559) 2026-01-12 11:01:02 -08:00
Mythreya Kuricheti 36d9d33d90 Users/mkuriche/rocprofiler sdk fmt build fix memory header (#2537)
* [rocprofiler-sdk] Fix fmt::join build errors

- remedy use of fmt::join without include <fmt/ranges.h>

* include memory header

* Disable FMT build for SDK CI

* Add -DROCPROFILER_BUILD_FMT=OFF to sanitizer steps

* Add temporary workaround for rccl.h issue

* Add ROCPROFILER_INTERNAL_RCCL_API_TRACE to SDK CI builds

* disable clang-tidy for vendored includes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: jbonnell-amd <jason.bonnell@amd.com>
2026-01-12 12:59:47 -05:00