rocm-systems

Autor	SHA1	Wiadomość	Data
akolliasAMD	e7269cb925	Tests package (#384 ) * added packaging for the tests and for the driver.sh * making .sh files into programs so they keep permissions	2026-01-16 09:10:36 -07:00
German Andryeyev	07a6b45535	rocr: restore the original line	2026-01-16 11:05:24 -05:00
Aurelien Bouteiller	ede2adfe49	new tester: put to all pes from all lanes concurrently (#112 ) * Add put to all pes from all lanes concurrently * Remove wg_init, use size_t for size params, 64bit data exchange (more bits for verification masking) * Rename to flood-test, add put,putnbi,p,get,getnbi,g variants, count time correctly * Add flood tester to the testing script * add to gda test case w/o the _g variant that is not implemented. [ROCm/rocshmem commit: `cca7872bcf`]	2026-01-16 10:40:48 -05:00
Aurelien Bouteiller	cca7872bcf	new tester: put to all pes from all lanes concurrently (#112 ) * Add put to all pes from all lanes concurrently * Remove wg_init, use size_t for size params, 64bit data exchange (more bits for verification masking) * Rename to flood-test, add put,putnbi,p,get,getnbi,g variants, count time correctly * Add flood tester to the testing script * add to gda test case w/o the _g variant that is not implemented.	2026-01-16 10:40:48 -05:00
vedithal-amd	f64d8e0f43	[rocprofiler-compute] Improve native tool discovery and partition detection (#2630 ) * Improve native tool discovery and partition detection - Enhanced native tool path resolution to support CMAKE_INSTALL_LIBDIR variations (lib, lib64, lib32, etc.) using glob pattern matching - Extracted path variables to avoid duplication in error messages - Improved error message clarity by showing exact paths searched for .so and .cpp files - Simplified code path construction using consistent Path.resolve().parents[x] syntax - Fixed redundant partition warnings on pre-MI300 GPUs by adding architecture check - Only query compute/memory partition on MI300+ series (gfx940+) - Added proper type hints for gpu_arch parameter - Moved gpu_info extraction after soc_info to ensure gpu_arch is available - Improved code comments for MI300 series threshold * Handle gpu arch like a hex string	2026-01-16 10:36:19 -05:00
Fábio Mestre	e6236417f7	SWDEV-571222 - Fix bf16 headers on gcc (#2260 ) GCC does not support anonymous structs with members that have non-trivial constructors. This commit changes the header to remove the union when compiling with gcc. This should be a non-breaking change for other compilers.	2026-01-16 15:02:48 +00:00
Edgar Gabriel	3ce10dc688	fix allreduce tester (#385 ) - use the reduce_psync buffers for synchronization in allreduce, not the barrier_psync. - execute a wwg barrier after the allreduce operation. After internal discussion it was determined that it is required for correctness. [ROCm/rocshmem commit: `6f512e92a5`]	2026-01-16 08:10:25 -06:00
Edgar Gabriel	6f512e92a5	fix allreduce tester (#385 ) - use the reduce_psync buffers for synchronization in allreduce, not the barrier_psync. - execute a wwg barrier after the allreduce operation. After internal discussion it was determined that it is required for correctness.	2026-01-16 08:10:25 -06:00
Fábio Mestre	7794ac9ac6	[hip-tests] Fix Float16 accuracy tests (#2178 ) Tests were relying on floats for calculating ulp values when validating the output. This is not correct given that the calculations are done using Float16. The fix is to update the test framework to use fp16 ulp instead.	2026-01-16 13:25:11 +00:00
Kian Cossettini	9f014db6a4	[rocprofiler-systems] Update install path for examples (#2625 ) * Update install path for examples to `share/rocprofiler-systems/examples` ---- Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com> Signed-off-by: David Galiffi <David.Galiffi@amd.com>	2026-01-15 21:51:16 -05:00
German Andryeyev	e438308541	rocr/libhskamt: Add wsl build in thunk	2026-01-15 17:29:50 -05:00
Omri Mor	93493e3e46	ionic: fix byteswap functions (added in #345 ), missed in #368 (#388 ) [ROCm/rocshmem commit: `885e41ec62`]	2026-01-15 14:19:19 -08:00
Omri Mor	885e41ec62	ionic: fix byteswap functions (added in #345 ), missed in #368 (#388 )	2026-01-15 14:19:19 -08:00
German Andryeyev	5c5b9729ff	Add 'projects/rocr-runtime/libhsakmt/include/hsakmt/drm/' from commit '8c47e25315e70f9c8cdd57a5790d3e080938c969' git-subtree-dir: projects/rocr-runtime/libhsakmt/include/hsakmt/drm git-subtree-mainline: `5319163521` git-subtree-split: `8c47e25315`	2026-01-15 16:06:07 -05:00
Omri Mor	3260759dfd	Replace byteswap interface to align with C++23 std::byteswap (#368 ) * byteswap<T> returns by value * replace hand-rolled implementations with Clang __builtin_bswap<N> intrinsics * new high-level interface endian::to_be, endian::from_be, etc. to indicate conversion direction [ROCm/rocshmem commit: `cf8b72a047`]	2026-01-15 13:03:01 -08:00
Omri Mor	cf8b72a047	Replace byteswap interface to align with C++23 std::byteswap (#368 ) * byteswap<T> returns by value * replace hand-rolled implementations with Clang __builtin_bswap<N> intrinsics * new high-level interface endian::to_be, endian::from_be, etc. to indicate conversion direction	2026-01-15 13:03:01 -08:00
German Andryeyev	5319163521	Add 'projects/rocr-runtime/libhsakmt/include/impl/' from commit 'c34ec1e52fcb52da248c00207ebe646197ea9d3e' git-subtree-dir: projects/rocr-runtime/libhsakmt/include/impl git-subtree-mainline: `55f7d39fa5` git-subtree-split: `c34ec1e52f`	2026-01-15 15:54:37 -05:00
German Andryeyev	55f7d39fa5	Add 'projects/rocr-runtime/libhsakmt/src/dxg/' from commit '029690f0a4f62fefefbb67305a066a72e99f8c0b' git-subtree-dir: projects/rocr-runtime/libhsakmt/src/dxg git-subtree-mainline: `8760fb4976` git-subtree-split: `029690f0a4`	2026-01-15 15:51:21 -05:00
Mark Meserve	8760fb4976	attach: Formalize ROCAttach API (#1653 ) * attach: Formalize ROCAttach API - Make ROCAttach public with public headers - Change detach to take a PID - attach and detach are now reentrant - Cleanup of states and signal handling in ptrace session - Fixes mixed up definition of ROCPROF_ATTACH_TOOL_LIBRARY - ROCPROF_ATTACH_TOOL_LIBRARY now always means the tool library loaded by the attachment target - ROCPROF_ATTACH_LIBRARY refers to the library used to perform attachment - Add direct call of rocprof-attach - Fix python library call of rocprof-attach - Function now named attach(), changed from main() * attach: rocprof-compute ROCAttach updates - Update to new library names - Correct usage of C lib detach * attach: add test for rocattach - Disable ASan, TSan, and UBSan for the new parallel-attach test - Lower log level for LSan tests, existing behavior from other tests --------- Co-authored-by: Ammar ELWazir <aelwazir@amd.com>	2026-01-15 14:32:14 -06:00
dsclear-amd	2482bff0b7	Excludes (more) docs-only changes from .azuredevops/rocm_ci_caller.yml. (#2615 ) Motivation We wish to avoid triggering full Jenkins runs for docs-only PRs, as this takes up testing resources and slows development time. rocm_ci_caller.yml already excludes some docs-only changes, but this can be improved to exclude them along more paths. Technical Details The checks that rocm_ci_caller.yml uses to determine if a changed file in a PR is worth a Jenkins run has been increased to exclude more paths and more file suffixes. JIRA ID AIROCDOC-78, AIROCDOC-424 Test Plan Created a test branch users/dsclear/shorten_workflows_test_root with the changes in this PR, branched from develop. Branched users/dsclear/shorten_workflows_test_bin_3 and users/dsclear/shorten_workflows_test_text_3 from users/dsclear/shorten_workflows_test_root. Modified users/dsclear/shorten_workflows_test_bin_3 to add two .h files, and submitted a PR into users/dsclear/shorten_workflows_test_root (Test PR, do not merge. Test PR to test Jenkins CI/CD modifications. #2613). Modified users/dsclear/shorten_workflows_test_text_3 to add a new .txt file, and submitted a PR into users/dsclear/shorten_workflows_test_root (Test PR, do not merge. Test PR to test Jenkins CI/CD modifications (docs only). #2614). Test Result The test PR in step 3 caused rocm_ci_caller.yml to attempt to trigger Jenkins, as this is a 'non-docs' change. The test PR in step 4 had the attempt to trigger Jenkins skipped, as this is a 'docs-only' change.	2026-01-15 14:54:20 -05:00
Mario Limonciello	838b3dccf1	Adjust amdgpu version output for `amd-smi` (#2563 ) * Fix the amdgpu version string comparison The intention behind it was to avoid showing the string if it's not got information. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> * Display the kernel version in amd-smi output This is an interesting debugging point, especially in the case of not having a DKMS package installed. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> * Moving os_kernel_version to static --driver Signed-off-by: Maisam Arif <Maisam.Arif@amd.com> --------- Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Maisam Arif <Maisam.Arif@amd.com> Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>	2026-01-15 11:11:58 -08:00
yugang-amd	fe60c39256	Bump rocm-docs-core to 1.31.2 (#2627 ) * Update requirements.in * Update requirements.txt	2026-01-15 13:18:30 -05:00
yugang-amd	bcd9119dbc	Bump rocm-docs-core to 1.31.2 (#387 ) [ROCm/rocshmem commit: `491739c9b4`]	2026-01-15 13:17:51 -05:00
yugang-amd	491739c9b4	Bump rocm-docs-core to 1.31.2 (#387 )	2026-01-15 13:17:51 -05:00
Bindhiya Kanangot Balakrishnan	aa16cca39a	[SWDEV-549108] Increase gpu_metrics API execution test threshold (#2617 ) Increased threshold from 2100 μs to 3100 µs to accommodate gpu_metric read time variation across Navi systems. Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>	2026-01-15 11:20:17 -06:00
Matthias Gehre	1883f736ad	Fix double-free crash when librocm_smi64.so and libamd_smi.so are loaded together (#2531 ) Problem: When TheRock-based PyTorch package is installed along with amdsmi, importing torch causes a double-free crash on exit (GitHub issue ROCm/TheRock#2269). Root cause: Both librocm_smi64.so and libamd_smi.so export the C++ static member 'amd::smi::Device::devInfoTypesStrings'. When libraries are loaded with RTLD_GLOBAL, the dynamic linker resolves libamd_smi.so's reference to this symbol to the one in librocm_smi64.so. This causes: 1. librocm_smi64.so registers its destructor for devInfoTypesStrings 2. libamd_smi.so also registers a destructor, but for the SAME address 3. On exit, both destructors run on the same object -> double-free Fix: Change devInfoTypesStrings from a class static member to a file-local static variable. This ensures the symbol has internal linkage and is not exported, preventing the symbol collision. Changes: - rocm_smi_device.h: Remove static member declaration - rocm_smi_device.cc: Change from 'Device::devInfoTypesStrings' to file-local 'static const std::map<...> devInfoTypesStrings' - rocm_smi.cc: Remove the global alias to the (now removed) class member Tested on gfx1151. `import torch` crashed on exit before the fix, and doesn't crash after the fix.	2026-01-15 08:43:47 -08:00
Filip Jankovic	29cd25df66	Add hipDeviceAttributeExpertSchedMode (#2435 ) * Add hipDeviceAttributeExpertSchedMode --------- Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com> * Update hipDeviceAttributeExpertSchedMode unit test * Move check to ROCr from thunk interface * Revert unrelated whitespace changes * Revert version bump --------- Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>	2026-01-15 08:41:39 -08:00
Milan Radosavljevic	940488ed58	[rocprofiler-systems] Fix naming and description of process_page category (#2606 )	2026-01-15 16:10:50 +01:00
Milan Radosavljevic	318d13870f	[rocprofiler-systems] Update logging to use spdlog library (#2428 ) ## Motivation - Structured logging with proper log levels (TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL) - Better performance through compile-time formatting - Consistent formatting using fmt library - Runtime log level control via arguments and environment variables - Easier maintenance and debugging capabilities ## Technical Details - Added spdlog as a submodule and integrated it into CMake build system - Created new `rocprofiler-systems-logger` library wrapping spdlog functionality - Replaced custom logging macros (`ROCPROFSYS_VERBOSE`, `ROCPROFSYS_DEBUG`, `ROCPROFSYS_FATAL`, `ROCPROFSYS_REQUIRE`, `ROCPROFSYS_CI_THROW`, etc.) with spdlog equivalents (`LOG_DEBUG`, `LOG_WARNING`, `LOG_CRITICAL`, etc.) - Implemented log level control through command-line arguments and environment variables - Converted assertion macros to proper error handling with exceptions and std::abort()	2026-01-14 15:27:51 -05:00
Joseph Narlo	499127c0b9	[SWDEV-553434] No direct way to get the BASEBOARD temperature info (#2502 ) * [SWDEV-553434] No direct way to get the BASEBOARD temperature info. Need to iterate all gpus Signed-off-by: amd-josnarlo <josnarlo.amd.com> --------- Signed-off-by: amd-josnarlo <josnarlo.amd.com> Co-authored-by: amd-josnarlo <josnarlo.amd.com>	2026-01-14 13:52:58 -06:00
David Yat Sin	a3b445118d	SWDEV-519413 - Ignore ROCr shutdown events (#1616 ) ROCr now reports a shutdown event, but this is not a fatal error. Ignore this event.	2026-01-14 11:28:03 -08:00
xuchen-amd	71b9ea6ba0	[rocprofiler-compute] improve config management system (#2359 )	2026-01-14 13:20:27 -05:00
Luca Bruni	d7ff927690	[clr] Fix device printf pointer advancement issue with string format specifiers (#1313 )	2026-01-14 13:05:25 -05:00
habajpai-amd	bad8d915c3	Fix: Add visibility hidden to devInfoTypesStrings to prevent symbol interposition (#2575 )	2026-01-14 09:48:49 -08:00
Gopesh Bhardwaj	b18db05091	[rocprofiler-sdk] Fixing docs build (#2608 )	2026-01-14 10:13:17 -05:00
pghoshamd	d2a1fc945e	SWDEV-569319 Fix dangling reference warning (#2509 ) * SWDEV-569319 Fix dangling reference warning * fix nullptr warning * use emplace * return regular pointer	2026-01-13 15:39:03 -06:00
hongkzha-amd	9dc2488b6b	rocrtst: Add test cases for interrupt disabled mode (#2385 ) Add explicit test cases to verify ROCr functionality with interrupts disabled (HSA_ENABLE_INTERRUPT=0). This ensures compatibility with virtio, dtif, and WSL configurations which require interrupt-disabled mode. Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>	2026-01-13 12:10:11 -06:00
hongkzha-amd	b3c4e94e70	rocr: Improve memory protection and WSL compatibility (#2274 ) * rocr: Add ProtectMemory API and use it in RemoveAccess Replace munmap + mmap with mprotect when removing memory access. This improves performance by 5-10x, ensures atomicity (no race condition window), and prepares for WSL/DXG compatibility fixes. Suggested-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> * rocr: Skip CPU mapping operations on WSL On WSL, CPU cannot access GPU VRAM due to platform restrictions. CPU access would fault-in system RAM instead, causing data corruption and memory leaks. Return HSA_STATUS_ERROR to fail fast rather than silently creating broken mappings. GPU-to-GPU mappings remain functional. Signed-off-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> * rocr: reduce ifdef linux v2: Fix IsDXG check logic Signed-off-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> --------- Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Flora Cui <flora.cui@amd.com>	2026-01-13 12:08:20 -06:00
Geo Min	dfdb64572c	[TheRock CI] Adding working single node tests (#2142 ) * Adding working single node tests * Revert to old docker sha * adding back no perf tests --------- Co-authored-by: Aravind Ravikumar <arravikum@amd.com> [ROCm/rccl commit: `4b295c9893`]	2026-01-13 08:35:58 -08:00
Geo Min	4b295c9893	[TheRock CI] Adding working single node tests (#2142 ) * Adding working single node tests * Revert to old docker sha * adding back no perf tests --------- Co-authored-by: Aravind Ravikumar <arravikum@amd.com>	2026-01-13 08:35:58 -08:00
Jan Stephan	2e8c863341	Use doxysphinx includes for enums, macros and global types (#2273 ) Signed-off-by: Jan Stephan <jan.stephan@amd.com>	2026-01-13 17:33:49 +01:00
Jan Stephan	88584f3c0d	Fix wrong call to executable (#2290 ) Signed-off-by: Jan Stephan <jan.stephan@amd.com>	2026-01-13 17:31:10 +01:00
Adam Pryor	9425a2f687	[SWDEV-569427] Fix segfault calling bad page info (#2547 )	2026-01-13 09:44:49 -06:00
AidanBeltonS	607d66e87c	Add messages to static asserts to prevent warnings (#1011 )	2026-01-13 14:02:36 +00:00
Fábio Mestre	09a01ee11c	Replace usages of __ockl_clz with builtins (#2234 )	2026-01-13 11:15:46 +01:00
Fábio Mestre	61325db1c8	Fix AMD_LOG_LEVEL_SIZE env variable (#2463 ) AMD_LOG_LEVEL_SIZE is being used in a global variable. This always uses the default value of 2048 because the HIP runtime doesn't have the opportunity to load environment variables at the point where global variables are initialized. The solution is to use AMD_LOG_LEVEL_SIZE inside truncate_log_file() function.	2026-01-13 09:57:49 +00:00
Jan Stephan	35a5274b84	CSS: Don't reference images that aren't generated by Doxygen (#2295 ) Signed-off-by: Jan Stephan <jan.stephan@amd.com>	2026-01-13 10:11:57 +01:00
David Galiffi	2daec0e4d0	Revert `63713f01e0` (#2585 ) ## Motivation <!-- Explain the purpose of this PR and the goals it aims to achieve. --> Remove Fortran example due to Palamida scan violation. ## Technical Details <!-- Explain the changes along with any relevant GitHub links. --> Revert `63713f01e0`. New test to be added later. Signed-off-by: David Galiffi <David.Galiffi@amd.com>	2026-01-12 23:44:26 -05:00
randyh62	21b6021848	Restore Lane masks bit shift content (#2411 ) Co-authored-by: Christophe Paquot <35546540+chrispaquot@users.noreply.github.com>	2026-01-12 19:01:19 -05:00
dsclear-amd	d5f490fa2f	Sets heavy GitHub CI workflows to not trigger on text documentation-only changes. (#2417 ) Sets heavy GitHub CI workflows to not trigger on docs-only changes. Specifically, sets azure-ci-dispatcher.yml and therock-ci.yml, as well as many rocprofiler workflows, to not trigger when the change consists entirely of docs-only files.	2026-01-12 18:31:30 -05:00

... 2 3 4 5 6 ...

74761 Commity