Commit graph

68533 Commits

Autor SHA1 Nachricht Datum
Junhua Shen 9da1572c42 libhsakmt: Refactor for Multi-KFD Context Support (Multiple KFD FDs per Process) (#1701)
* Introduce HsaKFDContext structure and infrastructure for multiple KFD contexts, enabling
   independent contexts within a single process.
* Refactor core components (queue, event, FMM, topology) to be context-aware,
   using explicit HsaKFDContext parameters instead of global state.
* Replace global hsakmt_kfd_fd with context-specific file descriptors, ensuring full context isolation.
* Maintain backward compatibility by redirecting legacy APIs to use the primary context.

This refactoring establishes a foundation for multi-context support while preserving existing functionality.

Signed-off-by: Junhua Shen <Junhua.Shen@amd.com>
2025-11-10 11:19:58 +08:00
Jin Jung 324a5519b9 SWDEV-563842 - Fix Memory Address Offset Bug (#1749)
* SWDEV-563842 - Fix Memory Address Offset Bug

* Revert "SWDEV-563842 - Fix Memory Address Offset Bug"

This reverts commit 477958dc48300ee1fe0166aa6f0d3d8125b91f5e.

* SWDEV-563842 - Fix Memcpy Address Offset Bug

* SWDEV-563842 - Find Memcpy Device Address Offset

* Revert "SWDEV-563842 - Find Memcpy Device Address Offset"

This reverts commit 6c75a9e5b58b7dfabb9e3f91fa3dd892d42639cc.

* Revert "SWDEV-563842 - Fix Memcpy Address Offset Bug"

This reverts commit 0b89072a988074aa4da4e8fc7ba04c554f31ed44.

* SWDEV-563842 - MemObjMap_ Offset Support

This patch fixes the buffer offset handling bug.

* Revert "SWDEV-563842 - MemObjMap_ Offset Support"

This reverts commit 37fce3382465e3420721e5277377f943ec2b30a1.

* SWDEV-563842 - External Memory Buffer View
2025-11-09 12:52:35 -08:00
Victor Zhang 7580052878 SWDEV-564318 - Add support for allocating uncached device memory (#1670) 2025-11-09 12:51:41 -05:00
Gerardo Hernandez 99cab3500d SWDEV-561284 - Fix use of uninitialized memory in Unit_hipMemVmm_Basic and Unit_hipMemVmm_Uncached (#1677) 2025-11-09 12:12:24 +00:00
SaleelK 738bb19835 clr: Increase kernelArg/managedBuffer size (#1586)
* Increase the buffer to 4MB. That can help kernel launches limited by a deep kernel pipeline

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-08 18:32:43 -08:00
ajanicijamd 2f9017f706 Fix build failure with Clang 20. (#1667)
* Modified for Clang

* Updated timemory version so it compiles with Clang 20

* Using TBB version 2018.6 for both GCC and Clang builds
2025-11-08 11:36:12 -05:00
Pengda Xie 93947241d0 SWDEV-556684 - HSAIL cleanup (#1657) 2025-11-08 02:22:03 -08:00
Pengda Xie 5dd15e22ca SWDEV-559514 - Add queue validation to submitMarker sync path (#1308) 2025-11-08 02:21:36 -08:00
lancesix f7ffcd1402 clr: SWDEV-547890 - Bump PAL API version to 954 (#1680)
* clr: Adjust call to ICmdBuffer::CmdCopyMemoryToImage for PAL >= 955

PAL starting versino 955 adds a new argument to
ICmdBuffer::CmdCopyMemoryToImage.  Adjust teh callsite to account
fort his.

* clr: Handle new GpuUtil::TraceSessionState cases for PAL >= 939

Starting PAL API version 939, GpuUtil::TraceSessionState changes its
possible values.  Adjust for it.

* clr: require PAL version 954

Bump the PAL required vesion to 954, as this is required for proper
debugger support.
2025-11-08 00:52:04 +00:00
Pratik Basyal 0325de6538 [ROCm Systems Profiler] Path issue note added to Profiling python script (#1766)
* Note added to Profiling python script

* Doxygen reverted

* Update projects/rocprofiler-systems/docs/how-to/profiling-python-scripts.rst

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-07 18:49:23 -05:00
Jin Jung 291ff6c468 SWDEV-558855 - Enable Interop Map Buffer on Windows (#1748)
* Support Windows HANDLE in interop_map_buffer

* Refactored Windows HANDLE in interop_map_buffer

* ROCr System Dependent Handle Type

* Fix for ROCr Handle Conversion Bug

* Remove Windows Header
2025-11-07 12:47:01 -08:00
Jimbo 2006a411e5 SWDEV-561611 - fix codeql errors by increasing printf buffer sizes (#1507)
* SWDEV-561611 - fix codeql errors by increasing printf buffer sizes

* Replace sprintf with snprintf to prevent potential buffer overflow

---------

Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>
2025-11-07 15:42:56 -05:00
David Yat Sin de3b7322f2 rocr/hsakmt: Fix asan compile errors - KFDQMTest (#1638)
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-07 14:52:36 -05:00
David Yat Sin 48cb61f378 rocr: Separate Linux coredump implementation (#1588)
Remove libamdhsacode/win32/elf.h due to license restrictions.

Separate Linux coredump implementation because we do not have the ELF
definitions on Windows.

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-07 14:52:08 -05:00
Larry Meadows e6fc009b28 SWDEV-552584 fix racy null pointer exception for ompt_callback_task_schedule for ompt-task_early_fulfill tasks (#980)
* Fix for SWDEV-552584
    Two calls to ompt_callback_task_scheduled were issued for the same
    prior task. One of them was ompt_task_complete, which causes
    internal storage to be release and a pointer zeroed. The other
    was ompt_task_early_fulfill, which attempted to reference the
    pointer. The callbacks could come in any order as they were
    from different threads, thus causing a null pointer
    dereference on occasion.  The code was changed to do nothing
    for the early_fulfill. Additional null pointer checks were
    added.

* formatting

* Update ompt.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-07 12:15:48 -06:00
Milan Radosavljevic d9b00da102 Add clean up of buffered_storage files (#1738)
* Add clean up of buffered_storage files

* Add step to workflows to test for remaining temp files after tests

* Applied suggestions from code review

* add deletion of all cache files

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-07 11:51:09 -05:00
Ethan Trinh 6b73f6ab5c fix texture operator error (#1719) 2025-11-07 11:34:58 -05:00
Yiannis Papadopoulos 30785f8d18 rocr: Assume KFD in hsa_amd_interop functions (#1138) 2025-11-07 09:38:06 -06:00
Milan Radosavljevic a9082a7158 ROCpd schema fetching from rocprofiler-sdk (#1501)
- Integrate rocprofiler-systems with rocprofiler-sdk-rocpd to fetch schema
- If rocprofiler-sdk-rocpd is not availabe, use embedded schema files. With this we provide rocpd format support even if ROCm is not available
- Include detection in CMake if rocprofiler-sdk-rocpd package is available (and valid), and build database class upon that
- Update embedded schema that is used as a fallback.
- Update some validation tests to account for schema changes.
2025-11-07 09:45:29 -05:00
Ben Richard b299eece9b Fix bug in rocprof-compute parsing (#1664)
Were not handling the case where the eval result is None e.g. some
columns have a peak value, but it is unused, so we use 'None', which
evaluates to the None object.

Return empty string in this case.
2025-11-07 09:33:43 -05:00
Gopesh Bhardwaj fabdab7aa4 [aqlprofile] Adding Strix Halo support (#1477)
* Adding Strix Halo support

* copilot review feedback

* Addressing feedback
2025-11-07 00:46:17 -06:00
Gopesh Bhardwaj 06bf110c84 Adding counters support for strix halo (#1358)
* Adding counters support for strix halo

* Updated coutners list

* Added missing counter info

* Updated arch support
2025-11-07 00:45:03 -06:00
Jason Bonnell 6e195ded9b Update rocprofiler_config_interfaces.cmake to use different elf naming (#1722)
* Update rocprofiler_config_interfaces.cmake to use different elf naming

* try out conditional for libelf

* run cmake-format to fix formatting issue

* Remove libelf.patch file from therock-ci-windows.yml

* Remove libelf patch from therock-ci-linux.yml as well
2025-11-06 23:50:02 -05:00
habajpai-amd 590c6c3b4f fix: null pointer after delete in get_stream_id (#1720) 2025-11-06 23:43:34 -05:00
David Galiffi 89cf46eb55 Removing jlumbroso/free-disk-space action from workflows (#1700) 2025-11-06 18:11:09 -05:00
Sourabh U Betigeri 90d5dc6b3a SWDEV-564408 - Reduces hip-tests runtime Pt 1 (#1695)
* SWDEV-564408 - Reduces hip-tests runtime Pt 1

* Update cmd_options.hh
2025-11-06 13:45:36 -08:00
Pratik Basyal fdb557c88a [Systems-Profiler] Officially unsupported OS removed (#1740)
* Fedora and CentOS removed

* David's feedback incorporated

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-06 16:06:28 -05:00
Kian Cossettini f4d0aeb8f3 Adjust host thread count for OpenMP-VV tests (#1742)
Reducing test time
2025-11-06 16:04:47 -05:00
MachineTom 3bb8c2ac50 SWDEV-564392 - Clean up image tests (#1694)
Remove unnecessary checking.
Enable all disabled tests.
Move Mipmap test files into Windows section.
2025-11-06 15:07:53 -05:00
Joseph Macaranas 524f62ae67 TheRock CI Workflow Updates 20251106 (#1743)
- Update the pinned SHA for TheRock in CI workflows.
- Update the version for actions in those same workflows.
- Comment out the rm .patch line and provide details on its use.
2025-11-06 12:06:44 -05:00
Poag, Charis d73726698b [SWDEV-562295] Fix Dmesg errors when using CLI (#822)
* Changes:
  - Modified attempting to open files to check
    permissions -> check read access only.

Do not try to open all paths, may cause driver issues.
Read access is sufficient to check permissions.

Reason: GPUs which support partitioning (memory/compute),
logical devices will not be valid until configured.
See `sudo amd-smi set -h` or applicable APIs
to configure on supported hardware.

Example error dmesg output:
[965358.883112] amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
[965358.883283] amdgpu 0000:15:00.0: amdgpu: renderD154 partition 2 not valid!
[965358.883438] amdgpu 0000:15:00.0: amdgpu: renderD155 partition 3 not valid!
[965358.883594] amdgpu 0000:15:00.0: amdgpu: renderD156 partition 4 not valid!
[965358.883749] amdgpu 0000:15:00.0: amdgpu: renderD157 partition 5 not valid!
[965358.883904] amdgpu 0000:15:00.0: amdgpu: renderD158 partition 6 not valid!
[965358.884060] amdgpu 0000:15:00.0: amdgpu: renderD159 partition 7 not valid!

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-11-06 10:24:14 -06:00
Poag, Charis ced0642b4b [SWDEV-562295] Fix Dmesg errors when using CLI (#822)
* Changes:
  - Modified attempting to open files to check
    permissions -> check read access only.

Do not try to open all paths, may cause driver issues.
Read access is sufficient to check permissions.

Reason: GPUs which support partitioning (memory/compute),
logical devices will not be valid until configured.
See `sudo amd-smi set -h` or applicable APIs
to configure on supported hardware.

Example error dmesg output:
[965358.883112] amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
[965358.883283] amdgpu 0000:15:00.0: amdgpu: renderD154 partition 2 not valid!
[965358.883438] amdgpu 0000:15:00.0: amdgpu: renderD155 partition 3 not valid!
[965358.883594] amdgpu 0000:15:00.0: amdgpu: renderD156 partition 4 not valid!
[965358.883749] amdgpu 0000:15:00.0: amdgpu: renderD157 partition 5 not valid!
[965358.883904] amdgpu 0000:15:00.0: amdgpu: renderD158 partition 6 not valid!
[965358.884060] amdgpu 0000:15:00.0: amdgpu: renderD159 partition 7 not valid!

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: d73726698b]
2025-11-06 10:24:14 -06:00
jamessiddeley-amd 37bbb58a19 [rocprof-compute] fix unit regex 'ns' in analyze mode (#1689)
* fix unit regex in analyze mode

* ruff format
2025-11-06 11:13:10 -05:00
Galantsev, Dmitrii 8bdf951d32 Add numbers to .so because wheels dont allow symlinks (#820)
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-11-06 03:57:31 -06:00
Galantsev, Dmitrii 181659ea1f Add numbers to .so because wheels dont allow symlinks (#820)
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

[ROCm/amdsmi commit: 8bdf951d32]
2025-11-06 03:57:31 -06:00
marandje 0ad05ed515 SWDEV-556947 - Parse the HIP version from the Git tag (#1135) 2025-11-06 10:18:26 +01:00
Galantsev, Dmitrii aac09912ec Add downloaded gtest as fallback
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-11-06 01:26:40 -06:00
Galantsev, Dmitrii 4e8d89306e Add downloaded gtest as fallback
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: aac09912ec]
2025-11-06 01:26:40 -06:00
Satyanvesh Dittakavi 478cee0f68 SWDEV-559525 - Add the HIP_POINTER_ATTRIBUTE_IS_LEGACY_HIP_IPC_CAPABLE attribute support (#1647)
* SWDEV-559525 - Add the HIP_POINTER_ATTRIBUTE_IS_LEGACY_HIP_IPC_CAPABLE attribute implementation

* Update indentation in hip_memory.cpp
2025-11-06 12:07:32 +05:30
systems-assistant[bot] 27f85500f8 Update amdgpu-windows-interop with latest changes 20251105 (#1728)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-11-05 22:09:25 -05:00
habajpai-amd ea31a0bf18 rocprofiler-sdk: fix per-record group_by_queue scoping (#1676)
* rocprofiler-sdk: fix per-record group_by_queue scoping

* added under resolved issues to CHANGELOG.md

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-05 21:46:44 -05:00
Xie, AlexBin c877be2afe rocr: make sure the member variable is conctructed (#1387)
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
2025-11-05 17:19:33 -05:00
alexxu-amd a330fb6b91 fix latest docs doesn't get synchronized issue (#1714) 2025-11-05 17:08:19 -05:00
Joseph Macaranas 865a8d4d59 Revert "Update amdgpu-windows-interop with latest changes (#1718)" (#1725)
This reverts commit 321e497048.
2025-11-05 15:38:23 -05:00
systems-assistant[bot] 321e497048 Update amdgpu-windows-interop with latest changes (#1718)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-11-05 21:13:32 +01:00
lancesix 280cda3196 clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue (#1669)
* clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue

To simplify the shader debugger implementation, maintain the relevant
parts of the emulated AQL queue's MQD (amd_queue_t): read_dispatch_id,
write_dispatch_id, compute_tmpring_size.

With this MQD, the shader debugger can handle the emulated AQL queue
the same way it does the real AQL queue, no specialization is required.

* clr: SWDEV-547890 - Conservatively update the MQD's read_dispatch_id

The read_dispatch_id cannot be smaller than the current aql_packet_id
- hsa_queue.size for the debugger to work correctly.

The read_dispatch_id really should be updated when the CmdBuf is marked
as complete. Left a FIXME to address it in a future commit.

---------

Co-authored-by: Laurent Morichetti <laurent.morichetti@amd.com>
2025-11-05 17:39:33 +00:00
Rakesh Roy 8797bb0150 Revert "SWDEV-562996 - Build fix: Ubertrace callback calling convention mismatch on x86 (#1587)" (#1717)
This reverts commit 8d31383dfe.

Reason for revert: It is breaking TheRock build on Windows
2025-11-05 11:48:02 -05:00
Apurv Mishra eded1f3529 rocrtst: Add check for SVM support in Runtime (#1687)
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
Approved-by: David Yat Sin <David.YatSin@amd.com>
2025-11-05 11:36:38 -05:00
Galantsev, Dmitrii 982737a852 Fix missing iomanip and cstdio in tests
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-11-05 10:14:19 -06:00
Galantsev, Dmitrii 87ace88e72 Fix missing iomanip and cstdio in tests
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 982737a852]
2025-11-05 10:14:19 -06:00