63 Коммитов

Автор SHA1 Сообщение Дата
pghoshamd 793755532f SWDEV-561708 Initial shared queue pool apis (#1614)
* SWDEV-561708 Initial shared queue pool apis

* Validate params; some fixes in callback function (but still needs to be checked)

* Dtor cleanup

* minor

* Enable profiling; remove callback since aql_queue takes care of it

* setPriority and setCuMask APIs updated for counted queues

* Increasing step and minor version for rocprofiler

* Tests for CountedQueueManager

* tests

* Code refactored to make pool manager part of GpuAgent only (incomplete); unique handles issue pending

* Refactored code to support CQM inside GpuAgent and unique handles; multithreaded test added

* Changed to ASSERT_SUCCESS macros for all tests

* RIng buffer overflow test added

* tests fixed; cleanup added at hsa_shutdown

* priority conversion table changes

* Compiler warnings fixed

* Rewrite 1 test; add desc and improve SetUp() code

* Improvement

* Unififed getinfo for both counted and non-counted queues

* Address PR feedback

* Addressing feedback: memleak, data type mismatch, documentation

* improve comment

* format

* Missing HSA_API macros for roctracer

* Revert "Addressing feedback: memleak, data type mismatch, documentation"

This reverts commit 5e498a55fb3640e00d06cec63dcec79293fb23de.

* Improving acquire api doc

* release api doc improved

* error codes for release api doc
2026-01-21 15:30:04 -05:00
Jin Jung d4758bc29e SWDEV-570501 - Add Windows support for hipGraphicsGLRegisterBuffer (#2323) 2026-01-12 13:10:46 -06:00
Mario Limonciello bc5d48e76c Run pre-commit's whitespace related hooks on projects/rocr-runtime (#2130)
* Run pre-commit's whitespace related hooks on projects/rocr-runtime

In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Add missing semicolon which would block compilation on big endian CPUs

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-08 07:56:50 -06:00
Jin Jung 291ff6c468 SWDEV-558855 - Enable Interop Map Buffer on Windows (#1748)
* Support Windows HANDLE in interop_map_buffer

* Refactored Windows HANDLE in interop_map_buffer

* ROCr System Dependent Handle Type

* Fix for ROCr Handle Conversion Bug

* Remove Windows Header
2025-11-07 12:47:01 -08:00
hkasivis 5e7210980e Users/hkasivis/add ais support v2.1 (#928)
* libhsakmt: Update hsakmt_fmm_get_handle to support address range

Currently, hsakmt_fmm_get_handle works only if the address is allocated
(staring) value. Update it so it can find the handle if address falls in
the valid allocated range. This is useful for AMD infinity storage
feature where data needs to be transferred to any memory within in the
allocated range

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

* libhsakmt: Introduce AMD Infinity Storage (AIS) API

Add hsaKmtAisReadWriteFile() API to support AMD Infinity Storage. The
API moves data directly from GPU VRAM to a file.

v2: Add in/out ioctl arguments to provide more status information to
user space. Modify hsaKmt API also accordingly.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

* rocr: Initial implementation of AMD Infinity Storage (AIS)

Implement first two API: hsa_amd_ais_file_write and hsa_amd_ais_file_read

v2: Change API from hsa_amd_ to hsa_amd_ais_
    Change API to take in handle instead of fd for compatibility accross
     different platforms

Original Author: Chris Freehill <Chris.Freehill@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

---------

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2025-09-20 11:30:05 -04:00
Chris Freehill 287986ab65 rocr: Add hsa_amd_portable_export_dmabuf_v2
The original version of hsa_amd_portable_export_dmabuf() did not
consider the conditions under which a dmabuf could be shared.
In the new version (hsa_amd_portable_export_dmabuf_v2()), the caller
can specify the flag HSA_AMD_DMABUF_MAPPING_TYPE_PCIE, which means they
want to share the dmabuf over PCIe. In that case, the new code will check
that if it is a PCIe GPU and it is not in a XGMI Hive then if
large-BAR is not supported, we will return an error.


[ROCm/ROCR-Runtime commit: 3a9d14bb66]
2025-06-09 15:42:58 -05:00
Tony Gutierrez 18404ba8a8 rocr: Remove empty shared.cpp
[ROCm/ROCR-Runtime commit: 11d1d2cd25]
2025-04-23 15:53:29 -04:00
Saleel Kudchadker 945d6da90b rocr: return preferred SDMA engine mask
- Add a new AMD extension API to return preferred SDMA engine mask.
This can use used in conjunction with copy_on_engine API to get
optimal bandwidth.


[ROCm/ROCR-Runtime commit: 57c0c643ce]
2025-04-22 13:28:38 -07:00
Tony Gutierrez ff52d6fc13 rocr: Add WaitMultiple to core Signal
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.

WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.

Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75


[ROCm/ROCR-Runtime commit: 8a38f121ea]
2025-01-27 09:21:43 -05:00
Chris Freehill b617b05c2a rocr: Ensure globals are initialized at first use
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.

Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88


[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
2024-10-16 23:19:48 -04:00
Saleel Kudchadker bdc02d3054 Initial external logging API
New API to accept a file stream for logging

Co-authored-by: David Yat Sin <David.YatSin@amd.com>

Change-Id: Ie09c35ae14ca86a97eb25f61251be287c55d7169
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 26e105d9ab]
2024-08-07 02:59:00 +00:00
David Yat Sin 140b5fbd40 Add hsa_amd_vmem_address_reserve_align API
New API to support alignment parameter when reserving virtual addresses.
If the alignment is 0, then the default size is used. Otherwise the
alignment needs to be a power of 2 and greater than or equal to page
size.

Existing hsa_amd_vmem_address_reserve marked for future deprecation.

Change-Id: I17cee75420183dea5842fc1ecc2514cdcd760bac
Signed-off-by: Chris Freehill <cfreehil@amd.com>


[ROCm/ROCR-Runtime commit: 08c44fbda6]
2024-06-25 12:57:22 -05:00
David Yat Sin ed7cdb88e4 Adding queue information queries
New hsa_amd_queue_get_info API to support:

- HSA_AMD_QUEUE_INFO_AGENT: Agent that owns the underlying HW queue

- HSA_AMD_QUEUE_INFO_DOORBELL_ID: KFD doorbell ID of the queue
completion signal.

Change-Id: I98842131bcbdd08552649791a5d43e578a615808


[ROCm/ROCR-Runtime commit: d6d5786051]
2024-04-11 12:53:48 -04:00
David Yat Sin b19929b090 Temporary: Set AllocateGTTAccess and node_id for MES
Temporary change to set the AllocateGTTAccess flag and node_id
on MES devices.

Change-Id: I22385d11b17b76cfb44278fa0d8a09bc8721cea6


[ROCm/ROCR-Runtime commit: efe455c2fa]
2024-03-29 19:38:19 +00:00
Mythreya e5d4513c7b Initial support for scratch allocation tracking
Add new tools table and functions to notify in case of an event

Change-Id: I47f0c2f3c8e02d7bcb74d649903eb4f86721c154


[ROCm/ROCR-Runtime commit: a67af3807f]
2024-02-07 16:56:52 +00:00
David Yat Sin 66b9fdc2d6 Implement async scratch reclaim
For devices where the CP FW supports asynchronous scratch reclaim, ROCr
is able to claw-back scratch memory that was assigned to an AQL queue.
With that ability, ROCr does not have to rely on using USO
(use-scratch-once) when assigning large amounts of memory to a queue.
If we reach a situation where we are running low on device memory, ROCr
will attempt to claw-back the scratch memory.

Change-Id: Iddf8ec84e37ab8b9fdc58bafbe2b61fe2acb6eb7


[ROCm/ROCR-Runtime commit: dca8f3a21d]
2023-12-04 15:05:22 +00:00
David Yat Sin bf41567189 Add retain handle and get allocation properties
Support function to retain allocation handle for memory mappings.
The get allocation properties function will return the current
allocation properties for existing memory mappings.

This is part of patch series for Virtual Memory API.

Change-Id: I0a53a11b6efc2b5bf9d463512a489a2abd812551


[ROCm/ROCR-Runtime commit: 687eb043d4]
2023-07-21 15:17:01 -04:00
David Yat Sin 0bcc573ed7 Support exporting and importing memory mappings
Support exporting  and importing dmabuf file descriptors for memory
mappings. The exported dmabuf file descriptors are shareable posix
file descriptors that can be used for cross-vendor, cross-device
and cross-process memory sharing.

This is part of patch series for Virtual Memory API.

Change-Id: I3673fc009f7e73bc26be8349e19f66e20d0607c5


[ROCm/ROCR-Runtime commit: b03c96c264]
2023-07-21 15:17:01 -04:00
David Yat Sin 933aac4cda Support Get and Set access for memory mappings
Mapping memory handles to virtual memory addresses do not make them
accessible. The set access function is needed to make the memory
mappings accessible to specific agents. The get access function
returns current access properties for individual agents.

This is part of patch series for Virtual Memory API.

Change-Id: I152ba0557fd2a802eb9d840568b68cdd1911b72c


[ROCm/ROCR-Runtime commit: 13fbd8a232]
2023-07-21 15:17:01 -04:00
David Yat Sin 203934445a Support mapping and unmapping memory handles
Add support for mapping and unmapping memory handles to virtual
address ranges.

This is part of patch series for Virtual Memory API.

Change-Id: If512d49ff4211e68f2064249add607a3200e458a


[ROCm/ROCR-Runtime commit: 179dcf1c77]
2023-07-21 15:17:01 -04:00
David Yat Sin 15aa42edb5 Support memory handles
Add support for creating and releasing memory handles. Memory
handles are memory allocations on device memory without a virtual
address.

This is part of patch series for Virtual Memory API.

Change-Id: I5dfb162eb1661621cce171b2870a3c93b24d840e


[ROCm/ROCR-Runtime commit: e4a84c4a9c]
2023-07-21 15:17:01 -04:00
David Yat Sin b219d0224d Support Virtual Address reservations
Add support for reserving virtual address ranges. Virtual address
ranges are addresses without any memory backing. These address ranges
need to be mapped to memory handles later.

This is part of patch series for Virtual Memory API.

Change-Id: I5d066e7421d6896f933f524312afc230a13d594e


[ROCm/ROCR-Runtime commit: 1085311f1a]
2023-07-21 15:17:01 -04:00
Philipp Knechtges 4a7c3a2607 fix link-time ordering condition
This fixes a segfault error in cases where the linking order of
compilation unit varies. Reason behind the segfault is that one
global variable in one compilation unit depends on another global
variable in another compilation unit, but there is no guarantee that
this other compilation unit is initialized first. The fix forces a
reinitialization at the first invocation of the library.

Change-Id: I1428592c6898bca13a330c4588941de260ff0370


[ROCm/ROCR-Runtime commit: d220e16000]
2023-06-29 10:08:29 -04:00
Sean Keely deee152909 Add support for exporting portable handles to GPU allocations.
Adds hsa_amd_portable_export_dmabuf and hsa_amd_portable_close_dmabuf
which allow obtaining dmabuf handles to rocr allocations.  These handles
may be shared with other APIs to support cross vendor & cross device
memory sharing.
Adds query to return whether dmabuf export is supported

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>

Change-Id: I7f98501087d9563d07fc2cb428cc886b1e518b1e


[ROCm/ROCR-Runtime commit: 42243c1e8f]
2023-03-06 12:39:01 -05:00
Jonathan Kim ff620e9fdc Add interface to DMA copy directly to a target engine.
Change-Id: Ic87cfeabb11c1a465f98f3f444d39955f5300525


[ROCm/ROCR-Runtime commit: 30920fc94d]
2023-02-13 13:50:49 -05:00
Jonathan Kim f161963c09 Make SDMA engine availability status queryable.
Report the availability of SDMA engines for memory copies.

Change-Id: Ie31b02d6b65355122bb8c98bc73700a59bee166e


[ROCm/ROCR-Runtime commit: 8f27f495c6]
2023-02-13 13:50:49 -05:00
Cordell Bloor 921ccf5f60 Fix static initialization order
Change-Id: I1d51e150b526d050b988fe5a422644667a561cd7


[ROCm/ROCR-Runtime commit: 5873a78d58]
2023-02-09 13:51:08 -05:00
David Yat Sin 93c4ffe473 Add Stream Performance Monitor(SPM) APIs
Change-Id: I0d48782887814ef245b7e0182e2d5570aa8c3f50


[ROCm/ROCR-Runtime commit: 6bfe57aeb2]
2022-12-08 13:56:29 -05:00
Graham Sider ff52cbb201 Make queue memory allocation non-paged
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.

Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4


[ROCm/ROCR-Runtime commit: 061aa04147]
2022-08-04 11:21:00 -04:00
Sean Keely e4b3eb87e2 Minor interface improvement to pointer info.
Take in const void* rather than void*.  This does not break the
abi or existing code.  Existing code would need to cast away any
const which is unnecessary and annoying.

Change-Id: I28787e8fab1b600bf6871ea82835e10a4f475c5b


[ROCm/ROCR-Runtime commit: 270d042ef8]
2021-08-04 16:43:23 -04:00
Sean Keely 4c6ea88cf5 Add HSA_CU_MASK
New environment variable HSA_CU_MASK allows users to
specify a cu mask to every queue allocated from any
GPU.  hsa_amd_queue_cu_set_mask is restricted from
escaping this mask.

A new API hsa_amd_queue_cu_get_mask is added to query
the current cu mask.

Change-Id: I846c03a5faaca9b95067c31db84b59cc9fce2f03


[ROCm/ROCR-Runtime commit: 4455250be1]
2021-07-29 02:23:34 -05:00
Sean Keely 1aae64e251 Revert "Revert SVM and XNACK support."
This reverts commit da41352a93.

Conflicts:
	opensrc/hsa-runtime/core/util/flag.h

Change-Id: I16daf41588e6139126d66af54b0693de2e7e39f3


[ROCm/ROCR-Runtime commit: 77046a1aaa]
2021-04-21 14:49:43 -05:00
Sean Keely da41352a93 Revert SVM and XNACK support.
KFD is not ready yet.

Change-Id: I61deb292ddb92185d33504c2115169888d56e211


[ROCm/ROCR-Runtime commit: 5bd153974d]
2021-04-02 02:10:59 -04:00
Sean Keely dd42ca6dbe Squash merge of cfreehil/amd-temp-gfx90a onto amd-staging.
Includes some workarounds and HMM.
Conflicts:
	opensrc/hsa-runtime/core/runtime/amd_topology.cpp
	opensrc/hsa-runtime/core/util/flag.h

Change-Id: I22976f07964a43dbb228a6231777dbd599112b8d


[ROCm/ROCR-Runtime commit: 7333c77e22]
2021-04-02 02:10:15 -04:00
Sean Keely 4047b1c3a8 Add hsa_amd_signal_value_pointer.
Enables partial signal interop with non-HSA devices.

Change-Id: Ic39bca84ed1709cbd2cc24b1eb0f4fc6cccb39cf


[ROCm/ROCR-Runtime commit: 01f42dbe46]
2021-02-10 18:47:54 -05:00
Sean Keely b6ed5e92bd Make explicit reference between init modules.
Make explicit reference to hsa_api_trace.cpp from
initialization of hsa_table_interface.cpp.  Breaks
the ability to use hsa_table_interface.cpp in plugins.

Change-Id: I22a42d3a132512b0d9ec7a1ca629b169e7f8eba7


[ROCm/ROCR-Runtime commit: f4fe7ddf47]
2020-07-15 16:02:15 -04:00
Sean Keely 0efce64e15 Move tools only table interfaces into namespace rocr.
Change-Id: Ic0b8d958c2d27c921c6955a56110c6cdf5ba5e8e


[ROCm/ROCR-Runtime commit: bd51c61af8]
2020-06-19 22:35:15 -04:00
Ramesh Errabolu b84e4987da Add rocr namespace to core header and impl files
Change-Id: I1e1b33f9bba1078d049bc19797889988c3e43360


[ROCm/ROCR-Runtime commit: fa13208698]
2020-06-19 22:34:21 -04:00
Sean Keely abd712f33f Update copyright date.
Change-Id: If4bf4c20cf051878bfe759080bb7345d884dd53d


[ROCm/ROCR-Runtime commit: ce19721c88]
2020-06-19 22:34:01 -04:00
Ramesh Errabolu 38747b8fec Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806


[ROCm/ROCR-Runtime commit: 627991b1c1]
2020-02-21 20:01:11 -05:00
Sean Keely 872c359ba2 Initial support for deallocation callbacks.
Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback
to notify when HSA memory has been released.

Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d


[ROCm/ROCR-Runtime commit: 299874f17d]
2019-06-26 04:12:17 -05:00
Felix Kuehling d810b66917 Use non-paged memory for IPC signals
Non-paged memory can be IPC-shared even when HSA_USERPTR_FOR_PAGED_MEM
is enabled.

Change-Id: I8b1fa6d7a4a9327c78a77b3679697fbf55397093
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 0c6b9532d4]
2019-04-29 09:20:11 -04:00
Sean Keely 59e91f0be8 Add hsa_amd_memory_lock_to_pool.
Makes malloc memory accessible to GPUs so that the memory has the
capabilities of the pool it is locked to.
This admits fine grained locked memory and reserves API space for any future
special CPU pools.

Change-Id: If8c3dd8582a43f19d3d36b3763c1a688cc419ef0


[ROCm/ROCR-Runtime commit: a535e18cc1]
2019-03-29 01:09:21 -05:00
Sean Keely ed18ee7f38 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7


[ROCm/ROCR-Runtime commit: 8323b2e1d7]
2018-11-12 22:37:28 -06:00
Sean Keely 61b53915d7 Implement SDMA copy rect for gfx9.
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.

Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9


[ROCm/ROCR-Runtime commit: e0839ab27e]
2018-08-29 19:13:07 -04:00
Jay Cornwall 8bd488dfb9 Add hsa_amd_queue_set_priority extension function
Controls dispatch and wavefront scheduling arbitration across quees.

Change-Id: I498f4898b544f79b8fb8514bf7e789ca9da29462


[ROCm/ROCR-Runtime commit: e388a23344]
2018-06-19 19:41:28 -05:00
Qingchuan Shi 286ca924f3 debug suport for queue error.
1/ Revised debug event handler to handle different events.
2/ Added queue error handler using the callback in queue create, which will print out wave info when queue in error state.
3/ Preempt queue instead of destory queue when queue error state.

Change-Id: Ib727d208de9caf1c72c76d42268483b24aaebde8


[ROCm/ROCR-Runtime commit: 49d2175c74]
2018-04-20 14:25:16 -04:00
Sean Keely e2efba0676 Exception support for Queue.
Remove "zombie" queue state and report queue creation failure via
exceptions.  Make Shared object a final container and support array
objects with Shared.  Add message printing to hsa_exception in
debug builds.

Change-Id: I459f38c80846018acbf45538874e95f91dd6b195


[ROCm/ROCR-Runtime commit: f312a7386e]
2017-11-08 15:50:02 -05:00
Sean Keely 2406218416 Add queue intercept support to the runtime.
Queue intercept is exposed as two tools-only APIs via the API
intercept table.

Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8


[ROCm/ROCR-Runtime commit: 0c7dde2d1f]
2017-11-08 15:50:01 -05:00
Qingchuan Shi 3e9a0561c0 Add APIs to support debugging vm fault
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
   (1) iterate loaded code objects in executable:
       hsa_ven_amd_loader_executable_iterate_loaded_code_objects
   (2) get loaded code object info:
       hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)

Change-Id: I68910809e59e24297350d262606f00e96c14bcbd


[ROCm/ROCR-Runtime commit: ce6aee01ed]
2017-10-28 21:48:26 -04:00