The original version of hsa_amd_portable_export_dmabuf() did not
consider the conditions under which a dmabuf could be shared.
In the new version (hsa_amd_portable_export_dmabuf_v2()), the caller
can specify the flag HSA_AMD_DMABUF_MAPPING_TYPE_PCIE, which means they
want to share the dmabuf over PCIe. In that case, the new code will check
that if it is a PCIe GPU and it is not in a XGMI Hive then if
large-BAR is not supported, we will return an error.
[ROCm/ROCR-Runtime commit: 3a9d14bb66]
- Add a new AMD extension API to return preferred SDMA engine mask.
This can use used in conjunction with copy_on_engine API to get
optimal bandwidth.
[ROCm/ROCR-Runtime commit: 57c0c643ce]
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.
WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.
Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75
[ROCm/ROCR-Runtime commit: 8a38f121ea]
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.
Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88
[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
New API to accept a file stream for logging
Co-authored-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: Ie09c35ae14ca86a97eb25f61251be287c55d7169
Signed-off-by: Chris Freehill <cfreehil@amd.com>
[ROCm/ROCR-Runtime commit: 26e105d9ab]
New API to support alignment parameter when reserving virtual addresses.
If the alignment is 0, then the default size is used. Otherwise the
alignment needs to be a power of 2 and greater than or equal to page
size.
Existing hsa_amd_vmem_address_reserve marked for future deprecation.
Change-Id: I17cee75420183dea5842fc1ecc2514cdcd760bac
Signed-off-by: Chris Freehill <cfreehil@amd.com>
[ROCm/ROCR-Runtime commit: 08c44fbda6]
New hsa_amd_queue_get_info API to support:
- HSA_AMD_QUEUE_INFO_AGENT: Agent that owns the underlying HW queue
- HSA_AMD_QUEUE_INFO_DOORBELL_ID: KFD doorbell ID of the queue
completion signal.
Change-Id: I98842131bcbdd08552649791a5d43e578a615808
[ROCm/ROCR-Runtime commit: d6d5786051]
Temporary change to set the AllocateGTTAccess flag and node_id
on MES devices.
Change-Id: I22385d11b17b76cfb44278fa0d8a09bc8721cea6
[ROCm/ROCR-Runtime commit: efe455c2fa]
Add new tools table and functions to notify in case of an event
Change-Id: I47f0c2f3c8e02d7bcb74d649903eb4f86721c154
[ROCm/ROCR-Runtime commit: a67af3807f]
For devices where the CP FW supports asynchronous scratch reclaim, ROCr
is able to claw-back scratch memory that was assigned to an AQL queue.
With that ability, ROCr does not have to rely on using USO
(use-scratch-once) when assigning large amounts of memory to a queue.
If we reach a situation where we are running low on device memory, ROCr
will attempt to claw-back the scratch memory.
Change-Id: Iddf8ec84e37ab8b9fdc58bafbe2b61fe2acb6eb7
[ROCm/ROCR-Runtime commit: dca8f3a21d]
Support function to retain allocation handle for memory mappings.
The get allocation properties function will return the current
allocation properties for existing memory mappings.
This is part of patch series for Virtual Memory API.
Change-Id: I0a53a11b6efc2b5bf9d463512a489a2abd812551
[ROCm/ROCR-Runtime commit: 687eb043d4]
Support exporting and importing dmabuf file descriptors for memory
mappings. The exported dmabuf file descriptors are shareable posix
file descriptors that can be used for cross-vendor, cross-device
and cross-process memory sharing.
This is part of patch series for Virtual Memory API.
Change-Id: I3673fc009f7e73bc26be8349e19f66e20d0607c5
[ROCm/ROCR-Runtime commit: b03c96c264]
Mapping memory handles to virtual memory addresses do not make them
accessible. The set access function is needed to make the memory
mappings accessible to specific agents. The get access function
returns current access properties for individual agents.
This is part of patch series for Virtual Memory API.
Change-Id: I152ba0557fd2a802eb9d840568b68cdd1911b72c
[ROCm/ROCR-Runtime commit: 13fbd8a232]
Add support for mapping and unmapping memory handles to virtual
address ranges.
This is part of patch series for Virtual Memory API.
Change-Id: If512d49ff4211e68f2064249add607a3200e458a
[ROCm/ROCR-Runtime commit: 179dcf1c77]
Add support for creating and releasing memory handles. Memory
handles are memory allocations on device memory without a virtual
address.
This is part of patch series for Virtual Memory API.
Change-Id: I5dfb162eb1661621cce171b2870a3c93b24d840e
[ROCm/ROCR-Runtime commit: e4a84c4a9c]
Add support for reserving virtual address ranges. Virtual address
ranges are addresses without any memory backing. These address ranges
need to be mapped to memory handles later.
This is part of patch series for Virtual Memory API.
Change-Id: I5d066e7421d6896f933f524312afc230a13d594e
[ROCm/ROCR-Runtime commit: 1085311f1a]
This fixes a segfault error in cases where the linking order of
compilation unit varies. Reason behind the segfault is that one
global variable in one compilation unit depends on another global
variable in another compilation unit, but there is no guarantee that
this other compilation unit is initialized first. The fix forces a
reinitialization at the first invocation of the library.
Change-Id: I1428592c6898bca13a330c4588941de260ff0370
[ROCm/ROCR-Runtime commit: d220e16000]
Adds hsa_amd_portable_export_dmabuf and hsa_amd_portable_close_dmabuf
which allow obtaining dmabuf handles to rocr allocations. These handles
may be shared with other APIs to support cross vendor & cross device
memory sharing.
Adds query to return whether dmabuf export is supported
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I7f98501087d9563d07fc2cb428cc886b1e518b1e
[ROCm/ROCR-Runtime commit: 42243c1e8f]
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.
Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4
[ROCm/ROCR-Runtime commit: 061aa04147]
Take in const void* rather than void*. This does not break the
abi or existing code. Existing code would need to cast away any
const which is unnecessary and annoying.
Change-Id: I28787e8fab1b600bf6871ea82835e10a4f475c5b
[ROCm/ROCR-Runtime commit: 270d042ef8]
New environment variable HSA_CU_MASK allows users to
specify a cu mask to every queue allocated from any
GPU. hsa_amd_queue_cu_set_mask is restricted from
escaping this mask.
A new API hsa_amd_queue_cu_get_mask is added to query
the current cu mask.
Change-Id: I846c03a5faaca9b95067c31db84b59cc9fce2f03
[ROCm/ROCR-Runtime commit: 4455250be1]
Includes some workarounds and HMM.
Conflicts:
opensrc/hsa-runtime/core/runtime/amd_topology.cpp
opensrc/hsa-runtime/core/util/flag.h
Change-Id: I22976f07964a43dbb228a6231777dbd599112b8d
[ROCm/ROCR-Runtime commit: 7333c77e22]
Make explicit reference to hsa_api_trace.cpp from
initialization of hsa_table_interface.cpp. Breaks
the ability to use hsa_table_interface.cpp in plugins.
Change-Id: I22a42d3a132512b0d9ec7a1ca629b169e7f8eba7
[ROCm/ROCR-Runtime commit: f4fe7ddf47]
Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback
to notify when HSA memory has been released.
Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d
[ROCm/ROCR-Runtime commit: 299874f17d]
Non-paged memory can be IPC-shared even when HSA_USERPTR_FOR_PAGED_MEM
is enabled.
Change-Id: I8b1fa6d7a4a9327c78a77b3679697fbf55397093
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 0c6b9532d4]
Makes malloc memory accessible to GPUs so that the memory has the
capabilities of the pool it is locked to.
This admits fine grained locked memory and reserves API space for any future
special CPU pools.
Change-Id: If8c3dd8582a43f19d3d36b3763c1a688cc419ef0
[ROCm/ROCR-Runtime commit: a535e18cc1]
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.
Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9
[ROCm/ROCR-Runtime commit: e0839ab27e]
1/ Revised debug event handler to handle different events.
2/ Added queue error handler using the callback in queue create, which will print out wave info when queue in error state.
3/ Preempt queue instead of destory queue when queue error state.
Change-Id: Ib727d208de9caf1c72c76d42268483b24aaebde8
[ROCm/ROCR-Runtime commit: 49d2175c74]
Remove "zombie" queue state and report queue creation failure via
exceptions. Make Shared object a final container and support array
objects with Shared. Add message printing to hsa_exception in
debug builds.
Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
[ROCm/ROCR-Runtime commit: f312a7386e]
Queue intercept is exposed as two tools-only APIs via the API
intercept table.
Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8
[ROCm/ROCR-Runtime commit: 0c7dde2d1f]
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
(1) iterate loaded code objects in executable:
hsa_ven_amd_loader_executable_iterate_loaded_code_objects
(2) get loaded code object info:
hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)
Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
[ROCm/ROCR-Runtime commit: ce6aee01ed]
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.
Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.
Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f
[ROCm/ROCR-Runtime commit: c9642cf7af]
Also separate signal ABI block allocations from the runtime interface object.
Change-Id: If16763338db664f29163a1348f8f4c38cf0597b2
[ROCm/ROCR-Runtime commit: 2732b18092]