Temporary measure. Must be reverted once CRAT tables have been fixed.
Change-Id: Id2f2673edbf7b6fc5752f8d871042b4bf4de653c
[ROCm/ROCR-Runtime commit: b49e5b4917]
Create an interface for doorbell signals to reduce code duplication.
No functional changes.
Change-Id: I101a8997dd582ff99e1537758c804b21fe3bb6af
[ROCm/ROCR-Runtime commit: d2e70bb999]
Prefer using memfd_create() for the ring buffer.
We were using /dev/shm, but this won't work on systems that
either don't have /dev/shm or have mounted it with noexec, because
for everything other than gfx700 we map the ring buffer with PROT_EXEC.
memfd_create() is Linux specific and was added in Linux 3.17, so we
will fallback to using /dev/shm on systems where memfd_create() is
not available.
Change-Id: I58fb533eebc362f6d29dc3e316a80801014d50e8
[ROCm/ROCR-Runtime commit: b93ffafdc7]
Corrected semantics used in hsa_queue_load_write_index_relaxed.
The semantics that was used in hsa_queue_load_write_index_relaxed
didn't seem to match the name of the function.
I also removed a useless return keyword.
Change-Id: If3819d38fb367f122fc382edf8ee3771a23279ae
[ROCm/ROCR-Runtime commit: 5872b618de]
Remove "zombie" queue state and report queue creation failure via
exceptions. Make Shared object a final container and support array
objects with Shared. Add message printing to hsa_exception in
debug builds.
Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
[ROCm/ROCR-Runtime commit: f312a7386e]
Queue intercept is exposed as two tools-only APIs via the API
intercept table.
Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8
[ROCm/ROCR-Runtime commit: 0c7dde2d1f]
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
(1) iterate loaded code objects in executable:
hsa_ven_amd_loader_executable_iterate_loaded_code_objects
(2) get loaded code object info:
hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)
Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
[ROCm/ROCR-Runtime commit: ce6aee01ed]
Adds the thunk include and lib paths to the cache, removes paths
to indicator files from the cache, uses the cached path directory
(if any) as a search hint for indicator files.
Change-Id: I0859faa8d229a97abfaacb408d2c831e317aed5f
[ROCm/ROCR-Runtime commit: a8d818a6bc]
TensorFlow was running out of VRAM due to padding up allocations
from legacy memory APIs. These allocations have been added to
the fragment allocator to improve VRAM utilization.
Change-Id: Ic680fff576a0434b3b17a4c91746da44e09957fa
[ROCm/ROCR-Runtime commit: 4f299a9909]
one or both directions. Users can enumerate the pools reported
by system to specify which pools serve as source / destination
Change-Id: I8e6d0adb3743b3328dd3ce9152762ca840ea613b
[ROCm/ROCR-Runtime commit: c2caa5ae2c]
Since access may only be manipulated on whole pages, suballocator fragments must cooperate to set the page's access.
Since the KFD does not migrate memory on access changes this implementation makes agent access sticky across the requests in a fragmented page.
Change-Id: I88479ed45fb40e9782b704526a7b8ffb22e7bd76
[ROCm/ROCR-Runtime commit: e9a6f2c3e6]