Prefer using memfd_create() for the ring buffer.
We were using /dev/shm, but this won't work on systems that
either don't have /dev/shm or have mounted it with noexec, because
for everything other than gfx700 we map the ring buffer with PROT_EXEC.
memfd_create() is Linux specific and was added in Linux 3.17, so we
will fallback to using /dev/shm on systems where memfd_create() is
not available.
Change-Id: I58fb533eebc362f6d29dc3e316a80801014d50e8
Corrected semantics used in hsa_queue_load_write_index_relaxed.
The semantics that was used in hsa_queue_load_write_index_relaxed
didn't seem to match the name of the function.
I also removed a useless return keyword.
Change-Id: If3819d38fb367f122fc382edf8ee3771a23279ae
Remove "zombie" queue state and report queue creation failure via
exceptions. Make Shared object a final container and support array
objects with Shared. Add message printing to hsa_exception in
debug builds.
Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
(1) iterate loaded code objects in executable:
hsa_ven_amd_loader_executable_iterate_loaded_code_objects
(2) get loaded code object info:
hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)
Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
Adds the thunk include and lib paths to the cache, removes paths
to indicator files from the cache, uses the cached path directory
(if any) as a search hint for indicator files.
Change-Id: I0859faa8d229a97abfaacb408d2c831e317aed5f
TensorFlow was running out of VRAM due to padding up allocations
from legacy memory APIs. These allocations have been added to
the fragment allocator to improve VRAM utilization.
Change-Id: Ic680fff576a0434b3b17a4c91746da44e09957fa
Since access may only be manipulated on whole pages, suballocator fragments must cooperate to set the page's access.
Since the KFD does not migrate memory on access changes this implementation makes agent access sticky across the requests in a fragmented page.
Change-Id: I88479ed45fb40e9782b704526a7b8ffb22e7bd76
GCC can't reasonably be told that the lock ptr isn't null. Adding a private bool
allows the branch to be eliminated, along with the bool.
Change-Id: I0605d69474d6a6e6951be93c0af1d8caf3f77124
Track pointer info for sub 2MB fragment allocations in allocation_map_.
Add fragment support to IPC.
Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.
Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.
Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f