Note: Implementation same as 1.0 APIs for now.
The followup change will have the complete implementation.
Change-Id: Ife633f74ff27eee0bb9b0c46952cf5233b0114e8
Also emit error messages to stderr if no async queue error callback was registered and queue fault messages are enabled (on by default).
Queue fault messages are controlled with env key HSA_ENABLE_QUEUE_FAULT_MESSAGE.
Change-Id: I496487b8d048b83aa95b9784e92928211f167b17
Uncommented HSA IPC code.
Changed hsa_amd_ipc_memory_t to be 8 uint32_t's instead of 9 to
match spec
Change-Id: Id1523125e9b876a23c3743df1be29c98b47f6725
Ensure that the write index and ring buffer contents are visible
to the HW before sending the doorbell. The latter is a write-combined
MMIO store and must be ordered with prior cacehable non-MMIO stores.
Also be more explicit about memory semantics for doorbell stores.
Change-Id: Ie4d96a7ee2a507237a8dbe7705fdf234d62ce9ba
If we issue too many copy commands without syncing and wrapping happens,
we need to wait for the blits to be done before moving forward otherwise
we will overwrite the kernel args of the blits in flight.
Change-Id: I9a21e31ce07f8e8157ca38e96dc264ff47fd3639
Introducing tiling format for images, still using LINEAR for now.
Using the new KFD/Thunk API hsaKmtGetTileConfig API for the address library.
Change-Id: Ic0677429dd320eef09ab62dddaf9b2dd94c4f904
C11 atomics are not statically guaranteed to be lock free and so
may not be atomic with respect to atomic operations originating
outside the standard library, such as platform atomics.
C11 macros to statically discover always lock free operations
(ATOMIC_*_LOCK_FREE) do not cover uint64_t in GCC and
std::atomic<uint64_t> is not a type alias of any covered type.
All use of __atomic by atomic_helpers.h is statically checked to be
always lock free.
GCC builtin fencing does not appear to be strong enough for WC memory.
Added an option (enabled) to enforce consistency for WC memory on x64.
__sync builtin's were not used as they were declared legacy by GCC.
Added a strongly conservative option (ALWAYS_CONSERVATIVE) to enable
use of full memory fences in place of partial fences and compiler
driven processor specific optimization.
Change-Id: Id7aaaca626144070f58759f6a348cbee4612bbc0
Change hsa_code_object_serialize and hsa_code_object_deserialize to use memcpy instead of hsa_memory_copy since it is system->system copy
Change-Id: I329e270ae4e2fc25e177dc8080d93662ffb261ab
- Includes Sean's latest changes
- Cleanups/improvements
- Fixes for few bugs that crept over from previous releases
Change-Id: I839dc4895bf13ebd0afc8843424387a9fef667b0
The PM4 IB must have executable permission.
A second part of this fix concerns robustness when this is not the case.
This remains under investigation.
This fix will shortly be cleaned up in a refactoring pass to consolidate
calls to hsaKmtAllocMemory.
Change-Id: I326fe01949a77669e0b07c3cadc9fd44b8065055