Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.
Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9
Adds env flag HSA_REV_COPY_DIR. If set to 1 async copy will
copy from dst device to src device rather than from src to dst.
Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e
Remove "zombie" queue state and report queue creation failure via
exceptions. Make Shared object a final container and support array
objects with Shared. Add message printing to hsa_exception in
debug builds.
Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
GCC can't reasonably be told that the lock ptr isn't null. Adding a private bool
allows the branch to be eliminated, along with the bool.
Change-Id: I0605d69474d6a6e6951be93c0af1d8caf3f77124
Track pointer info for sub 2MB fragment allocations in allocation_map_.
Add fragment support to IPC.
Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
When a fatal memory fault occurs the scheduler context-saves all queues
in the process and notifies the runtime through the memory event. The
saved state contains all GPR/LDS data at the moment of the fault.
Retrieve this state and present it to the user if HSA_DEBUG_FAULT is set
to "analyze" and the wavefront caused the fault. If amdgcn-capable objdump
is in the PATH invoke this to disassemble code around the PC.
Queue lifetime is now managed by the runtime to allow querying the
context save state for all active queues.
Change-Id: I6fee662fad1c4f9aa125bf5c53d7d0ea1ab32f95
- Build system fixes
- No user-mode high-precision timer by default, use clock_gettime
- Use C11 aligned_alloc pending C++17 std::aligned_alloc
Change-Id: I268365bdfd11d1e817a89584b9e086ee5b86e1dc
Also emit error messages to stderr if no async queue error callback was registered and queue fault messages are enabled (on by default).
Queue fault messages are controlled with env key HSA_ENABLE_QUEUE_FAULT_MESSAGE.
Change-Id: I496487b8d048b83aa95b9784e92928211f167b17
C11 atomics are not statically guaranteed to be lock free and so
may not be atomic with respect to atomic operations originating
outside the standard library, such as platform atomics.
C11 macros to statically discover always lock free operations
(ATOMIC_*_LOCK_FREE) do not cover uint64_t in GCC and
std::atomic<uint64_t> is not a type alias of any covered type.
All use of __atomic by atomic_helpers.h is statically checked to be
always lock free.
GCC builtin fencing does not appear to be strong enough for WC memory.
Added an option (enabled) to enforce consistency for WC memory on x64.
__sync builtin's were not used as they were declared legacy by GCC.
Added a strongly conservative option (ALWAYS_CONSERVATIVE) to enable
use of full memory fences in place of partial fences and compiler
driven processor specific optimization.
Change-Id: Id7aaaca626144070f58759f6a348cbee4612bbc0
This option was disabled by default to address issues writing to stderr
in Windows applications. The lack of an error message for memory access
faults is confusing to users, however.
Enable the error message by default on Linux only.
Change-Id: I1f44ba42362f8874abdc7c8e63ddd54a855b5394