LinkInfo is already initialized to zero in its default constructor.
Change-Id: Ifa4fb886cce9b474c6879c9c82744044ab394082
[ROCm/ROCR-Runtime commit: 2843988dd7]
Remove fence pool and use two signals. Two signals allows overlapped
submission and copy while reducing thread busy polling.
Change-Id: Idb5f8e4c7f482a596ffce9e7799191fdd785a216
[ROCm/ROCR-Runtime commit: 56ed5c8904]
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.
Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9
[ROCm/ROCR-Runtime commit: e0839ab27e]
Can only check that the signal has some time stamp, can't check if
the translating agent matches the last used agent or not.
Change-Id: I62943a864318808059c617280bb65a269dfadd1b
[ROCm/ROCR-Runtime commit: aca00b7238]
Adds HSA_AMD_SYSTEM_INFO_BUILD_VERSION=0x200 to hsa_system_info_t.
This returns a const char* pointing at the build string (git describe).
Change-Id: I73e6612482bf6ffc4037fd365808eb9211a650ad
[ROCm/ROCR-Runtime commit: cd8e5c1da8]
Adds env flag HSA_REV_COPY_DIR. If set to 1 async copy will
copy from dst device to src device rather than from src to dst.
Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e
[ROCm/ROCR-Runtime commit: 6c47780620]
Conserves VMIDs when multiple processes are in use and memory operations
are not GPU specific. For instance HIP API hipHostMalloc does not accept
a target GPU so when used with one process per GPU (ie GPU == MPI rank) we can
quickly exceed the available VMID slots if every process consumes a VMID on
every GPU.
Change-Id: Ib6fa051290089f71581029c09f9a44b9992237d1
[ROCm/ROCR-Runtime commit: 35a270ef7e]
SDMA will use atomic completion fences if KFD reports 64bit atomic support.
Otherwise it will fall back to store completion fences.
Change-Id: I12b76f8a74ec3ee96372c250f9824d846051536e
[ROCm/ROCR-Runtime commit: 3e3aa37750]
These fixes are needed to find the hsakmt headers and libraries with
an upcoming hsakmt build system cleanup. It should continue to work
with the original hsakmt build system.
Change-Id: I6b3fcea8f2588698c130c9ec50952c66712afa6c
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 5f25d024a8]
Disable some tests that rely on features not typically available
in emulator and use smaller data and iteration sets
Change-Id: I587bf83162b114719e0361109ed44c6bf2adf34c
[ROCm/ROCR-Runtime commit: 2c8cbf61c3]
Avoids using non-atomic SDMA fences by default since that path can duplicate fences.
If HSA_ENABLE_SDMA is set this will override copy path selection and may use
non-atomic fences.
Change-Id: I4747e9a766f7f649d21ddf6bfded047ac26fd60e
[ROCm/ROCR-Runtime commit: c593dfc6bf]
llvm.debugtrap and other trap IDs are reserved and should not place
the queue into an error state.
Change-Id: I98193a35ac7da94c4a42ee75d87754ee552ebea0
[ROCm/ROCR-Runtime commit: 536823482b]
Ensure system release fence is set on GFX8 large scratch using packets.
Change-Id: I13cfdcd35969482ea6e95e0b352f5cb3a0454b86
[ROCm/ROCR-Runtime commit: 5f25619bb7]
Use async. signal handler to satisfy dependencies for SDMA blits.
Change-Id: Ifa8d3ee6810509f400a568ca2387ac6ab3ab7c36
[ROCm/ROCR-Runtime commit: 7cd6e366ed]