Using new ExtendedCoherent KFD HSA memory flag to achieve system
scope coherence on atomic instructions. Non-compliant systems may
have the need to perform explicit HDP flushes to achieve system
scope coherence using this flag.
Change-Id: Ic6b47c0e97285086fa1f52bbfa4597b81cadafeb
Some negative tests can trigger C++ exceptions to be thrown, which
causes code to leave the ref counts in inconsistent state.
Change-Id: Ifa6d8be986941efcdf20d7ac8b86eb15a8fe9932
Modify hsa_amd_vmem_get_access to handle pointers that are within VA
range of an existing memory mapping
Change-Id: I9f806ec39f6e9a33da8d86dd65d9a472438fa8ed
The debug address watch test will hang when running with the
entire KFD test.
Disable it for now.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I1d0479fa2717d2f398cc32e0605ca6dcc17ebcd5
Silence warnings on more stringent compile checks for lack of override
declaration.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iaa54dfc3dd74f5ee55763cafbbcf2db73493bb21
Debug test shaders should use camel case and suffix *Isa to match other
test shader naming convention.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I64e14183ba1c7c9664b13a742a0e5683866e8223
On busy systems, the memory allocation can take long duration and
increase calls to hsa_signal_create/hsa_amd_signal_create. This
mitigates this issue.
Change-Id: Ib7640273262ebc3dbf1f07049ce5da10b1d6b158
MCPU const char * always returns true, so check the value instead.
Before: if (!MCPU) {
After: if (!*MCPU) {
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I414e091ca764095937311648c534351d6abf30e6
For some reason, non-Ubuntu builds have some sort of memory
corruption when running this test, which affect subsequent running
tests. Disable it for now.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I5f54ee4c63286a33c6948bc818aa1501c4a6751e
Add compile time asserts to force incrementing API table STEP versions
each time a new function is added to each table. This is required for
profiler team to be able to add preprocessor macros to determine which
versions contain the new APIs.
Also incrementing the major versions to 2 to indicate new numbering
scheme.
Change-Id: I148a436a5ceab6be3906f8263b40ea9b07841577
Use memset to avoid general 0 set padding issues and ASAN compile issues
for debug tests.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I0a5aca5b7b631083599573b47f1ae87d5d0d5d71
Some GFX9 devices will drop commands if ring buffer submission is less
than 64 DWORDs. Pad submission with a NOP head an trailing null
DWORDs in this case.
Change-Id: I850af490fb699f7efe8aef96d97c600a8e76516b
Also changed enum value to leave gap between enums that only exist in
hsa_region_info_t and enums that exist in both hsa_amd_memory_pool_info_t
Change-Id: I8f9f31200de66648e9328e4203ab283068c993f0
We don't need to keep track of specific blit engines in gang for
submission anymore as ganging early exits on pending bytes.
So tidy up the fluff.
Change-Id: I77e80bf1ad8f561a03fff77bce33aa09d02760c6
In ASAN builds, the compiler used is clang. The initialization of
variable sized array using assignment operator is causing compilation
failure in ASAN builds. Used memset to fix the same.
Change-Id: I02aef3b99a6cad0cce3a378210a48732e07a88fb
Add test to catch trap on wave start or end override event.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Icb57af64475fbd2d8a6c0af9a2ee5db5d1a169c6
Address watch test will test read and write operations.
Test will also check if operation is precise if precise
address watch is available.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I7ef835790e26bf6345682755d7dd26a35853bcd5
For GFX11 debugger testing, waves require to start in non-priv mode for
some test cases, so allow tester to set this.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iee93fda926bfd336d51c79c086f1f75bc35b70e5
When oversubscribing SDMA gangs, a circular deadlock can occur since
gang enqueue is staggered with respect to SDMA engine leader based
on source to destination.
As a result, an enqueued leader may be waiting on a gang item that is
waiting on another enqueued leader or gang item and so on.
To prevent this, first lock the submission to ensure dma status query
and submissions are atomic. Once this is in place, be more stringent
with ganging in that all SDMA engines must be available in order to gang.
Finally, re-enable SDMA ganging by default.
Change-Id: I4511e3487db9d26475b5aece4897f10168cc5322
xGMI for compute partitioning in non-SPX modes does not have
a reported bandwith.
Fix it to at most 2 since each partition is either bounded
by the number of xGMI links or the number of available
SDMA contexts.
Change-Id: I09094bd7548d9eee6f039b0efe849838e5de166e
SDMA ganging is causing some regressions with some applications hanging.
Temporarily disabling SDMA ganging by default until issue is fixed.
Change-Id: I65e172923a53a967df27b30d969ad5d215c4fa09
Add base debug operations to suspend and resume queues.
Routine will return the number of queues successfully
suspended or resumed.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8f18317f70464b04231c5cf822e11d545ebfa02a
Check that a jump to trap event can be picked up by the debugger.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iad5f87092f2b82d5018013bba548979122a9bd02
Add debug attach and runtime enable test for attaching to a spawned and
running process.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I72302ff73494d9dae0c79a299508085d7ca0552b
Add base debug class and attach/detach operations.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I60f3c166646f05838fec208ac2f59bba998c63f8
Even if the version of libdrm older and does not support the
amdgpu_device_get_fd function, the device_handle stored in
amdgpu_handle[] is still valid and can be returned via
hsaKmtGetAMDGPUDeviceHandle.
Change-Id: I024a3e82e6cfebac5577aefe359b067746c4023e
Use all available SDMA engines capped by xGMI bandwith for
all D2D copies within a hive.
By default, set the latency boundary copy size as 4KB and below.
Any copy size in within this boundary will not gang.
Avoid oversubscribing engines by not ganging on engines with
pending non-ganged work.
An enviroment variable HSA_ENABLE_SDMA_GANG has been provided
to override default ganging behaviour.
Change-Id: Iccde76aa1af1d47ea2a151789432c9db4f0ffa8d