Addressed by:
4066dcd kfdtest: increase BigBufStressTest timeout and avoid VM fault
36776e9 kfdtest: avoid BigBufStressTest run on NUMA node 0
Change-Id: If21c6e42b4cf6aada1f74e77f0d8d1a2fdebcdb8
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
Remove the CHIP name from the shader ISA and add wave_size(32) to make the same
shader can be used for both GFX9 and GFX10
Change-Id: I16ea72f87980c3d9c11298e20c06a0a073fe9a28
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
KFDGraphicsInterop.RegisterForeignDeviceMem looks like it is running
now. Re-enable it for kfdtest for all platforms.
Change-Id: I6f6ee9cd11da793c5d525d8676bfc6d5bd8007bb
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
v_add_u32 was removed from gfx10, use carry-out explicit instruction
v_add_co_u32 instead on both gfx9 and gfx10
Change-Id: I1fcd5956844457a676757ad13bdce7f5304bb34b
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Modified shaders in KFD memory test to support gfx10.
There is no gprs register for flat_scratch on gfx10.
Use s_setreg_b32 instruction to set flat scratch base
address register
Change-Id: I505156a046056b61ce2d873343feb50ce635274a
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
This functionality doesn't work on GFX9+, and was disabled for gfx802.
Remove the test altogether for now, especially since some kernel changes
broke it on gfx803, and the functionality is deprecated now anyways. Leave
the code for reference, but "#if 0" it to prevent it from compiling or
being in the kfdtest binary
Change-Id: I848b4f23201f18612cbdc122a5b46e4010c4af2a
Some SDMA packet format might be different among asic versions
Change-Id: Ic7eda7554c23e3972e168480874ca67a92677346
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Because not all ASICs (like gfx908) have GFX rings, we should use SDMA
rings instead of GFX rings.
Change-Id: Ibcc9f9e555302ba4ce25ac76c2ca73b8c3962a58
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
use family ID as parameter when construct the packets
Change-Id: I6c1706954ab7b8cbb8bef2aab16edf21f5e1abf0
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Add release_mem and acquire_mem pm4 packet format for nv
Change-Id: I172407c3418005922c17937e1e43f57d153ea732
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Currently the test only covers relatively small buffers sizes. It's
useful to test buffer sizes up to 1GB to see the impact of features
that target the efficiency of large buffer allocations and mappings.
Change-Id: I2e8d5afd482894dbe2166f32d38091199b9c15e6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This test is designed to reproduce soft-hangs cause by HWS running
with oversubscription.
Change-Id: I49861522b3ff5ba50df5ddc968545c35ccb25353
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Create KFDMultiProcessTest base class for tests forking multiple
child processes. Derive KFDEvictTest from that class.
Change-Id: Ie5f3362c45be2b807bf7a83839ab3820352a67f9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This will faciliate ASIC bringup, including under simulation environment.
Change-Id: Ie027a77a2498cba739fea51f404d9843ce8dbeae
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
BasicAddressWatch causes issues where KFDEvictTest and
KFDQMTest.OverSubscribeCpQueues fails, and results in a GPU hang/reset.
PM4EventInterrupt just hangs indefinitely. Remove them for now to allow
the kernel merges to resume, and figure out what happened in the nv10
merge to cause it
Change-Id: I418f9561ecb3e71bc52ac48ea363fcbde82a8e2b
The SDMA blacklist should contain all tests that use SDMA. It will
be applied to all ASICs that are know to have SDMA stability issues.
Change-Id: I53e723382c12f99bddf9c535000e27737a7ea1f6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
1. Use s_mov_b32 to move 0xcafe to s18. s_movk_i32 is a sign extention move
instruction. Oxcafe will be extended to 0xffffcafe which is not desired
2. Add wait to s_load_dword instruction to make sure memory read finish before
the next store instruction.
Change-Id: I665d1d471019edfaba5693e07cdc567d4103573f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
If TTM eviction and restore happens, it may takes very long time if
retry, the longest time is 5 minutes during my test. There is chance
packet is submited to queue while eviction, we have to increase the
Wait4PacketConsumption timeout.
The queue will continue to execute after eviction and restore. If we
upmap the memory from GPU while queue is evicted, this will cause VM
fault. Change to unmap memory after queue is destroyed.
Change-Id: I1b44e2274ea7b83398b2e3293578dad6947cb5af
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Because dma32 zone is on node 0, use all system memory on node 0 will
cause TTM eviction to free dma32 zone for other devices which only
work with 32bit physical address. The TTM eviction and restore may take
too long and cause queue timeout.
Running on other NUMA nodes, the NUMA default memory policy is
MPOL_PREFERRED, means TTM will get pages from local node first, and then
get remaining pages from other nodes. Check /proc/buddyinfo can confirm
this.
Reset NUMA bind to all after the test.
Change-Id: I39b373c07a2d5aa396f5c7602bffabab0481930f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
PSDB and other jenkins jobs are currently failing on several kfd tests.
This is blocking user throughput for screening patches by PSDB.
Blacklist multiple tests and submit JIRA's.
KFDIPCTest.BasicTest (ROCMOPS-459) .CMABasicTest (ROCMOPS-460) .CrossMemoryAttachTest (ROCMOPS-461)
KFDMemoryTest.BigBufferStressTest (ROCMOPS-462)
KFDQMTest.MultipleSdmaQueues (ROCMOPS-463) (ROCMOPS-416)
KFDEvictTest.BurstyTest (ROCMOPS-464)
Change-Id: I2c7cdeabc26654f39823201ce86d4113b3a98a0e
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
This relates to the following commits:
1. commit aa7c13264a
2. commit 54807526b9
3. commit 6df62c78b8
Change-Id: I3d0d3214baba403b4709b358132b6756a15f42d7
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Allocating these before the big memory allocations minimizes the chances
of spurious out of memory errors.
Change-Id: I94aff9ec7ea34d4dc98ae08ac4cf9dc335b3df7f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This reduces thrashing due to graphics submissions only and
significantly speeds up the BasicTest when keeping idle compute
processes evicted. In the BasicTest compute is always idle, so
only one compute eviction and no restore is triggered. Then
graphics submissions complete quickly without thrashing each other.
Change-Id: Iae6da98903b20424a5097f235e1d09cf13e4b41b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
1. umc error injection only accepts parameter "0 0".
2. flush output to file in order to make writing happen
immediately.
Change-Id: I8d3bde287caee6b90b6eec56c760f5a228be7595
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
The path was wrong based on assumption that GPU dri render
node starts from 0, because if there is a VGA device on
board, node 0 will be VGA and node 1 will be GPU. So the fix
will look at the name of GPU minor node and find the correct
primary node on which RAS debugfs entry exists.
Change-Id: Icc5e63ce48698d5d29105c0417e3bec8afa0a7c8
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
The queue IDs passed over to the kernel via kfd_ioctl_dbg_trap_args->ptr
should be a list of uint32_t's. Need to convert from the passed in
64 bit HSA_QUEUEID to 32 bit uint32_t's.
Change-Id: I8718566d9f9ffc90ce0b2ecc129b10c49d73186a
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
This is updating to the new suspend and resume API for the
KFD and the thunk. We now support passing in a list of queues
to suspend, and not just all of the queues for the process.
The kfdtest testcase was also updated so it still compiles.
Change-Id: I71d1b178476bd9df0c311bdedaa6a891528cebcf
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
- Run more graphics command submissions with shorter delay between
them
- Synchronize after every graphics command submission
- Include the big VRAM BO in the BOList of the command submission
to trigger more evictions
- In QueueTest, run AMDGPU command submissions concurrently with
compute shader on the user mode queue
- Submit AMDGPU commands to GFX queue instead of compute queue to
avoid deadlocks between user-mode and kernel-mode queues on the
same pipe
- Allocate slightly less memory from KFD to avoid allocation errors
due to fragmentation or memory leaks in previous tests
- Running only two processes maximizes the number of KFD evictions
(probably because of lower chances of evicting non-KFD BOs)
Change-Id: If05d53f5fcf690b6488998a3f933f120ddaa71ee
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Those two tests cover the basic queue creation and destruction
without submitting packets to CP and SDMA user queues. During bringup,
they bring values in term of untangling the issues arising in queue
creation and packet execution, which are two very different kinds.
Because of those two tests, we also rename some existing tests as
follows:
CreateCpQueue -> SubmitPacketCpQueue
CreateSdmaQueue -> SubmitPacketSdmaQueue
CreateMultipleCpQueues -> MultipleCpQueues
CreateMultipleSdmaQueues -> MultipleSdmaQueues
Lastly, move MultipleCpQueues test closer to the CP queue section
rather than leaving it behind the SDMA queue section.
Change-Id: I110fb3f3fb21878339045dd1d1c8c9d61b8988b7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
If a Render Node can't be found, we should finish off all child
processes immediately, then return.
Trying to do the check before forking the children results in the test
failing as well, regardless of the status of finding the render node,
which is likely why the forking occurred first in the test's initial
creation. This way we ensure that things finish cleanly before moving to
the next test
Change-Id: I2e1b62fed25c30ff1f179612127c23960da4ee5e
1. RAS error injection debugfs interface has been changed which
is using ras_ctrl instead of *_err_inject.
2. Remove ASSERT_SUCCESS for fwrite, because fwrite returns
the size of written item but not the error number.
3. Using throw exception instead of return to avoid a segment fault.
Change-Id: I6c4d9c2f7e66719faec99abd1552105a08c238a4
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
This reverts commit d00ec779ce.
Fixes for HMM change corner cases are merged in from drm-next.
Tests are passed on gfx900 with the latest amd-kfd-staging.
Change-Id: I6c00d1eacf6b3f1ce715e085ae622b4e9ff1b7ff
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Due to the recent HMM changes, the KFDIPCTest can intermittently fail,
combined with CrossMemoryAttach consistently failing. Remove it for now
while Philip Yang investigates
Change-Id: Icf272100bb7882eff4202ad6f4ced63b569f4e7d