This helps the user to troubleshoot the problem.
Change-Id: If6cf42c488097011285252a6c722d3d74c0f7ce7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 4e7b2f2e27]
This will allow it to be installed with the ROCm suite,
and centralize things a little bit more
Also update run_kfdtest.sh to reflect the changes
Lastly, remove "die" reference as compute_utils.sh
may not be packaged with KFDTest
Change-Id: I4c30cd29979192496419e71e3685937d7417f739
[ROCm/ROCR-Runtime commit: a360c68b0c]
Those tests are currently all passing.
Change-Id: I233afe33e8275d482bab5b5590b856fce49af76d
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 4f2ff25a3d]
After XGMI SDMA queues were separated from regular SDMA queues, they
were not covered in the current tests. Add tests for them now.
Change-Id: I036e3ca5d583ab7f022a9dc6cda3ef867f4773a0
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: fe97612800]
KFDExceptionTest on those platform is passing.
Change-Id: I328ee4fd4ff5b339e560f2f79e754fd34459210a
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 44db5cb011]
The initial baseline measurements are proving inconsistent, which
results in the test failing more often with different variant rates
Change-Id: I1f4e04bf7d615cf39de9605bd5141a997b22cdfc
[ROCm/ROCR-Runtime commit: 8b14ea2e83]
The old names are not accurate enough and we rename them according to
their corresponding fault types.
Change-Id: Icf4d52ba0ab9d49af5d912a0feb82665b1e8d344
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: f7c0172385]
The InvalidPPR* tests are only useful for gfx801 right now, on which
they won't trigger exceptions. So they are not relevent in the
KFDExceptionTest category. In addition, given AccessPPRMem already tests
the PPR memory functionality, we can just delete those two tests.
Change-Id: Id5c6e23c4c0ce47a4f04e9e1f0fa9083e0a9d0e0
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 1c2c5a7b9a]
This puts all CP and SDMA queues in a single test, which is
currently missing.
Change-Id: I98bf58df1be65fe9daf6311c016a48569a8ab674
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 10ffc63d7b]
The previous BigBufferStressTest has too much stuff and takes a long
time to run. By separating largest*BufferTest out into other
tests, we dramatically reduce the time to run BigBufferStressTest and
therefore make reproducing issues much easier.
Meanwhile, rename the test to BigSysBufferStressTest to express more
information.
Change-Id: I5911f113c0bd50627ee6d84bbb4f2972cbed8886
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: b6cefa7bda]
Because of that, rename the test to AllCpQueues.
Change-Id: I57105f863db2558e850c703d151ffebcce2c7a17
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: a4d570fa2b]
This patch adds the support for gfx90c apu. So far we treat it as "dgpu" and
gfx900. Will update hsa gfxip table while the isa/llvm is implemented on gfx90c.
Change-Id: I6ef164bf3e751fe6dd6287cac212a500dce84b1a
Signed-off-by: Huang Rui <ray.huang@amd.com>
[ROCm/ROCR-Runtime commit: fdba74c2fb]
The test actually tested all available SDMA queues, so change the name
to reflect the fact.
Change-Id: Ia23df3e5ac79b692b0b60194b05603ba8dd897a4
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 4b36a1e728]
The tests are useful to triage the fundamental queue submission
functionality by excluding the packet format variable from the equation.
Change-Id: I2c7fcda811f93bdefc1b62396233559416be44e7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: ac5c433420]
The new filter can be used by "./run_kfdtest.sh -p core_sws".
Change-Id: I1c43669cfc07c09ccafb9fa2e2851932ac59307d
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 98b0652917]
We are still working on those tests for gfx1010, so disable them
temporarily.
Change-Id: I5d51b4b02bc753137014684859cc033f759b2899
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: ffbdb726ac]
This is some data fabric/vbios issue that causing system hard hang
while running this test. Will enable it after the HW/vibos fix.
Change-Id: Ic0753c2d92e9e4863c310da9a595b2af302f17f8
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: dadbbbb03c]
This will emilinate the need of updating the run_kfdtest.sh every time
a new platform is added.
Change-Id: I584d65b462de36a685fa2d29d43962078ba511dc
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: ec1375ac66]
A number of tests are no longer broken on gfx802.
Change-Id: If70c77423f8f14de59490ab8ca156b0c4e7b5cf1
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 8f91d6a222]
This saves us from maintaining device ID to Asic mapping in the scripts.
Moreover, stop using abbrevation asic names to avoid confusion.
Change-Id: I7ce583b26b09b627c142aae41932483b28c545d8
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: f1c0bc8e35]
Decimal is better than hex in this case.
Change-Id: Ic15a9373e99160880b98d3dcd6827d551c87b77a
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: 3e4c42ef13]
Addressed by:
ae92e8f kfdtest: increase BigBufStressTest timeout and avoid VM fault
3edf77b kfdtest: avoid BigBufStressTest run on NUMA node 0
Change-Id: If21c6e42b4cf6aada1f74e77f0d8d1a2fdebcdb8
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
[ROCm/ROCR-Runtime commit: 20cd954fe8]
KFDGraphicsInterop.RegisterForeignDeviceMem looks like it is running
now. Re-enable it for kfdtest for all platforms.
Change-Id: I6f6ee9cd11da793c5d525d8676bfc6d5bd8007bb
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: b6f6d9da1c]
The test is viable still on VG10/20. Phil is investigating why it takes
so long on gfx803
Change-Id: I61669b29dc0e8407858a5c73cfa69c5ea923846f
[ROCm/ROCR-Runtime commit: 79a3995816]
This functionality doesn't work on GFX9+, and was disabled for gfx802.
Remove the test altogether for now, especially since some kernel changes
broke it on gfx803, and the functionality is deprecated now anyways. Leave
the code for reference, but "#if 0" it to prevent it from compiling or
being in the kfdtest binary
Change-Id: I848b4f23201f18612cbdc122a5b46e4010c4af2a
[ROCm/ROCR-Runtime commit: 1ca1825b84]
This test is designed to reproduce soft-hangs cause by HWS running
with oversubscription.
Change-Id: I49861522b3ff5ba50df5ddc968545c35ccb25353
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 5475e618e5]
This will faciliate ASIC bringup, including under simulation environment.
Change-Id: Ie027a77a2498cba739fea51f404d9843ce8dbeae
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
[ROCm/ROCR-Runtime commit: c27704ded9]
BasicAddressWatch causes issues where KFDEvictTest and
KFDQMTest.OverSubscribeCpQueues fails, and results in a GPU hang/reset.
PM4EventInterrupt just hangs indefinitely. Remove them for now to allow
the kernel merges to resume, and figure out what happened in the nv10
merge to cause it
Change-Id: I418f9561ecb3e71bc52ac48ea363fcbde82a8e2b
[ROCm/ROCR-Runtime commit: be6ff2cdff]
The SDMA blacklist should contain all tests that use SDMA. It will
be applied to all ASICs that are know to have SDMA stability issues.
Change-Id: I53e723382c12f99bddf9c535000e27737a7ea1f6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 62ee7b4112]
The bus error bug was fixed from kfd driver and Thunk
Change-Id: Id02617fdc26f1c49307f90a0a939e05f22d739e7
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: be9ac578ef]
PSDB and other jenkins jobs are currently failing on several kfd tests.
This is blocking user throughput for screening patches by PSDB.
Blacklist multiple tests and submit JIRA's.
KFDIPCTest.BasicTest (ROCMOPS-459) .CMABasicTest (ROCMOPS-460) .CrossMemoryAttachTest (ROCMOPS-461)
KFDMemoryTest.BigBufferStressTest (ROCMOPS-462)
KFDQMTest.MultipleSdmaQueues (ROCMOPS-463) (ROCMOPS-416)
KFDEvictTest.BurstyTest (ROCMOPS-464)
Change-Id: I2c7cdeabc26654f39823201ce86d4113b3a98a0e
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
[ROCm/ROCR-Runtime commit: 3f2d2e67c9]
This relates to the following commits:
1. commit 931dd817fa
2. commit 34e6346848
3. commit 880119d3a3
Change-Id: I3d0d3214baba403b4709b358132b6756a15f42d7
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
[ROCm/ROCR-Runtime commit: fe4db33875]
This reverts commit a349805264.
Fixes for HMM change corner cases are merged in from drm-next.
Tests are passed on gfx900 with the latest amd-kfd-staging.
Change-Id: I6c00d1eacf6b3f1ce715e085ae622b4e9ff1b7ff
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 0bd9f35563]
Due to the recent HMM changes, the KFDIPCTest can intermittently fail,
combined with CrossMemoryAttach consistently failing. Remove it for now
while Philip Yang investigates
Change-Id: Icf272100bb7882eff4202ad6f4ced63b569f4e7d
[ROCm/ROCR-Runtime commit: d00ec779ce]
Per Philip Yang:
For forked child process, userptr allocated on heap (through malloc)
will have two vmas if child process malloc smaller size buf, free it,
this is on vma cloned from parent process. Then malloc larger size buf,
kernel will put some pages on previous freed space from vma cloned,
create new vma for the rest of pages. This is what IPCTest does.
Change-Id: I054771e20880f975d7cc774225f19aad5363843f
[ROCm/ROCR-Runtime commit: a0b8dd8462]
They are disabled for now.
Change-Id: I9c936130cbaf8c773f4b8e94bccf4af1f45eda65
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
[ROCm/ROCR-Runtime commit: 7349276860]
We need to black list this testcase temporarily because
it is failing intermittently. The failure tends to only happen
when the certain build machine is used to build it.
This issue is being tracked by Jira ticket:
ROCMOPS-389
Change-Id: Ic4682c9da389ed731cbc034dff57e6646bba0e9d
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: 90a3697e1d]
These tests all make use of an SDMAQueue in one way or another, so add
them to the SDMA_BLACKLIST to be 100% certain
Change-Id: Ic29e073c2f46249f3e5918145b13d276aec7bb33
[ROCm/ROCR-Runtime commit: 54807526b9]