KFDPerformanceTest.P2PBandWidthTest[push, push] takes about 3 seconds
on 4 gfx906, the default g_TestTimeout 2 seconds is not enough to wait
for sDMA queue rptr is consumed. Use kfdtest command line option
--timeout=6000, the test is finished and result is reasonable twice as
P2PBandWidthTest[push, none]. Change P2PBandWidthTest wait timeout to 6
seconds.
Add timeout argument to function WaitOnValue, BaseQueue.Wait4PacketConsumption
SDMAQueue.Wait4PacketConsumption, PM4Queue.Wait4PacketConsumption with
default value is g_TestTimeOut.
Change-Id: I0aa04d644339feaeea695e41647ae66568beab9e
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
This will support the sp3 library built on one gcc version to be
compatible with another gcc version.
Change-Id: If67714bd63376dc781c56ed025be335fe54b2ba5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
These tests all make use of an SDMAQueue in one way or another, so add
them to the SDMA_BLACKLIST to be 100% certain
Change-Id: Ic29e073c2f46249f3e5918145b13d276aec7bb33
This is intermittently causing VM faults and excessive evictions, which
causes the rest of the tests to fail. Take it out for now until someone
can investigate
Change-Id: I9c43890bc9f03a4a31efbc18df0df5e40a232c58
On small bar multi-gpu system, hsaKmtMemoryMapToGPU will fail due to latest
kernel P2P sanity check. Swith to use hsaKmtMemoryMapToGPUNodes to fix
the failure
Change-Id: Id8b6329d1243df0e908cc9a171b5c7f9156f4a8b
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
BaseQueue class has a member function GetQueueType so m_Type
is duplicated. m_Type is only used in one function. Move it to
a local variable.
Change-Id: Ice144cf723178dd628cb49261c23d10605f9ee7d
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
On gfx900+, the test sometimes timeout due to cp fw bug.
Blacklist it until we address the root cause and have a fix.
Change-Id: Iff600a6f6dbd86c56e034f530484205520bced32
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
We observe this test fails on gfx900+. Looks like the sdma packets are not
executed at all after we submit sometimes.
Run it with timeout 2s on gfx900.
[ RUN ] KFDQMTest.SdmaEventInterrupt
[----------] SDMACopyData FAIL! 1485262707170 VS 1485262747814
[----------] Event On Queue 1:0 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[ ] Time Consumption (ns)
[ ] 1: 1859427148
[ ] 2: 680148
[ ] 3: 6370
[ ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[----------] SDMACopyData FAIL! 1485367669958 VS 1485367750022
[----------] Event On Queue 2:1 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[ ] Time Consumption (ns)
[ ] 1: 1881615148
[ ] 2: 673629
[ ] 3: 6074
[ ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[----------] SDMACopyData FAIL! 1485427671250 VS 1485427751238
[----------] Event On Queue 2:1 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[ ] Time Consumption (ns)
[ ] 1: 1881508777
[ ] 2: 741629
[ ] 3: 6074
[ ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[ FAILED ] KFDQMTest.SdmaEventInterrupt (23675 ms)
Change-Id: I7c1b752537d89782570df20838bf976578614f75
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Create an extra event so that the event id to test is non zero. That
way we can be sure the context id received in kernel ISR is non zero, which
is different from the default value 0 when context id is not set at all.
Change-Id: I7e261d1bbb783d5afd15558c7ac00493b1218cef
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Currently the FindDRMRenderNode function will access the sysfs
directly to find the render node. It doesn't work with the
GPU management changes. Have changed code to call hsaKmtGetNodeProperties
instead.
Change-Id: I3bb537a323bc1e8c49f38d8aabc60c13e268aecd
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Some infrastructures below,
Implement SdmaTimePacket which records the global GPU timestamp.
Introduce class AsyncMPSQ and AsyncMPMQ.
AsyncMPSQ is aka async multiple packet single queue. It takes a set of
packet when create and submits them to a GPU to run. While AsyncMPMQ is
aka async multiple packet multiple queue. It manages a set of AsyncMPSQ,
and use a forloop to do operations of AsyncMPSQ.
Implement sdma_multicopy helper functions.
Change-Id: I47e1d2ca9630113b2a1d85a0055f3f8ee629fb5f
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
For following test cases:
- KFDQMTest.QueueLatency
- KFDQMTest.BasicCuMaskingLinear
- KFDQMTest.BasicCuMaskingEven
- KFDMemoryTest.MMBandWidth
- KFDMemoryTest.MMapLarge
- KFDMemoryTest.MMBench
v2: xml element cannot start with a number, so change the key name of
MMBandWidth and MMBench accordingly
xml element cannot contain whitespaces, so trim whitespaces in "VRAM "
v3: introduce KFDLog-like way to use KFDRecord
Change-Id: Ifc3ed5657621252a7b39dccf1ef4f50a92593f77
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
This change is from commit 62f7dc2a("kfdtest: Do not set GTEST_FLAG
throw_on_failure").
But it is unexpected to reverted by commit 414042ab("kfdtest: Clean up
comments"). So add this change back.
Fix: 414042ab
Change-Id: Ia9e99c9ca17b99aab62b4db55017018ddae43dfb
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
The timestamp written by releaseMemory packet might still not be visible
when we fetch it.
To fix this bug, use event-based wait.
Change-Id: If2324eb3b3a632c711ee4dff4d03a93d5306c289
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Use the NodeFrom returned by hsaKmtGetNodeIoLinkProperties() to check
its correctness.
Change-Id: I6ce436dc7c5d5b192bee21156292bd3eff77f916
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Wait4PacketConsumption now can accept an event to wait all packets subbmitted
to be processed.
Change-Id: I1497b7704e892b04d05811b8d3e4742237c1be57
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Removed some tests from the blacklist that are now passing. Added two
new tests that hang the GPU.
Change-Id: I09e729590e5181311375058be492d387342ba2fe
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
As it will alloc as much as small system memory to reach the allocation limit.
We can try to alloc memory several times to see if any allocation in
the previous step cause memory leak.
Also we test if GPU can access these memory correctly or not.
Change-Id: I309f9821b6bc99c212a6bfbc21fe3086ab589fd3
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Similar with SdmaEventInterrupt, verify event interrupt on pm4 queue.
Change-Id: I0e43f26fd0d965126985820704215d2ef5e52c1a
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Simulate some workload there to verify the sDMA event interrupt.
Change-Id: Ib5ad0c238cc66898f7835e765df50427ef106b04
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
It should have PASS/FAIL report for the vram allocated size.
Change-Id: I546c02c2ed02f1cfb5278e0dfd7b18ade39faafb
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
ASSERT failures result in immediate termination of the test. EXPECT
returns a failure but continues execution. Reserve ASSERT for required
functionality (node initialization, queue creation, etc) where the rest
of the test cannot run if that call fails. Use EXPECT everywhere else
Change-Id: I1c11326fc3ae22b50fa83b07b3b49af1e1f4e69e
This should fix gtest compile errors.
code like below has trouble,
typedef char char8;
typedef unsigned char uchar8;
ASSERT_NE((uchar8)1, 0);
ASSERT_NE((unsigned char8)1, 0); // compile error here
or
ASSERT_NE((unsigned char8)1, 0);
ASSERT_NE((uchar8)1, 0); // compile error here
HSA[u]int64 are alias. So ASSERT_XX((unsigned HSAint64)..)
with ASSERT_XX((HSAuint64)..) fail to compile.
Change-Id: I4c24bc699a69bd4f37c4bc8aaaa9f1a92a24a33e
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
The flag makes EXPECT_* to behave like ASSERT_*, which actually work against
our favor, so disable the flag.
Change-Id: I2ea1dfeaf916b396593a504d081148abdac0fc70
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
A lot of tests were disabled on gfx900 for historical reasons that
are no longer valid. The only remaining one that won't work on
gfx900 is BasicAddressWatch.
Change-Id: I11507de0dfd31262713127d6cb15cc09c14b8b9f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
When skipping a test, the output should be:
Skipping test: <reason>.
This will allow for easier identification, automation and general readability
Change-Id: I98bda1c068f9dbc83aeea74f642b6101121f234d
Make indentation consistent, which is that subsequent lines are aligned
with the variables declared above
Change-Id: I590f7768d93565145b986ad1fb6ac8e82f9c0d58
Clean up the KFDTest style via CPPLint. Some warnings remain regarding
volatile variables being cast to void*. This is the command used:
cpplint.py --linelength=120
--filter=-readability/multiline_string,-readability/todo,-build/include,-runtime/references
multiline_string is due to using ISA code
todo is to avoid errors that we don't have TODO(username) instead of TODO
include is about including the folder in the header includes
references is regarding non-const references '&' being const or using
pointers. That can be addressed later
Change-Id: I3c6622da0a13dd33ab29b2bfff48be25e763b750
When mapMemoryToGpu fails, we need unregister it with user address as
the gpu address is not available.
Change-Id: I4418eeaa7aa37008f5bffa144e2c2171f0d238fd
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Implement sDMA copy packet broadcast.
Each time sDMA will copy its local vram to sysbuf and next GPU's vram.
That will verify where the p2p link is broken.
Currently we just test push of p2p.
test result on 2 cpus, 4 gpus, numa enabled system.
[ RUN ] KFDQMTest.P2PTest
[ ] Test 2 -> 3
[ ] PASS 2 -> 3
[ ] Test 3 -> 4
[ ] PASS 3 -> 4
[ ] Test 4 -> 5
[ ] PASS 4 -> 5
[ ] Test 5 -> 0
[ ] PASS 5 -> 0
[ OK ] KFDQMTest.P2PTest (190 ms)
Change-Id: Ie6fb2604109e39465b8a873b3bb42abc6259825a
This test has been intermittently failing for various reasons and
was already disabled on all chips except Ellesmere. It stresses
memory management in unusual ways by having lots of memory allocated
but +# not mapped, which is not relevant to compute applications over
ROCr.
Change-Id: I6b791ca7e2e0fcfe93fc720063b4b56acfded751
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This is to coordinate kfd kernel vram limit change, and adding
GFX vram allocation with submission of command nop is to
trigger eviction.
Change-Id: I18615cd13cfde034aae09c188ae3a82babde97b9
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
This will facilitate the user cases that some APU asics is used as dGPU.
Change-Id: Ib3a79ae31a03e7a618c7785166f56282a7617127
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
A README.txt file is added to help the opensource community to use kfdtest
effectively.
After building, run_kfdtest.sh in the building output folder can be used
to run the test.
Change-Id: I9612d9d5a63bd4cdc3a328efd9961d3cc92a6ba5
Signed-off-by: Yong Zhao <yong.zhao@amd.com>