Graf Tiomantas

33 Tiomáintí

Údar SHA1 Teachtaireacht Dáta
Yong Zhao fe97612800 kfdtest: Add basic tests for XGMI SDMA queues
After XGMI SDMA queues were separated from regular SDMA queues, they
were not covered in the current tests. Add tests for them now.

Change-Id: I036e3ca5d583ab7f022a9dc6cda3ef867f4773a0
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-10 15:32:12 -05:00
Yong Zhao 10ffc63d7b kfdtest: Add AllQueues test
This puts all CP and SDMA queues in a single test, which is
currently missing.

Change-Id: I98bf58df1be65fe9daf6311c016a48569a8ab674
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-17 19:25:14 -05:00
Eric Huang caa744944e kfdtest: change baseline in BasicCuMaskingEven
To keep normal performance, we have to use symmetrical cu masks
based on shader engines. So change baseline from 1 cu to 1 cu
per SE.

Change-Id: I9e83a87fb670bb406f7983714fa0d8ab673609eb
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-12-11 14:48:55 -05:00
Yong Zhao 8f140fc03d kfdtest: Add some logs to Atomics test
This will help us triage the unexpected hangs on the farm systems.
Meanwhile, simplify the logic.

Change-Id: Ie50b97a34cb86891720dce11f2d178bff9aa2cd5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-02 12:56:03 -05:00
Yong Zhao a4d570fa2b kfdtest: Expand KFDQMTest.MultipleCpQueues to cover all CP queues
Because of that, rename the test to AllCpQueues.

Change-Id: I57105f863db2558e850c703d151ffebcce2c7a17
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-15 08:25:14 -05:00
Yong Zhao 4b36a1e728 kfdtest: Rename KFDQMTest.MultipleSdmaQueues to AllSdmaQueues
The test actually tested all available SDMA queues, so change the name
to reflect the fact.

Change-Id: Ia23df3e5ac79b692b0b60194b05603ba8dd897a4
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-08 19:00:37 -05:00
Yong Zhao 174484aac3 kfdtest: Use GetSystemTickCountInMicroSec() as much as possible
GetSystemTickCountInMicroSec() wraps the function gettimeofday().

Change-Id: I7b767a6efdd1db491fc8113313945b578ac69382
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-07 18:21:04 -05:00
Yong Zhao ac5c433420 kfdtest: Add a Nop packet submission test for CP and SDMA queue
The tests are useful to triage the fundamental queue submission
functionality by excluding the packet format variable from the equation.

Change-Id: I2c7fcda811f93bdefc1b62396233559416be44e7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-22 13:42:33 -04:00
Eric Huang 0174377351 kfdtest: add xgmi path for p2p tests
When large bar is not available, we can use
xgmi to do p2p tests.

Change-Id: Ib7b59fb8a4d41f605739a0428973f6b2f1a3450f
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-10-17 10:21:10 -04:00
Philip Yang 69d8f2d734 kfdtest: use flag NoNUMABind to allocate system memory
Allocate system memory from node id 0 will fail on NUMA system which has
no memory on node 0. Change to use new flag NoNUMABind to allocate
system memory from NUMA nodes which have free memory.

Change-Id: I8ef9ca28fc2ab5dd31d07a2d3eaf1d5886e798a0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-09-16 12:25:01 -04:00
Yong Zhao 3e4c42ef13 kfdtest: Improve the printing message for CuMasking tests
Decimal is better than hex in this case.

Change-Id: Ic15a9373e99160880b98d3dcd6827d551c87b77a
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-08-07 13:57:36 -04:00
shaoyunl 78e754ca5b KFDTest: Make shader compatiable for gfx9 and gfx10
Remove the CHIP name from the shader ISA and add wave_size(32) to make the same
shader can be  used for both  GFX9 and GFX10

Change-Id: I16ea72f87980c3d9c11298e20c06a0a073fe9a28
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-30 10:56:19 -04:00
shaoyunl 395750264d KFDTest : Add family ID when building SDMA packet
Some SDMA packet format might be different among asic versions

Change-Id: Ic7eda7554c23e3972e168480874ca67a92677346
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-22 16:36:49 -04:00
shaoyunl e9882daf11 KFDTest : Add gfx1xxx release_mem and acquire_mem packet support
use family ID as parameter when construct the packets

Change-Id: I6c1706954ab7b8cbb8bef2aab16edf21f5e1abf0
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-18 10:43:48 -04:00
Yong Zhao 5b18614eaf kfdtest: Add CreateDestroyCpQueue and CreateDestroySdmaQueue test
Those two tests cover the basic queue creation and destruction
without submitting packets to CP and SDMA user queues. During bringup,
they bring values in term of untangling the issues arising in queue
creation and packet execution, which are two very different kinds.

Because of those two tests, we also rename some existing tests as
follows:
CreateCpQueue             ->   SubmitPacketCpQueue
CreateSdmaQueue           ->   SubmitPacketSdmaQueue
CreateMultipleCpQueues    ->   MultipleCpQueues
CreateMultipleSdmaQueues  ->   MultipleSdmaQueues

Lastly, move MultipleCpQueues test closer to the CP queue section
rather than leaving it behind the SDMA queue section.

Change-Id: I110fb3f3fb21878339045dd1d1c8c9d61b8988b7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-04-23 15:08:17 -04:00
Yong Zhao 1d478f3cf2 kfdtest: Add a result check in CreateCpQueue test
With the orginal code, CreateCpQueue will report failure if
WaitOnValue return false. Add EXPECT_TRUE() so that in that case
the failure is reported.

Change-Id: I043d013958b452d7ccb9538dc296d99d024abf01
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-02-21 17:42:45 -05:00
Yong Zhao 7cd7830182 kfdtest: Use SDMA queue info from class KFDBaseComponentTest
Change-Id: Iacdb75006ef5f9ce6c4fbb0525d2c3a8535fdd23
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-01-31 14:00:57 -05:00
Oak Zeng 742fa5d871 Revert "Add test to allocate SDMA queue on specific engine"
This reverts commit af5b320c47.

Change-Id: I262d91afc60ba2618bf4a857f162ea5236d54131
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-21 10:36:54 -06:00
Oak Zeng af5b320c47 Add test to allocate SDMA queue on specific engine
Change-Id: I5b5140e4119fc01db250d63cca7389cf80ec0d16
Signed-off-by: Oak Zeng <ozeng@amd.com>
2018-11-20 11:17:43 -05:00
xinhui pan ab4610cff7 kfdtest: Add more debug information of sdma event interrupt test
We observe this test fails on gfx900+. Looks like the sdma packets are not
executed at all after we submit sometimes.

Run it with timeout 2s on gfx900.
[ RUN      ] KFDQMTest.SdmaEventInterrupt
[----------] SDMACopyData FAIL! 1485262707170 VS 1485262747814
[----------] Event On Queue 1:0 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[          ] Time Consumption (ns)
[          ] 1: 1859427148
[          ] 2: 680148
[          ] 3: 6370
[          ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
  Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[----------] SDMACopyData FAIL! 1485367669958 VS 1485367750022
[----------] Event On Queue 2:1 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[          ] Time Consumption (ns)
[          ] 1: 1881615148
[          ] 2: 673629
[          ] 3: 6074
[          ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
  Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[----------] SDMACopyData FAIL! 1485427671250 VS 1485427751238
[----------] Event On Queue 2:1 Timeout, try to resubmit packets!
[----------] The timeout event is signaled!
[          ] Time Consumption (ns)
[          ] 1: 1881508777
[          ] 2: 741629
[          ] 3: 6074
[          ] 4: 5481
/home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure
Value of: (ret)
  Actual: 31
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDQMTest.SdmaEventInterrupt (23675 ms)

Change-Id: I7c1b752537d89782570df20838bf976578614f75
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-10-19 15:29:54 -04:00
xinhui pan e5a541eaf2 kfdtest: Add P2P bandwidth test
The test measures the bandwidth between GPUs. Currently we do not
care numa topology as some products really support across PCI-e root
complex p2p.

test result on two gfx900 system.
[ RUN      ] KFDPerformanceTest.P2PBandWidthTest
[          ] Copy from node to node by [push, NONE]
[          ] [1 -> 0] 6.13477 - 6.12695 GB/s
[          ] [1 -> 2] 3.77734 - 3.76855 GB/s
[          ] [2 -> 0] 6.67676 - 6.6543 GB/s
[          ] [2 -> 1] 6.14453 - 6.12793 GB/s
[          ] Copy from node to node by [pull, NONE]
[          ] [1 -> 0] 6.10547 - 6.08105 GB/s
[          ] [1 -> 2] 9.65527 - 9.65039 GB/s
[          ] [2 -> 0] 6.49805 - 6.4873 GB/s
[          ] [2 -> 1] 8.95508 - 8.85254 GB/s
[          ] Full duplex copy from node to node by [push|pull, NONE]
[          ] [1 -> 0] 11.0986 - 11.0986 GB/s
[          ] [1 -> 2] 7.54297 - 7.54297 GB/s
[          ] [2 -> 0] 12.0264 - 11.9639 GB/s
[          ] [2 -> 1] 12.0469 - 12.0371 GB/s
[          ] Full duplex copy from node to node by [push, push]
[          ] [1 <-> 2] 11.7324 - 11.4541 GB/s
[          ] Full duplex copy from node to node by [pull, pull]
[          ] [1 <-> 2] 11.4824 - 11.0508 GB/s
[          ] Copy from node to multiple nodes by [push, NONE]
[          ] [1 -> [0...2]] 5.625 - 5.73633 GB/s
[          ] [2 -> [0...2]] 6.45801 - 6.4707 GB/s
[          ] Copy from multiple nodes to node by [push, NONE]
[          ] [[1...2] -> 0] 12.8379 - 12.2578 GB/s

Now we can get more timestamp info like below.

Copy from node to node by [push, NONE]
[1 -> 0]
[1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-###############################
[1 : 1] ####################################################################################################
[1 -> 2]
[1 : 0] #--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-######################################
[1 : 1] ##################################################################################################-#
[2 -> 0]
[2 : 0] ##-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-#################
[2 : 1] ###############################################################################-#############-###-##
[2 -> 1]
[2 : 0] ##-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-####################
[2 : 1] ################################################################################-###-############-##

[snip]

Full duplex copy from node to node by [push, push]
[1 <-> 2]
[1 : 0] #-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-####################################
[1 : 1] ################-###################################################-############-####-#############
[2 : 2] #-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-##################
[2 : 3] #####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-#####-##
Full duplex copy from node to node by [pull, pull]
[1 <-> 2]
[1 : 0] ######################################################################-##-#-###############-####-###
[1 : 1] #-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-############################
[2 : 2] ##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-############
[2 : 3] #-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#########-#############
Copy from node to multiple nodes by [push, NONE]
[1 -> [0...2]]
[1 : 0] #-#--#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-###############################
[1 : 1] ########################################################################################-###-###-###
[2 -> [0...2]]
[2 : 0] ##-##-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-##################
[2 : 1] -################################################################################################-##
Copy from multiple nodes to node by [push, NONE]
[[1...2] -> 0]
[1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-###############################
[1 : 1] ################################################################################################-#-#
[2 : 2] ##-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-##################
[2 : 3] #########################-#########################-#########################-#########################
[       OK ] KFDPerformanceTest.P2PBandWidthTest (15982 ms)

Change-Id: Ia90044191d51650ccb220476d31fb317aa3ad6ce
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-09-19 12:03:05 +08:00
Xiaojie Yuan 247fa9f1e0 Use 'RecordProperty' to record performance scores
For following test cases:
- KFDQMTest.QueueLatency
- KFDQMTest.BasicCuMaskingLinear
- KFDQMTest.BasicCuMaskingEven
- KFDMemoryTest.MMBandWidth
- KFDMemoryTest.MMapLarge
- KFDMemoryTest.MMBench

v2: xml element cannot start with a number, so change the key name of
    MMBandWidth and MMBench accordingly
    xml element cannot contain whitespaces, so trim whitespaces in "VRAM  "
v3: introduce KFDLog-like way to use KFDRecord

Change-Id: Ifc3ed5657621252a7b39dccf1ef4f50a92593f77
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
2018-09-18 17:41:14 +08:00
xinhui pan 07bd97a864 kfdtest: Fix queuelatency fail issue
The timestamp written by releaseMemory packet might still not be visible
when we fetch it.
To fix this bug, use event-based wait.

Change-Id: If2324eb3b3a632c711ee4dff4d03a93d5306c289
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-09-10 21:17:29 -04:00
xinhui pan 3e527bc7e8 kfdtest: add PM4EventInterrupt test
Similar with SdmaEventInterrupt, verify event interrupt on pm4 queue.

Change-Id: I0e43f26fd0d965126985820704215d2ef5e52c1a
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-24 13:21:01 +08:00
xinhui pan bdb1f8a066 kfdtest: Let SdmaEventInterrupt test more meaningful
Simulate some workload there to verify the sDMA event interrupt.

Change-Id: Ib5ad0c238cc66898f7835e765df50427ef106b04
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-24 11:27:34 +08:00
Kent Russell fe33461622 kfdtest: Consolidate logic for ASSERT vs EXPECT
ASSERT failures result in immediate termination of the test. EXPECT
returns a failure but continues execution. Reserve ASSERT for required
functionality (node initialization, queue creation, etc) where the rest
of the test cannot run if that call fails. Use EXPECT everywhere else

Change-Id: I1c11326fc3ae22b50fa83b07b3b49af1e1f4e69e
2018-08-23 06:20:18 -04:00
Kent Russell 414042abf7 kfdtest: Clean up comments
Consolidate style (use /* */ for multi-line), fix typos,
use dword instad of DWORD/DWord

Change-Id: I620e45c1687550db41127e45641b7d79d28223a1
2018-08-23 06:20:17 -04:00
xinhui pan 163fa2f3aa kfdtest: use HSAuint64 instead of unsigned HSAint64
This should fix gtest compile errors.

code like below has trouble,

typedef char char8;
typedef unsigned char uchar8;

ASSERT_NE((uchar8)1, 0);
ASSERT_NE((unsigned char8)1, 0); // compile error here
or
ASSERT_NE((unsigned char8)1, 0);
ASSERT_NE((uchar8)1, 0); // compile error here

HSA[u]int64 are alias. So ASSERT_XX((unsigned HSAint64)..)
with ASSERT_XX((HSAuint64)..) fail to compile.

Change-Id: I4c24bc699a69bd4f37c4bc8aaaa9f1a92a24a33e
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-16 16:03:52 +08:00
Kent Russell f2bd7e1d52 kfdtest: Consolidate log messages for skipped tests
When skipping a test, the output should be:
Skipping test: <reason>.

This will allow for easier identification, automation and general readability

Change-Id: I98bda1c068f9dbc83aeea74f642b6101121f234d
2018-08-14 10:11:50 -04:00
Kent Russell dffac0a97e kfdtest: Style cleanup
Clean up the KFDTest style via CPPLint. Some warnings remain regarding
volatile variables being cast to void*. This is the command used:
cpplint.py --linelength=120
--filter=-readability/multiline_string,-readability/todo,-build/include,-runtime/references

multiline_string is due to using ISA code
todo is to avoid errors that we don't have TODO(username) instead of TODO
include is about including the folder in the header includes
references is regarding non-const references '&' being const or using
pointers. That can be addressed later

Change-Id: I3c6622da0a13dd33ab29b2bfff48be25e763b750
2018-08-14 08:17:57 -04:00
xinhui pan 9d6d0911e4 kfdtest: make p2ptest go through all gpus
Implement sDMA copy packet broadcast.

Each time sDMA will copy its local vram to sysbuf and next GPU's vram.
That will verify where the p2p link is broken.
Currently we just test push of p2p.

test result on 2 cpus, 4 gpus, numa enabled system.
[ RUN      ] KFDQMTest.P2PTest
[          ] Test 2 -> 3
[          ] PASS 2 -> 3
[          ] Test 3 -> 4
[          ] PASS 3 -> 4
[          ] Test 4 -> 5
[          ] PASS 4 -> 5
[          ] Test 5 -> 0
[          ] PASS 5 -> 0
[       OK ] KFDQMTest.P2PTest (190 ms)

Change-Id: Ie6fb2604109e39465b8a873b3bb42abc6259825a
2018-08-07 21:13:37 -04:00
xinhui pan 86552aba4b kfdtest: make the output of QueueLatency test more readable
Change-Id: Ib33ac25509b23f2e5869bde126e3f11ef60f017e
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-01 10:06:33 +08:00
Yong Zhao 6df62c78b8 kfdtest: Add kfdtest source code
The code is a snapshot up to this commit around July 31 2018.

commit b00fadff36a3
Author: xinhui pan <xinhui.pan@amd.com>
Date:   Mon Jul 30 09:53:03 2018 +0800

    kfdtest: skip MMapLarge test on apu

    

Change-Id: I40e9a5a18e5c8f075e5290bb80532f1a3f689058
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-07-31 00:00:34 -04:00