xinhui pan
918a45a430
kfdtest: add P2POverheadTest
...
This is to measure the laterncy + overhead of sdma packet
consumption on p2p.
It is Similar with QueueLatency test. What's more, the queue's overhead
with different workload show more details.
test result on two gfx900.
[ RUN ] KFDPerformanceTest.P2POverheadTest
[ ] Test (avg. ns) | Size 4 8 16 64 256 1024
[ ] -----------------------------------------------------------------------
[ ] [push] [1 -> 0] 333 148 185 111 148 148
[ ] [push] [1 -> 1] 370 222 333 74 148 111
[ ] [push] [1 -> 2] 333 148 148 148 148 148
[ ] [push] [2 -> 0] 111 333 259 148 148 148
[ ] [push] [2 -> 1] 222 148 185 148 148 148
[ ] [push] [2 -> 2] 222 111 370 111 74 148
[ ] [pull] [1 -> 0] 370 296 296 148 185 148
[ ] [pull] [1 -> 1] 185 333 222 148 222 148
[ ] [pull] [1 -> 2] 222 444 259 148 185 111
[ ] [pull] [2 -> 0] 148 148 148 148 148 148
[ ] [pull] [2 -> 1] 148 148 148 148 148 148
[ ] [pull] [2 -> 2] 185 148 148 74 222 296
[ ] [push|pull][1 -> 0] 1259 1222 1259 1074 1037 962
[ ] [push|pull][1 -> 1] 1037 1037 1037 740 740 1000
[ ] [push|pull][1 -> 2] 1259 1259 1296 1037 1000 1074
[ ] [push|pull][2 -> 0] 1037 1037 1037 1074 1037 1148
[ ] [push|pull][2 -> 1] 1037 1037 1037 1037 925 1074
[ ] [push|pull][2 -> 2] 666 666 740 740 703 925
[ OK ] KFDPerformanceTest.P2POverheadTest (459 ms)
Change-Id: I422263cb52f7ce184f6f1ff4466d04c239fbe9c9
Signed-off-by: xinhui pan <xinhui.pan@amd.com >
2018-09-24 09:28:00 -04:00
xinhui pan
e5a541eaf2
kfdtest: Add P2P bandwidth test
...
The test measures the bandwidth between GPUs. Currently we do not
care numa topology as some products really support across PCI-e root
complex p2p.
test result on two gfx900 system.
[ RUN ] KFDPerformanceTest.P2PBandWidthTest
[ ] Copy from node to node by [push, NONE]
[ ] [1 -> 0] 6.13477 - 6.12695 GB/s
[ ] [1 -> 2] 3.77734 - 3.76855 GB/s
[ ] [2 -> 0] 6.67676 - 6.6543 GB/s
[ ] [2 -> 1] 6.14453 - 6.12793 GB/s
[ ] Copy from node to node by [pull, NONE]
[ ] [1 -> 0] 6.10547 - 6.08105 GB/s
[ ] [1 -> 2] 9.65527 - 9.65039 GB/s
[ ] [2 -> 0] 6.49805 - 6.4873 GB/s
[ ] [2 -> 1] 8.95508 - 8.85254 GB/s
[ ] Full duplex copy from node to node by [push|pull, NONE]
[ ] [1 -> 0] 11.0986 - 11.0986 GB/s
[ ] [1 -> 2] 7.54297 - 7.54297 GB/s
[ ] [2 -> 0] 12.0264 - 11.9639 GB/s
[ ] [2 -> 1] 12.0469 - 12.0371 GB/s
[ ] Full duplex copy from node to node by [push, push]
[ ] [1 <-> 2] 11.7324 - 11.4541 GB/s
[ ] Full duplex copy from node to node by [pull, pull]
[ ] [1 <-> 2] 11.4824 - 11.0508 GB/s
[ ] Copy from node to multiple nodes by [push, NONE]
[ ] [1 -> [0...2]] 5.625 - 5.73633 GB/s
[ ] [2 -> [0...2]] 6.45801 - 6.4707 GB/s
[ ] Copy from multiple nodes to node by [push, NONE]
[ ] [[1...2] -> 0] 12.8379 - 12.2578 GB/s
Now we can get more timestamp info like below.
Copy from node to node by [push, NONE]
[1 -> 0]
[1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-###############################
[1 : 1] ####################################################################################################
[1 -> 2]
[1 : 0] #--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-######################################
[1 : 1] ##################################################################################################-#
[2 -> 0]
[2 : 0] ##-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-#################
[2 : 1] ###############################################################################-#############-###-##
[2 -> 1]
[2 : 0] ##-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-####################
[2 : 1] ################################################################################-###-############-##
[snip]
Full duplex copy from node to node by [push, push]
[1 <-> 2]
[1 : 0] #-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-####################################
[1 : 1] ################-###################################################-############-####-#############
[2 : 2] #-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-##################
[2 : 3] #####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-#####-##
Full duplex copy from node to node by [pull, pull]
[1 <-> 2]
[1 : 0] ######################################################################-##-#-###############-####-###
[1 : 1] #-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-############################
[2 : 2] ##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-############
[2 : 3] #-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#########-#############
Copy from node to multiple nodes by [push, NONE]
[1 -> [0...2]]
[1 : 0] #-#--#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-###############################
[1 : 1] ########################################################################################-###-###-###
[2 -> [0...2]]
[2 : 0] ##-##-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-##################
[2 : 1] -################################################################################################-##
Copy from multiple nodes to node by [push, NONE]
[[1...2] -> 0]
[1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-###############################
[1 : 1] ################################################################################################-#-#
[2 : 2] ##-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-##################
[2 : 3] #########################-#########################-#########################-#########################
[ OK ] KFDPerformanceTest.P2PBandWidthTest (15982 ms)
Change-Id: Ia90044191d51650ccb220476d31fb317aa3ad6ce
Signed-off-by: xinhui pan <xinhui.pan@amd.com >
2018-09-19 12:03:05 +08:00
Felix Kuehling
608dddbe9d
kfdtest: Fix gfx902 blacklist
...
Removed some tests from the blacklist that are now passing. Added two
new tests that hang the GPU.
Change-Id: I09e729590e5181311375058be492d387342ba2fe
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
2018-08-31 15:04:50 -04:00
Felix Kuehling
d3fdaaca3a
kfdtest: Enable more tests for gfx900
...
A lot of tests were disabled on gfx900 for historical reasons that
are no longer valid. The only remaining one that won't work on
gfx900 is BasicAddressWatch.
Change-Id: I11507de0dfd31262713127d6cb15cc09c14b8b9f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
2018-08-15 14:22:19 -04:00
Felix Kuehling
5c742f3e5e
kfdtest: Blacklist Fragmentation test on all chips
...
This test has been intermittently failing for various reasons and
was already disabled on all chips except Ellesmere. It stresses
memory management in unusual ways by having lots of memory allocated
but +# not mapped, which is not relevant to compute applications over
ROCr.
Change-Id: I6b791ca7e2e0fcfe93fc720063b4b56acfded751
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com >
2018-08-03 20:14:46 -04:00
Yong Zhao
1d43938ac7
kfdtest: Add run utility files for kfdtest
...
A README.txt file is added to help the opensource community to use kfdtest
effectively.
After building, run_kfdtest.sh in the building output folder can be used
to run the test.
Change-Id: I9612d9d5a63bd4cdc3a328efd9961d3cc92a6ba5
Signed-off-by: Yong Zhao <yong.zhao@amd.com >
2018-07-31 00:02:04 -04:00