Gráfico de Commits

46 Commits

Autor SHA1 Mensagem Data
Philip Yang 92076f6f1b kfdtest: add KFDMemoryTest MultiThreadRegisterUserptrTest
Test Thunk multiple threads register and deregister same userptr race
condition, to emulate application register same userptr to multiple
GPUs using multiple threads.

Use thread barrier to sync the threads, to start register userptr at
same time.



Change-Id: I6723dc39f75908026fa14a490e39e1fe49a13a1b
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2021-07-07 17:52:31 -04:00
Felix Kuehling 25288e07dc kfdtest: Handle EINTR in waitpid
If the signal arrives too late, it interrupts waitpid. Handle this
situation gracefully.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: If4925c352c81ba7fef8a940460b91f5e720b451e
2021-05-03 11:01:11 -04:00
Eric Huang a6703395f6 kfdtest: remove scc bit for cache coherence tests
It is to address gfx90a HW memory model changes.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Ie5c5c5ee5ddfb75c0b4f625baf59ce37b4cc7c31
2021-04-26 19:55:49 -04:00
Kent Russell 83d80074f7 Merge gfx90a into amd-staging
Conflicts:
	CMakeLists.txt
	include/hsakmt.h
	src/libhsakmt.h
	src/libhsakmt.ver
	src/queues.c
	src/topology.c
	tests/kfdtest/src/KFDMemoryTest.cpp
	tests/kfdtest/src/KFDTestUtil.hpp

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Ic2732e7c0b5e42c1a3a91223f65a65064b602181
2021-03-02 07:48:22 -05:00
Eric Huang 9aa521d1ff KFDTest: add cache coherence tests for gfx90a
Three kfd subtests are added to verify new XGMI connection with
cache coherence HW link on A+A.

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Change-Id: I6960ec91cbfb696c4e6acb3b79fd83107003acdd
2021-02-23 12:22:32 -05:00
Harish Kasiviswanathan 085005f07b kfdtest: Add gfx9_PollNCMemory function to support NC memory
In A+A all system memory is mapped as NC. So add a new function
gfx9_PollNCMemory which will support NC memory.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I097b95fb156f73d6f480cd4fd262cc6fa5933f69
2021-02-23 12:20:29 -05:00
Harish Kasiviswanathan 57f46b53ec kfdtest: A+A: CP writes to NC mem need flush
Refer to commit "Mark buffers accessed by CP as UC"

A+A buffers are mapped as NC. CP (PM4Writes) need ReleaseMem function to
ensure the write go through to the memory

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I4ee55a6e40fba078f5950d95c8fee7ee076260bf
2021-02-23 12:20:29 -05:00
Mukul Joshi c861873dae Add SP3 assembler support for gfx90a.
Add updated SP3 static library with support for gfx90a and
also add initial corresponding changes in kfdtest.

Change-Id: I71bc6404ace7f9bf0dd74e712287136aa2b8a03d
2021-02-23 12:20:29 -05:00
Yifan Zhang 742f718722 kfdtest: Take vram size into account when calculate buffer number.
Vram size is relatively smaller in APU, e.g. 512MB.
Current MMBench doesn't support small vram system.
Running MMBench may have below errors:

[ RUN      ] KFDMemoryTest.MMBench
[          ] Found VRAM of 512MB.
[          ] Test (avg. ns)        alloc   mapOne  umapOne   mapAll  umapAll     free
[          ] --------------------------------------------------------------------------
[          ]   4K-SysMem-noSDMA         4569    20098     1292    18835      926     2218
[          ]  64K-SysMem-noSDMA        12738    20469     1030    19201     1293     4560
[          ]   2M-SysMem-noSDMA       256384    21020     1022    20568     1196    36294
[          ]  32M-SysMem-noSDMA      4031812    83750     5406    61156     4312   535656
[          ]   1G-SysMem-noSDMA    129260000   427000    34000   390000    30000 18548000
[          ] --------------------------------------------------------------------------
[          ]   4K-VRAM-noSDMA         3594    19637      979    19624     1357     2829
[          ]  64K-VRAM-noSDMA         3540    21062     1407    19614     1654     3024
/home/foreman/build/hsakmt-roct-amdgpu-1.0.9/sources/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1119: Failure
Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, &bufs[i]))
  Actual: 6
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDMemoryTest.MMBench (723 ms)

Fix this issue by changing buffer number calculation in MMBench.

Change-Id: I5cce95707a048248f1e825c807586818619eddaf
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
2020-12-17 07:41:24 -05:00
Chengming Gui 3ed8b96bf0 kfdtest: remove unsupported modifier 'offset'
fix 
v2: fix VGPR conflict
v3: use s_addc_u32 to replace s_add_u32

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I8fe6bf1f5bf99544038ad16128c2bebd559d3da9
2020-12-14 17:29:13 +08:00
Gang Ba 8e94dde685 kfdtest: check peer accessible with new function
check GPU peer accessible with p2p_links in system

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I026f16564303b687811d6648f0b7f84be6819979
2020-11-26 10:34:06 -05:00
Chengming Gui f283fe2854 kfdtest: update shader code for gfx10.3 kfdtest
s_store_* instruction set was retired from gfx10.3

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: Ibe41a3fe7e053fb345b1af6ad4abc22a0885bc81
2020-11-03 22:25:39 -05:00
Yong Zhao 2b70d73f68 kfdtest: Improve the stablility of SignalHandling test
On gfx1012, allocating 1/4 of the system memory on a 32G RAM machine
could fail, resulting in this test to fail. Limit the maximum buffer
to allocate to be smaller than 3G to accommodate this situation.

Change-Id: I38b0a0f7da1f0b9ca851e04d2d0a51767858c801
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-07 17:28:57 -05:00
Yong Zhao 4daa25fceb kfdtest: Merge the two largest buffer test helper functions
This is cleaner.

Change-Id: I7740f3e0f93a63b35fefc3cb69712dfad68df552
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:45:23 -05:00
Yong Zhao b6cefa7bda kfdtest: Split BigBufferStressTest into two smaller tests
The previous BigBufferStressTest has too much stuff and takes a long
time to run. By separating largest*BufferTest out into other
tests, we dramatically reduce the time to run BigBufferStressTest and
therefore make reproducing issues much easier.

Meanwhile, rename the test to BigSysBufferStressTest to express more
information.

Change-Id: I5911f113c0bd50627ee6d84bbb4f2972cbed8886
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:28:17 -05:00
Yong Zhao cbe21fa261 kfdtest: Remove the queue submission in BigBufferStressTest
In order to accommodate the flaky queue submission under memory
shortage scenarios, BigBufferStressTest has become very much a hack,
undermining its purpose of testing the basic memory related operations.

Therefore, remove the queue submission part. The EvictTest should serve
the purpose of testing the memory and queue submission functionalities
when memory eviction happens.

Change-Id: I3c3603a0e834267eccb72f46efeabe1e053c8fc5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:25:23 -05:00
Philip Yang 2fa7d23a82 kfdtest: use flag NoNUMABind for more test cases
If NUMA system no available memory on node 0, mbind will fail on node
0, so set flag NoNUMABind=1 to bypass mbind for all test cases which use
node 0 and allocate system memory.

Change-Id: I7962938ad2bed5a293ca5e6a8500c7f7e15ff453
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-11-06 18:33:46 -05:00
Oak Zeng 7593b41575 Unmap memory from GPU before free
Change-Id: Ic33b17cbaee5de7908d37527254f4f146e6b71e3
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-11-01 08:58:55 -04:00
Eric Huang 0174377351 kfdtest: add xgmi path for p2p tests
When large bar is not available, we can use
xgmi to do p2p tests.

Change-Id: Ib7b59fb8a4d41f605739a0428973f6b2f1a3450f
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-10-17 10:21:10 -04:00
Oak Zeng da789a2584 Fix memory map issue in KFDMemoryTest.CacheInvalidationOnRemoteWrite
The memory need to be mapped for both local and remote GPU access

Change-Id: I4aeaffc0851b6107fc91e9eaa6150764b06f5ca9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-08 16:33:51 -05:00
Oak Zeng d7c53bb1fa Test new RW mtype for gfx908
Change-Id: Ia859c8f2e3c486f119772231a2d887f6783caf36
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-04 10:49:15 -04:00
Philip Yang 69d8f2d734 kfdtest: use flag NoNUMABind to allocate system memory
Allocate system memory from node id 0 will fail on NUMA system which has
no memory on node 0. Change to use new flag NoNUMABind to allocate
system memory from NUMA nodes which have free memory.

Change-Id: I8ef9ca28fc2ab5dd31d07a2d3eaf1d5886e798a0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-09-16 12:25:01 -04:00
Felix Kuehling 8f91d6a222 kfdtest: Enable more tests on gfx802
A number of tests are no longer broken on gfx802.

Change-Id: If70c77423f8f14de59490ab8ca156b0c4e7b5cf1
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-08-30 19:06:24 -04:00
Eric Huang cdc10991a9 kfdtest: avoid TTM eviction in KFDMemoryTest.BigBufferStressTest
Reserve half of dma32 zone for non-NUMA system.



Change-Id: Id7aea7b6ff6cc1cc7983ecd95f8078b7f1be630c
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-08-07 16:35:29 -04:00
Yong Zhao 5ae5854302 kfdtest: Improve FlatScratchAccess by not hardcoding the value
We should use the SE number reflected by NumShaderBanks of the node
rather than hardcoding it.

Change-Id: I945fb001f81ce506249cf485a7ce25aee8219bc7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-08-01 23:10:28 -04:00
shaoyunl 78e754ca5b KFDTest: Make shader compatiable for gfx9 and gfx10
Remove the CHIP name from the shader ISA and add wave_size(32) to make the same
shader can be  used for both  GFX9 and GFX10

Change-Id: I16ea72f87980c3d9c11298e20c06a0a073fe9a28
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-30 10:56:19 -04:00
shaoyunl 85a9821519 KFDTest: Shader modification for gfx10
Modified shaders in KFD memory test to support gfx10.
There is no gprs register for flat_scratch on gfx10.
Use s_setreg_b32 instruction to set flat scratch base
address register

Change-Id: I505156a046056b61ce2d873343feb50ce635274a
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-07-25 10:54:30 -04:00
shaoyunl 395750264d KFDTest : Add family ID when building SDMA packet
Some SDMA packet format might be different among asic versions

Change-Id: Ic7eda7554c23e3972e168480874ca67a92677346
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-22 16:36:49 -04:00
shaoyunl e9882daf11 KFDTest : Add gfx1xxx release_mem and acquire_mem packet support
use family ID as parameter when construct the packets

Change-Id: I6c1706954ab7b8cbb8bef2aab16edf21f5e1abf0
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-07-18 10:43:48 -04:00
Felix Kuehling 4e9ff4393d kfdtest: MMBench: Test a more useful range of buffer sizes
Currently the test only covers relatively small buffers sizes. It's
useful to test buffer sizes up to 1GB to see the impact of features
that target the efficiency of large buffer allocations and mappings.

Change-Id: I2e8d5afd482894dbe2166f32d38091199b9c15e6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-07-11 17:15:39 -04:00
Oak Zeng 3b014adccc Device HDP flush test
Change-Id: I1c19e44caeee4a6e59200dceb718896fcff9bf82
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-07-07 21:59:37 -04:00
Oak Zeng 5d163cd821 Fix HostHdpFlush shader
1. Use s_mov_b32 to move 0xcafe to s18. s_movk_i32 is a sign extention move
instruction. Oxcafe will be extended to 0xffffcafe which is not desired
2. Add wait to s_load_dword instruction to make sure memory read finish before
the next store instruction.

Change-Id: I665d1d471019edfaba5693e07cdc567d4103573f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-06-21 11:51:51 -04:00
Philip Yang 4066dcd542 kfdtest: increase BigBufStressTest timeout and avoid VM fault
If TTM eviction and restore happens, it may takes very long time if
retry, the longest time is 5 minutes during my test. There is chance
packet is submited to queue while eviction, we have to increase the
Wait4PacketConsumption timeout.

The queue will continue to execute after eviction and restore. If we
upmap the memory from GPU while queue is evicted, this will cause VM
fault. Change to unmap memory after queue is destroyed.



Change-Id: I1b44e2274ea7b83398b2e3293578dad6947cb5af
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18 09:28:43 -04:00
Philip Yang 36776e9917 kfdtest: avoid BigBufStressTest run on NUMA node 0
Because dma32 zone is on node 0, use all system memory on node 0 will
cause TTM eviction to free dma32 zone for other devices which only
work with 32bit physical address. The TTM eviction and restore may take
too long and cause queue timeout.

Running on other NUMA nodes, the NUMA default memory policy is
MPOL_PREFERRED, means TTM will get pages from local node first, and then
get remaining pages from other nodes. Check /proc/buddyinfo can confirm
this.

Reset NUMA bind to all after the test.



Change-Id: I39b373c07a2d5aa396f5c7602bffabab0481930f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18 09:28:20 -04:00
Oak Zeng b26580788b Host HDP flush test
Change-Id: I396ac021d15da972f4841d6d8f90d4b175e64ecd
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-05-03 09:31:41 -04:00
shaoyunl d8009b4fd3 KFDTest: fix failure when run KFDTest on multi-GPU small bar system
On small bar multi-gpu system, hsaKmtMemoryMapToGPU will fail due to latest
kernel P2P sanity check. Swith to use hsaKmtMemoryMapToGPUNodes to fix
the failure

Change-Id: Id8b6329d1243df0e908cc9a171b5c7f9156f4a8b
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2018-11-19 16:09:31 -05:00
Xiaojie Yuan 247fa9f1e0 Use 'RecordProperty' to record performance scores
For following test cases:
- KFDQMTest.QueueLatency
- KFDQMTest.BasicCuMaskingLinear
- KFDQMTest.BasicCuMaskingEven
- KFDMemoryTest.MMBandWidth
- KFDMemoryTest.MMapLarge
- KFDMemoryTest.MMBench

v2: xml element cannot start with a number, so change the key name of
    MMBandWidth and MMBench accordingly
    xml element cannot contain whitespaces, so trim whitespaces in "VRAM  "
v3: introduce KFDLog-like way to use KFDRecord

Change-Id: Ifc3ed5657621252a7b39dccf1ef4f50a92593f77
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
2018-09-18 17:41:14 +08:00
xinhui pan a040a24243 kfdtest: Let BigBufferStressTest detect memory leak
As it will alloc as much as small system memory to reach the allocation limit.
We can try to alloc memory several times to see if any allocation in
the previous step cause memory leak.

Also we test if GPU can access these memory correctly or not.

Change-Id: I309f9821b6bc99c212a6bfbc21fe3086ab589fd3
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-28 22:50:42 -04:00
xinhui pan 1076075a1c kfdtest: Add some asserts in BigBufferStressTest
It should have PASS/FAIL report for the vram allocated size.

Change-Id: I546c02c2ed02f1cfb5278e0dfd7b18ade39faafb
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-23 23:01:20 -04:00
Kent Russell fe33461622 kfdtest: Consolidate logic for ASSERT vs EXPECT
ASSERT failures result in immediate termination of the test. EXPECT
returns a failure but continues execution. Reserve ASSERT for required
functionality (node initialization, queue creation, etc) where the rest
of the test cannot run if that call fails. Use EXPECT everywhere else

Change-Id: I1c11326fc3ae22b50fa83b07b3b49af1e1f4e69e
2018-08-23 06:20:18 -04:00
Kent Russell 414042abf7 kfdtest: Clean up comments
Consolidate style (use /* */ for multi-line), fix typos,
use dword instad of DWORD/DWord

Change-Id: I620e45c1687550db41127e45641b7d79d28223a1
2018-08-23 06:20:17 -04:00
Yong Zhao 62f7dc2a48 kfdtest: Do not set GTEST_FLAG throw_on_failure
The flag makes EXPECT_* to behave like ASSERT_*, which actually work against
our favor, so disable the flag.

Change-Id: I2ea1dfeaf916b396593a504d081148abdac0fc70
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2018-08-15 18:08:39 -04:00
Kent Russell f2bd7e1d52 kfdtest: Consolidate log messages for skipped tests
When skipping a test, the output should be:
Skipping test: <reason>.

This will allow for easier identification, automation and general readability

Change-Id: I98bda1c068f9dbc83aeea74f642b6101121f234d
2018-08-14 10:11:50 -04:00
Kent Russell dffac0a97e kfdtest: Style cleanup
Clean up the KFDTest style via CPPLint. Some warnings remain regarding
volatile variables being cast to void*. This is the command used:
cpplint.py --linelength=120
--filter=-readability/multiline_string,-readability/todo,-build/include,-runtime/references

multiline_string is due to using ISA code
todo is to avoid errors that we don't have TODO(username) instead of TODO
include is about including the folder in the header includes
references is regarding non-const references '&' being const or using
pointers. That can be addressed later

Change-Id: I3c6622da0a13dd33ab29b2bfff48be25e763b750
2018-08-14 08:17:57 -04:00
xinhui pan 3f7b6356fd kfdtest: fix a memory leak issue in MMapLarge test
When mapMemoryToGpu fails, we need unregister it with user address as
the gpu address is not available.

Change-Id: I4418eeaa7aa37008f5bffa144e2c2171f0d238fd
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-08-10 05:26:06 -04:00
Yong Zhao 6df62c78b8 kfdtest: Add kfdtest source code
The code is a snapshot up to this commit around July 31 2018.

commit b00fadff36a3
Author: xinhui pan <xinhui.pan@amd.com>
Date:   Mon Jul 30 09:53:03 2018 +0800

    kfdtest: skip MMapLarge test on apu

    

Change-Id: I40e9a5a18e5c8f075e5290bb80532f1a3f689058
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-07-31 00:00:34 -04:00