rocm-systems

Автор	SHA1	Сообщение	Дата
David Francis	30b1f23f7a	kfdtest: Add coherency tests for Aqua Vanjaram Aqua Vanjaram is intended to have fine-grained coherency from anywhere to anywhere else using read-acquire and write-release primitives. Add a test that writes to memory covered by five different cache lines, then write-releases, while another thread read-acquires, then reads those five locations in memory. There are nine variations of the test to cover CPU-GPU, same-GPU and across-GPU, vector instructions and scalar instructions, and data local to the acquirer or receiver. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I20d2db5c53bd280e971479aad7e61df6ed5d3623	2023-04-19 10:28:05 -04:00
Philip Yang	598e3e8d86	kfdtest: KFDMemoryTest.DeviceHdpFlush requires large bar KFDMemoryTest.DeviceHdpFlush requires device node 0 is large bar to check VRAM content from CPU, run the test only if device 0 is large bar GPU. Change-Id: I874b153219550c50b724625e971e3ed3a84dc652 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-04-14 10:03:38 -04:00
David Francis	e32278a612	kfdtest: Restrict DriverHDPFlush to systems with PCIe Nodes with XGMI have no HDP, so DriverHDPFlush should skip. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: If5a87e660712e51d03e750d8e044786036b2e603	2023-04-14 10:03:38 -04:00
David Francis	16c6530330	kfdtest: Deprecate PollNCMemoryIsa Even with the restriction to only compile on gfx90a, this shader still fails CompileShaders test. There don't seem to be any systems that actually use it. Leave it in the shader store, but remove it otherwise Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I41bec6ba10363d42b163ac101c3a92edaad6d6df	2023-04-14 10:03:38 -04:00
Alex Sierra	63c8cf115a	src: use SVM mechanism to register userptr memory Register and map userptrs through Shared Virtual Memory(SVM) API at the Kernel level when available. Using this approach, performance will be improve as register/unregister memory will not trigger any system call to KFD driver. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9	2023-03-22 13:33:35 -05:00
Daniel Phillips	d3bb1ca4af	kfdtests: Relax MemoryAllocAll failure criteria The MemoryAllocAll test in kfdtests exercises the new KFD memory availability API by trying to allocate a single buffer object that exactly fills all of vram. Desired object size is determined using the memory availility KFD ioctl via libhsakmt, then an object is allocated slightly larger than that size. If the allocation attempt fails then the test tries to allocate a slightly smaller object, and continues trying with smaller sizes until the allocation succeeds. The test succeeds if the successfully allocated object is within some specified tolerance of the available memory reported. There are a number of known issues that can cause the successfully allocated object to be significantly smaller than reported availability. Until these issues are addressed, we should not fail the test, but just log the actual divergence between the size of the object we thought we could allocate, and what was actually possible. Signed-off-by: Daniel Phillips <daniel.phillips@amd.com> Change-Id: I165a30865ffbb2353286dcc896ad8e24af124615	2023-03-03 15:24:39 -08:00
Felix Kuehling	e5ab87ede7	kfdtest: Add test for hsaKmtExportDMABufHandle Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Ia87377c1d4201fecfa00c2e0ca53b507608df2b3	2023-02-27 14:44:11 -05:00
kent.russell@amd.com	64aa9009e1	Add check for available_memory API If the KFD IOCTL version doesn't support available_memory, don't run the test. Just skip the test Change-Id: Iebf526d4563ab9f3c054bbfb38c214a1b893fcb5	2023-02-23 15:19:28 -05:00
Alex Sierra	f2bda56d04	Revert "src: use SVM mechanism to register userptr memory" This reverts commit `178a619b80`. There are some openMP issues that were introduced after SVM userptr feature was added. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5	2022-12-08 17:33:51 -06:00
Daniel Phillips	e71eb13784	kfdtest: Also detect under-reporting of available memory Detect under-reporting of available memory by initially attempting to allocate substantially more than reported available memory, and ensure that the allocation fails. Continue shrinking the attempted allocation until it succeeds, then fail the test if the successful allocation is either too much more than or too much less than reported available. Signed-off-by: Daniel Phillips <daniel.phillips@amd.com> Change-Id: Ib418f0aa26e8db80590a6c5f2578da56a4b60f2b	2022-11-28 11:43:48 -05:00
Eric Huang	8e8aa024fd	kfdtest: remove scc test in MapUnmapToNodes for gfx90a A+A Modifier scc is disabled from gfx90a's asm, so remove the shader for gfx90a A+A and keep it for newer asics with scc support. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Change-Id: Iec3c7ccd5156a855adb2b02feb3db0761876aa2f	2022-11-25 13:55:28 -05:00
Ramesh Errabolu	75428364a7	Add support for CRIU testing Change-Id: I8945a078ee8ae491245da6091e64b118584a48ab	2022-11-02 15:40:03 -04:00
Alex Sierra	178a619b80	src: use SVM mechanism to register userptr memory Register and map userptrs through Shared Virtual Memory(SVM) API at the Kernel level when available. Using this approach, performance will be improve as register/unregister memory will not trigger any system call to KFD driver. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14	2022-10-21 15:32:02 -04:00
Daniel Phillips	169673a435	kfdtest: Add thunk test for KFD memory availability ioctl Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com> Change-Id: Ic4c1ffefdc3570718a1fce4e53ca5f1ebde8c479	2022-09-21 13:26:38 -04:00
jie1zhan	17fb40f1f6	Fix allocate memory failed in VRAM : The kernel driver will do align VRAM allocations to 2MB, instead of 4KB. Change-Id: Iea9d8c0f02999b9ea5fd931da82240a33f7bcc69	2022-07-30 01:18:50 -04:00
Graham Sider	ffa6c95858	kfdtest: Hotfix wrong isaBuffer used in DeviceHdpFlush In DeviceHdpFlush, isaBuffer was accidentally used instead of isaBuffer0 during LLVM re-work--revert. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I98d2322e772f821a39505bb336ceb4e6cd8722ef	2022-05-04 16:13:20 -04:00
David Francis	4b041a8ad9	kfdtest: Use correct isa buffer in GPU coherency test In the LLVM rework, a line was accidentally changed from isaBuffer1 to isaBuffer, causing VramCacheCoherenceWithRemoteGPU to fail. Change it back. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: Ie1f3465b5c46556f18682d1b3d1f086bb790c648	2022-05-02 09:13:20 -04:00
Graham Sider	c926d83b5a	kfdtest: Move KFDMemoryTest shaders to ShaderStore Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I3335ca1f9dbe849233cf85253e0e92b56a20b8c9	2022-04-26 13:14:33 -04:00
Graham Sider	039bce94a6	kfdtest: Update KFDMemoryTest to LLVM Asm - Reformat shaders for legibility - Move assembly processes to from IsaGen (CompileShader) to Assembler (RunAssembleBuf) - LLVM syntax change on ScratchCopyDwordIsa_gfx10: hwreg(HW_REG_SHADER_FLAT_SCRATCH_LO/HI) -> hwreg(HW_REG_FLAT_SCR_LO/HI) - Fix bug in CopyOnSignalIsa_gfx10 and PollMemoryIsa_gfx10 whereby flat_store_dword used vector reg format v[n,n]. Changed to v[n:n] Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Id182cfb8aeb7372366c59affb5cbdd145909ee96	2022-04-26 13:14:33 -04:00
Prike Liang	6c103877dd	kfdtest: decrease granularityMB for handling small vram system It's not possible to allocate the 3/4 vram size with granularityMB being 128 when vram size < 512MB and decrease granularityMB to 16 has no significant impact on ROCt test on other system. So let's decrease granularityMB on small vram system for handling LargestVramBufferTest(). Change-Id: Iea7c29abfd382a20761b653730fd09a220ad2fd0 Signed-off-by: Prike Liang <Prike.Liang@amd.com>	2022-04-19 23:28:26 -04:00
Felix Kuehling	3ecd54f098	kfdtest: Skip slow tests in MMBandWidth Some VRAM access tests in MMBandWidth can be very slow on systems with complicated PCIe topology. Skip tests that take a long time to avoid excessively long running tests with little benefit. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: I2950237347fc2f764f6aa3292ab819051472bf37	2022-04-15 23:03:41 -04:00
Ruili Ji	0340c68031	kfdtest : adjust memory size for KFDMemoryTest. Total VRAM size on APU is 512M usually, Framebuffer also is allocated from VRAM. There is no enough memory for this case. /home/ruiliji2/p5/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1285: Failure Value of: (hsaKmtMapMemoryToGPUNodes(bufs[i], bufSize, &altVa, mapFlags, 1, &defaultGPUNode)) [ FAILED ] KFDMemoryTest.MMBench (1034 ms) Change-Id: Ib4201291122d85f6512a85859aea9a4713fb4f5c (cherry picked from commit a9f924484e7022a2d53ee02811b080f0833eba55)	2022-01-09 20:52:11 -05:00
Yang Wang	033b52c4e4	kfdtest: skip hdp flush test in sriov mode skip HDP flush test when remap feature is not supported. Backgroud: the HDP register remap is skipped in sriov mode, it will cause mmio base is nullPtr. Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Ib9aea1900931e30571656397a485ee4db051ec0a	2021-12-20 20:00:43 +08:00
Philip Yang	c3c1618db7	kfdtest: query userptr pointer alloc flags Test if query userptr pointer info return correct alloc flags, CoarseGrain by default. Test if query hsaKmtAllocMemory pointer info return correct alloc CoarseGrain flags. Change-Id: If3a1175645717e5d7c475d6ff35b02d6876a1f7c Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2021-07-22 21:14:46 -04:00
Philip Yang	92076f6f1b	kfdtest: add KFDMemoryTest MultiThreadRegisterUserptrTest Test Thunk multiple threads register and deregister same userptr race condition, to emulate application register same userptr to multiple GPUs using multiple threads. Use thread barrier to sync the threads, to start register userptr at same time. Change-Id: I6723dc39f75908026fa14a490e39e1fe49a13a1b Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2021-07-07 17:52:31 -04:00
Felix Kuehling	25288e07dc	kfdtest: Handle EINTR in waitpid If the signal arrives too late, it interrupts waitpid. Handle this situation gracefully. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: If4925c352c81ba7fef8a940460b91f5e720b451e	2021-05-03 11:01:11 -04:00
Eric Huang	a6703395f6	kfdtest: remove scc bit for cache coherence tests It is to address gfx90a HW memory model changes. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Change-Id: Ie5c5c5ee5ddfb75c0b4f625baf59ce37b4cc7c31	2021-04-26 19:55:49 -04:00
Kent Russell	83d80074f7	Merge gfx90a into amd-staging Conflicts: CMakeLists.txt include/hsakmt.h src/libhsakmt.h src/libhsakmt.ver src/queues.c src/topology.c tests/kfdtest/src/KFDMemoryTest.cpp tests/kfdtest/src/KFDTestUtil.hpp Signed-off-by: Kent Russell <kent.russell@amd.com> Change-Id: Ic2732e7c0b5e42c1a3a91223f65a65064b602181	2021-03-02 07:48:22 -05:00
Eric Huang	9aa521d1ff	KFDTest: add cache coherence tests for gfx90a Three kfd subtests are added to verify new XGMI connection with cache coherence HW link on A+A. Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> Change-Id: I6960ec91cbfb696c4e6acb3b79fd83107003acdd	2021-02-23 12:22:32 -05:00
Harish Kasiviswanathan	085005f07b	kfdtest: Add gfx9_PollNCMemory function to support NC memory In A+A all system memory is mapped as NC. So add a new function gfx9_PollNCMemory which will support NC memory. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Change-Id: I097b95fb156f73d6f480cd4fd262cc6fa5933f69	2021-02-23 12:20:29 -05:00
Harish Kasiviswanathan	57f46b53ec	kfdtest: A+A: CP writes to NC mem need flush Refer to commit "Mark buffers accessed by CP as UC" A+A buffers are mapped as NC. CP (PM4Writes) need ReleaseMem function to ensure the write go through to the memory Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Change-Id: I4ee55a6e40fba078f5950d95c8fee7ee076260bf	2021-02-23 12:20:29 -05:00
Mukul Joshi	c861873dae	Add SP3 assembler support for gfx90a. Add updated SP3 static library with support for gfx90a and also add initial corresponding changes in kfdtest. Change-Id: I71bc6404ace7f9bf0dd74e712287136aa2b8a03d	2021-02-23 12:20:29 -05:00
Yifan Zhang	742f718722	kfdtest: Take vram size into account when calculate buffer number. Vram size is relatively smaller in APU, e.g. 512MB. Current MMBench doesn't support small vram system. Running MMBench may have below errors: [ RUN ] KFDMemoryTest.MMBench [ ] Found VRAM of 512MB. [ ] Test (avg. ns) alloc mapOne umapOne mapAll umapAll free [ ] -------------------------------------------------------------------------- [ ] 4K-SysMem-noSDMA 4569 20098 1292 18835 926 2218 [ ] 64K-SysMem-noSDMA 12738 20469 1030 19201 1293 4560 [ ] 2M-SysMem-noSDMA 256384 21020 1022 20568 1196 36294 [ ] 32M-SysMem-noSDMA 4031812 83750 5406 61156 4312 535656 [ ] 1G-SysMem-noSDMA 129260000 427000 34000 390000 30000 18548000 [ ] -------------------------------------------------------------------------- [ ] 4K-VRAM-noSDMA 3594 19637 979 19624 1357 2829 [ ] 64K-VRAM-noSDMA 3540 21062 1407 19614 1654 3024 /home/foreman/build/hsakmt-roct-amdgpu-1.0.9/sources/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1119: Failure Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, &bufs[i])) Actual: 6 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 [ FAILED ] KFDMemoryTest.MMBench (723 ms) Fix this issue by changing buffer number calculation in MMBench. Change-Id: I5cce95707a048248f1e825c807586818619eddaf Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>	2020-12-17 07:41:24 -05:00
Chengming Gui	3ed8b96bf0	kfdtest: remove unsupported modifier 'offset' fix v2: fix VGPR conflict v3: use s_addc_u32 to replace s_add_u32 Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Change-Id: I8fe6bf1f5bf99544038ad16128c2bebd559d3da9	2020-12-14 17:29:13 +08:00
Gang Ba	8e94dde685	kfdtest: check peer accessible with new function check GPU peer accessible with p2p_links in system Signed-off-by: Gang Ba <gaba@amd.com> Change-Id: I026f16564303b687811d6648f0b7f84be6819979	2020-11-26 10:34:06 -05:00
Chengming Gui	f283fe2854	kfdtest: update shader code for gfx10.3 kfdtest s_store_* instruction set was retired from gfx10.3 Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Change-Id: Ibe41a3fe7e053fb345b1af6ad4abc22a0885bc81	2020-11-03 22:25:39 -05:00
Yong Zhao	2b70d73f68	kfdtest: Improve the stablility of SignalHandling test On gfx1012, allocating 1/4 of the system memory on a 32G RAM machine could fail, resulting in this test to fail. Limit the maximum buffer to allocate to be smaller than 3G to accommodate this situation. Change-Id: I38b0a0f7da1f0b9ca851e04d2d0a51767858c801 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2020-01-07 17:28:57 -05:00
Yong Zhao	4daa25fceb	kfdtest: Merge the two largest buffer test helper functions This is cleaner. Change-Id: I7740f3e0f93a63b35fefc3cb69712dfad68df552 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-11-25 18:45:23 -05:00
Yong Zhao	b6cefa7bda	kfdtest: Split BigBufferStressTest into two smaller tests The previous BigBufferStressTest has too much stuff and takes a long time to run. By separating largest*BufferTest out into other tests, we dramatically reduce the time to run BigBufferStressTest and therefore make reproducing issues much easier. Meanwhile, rename the test to BigSysBufferStressTest to express more information. Change-Id: I5911f113c0bd50627ee6d84bbb4f2972cbed8886 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-11-25 18:28:17 -05:00
Yong Zhao	cbe21fa261	kfdtest: Remove the queue submission in BigBufferStressTest In order to accommodate the flaky queue submission under memory shortage scenarios, BigBufferStressTest has become very much a hack, undermining its purpose of testing the basic memory related operations. Therefore, remove the queue submission part. The EvictTest should serve the purpose of testing the memory and queue submission functionalities when memory eviction happens. Change-Id: I3c3603a0e834267eccb72f46efeabe1e053c8fc5 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-11-25 18:25:23 -05:00
Philip Yang	2fa7d23a82	kfdtest: use flag NoNUMABind for more test cases If NUMA system no available memory on node 0, mbind will fail on node 0, so set flag NoNUMABind=1 to bypass mbind for all test cases which use node 0 and allocate system memory. Change-Id: I7962938ad2bed5a293ca5e6a8500c7f7e15ff453 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2019-11-06 18:33:46 -05:00
Oak Zeng	7593b41575	Unmap memory from GPU before free Change-Id: Ic33b17cbaee5de7908d37527254f4f146e6b71e3 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-11-01 08:58:55 -04:00
Eric Huang	0174377351	kfdtest: add xgmi path for p2p tests When large bar is not available, we can use xgmi to do p2p tests. Change-Id: Ib7b59fb8a4d41f605739a0428973f6b2f1a3450f Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>	2019-10-17 10:21:10 -04:00
Oak Zeng	da789a2584	Fix memory map issue in KFDMemoryTest.CacheInvalidationOnRemoteWrite The memory need to be mapped for both local and remote GPU access Change-Id: I4aeaffc0851b6107fc91e9eaa6150764b06f5ca9 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-10-08 16:33:51 -05:00
Oak Zeng	d7c53bb1fa	Test new RW mtype for gfx908 Change-Id: Ia859c8f2e3c486f119772231a2d887f6783caf36 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-10-04 10:49:15 -04:00
Philip Yang	69d8f2d734	kfdtest: use flag NoNUMABind to allocate system memory Allocate system memory from node id 0 will fail on NUMA system which has no memory on node 0. Change to use new flag NoNUMABind to allocate system memory from NUMA nodes which have free memory. Change-Id: I8ef9ca28fc2ab5dd31d07a2d3eaf1d5886e798a0 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2019-09-16 12:25:01 -04:00
Felix Kuehling	8f91d6a222	kfdtest: Enable more tests on gfx802 A number of tests are no longer broken on gfx802. Change-Id: If70c77423f8f14de59490ab8ca156b0c4e7b5cf1 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-08-30 19:06:24 -04:00
Eric Huang	cdc10991a9	kfdtest: avoid TTM eviction in KFDMemoryTest.BigBufferStressTest Reserve half of dma32 zone for non-NUMA system. Change-Id: Id7aea7b6ff6cc1cc7983ecd95f8078b7f1be630c Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>	2019-08-07 16:35:29 -04:00
Yong Zhao	5ae5854302	kfdtest: Improve FlatScratchAccess by not hardcoding the value We should use the SE number reflected by NumShaderBanks of the node rather than hardcoding it. Change-Id: I945fb001f81ce506249cf485a7ce25aee8219bc7 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-08-01 23:10:28 -04:00
shaoyunl	78e754ca5b	KFDTest: Make shader compatiable for gfx9 and gfx10 Remove the CHIP name from the shader ISA and add wave_size(32) to make the same shader can be used for both GFX9 and GFX10 Change-Id: I16ea72f87980c3d9c11298e20c06a0a073fe9a28 Signed-off-by: shaoyunl <shaoyun.liu@amd.com>	2019-07-30 10:56:19 -04:00

1 2

70 Коммитов