KFDDBGTest and KFDNegative test can eat into memory and event resources
for subsequent test interations if unallocated.
Change-Id: Iea170c20df8d487703441181b6c152b61f02d3db
Queue 2's wave blocked the queue 1's wave save, which will cause unmap
queue preemption fail. Add nop per SQ suggested.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Change-Id: Iea7f280e35487059c4499ea999b9e0cdf841d1e1
Current test has 4 processes, each process allocate and access 512
buffers, this requires 2048 waves to access 2048 buffers at same time to
finish the test. For CPX compute partition mode, each compute node has
less waves and cause random test failure. Change test to 2 processes to
use 1024 waves to access 1024 buffers with the increased buffer size.
Add waves_num check to avoid the test failure on new ASICs or simulator,
skip test if the available waves is less than 1024.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I64b5f9172b62cf38f62fbb0b48a801b8a11401c0
If you build thunk following the instructions in the thunk's README,
there is no /lib folder in the build folder. Adjust the include path,
and clean up the docs to reflect that. The header include is already
defined in the CMake file as ../../include, so we don't use
LIBHSAKMT_PATH for that linking, just the lib location
Change-Id: I73435d59adb9d01f527a28b1935086260e9d3d70
Signed-off-by: Kent Russell <kent.russell@amd.com>
Currently, KFDPerformanceTest.P2PBandWidthTest cannot work if there are
more than 16 KFD nodes in the system. This limit was put in to match the
number of SDMA queues supported on a single node.
This patch updates the test to make it run on systems with more than
16 KFD nodes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I561d0cdef664cae84fb9c13a801052e2001256e5
In VM with 6vcpu, cpu schedule of
queue_delayed_work(system_freezable_wq) is lower than BM.
HSA_SMI_EVENT_QUEUE_RESTORE event from case HMMProfilingEvent/0 got
delayed execution and caused HMMProfilingEvent/1 fail.
The fix is only listen to HSA_SMI_EVENT_MIGRATE_START event and ignore
all other events.
Change-Id: I534e49b030bd4c534bc7a63eb431f4907659c8cd
We had skipped doing it for PAGE_SIZE, but it should be left as the
regular PAGE_SHIFT name, especially for users who are using different
headers. We want PAGE_SHIFT and PAGE_SIZE to be consistent with one
another, so set them both explicitly to the same value if either
of them is undefined
Change-Id: I121d81c48409dd77351b59a192d824e2419a2410
Signed-off-by: Kent Russell <kent.russell@amd.com>
To support fully-static library ROCm builds, ensure that all global
symbols are prefixed with something meaningful to avoid collisions with
other libraries
A script was made using" objdump -C -t" to get a list of symbols,
then checking if the global symbols have a meaningful prefix (for thunk:
hsakmt or kmt in various cases)
Change-Id: Ifd353f64a3344eb60d1f6c4e041aa20967b38a59
Signed-off-by: Kent Russell <kent.russell@amd.com>
Fix for some places where the ISA buffers are not declared as
executable. Previous code in Thunk was blindly setting exec bit on all
memory allocations so this issue was masked.
Change-Id: Ic7a1169c69fb85ff9e8ea7bcc49a1845b37c08ff
The function can return NULL if it fails to create the backend, so check
for NULL before using it.
Change-Id: I4d6501bffd6dd0fc0d0f2224720f7d6dca1646f3
Signed-off-by: Kent Russell <kent.russell@amd.com>
This reverts commit 9f0f7741de.
For APU, the PCIe atomic is supported by default. However, the PCIe
atomic feature needs to checked for dGPU. The kfd driver has already
set PCIe atomic support for APUs, so this patch can be reverted.
Change-Id: I131d5b8e095c1104e1695e7cf8b1ed178bccddde
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
On Fedora, rocm-smi is a standard package and is installed to /usr/bin
So when run_kfdtest.sh is run this error is produced
find: ‘/opt/rocm*’: No such file or directory
First redirect stderr to dev/null on the original search.
Then fall back to either looking for rocm-smi in BIN_DIR or
look for it in the PATH.
Change-Id: I389ed0b9a4a4507263c9eb19894b25326c9a4222
Signed-off-by: Tom Rix <Tom.Rix@amd.com>
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Using "PROGRAMS" and "FILES" without specifying permissions will
automatically select the right permissions.
PROGRAMS is used for executables, FILES is used for data files
Change-Id: I0fb6eff257a8f936848bd648cf877da6dc0b6906
Fix register COMPUTE_PGM_RSRC2 in Dispatch code.
Bit 6 (called TRAP_PRESENT on pre-GFX12) should not be set on GFX12
as it has a different meaning (DYNAMIC_VGPR).
Minor instructions changes for CopyOnSignalIsa and WriteAndSignalIsa
shaders.
Change-Id: Ib4e75e3c92f220210bc45778738d81b91efb9d5e
Signed-off-by: David Belanger <david.belanger@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
A function call was refactored out of CommandLine.h, so add the header
to include it
Change-Id: If5594e3abc2fdfdd59f108c4379802cedab127ee
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
RDMATest.ContiguousVRAMAllocation test uses 4GB buffer, skip the test if
total VRAM size is less than 5GB, considering page table and other
reserved VRAM usage.
Change-Id: I0342417501cdd3477c2bf1b2f7d1e6bef61d1871
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Minor instructions changes for GFX12.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Iab2c430bb5d7d8fa2b166d07fd33ea15aca3a5cd
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Minor instructions changes for GFX12.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I57cca6393d4b4aae869a2bc9862d75eef1f29ed7
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Minor instructions changes for GFX12.
Change-Id: I78a37fa37950b378cdd2a1618c71c97c6ba66aac
Signed-off-by: David Belanger <david.belanger@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Update amdp2ptest.h to sync with the same file from rdma test driver
folder.
Add ContiguousVRAMAllocation to verify rdma get pages will get
contiguous VRAM pages, skipped RDMA getpages if amdp2ptest.ko is not
loaded.
Change rdma buffer mmap with MAP_SHARED flag, because MAP_PRIVATE goes
to COW path, which requires mmap the entire vma and cannot support
multiple sg nents.
Change-Id: I5fbb1902251f1454616d4404a4b048a88996d4f7
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
mmap system call parameter vma->vm_start, vm_end is mmap virtual address
start, end, vma->vm_pgoff is rdma buffer GPU address, which is used to
find the sg_table dma_address.
Handle multiple sg table nents case because sg->length is limited to max
2GB.
Change-Id: I677dd6662ee58f0b5c93f8eef32b7009e1e890d8
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Since the amdgpu driver commit 1f4ac94b59aebebf.
https://lore.kernel.org/all/a121a72c-b441-4f42-94a3-4597b7f19e7d@amd.com/T/
gtt and vram are available for compute.
So, the vramSize obtained by function GetSysMemSize is actually about 50% system memory.
But small APUs don't have large system memory, and kernel memory limit is smaller for them.
Therefore, it will fail to register SVM Range for SysBuffer and SysBuffer2.
Example:
System Memory size: 3373M Kernel memory limit:1791M
VRAM Memory Size: 256M GTT Memory Size: 1686M
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Change-Id: Ib3826933100ab7b432cb476caaf2d91cc9cdb948
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Update CMakeLists.txt to use Thunk pkgconfig.
Add rdma contiguous memory allocation test, to verify if KFD rdma get
pages to pin buffer on contiguous VRAM pages.
Change-Id: I7cc617fc083ce1998c214c327c130f033ce41d6f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Update Makefile for newer kernel version, and support build with dkms
amdgpu driver. Use symbol_request to get KFD peerdirect interface.
Sync up with KFD peerdirect interface changes, remove the free callback
which is not used any more.
Change-Id: I01d8906d9ffa427a058a26e88e36f6b80e9e22c2
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
It's unnecessary to register non-userptr.
Change-Id: Iefd329578365e036e2fe7e4d5c9c0c3d0976f67c
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Small APUs use same memory allocation approach with APP APU now,
skip these tests as well.
Change-Id: I13c953cc53da071f6f36af0d4a0153a48ea066fe
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Add test cases excluded from GFX11 to GFX12 list if they are also not
stable on GFX12.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ifeab24f8ea94085250ea86128a3e401479bdb53d
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Minor instructions changes.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I40d6aaffd78cf27f7c3b436cea5403d39b5b88ec
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Minor changes to instructions for GFX12.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Iac5be900e3755099d83010fb1a2066b4dbb52dda
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Updated ShaderStore shader (used by CWSR test) for GFX12.
Workgroup ID now pass in a different register.
Minor changes for new scope syntax.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I6fdabc8b62cba201d7777a736d3d43cfae28ca4c
Signed-off-by: Chris Freehill <cfreehil@amd.com>
New watchpoint exception status bits have to been assign to the first 4 least
significant bits so change test verification mask to check against the
first watch point ID accordingly.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: If83950207ea9f66cd230c23e7386a97b3893c2eb
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Fix traphandler for KFD debugger testing.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Ib8f5aac3d1b99e4463ac56b5f6d5dee2c367c447
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Initial template for GFX12.0.1.
Change-Id: I5d2be1f594bf057c04f6feee75a80c61a9d7e4a8
Signed-off-by: David Belanger <david.belanger@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Initial template for GFX12.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I552374bfcc0dd6272d170df85d36d0dbca0196d5
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Skip test when PC Sampling is not supported by ASIC.
Change-Id: I6f9be0bdaed66e51052723b6df6908079470cefb
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
We need : to end each subtest, except for the last entry.
Change-Id: I9515d90703c9679e06a4acd124883540c1d5b832
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>