Gráfico de commits

1189 Commits

Autor SHA1 Mensaje Fecha
Jonathan Kim b0e84183c1 kfdtest: add snapshot operations
Add queue and devices snapshot operations.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I836884c9f3b65dd9e5e444d554d3eb87938e1634
2023-08-09 09:26:29 -04:00
Jonathan Kim 5a675921ea kfdtest: add suspend and resume queues operation
Add base debug operations to suspend and resume queues.
Routine will return the number of queues successfully
suspended or resumed.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8f18317f70464b04231c5cf822e11d545ebfa02a
2023-08-09 09:26:09 -04:00
Jonathan Kim b77189cf83 kfdtest: add hit trap event test
Check that a jump to trap event can be picked up by the debugger.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iad5f87092f2b82d5018013bba548979122a9bd02
2023-08-08 16:01:23 -04:00
Jonathan Kim 97fc25bb8d kfdtest: add set exceptions enable base debug operation
Add set exceptions enabled debug option

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I6ee1769bbbb90a74074d8100974c4bfeabaf7f2c
2023-08-08 16:01:03 -04:00
Jonathan Kim 097ee967d1 kfdtest: add runtime enable and attach test
Add debug attach and runtime enable test for attaching to a spawned and
running process.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I72302ff73494d9dae0c79a299508085d7ca0552b
2023-08-08 16:00:44 -04:00
Jonathan Kim bfb0d15ee8 kfdtest: add query operations
Add polling query debug event operation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Ic82ce4a393bfb28c9f32e7920f80c12da7f627d5
2023-08-08 16:00:25 -04:00
Jonathan Kim c7129edcb8 kfdtest: add send exception operation
Add send exception operation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iecb43004a1a7ffc75e94badf203cd0927ffe0909
2023-08-08 16:00:04 -04:00
Jonathan Kim dd56b38c2f kfdtest: add base debug class and debug attach/detach operation
Add base debug class and attach/detach operations.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I60f3c166646f05838fec208ac2f59bba998c63f8
2023-08-08 15:59:35 -04:00
David Yat Sin 66b66e42cd Keep libdrm device_handle on older libdrm
Even if the version of libdrm older and does not support the
amdgpu_device_get_fd function, the device_handle stored in
amdgpu_handle[] is still valid and can be returned via
hsaKmtGetAMDGPUDeviceHandle.

Change-Id: I024a3e82e6cfebac5577aefe359b067746c4023e
2023-08-01 10:52:26 -04:00
Jonathan Kim aaab019960 libhsakmt: add debug trap thunk call for testing
Add generic thunk call for debug testing that assumes
caller populations trap arguments correctly.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I33a0bc66ca77e29f5b663d4bfe73f8684df8bfb6
2023-07-26 10:29:27 -04:00
Jonathan Kim 98c6784cc1 kfdtest: remove deprecated debug references
Remove all unused material from KFDDBGTest.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I13ed68656efadef7bbaf8bb737ce5a04829eca9b
2023-07-26 10:29:23 -04:00
Jonathan Kim 8471f80bac libhsakmt: remove old debugger versioning
Current debugger uses KFD version directly.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I212a53560a94dd24c599addce72f59c527c8af25
2023-07-26 09:41:38 -04:00
Philip Yang a395dd7306 kfdtest: KFDSVMEvictTest support large VRAM or small system memory
For xnack off, skip SVM evict tests if memory allocation size is larger
than 15/16 total system memory, because the test may fail to allocate
CWSR svm range to create queue after allocating test memory.

Limit eviction size from total VRAM size to 1/2 total VRAM size,
because for 192GB VRAM, evict 192GB may takes more than 120 seconds
and cause test timeout failed.

Change-Id: Ib1483b9aab580a8539187b2943cadea0fd5a7c71
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-07-25 11:11:55 -04:00
Kent Russell 1958224379 run_kfdtest.sh: Clarify parameters taking arguments
For --node and --exclude, these flags take arguments, but usage was
unclear. This led to attempts like --node=1 , which will not work
appropriately. Add examples for flags that take parameters, as well as
the requirements for those parameters. Also change --exclude parsing to
match --node parsing, for consistency

Change-Id: I563ba9b370a24d9a84b9c39093f3cb1a5d723cef
2023-07-21 10:53:31 -04:00
Jonathan Kim 2d3a09cbd6 kfdtest: disable gws tests for gfx11
GFX11 will no longer use GWS for cooperative launch so disable the test.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8611c8158e1654782150ad10f1f65edb578e6435
2023-07-20 11:22:56 -04:00
Ori Messinger 7cc3ffc115 kfdtest: Fix kfdtest.exclude "ReadOnlyRangeTest" Issue
The purpose of this patch is to fix an issue in kfdtest.exclude's
blacklist for KFDSVMRangeTest.ReadOnlyRangeTest.

Excluding "KFDSVMRangeTest.ReadOnlyRangeTest" without adding a "*"
to the end causes the test to still run, since after a recent patch
the test actually runs these two variants instead:
   -"KFDSVMRangeTest.ReadOnlyRangeTest/0"
   -"KFDSVMRangeTest.ReadOnlyRangeTest/1"
(For XNACK OFF/ON)

Now, the test is excluded as "KFDSVMRangeTest.ReadOnlyRangeTest*"
to cover those two XNACK ON/OFF variants.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I067c4c99fe839ce6cec5d134bd605e8cb41b8291
2023-06-29 23:14:30 -04:00
Jeremy Newton 473a66d115 Don't install asan license if disabled
Change-Id: I8bffe5ec8496ff11e6d66995dd470cddb13f3c0d
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2023-06-29 09:34:49 -04:00
Alex Sierra 0cbf26c148 src: add debug API to support GPU core dump
Functions to API added to extract the following information from KFD
Runtime information, device info and queues snapshot.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f
2023-06-26 18:58:15 +00:00
Alex Sierra 5e0a32d7b3 add hsaKmtGetRuntimeCapabilities API
Queries for runtime capabilities after its being enabled

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691
2023-06-26 18:58:15 +00:00
Ori Messinger 3447b795df kfdtest: Fix gfx_target_version Parsing Issue
The purpose of this patch is to fix an issue in the run_kfdtest.sh
script's gfx_target_version parsing.

When the character length of the "gfx_target_version" value is
equal to 5 instead of 6, it will now be zero padded on the left to
allow each Major/Minor/Stepping value to be parsed correctly.

Also, kfdtest.exclude file now replaces the default filter for
aqua_vanjaram with the following 3 gfx filters:
gfx940, gfx941, & gfx942

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I1f0264d3705803f24ad3c458e6bd367fbbec62be
2023-06-23 13:18:05 -04:00
Kent Russell 9a22bade89 kfdtest.exclude: Blacklist CuMaskingEven on all ASICs
This has slowly become less and less reliable on more and more ASICs,
so just blacklist it altogether. Using wall clock for performance
is not a reliable method for testing performance, so skip it to avoid
more failure reports on various systems.

Change-Id: I1a5744604e4620bc7675a629d146ba4ffba669d2
2023-06-15 11:24:04 -04:00
Ruili Ji 9bf1cbe4ed kfdtest: Update COMPUTE_PGM_RSRC1 for software trap
If asics don't need software traps within GFX11 domain,
test with COMPUTE_PGM_RSRC1.PRIV = 1 will make system hang.

Change-Id: I00cf8eb6d6b07856885c77bd343ca3c41cc3cad5
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
2023-06-14 07:46:51 -04:00
Philip Yang 29b04c2534 kfdtest: Fix KFDSVMEvictTest.QueueTest OOM
Typo to calculate bufferSize from vramBufSizeInPages. The OOM shows up
only with HSA_XNACK=1 because HSA_XNACK=0 doesn't support VRAM
oversubscription. We changed to run SVM tests with both XNACK off and
on.

Change-Id: I3949959288fd92f4e7f4a87115a5f1547e225042
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-06-13 21:15:31 -04:00
James Zhu 4ba8f1fe77 kfdtest: Add test for event wait with event age tracking enable
Add 5 different test scenario to cover new event age tracking features.

Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-06-10 11:41:50 -04:00
James Zhu a0cbf90b90 libhsakmt: add event age tracking
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.

Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-06-10 11:41:50 -04:00
Ori Messinger 4675492852 kfdtest: Fix minor typo
The purpose of this patch is to fix a minor typo in KFDSVMRangeTest.
Before:
"Skipping test: no enough system memory."
After:
"Skipping test: Not enough system memory."

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba
2023-06-08 15:58:25 -04:00
Graham Sider d1a095123d kfdtest: Update GFX11 blacklist
KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version
0x3c ().

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c
2023-06-08 14:11:30 -04:00
Sreekant Somasekharan 1428a7538e kfdtest: RoundToPowerOf2 function modified for compiler compliant bit shift values
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.

Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6
2023-06-05 18:14:58 -04:00
Alex Sierra 728162c2c8 libhsakmt: include changes for upstream debugger API
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467
2023-06-02 15:56:19 -04:00
Kent Russell 718d95de77 fmm.c: Fix possibly initialized variable usage
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base

Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>
2023-05-30 15:46:14 -04:00
Xiaogang Chen f6183f937e libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero.
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.

Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e
2023-05-29 00:15:14 -04:00
Xiaogang Chen 7e4e57ae5f libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu.
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6
2023-05-29 00:15:14 -04:00
Xiaogang Chen ac1db60fc2 libhsakmt: query/use render node fds that libdrm uses.
Query render node fds that libdrm uses for current process and
use them at Thunk if available.

v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471
2023-05-29 00:15:14 -04:00
Xiaogang Chen 9bebb276be libhsakmt: add NodeId at HsaGraphicsResourceInfo.
Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver
at fmm_register_graphics_handle.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b
2023-05-29 00:15:14 -04:00
Xiaogang Chen 989c6c617c libhsakmt: add DMABuf import without address allocation.
When gpu map info is not provided import DMABuf without VA assigned.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292
2023-05-29 00:15:14 -04:00
Xiaogang Chen d2a37894bb libhsakmt: support allocating a fixed address at mmap_aperture.
When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c
2023-05-29 00:15:14 -04:00
Xiaogang Chen 11ac57d293 libhsakmt: update HsaPointerInfo for address-only allocated VRAM.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56
2023-05-29 00:15:14 -04:00
Xiaogang Chen 108c0e5f92 kfdtest: add kfdtest cases for VA-only, VRAM-only allocated VRAM.
Alloc vram by kfd, then map by GEM api to GPU VM and map to CPU VM.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib5b2f35662cd5473f622f6ffc9b62925fe57ae42
2023-05-29 00:15:14 -04:00
Xiaogang Chen 0138487aa4 libhsakmt: support vram-only and VA-only alloc/free.
Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0
2023-05-29 00:15:14 -04:00
Xiaogang Chen 0a2989083b libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space.
This new manageable_aperture_t is used for VRAM allocation-only and
VA allocation-only.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50
2023-05-29 00:15:14 -04:00
Xiaogang Chen cc4fb2d1a9 libhsakmt: Revert "libhsakmt: Update FD creation logic"
This reverts commit fd48f14ceb.
Current amdgpu exposes one render node for one gpu node/partition,
revert to previous way to open render node at Thunk.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57
2023-05-29 00:15:14 -04:00
Kent Russell 478a68d49c kfdtest: Test XNACK on and off for SVM tests
Add parameterization for KFDSVM tests so that we test with both XNACK
enabled and XNACK disabled. This will be overridden by HSA_XNACK, if set

Change-Id: Ie96eb61c03115f947e08cfa076ac459f7440f5d8
2023-05-25 12:08:16 -04:00
Philip Yang 5df82e3d14 kfdtest: Enable KFDEvictTest and KFDSVMEvictTest on aqua_vanjaram
For aqua_vanjaram APU mode, KFDEvictTest and KFDSVMEvictTest are
skipped. Those tests passed on dGPU mode with memory reporting partition
support on GFX 9.4.3.

Change-Id: I56357843c6743b01b807359dbb37b32391fd9a25
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-05-17 17:17:46 -04:00
Bing Ma 1e6d728730 libhsakmt: Add support functions for ASAN
Add support functions to remap the first page of device memory (GPU/GTT)
to share host ASAN logic.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85
2023-05-17 13:38:19 -04:00
Kent Russell d966243783 kfdtest: Add include directory for ROCr merge
When we merge thunk into ROCr, kfdtest will be in a different folder
structure. Add the new location to ensure that we can build now and in
the future with no disruptions

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I6517e061cb0da7137d903abbc380bfc7126f40d4
2023-05-15 10:13:49 -04:00
Yifan Zhang d319660838 kfdtest: Using non-paged memory allocation for wptr on devices that have MES scheduler
Starting with GFX11, wptr BOs must be mapped to GART for MES to determine work
on unmapped queues for usermode queue oversubscription (no aggregated doorbell)

Change-Id: I10e30fdc2bec587cef9427faa4874957988c34b3
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
2023-05-12 01:06:37 -04:00
Yifan Zhang 53ed978c3d kfdtest: add non paged wptr judging API.
If MES is enabled, wptr has to be non paged memory,
Add an API to check this condition.

Change-Id: I53af1f6687d5332d102e7062c3d760e33b96e722
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
2023-05-12 01:06:37 -04:00
Ranjith Ramakrishnan b487f87363 Set the default value of ROCM_HEADER_WRAPPER_WERROR to OFF
Using wrapper header files will result in #warning message by default

Change-Id: I8301e433d39f3e5d39384ede6f0e4464d0eb20a6
2023-05-10 12:36:00 -04:00
Shane Xiao 5d6f900353 kfdtest: DeviceHdpFlush need set target ASIC with different Gfx versions
If Dev0 and Dev1 are not the same gfx, we should temporarily
set the target ASIC for compiling Shader code.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Signed-off-by: Shikai Guo <shikai.guo@amd.com>
Change-Id: I5836beb16ade519f5a148d3d2b9c2875554f0c35
2023-05-09 09:50:07 -04:00
Graham Sider 54136f60a0 kfdtest: Add Assembler::RunAssembleBuf overload
Overload Assembler::RunAssembleBuf to take in an extra Gfxv parameter.
Using this overload will temporarily set the target ASIC to Gfxv before
calling RunAssemble, and copy back the original MCPU literal upon
completion. The copy to reset the original MCPU in this case is safe as
the MCPU length is always known.

This will be useful in multi-device test cases whereby the devices are
not necessarily the same gfx version. The overload is explicitly for the
RunAssembleBuf wrapper rather than RunAssemble to ensure the default
MCPU is always reset independent of errors in RunAssemble.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7fe5a962876314b6df32e4b7160174949d98f9e3
2023-05-08 11:35:32 -04:00