Graf commitů

1206 Commity

Autor SHA1 Zpráva Datum
James Zhu 277d5e27ff libhsakmt: remove iommu_block which supports IOMMUv2 performance
IOMMUv2 is removed from AMDGPU/KFD.

Change-Id: I9fcf20ae9288cb40bb4b696284fc70534fb6484b
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-09-30 08:54:10 -04:00
James Zhu 274b5b51ca libhsakmt: remove IOMMUv2 performance monitor support
IOMMUv2 is removed from AMDGPU/KFD.

Change-Id: Ib87f501c07d9de90e6b83b98f98daacd5913e98a
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-09-30 08:54:10 -04:00
David Yat Sin 8e06dce573 Add extended coherence memory flag
Add support for new flag for memory allocation that will provide
system-scope coherent atomics

Change-Id: I426d66223e8d2b570f69b4c0e61145ce9b2290d2
2023-09-22 11:03:00 -04:00
Jonathan Kim 986e82d677 kfdtest: temporarily exclude address watch testing
The debug address watch test will hang when running with the
entire KFD test.

Disable it for now.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I1d0479fa2717d2f398cc32e0605ca6dcc17ebcd5
2023-09-14 09:07:20 -04:00
Jonathan Kim fcec22716a Use camel case for KFDDBGTest shaders
Debug test shaders should use camel case and suffix *Isa to match other
test shader naming convention.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I64e14183ba1c7c9664b13a742a0e5683866e8223
2023-09-12 15:38:12 -04:00
Ori Messinger 5f117f7608 kfdtest: Fix String NULL Check
MCPU const char * always returns true, so check the value instead.

Before: if (!MCPU) {
After: if (!*MCPU) {

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I414e091ca764095937311648c534351d6abf30e6
2023-09-08 16:36:01 -04:00
Jonathan Kim 6ec529fe68 kfdtest: temporarily exclude debug suspend queues test
For some reason, non-Ubuntu builds have some sort of memory
corruption when running this test, which affect subsequent running
tests.  Disable it for now.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I5f54ee4c63286a33c6948bc818aa1501c4a6751e
2023-09-08 12:12:13 -04:00
Jonathan Kim f9e20c8a93 kfdtest: replace 0 initialized dbg structs with memset
Use memset to avoid general 0 set padding issues and ASAN compile issues
for debug tests.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I0a5aca5b7b631083599573b47f1ae87d5d0d5d71
2023-08-29 11:25:56 -04:00
Lang Yu 65ca3317f2 kfdtest: add blacklist for gfx1150 and gfx1151
Change-Id: If78840e57c2523696c620d28f4c4ffb004128c0c
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
2023-08-24 17:27:04 +08:00
Ranjith Ramakrishnan 65911e8368 Use memset for initializing variable sized array
In ASAN builds, the compiler used is clang. The initialization of
variable sized array using assignment operator is causing compilation
failure in ASAN builds. Used memset to fix the same.

Change-Id: I02aef3b99a6cad0cce3a378210a48732e07a88fb
2023-08-21 12:01:58 -07:00
Jonathan Kim a3f8085025 kfdtest: add trap on wave start and end test
Add test to catch trap on wave start or end override event.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Icb57af64475fbd2d8a6c0af9a2ee5db5d1a169c6
2023-08-18 12:15:08 -04:00
Jonathan Kim 8311ca5bfa kfdtest: add address watch test
Address watch test will test read and write operations.
Test will also check if operation is precise if precise
address watch is available.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I7ef835790e26bf6345682755d7dd26a35853bcd5
2023-08-18 12:15:07 -04:00
Jonathan Kim 431dc8d403 kfdtest: Add ops for address watch test
Add wave launch override, set/clear address watch and precise memops
test.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Ib405d5570cd304e02c2e76eca3593cbd9a5937d9
2023-08-18 12:11:48 -04:00
Jonathan Kim d4029a9492 kfdtest: add memory violation test
Add memory violation detection test.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I1b56f684682836fc84fbec713bd81c53bdd6d413
2023-08-18 12:11:48 -04:00
Jonathan Kim 6c5121faff kfdtest: allow toggle of dispatch privilege
For GFX11 debugger testing, waves require to start in non-priv mode for
some test cases, so allow tester to set this.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iee93fda926bfd336d51c79c086f1f75bc35b70e5
2023-08-18 12:09:07 -04:00
Ranjith Ramakrishnan 96c8bb11b1 Disable file reorg backward compatibility support by default
Change-Id: I157e05e52a1a61b86fa2fc6f29d31361a688fa10
2023-08-09 12:31:27 -04:00
Jonathan Kim d20f0bbb90 kfdtest: add suspend resume test
Add queue suspend and resume test.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I2ade721026cbb458a3597b7858a164e70fe05f4f
2023-08-09 09:26:46 -04:00
Jonathan Kim b0e84183c1 kfdtest: add snapshot operations
Add queue and devices snapshot operations.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I836884c9f3b65dd9e5e444d554d3eb87938e1634
2023-08-09 09:26:29 -04:00
Jonathan Kim 5a675921ea kfdtest: add suspend and resume queues operation
Add base debug operations to suspend and resume queues.
Routine will return the number of queues successfully
suspended or resumed.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8f18317f70464b04231c5cf822e11d545ebfa02a
2023-08-09 09:26:09 -04:00
Jonathan Kim b77189cf83 kfdtest: add hit trap event test
Check that a jump to trap event can be picked up by the debugger.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iad5f87092f2b82d5018013bba548979122a9bd02
2023-08-08 16:01:23 -04:00
Jonathan Kim 97fc25bb8d kfdtest: add set exceptions enable base debug operation
Add set exceptions enabled debug option

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I6ee1769bbbb90a74074d8100974c4bfeabaf7f2c
2023-08-08 16:01:03 -04:00
Jonathan Kim 097ee967d1 kfdtest: add runtime enable and attach test
Add debug attach and runtime enable test for attaching to a spawned and
running process.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I72302ff73494d9dae0c79a299508085d7ca0552b
2023-08-08 16:00:44 -04:00
Jonathan Kim bfb0d15ee8 kfdtest: add query operations
Add polling query debug event operation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Ic82ce4a393bfb28c9f32e7920f80c12da7f627d5
2023-08-08 16:00:25 -04:00
Jonathan Kim c7129edcb8 kfdtest: add send exception operation
Add send exception operation.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: Iecb43004a1a7ffc75e94badf203cd0927ffe0909
2023-08-08 16:00:04 -04:00
Jonathan Kim dd56b38c2f kfdtest: add base debug class and debug attach/detach operation
Add base debug class and attach/detach operations.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I60f3c166646f05838fec208ac2f59bba998c63f8
2023-08-08 15:59:35 -04:00
David Yat Sin 66b66e42cd Keep libdrm device_handle on older libdrm
Even if the version of libdrm older and does not support the
amdgpu_device_get_fd function, the device_handle stored in
amdgpu_handle[] is still valid and can be returned via
hsaKmtGetAMDGPUDeviceHandle.

Change-Id: I024a3e82e6cfebac5577aefe359b067746c4023e
2023-08-01 10:52:26 -04:00
Jonathan Kim aaab019960 libhsakmt: add debug trap thunk call for testing
Add generic thunk call for debug testing that assumes
caller populations trap arguments correctly.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I33a0bc66ca77e29f5b663d4bfe73f8684df8bfb6
2023-07-26 10:29:27 -04:00
Jonathan Kim 98c6784cc1 kfdtest: remove deprecated debug references
Remove all unused material from KFDDBGTest.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I13ed68656efadef7bbaf8bb737ce5a04829eca9b
2023-07-26 10:29:23 -04:00
Jonathan Kim 8471f80bac libhsakmt: remove old debugger versioning
Current debugger uses KFD version directly.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I212a53560a94dd24c599addce72f59c527c8af25
2023-07-26 09:41:38 -04:00
Philip Yang a395dd7306 kfdtest: KFDSVMEvictTest support large VRAM or small system memory
For xnack off, skip SVM evict tests if memory allocation size is larger
than 15/16 total system memory, because the test may fail to allocate
CWSR svm range to create queue after allocating test memory.

Limit eviction size from total VRAM size to 1/2 total VRAM size,
because for 192GB VRAM, evict 192GB may takes more than 120 seconds
and cause test timeout failed.

Change-Id: Ib1483b9aab580a8539187b2943cadea0fd5a7c71
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-07-25 11:11:55 -04:00
Kent Russell 1958224379 run_kfdtest.sh: Clarify parameters taking arguments
For --node and --exclude, these flags take arguments, but usage was
unclear. This led to attempts like --node=1 , which will not work
appropriately. Add examples for flags that take parameters, as well as
the requirements for those parameters. Also change --exclude parsing to
match --node parsing, for consistency

Change-Id: I563ba9b370a24d9a84b9c39093f3cb1a5d723cef
2023-07-21 10:53:31 -04:00
Jonathan Kim 2d3a09cbd6 kfdtest: disable gws tests for gfx11
GFX11 will no longer use GWS for cooperative launch so disable the test.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8611c8158e1654782150ad10f1f65edb578e6435
2023-07-20 11:22:56 -04:00
Ori Messinger 7cc3ffc115 kfdtest: Fix kfdtest.exclude "ReadOnlyRangeTest" Issue
The purpose of this patch is to fix an issue in kfdtest.exclude's
blacklist for KFDSVMRangeTest.ReadOnlyRangeTest.

Excluding "KFDSVMRangeTest.ReadOnlyRangeTest" without adding a "*"
to the end causes the test to still run, since after a recent patch
the test actually runs these two variants instead:
   -"KFDSVMRangeTest.ReadOnlyRangeTest/0"
   -"KFDSVMRangeTest.ReadOnlyRangeTest/1"
(For XNACK OFF/ON)

Now, the test is excluded as "KFDSVMRangeTest.ReadOnlyRangeTest*"
to cover those two XNACK ON/OFF variants.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I067c4c99fe839ce6cec5d134bd605e8cb41b8291
2023-06-29 23:14:30 -04:00
Jeremy Newton 473a66d115 Don't install asan license if disabled
Change-Id: I8bffe5ec8496ff11e6d66995dd470cddb13f3c0d
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2023-06-29 09:34:49 -04:00
Alex Sierra 0cbf26c148 src: add debug API to support GPU core dump
Functions to API added to extract the following information from KFD
Runtime information, device info and queues snapshot.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f
2023-06-26 18:58:15 +00:00
Alex Sierra 5e0a32d7b3 add hsaKmtGetRuntimeCapabilities API
Queries for runtime capabilities after its being enabled

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691
2023-06-26 18:58:15 +00:00
Ori Messinger 3447b795df kfdtest: Fix gfx_target_version Parsing Issue
The purpose of this patch is to fix an issue in the run_kfdtest.sh
script's gfx_target_version parsing.

When the character length of the "gfx_target_version" value is
equal to 5 instead of 6, it will now be zero padded on the left to
allow each Major/Minor/Stepping value to be parsed correctly.

Also, kfdtest.exclude file now replaces the default filter for
aqua_vanjaram with the following 3 gfx filters:
gfx940, gfx941, & gfx942

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I1f0264d3705803f24ad3c458e6bd367fbbec62be
2023-06-23 13:18:05 -04:00
Kent Russell 9a22bade89 kfdtest.exclude: Blacklist CuMaskingEven on all ASICs
This has slowly become less and less reliable on more and more ASICs,
so just blacklist it altogether. Using wall clock for performance
is not a reliable method for testing performance, so skip it to avoid
more failure reports on various systems.

Change-Id: I1a5744604e4620bc7675a629d146ba4ffba669d2
2023-06-15 11:24:04 -04:00
Ruili Ji 9bf1cbe4ed kfdtest: Update COMPUTE_PGM_RSRC1 for software trap
If asics don't need software traps within GFX11 domain,
test with COMPUTE_PGM_RSRC1.PRIV = 1 will make system hang.

Change-Id: I00cf8eb6d6b07856885c77bd343ca3c41cc3cad5
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
2023-06-14 07:46:51 -04:00
Philip Yang 29b04c2534 kfdtest: Fix KFDSVMEvictTest.QueueTest OOM
Typo to calculate bufferSize from vramBufSizeInPages. The OOM shows up
only with HSA_XNACK=1 because HSA_XNACK=0 doesn't support VRAM
oversubscription. We changed to run SVM tests with both XNACK off and
on.

Change-Id: I3949959288fd92f4e7f4a87115a5f1547e225042
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-06-13 21:15:31 -04:00
James Zhu 4ba8f1fe77 kfdtest: Add test for event wait with event age tracking enable
Add 5 different test scenario to cover new event age tracking features.

Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-06-10 11:41:50 -04:00
James Zhu a0cbf90b90 libhsakmt: add event age tracking
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.

Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-06-10 11:41:50 -04:00
Ori Messinger 4675492852 kfdtest: Fix minor typo
The purpose of this patch is to fix a minor typo in KFDSVMRangeTest.
Before:
"Skipping test: no enough system memory."
After:
"Skipping test: Not enough system memory."

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba
2023-06-08 15:58:25 -04:00
Graham Sider d1a095123d kfdtest: Update GFX11 blacklist
KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version
0x3c ().

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c
2023-06-08 14:11:30 -04:00
Sreekant Somasekharan 1428a7538e kfdtest: RoundToPowerOf2 function modified for compiler compliant bit shift values
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.

Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6
2023-06-05 18:14:58 -04:00
Alex Sierra 728162c2c8 libhsakmt: include changes for upstream debugger API
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467
2023-06-02 15:56:19 -04:00
Kent Russell 718d95de77 fmm.c: Fix possibly initialized variable usage
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base

Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>
2023-05-30 15:46:14 -04:00
Xiaogang Chen f6183f937e libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero.
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.

Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e
2023-05-29 00:15:14 -04:00
Xiaogang Chen 7e4e57ae5f libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu.
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6
2023-05-29 00:15:14 -04:00
Xiaogang Chen ac1db60fc2 libhsakmt: query/use render node fds that libdrm uses.
Query render node fds that libdrm uses for current process and
use them at Thunk if available.

v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471
2023-05-29 00:15:14 -04:00