Add base debug class and attach/detach operations.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I60f3c166646f05838fec208ac2f59bba998c63f8
[ROCm/ROCR-Runtime commit: dd56b38c2f]
Even if the version of libdrm older and does not support the
amdgpu_device_get_fd function, the device_handle stored in
amdgpu_handle[] is still valid and can be returned via
hsaKmtGetAMDGPUDeviceHandle.
Change-Id: I024a3e82e6cfebac5577aefe359b067746c4023e
[ROCm/ROCR-Runtime commit: 66b66e42cd]
Remove all unused material from KFDDBGTest.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I13ed68656efadef7bbaf8bb737ce5a04829eca9b
[ROCm/ROCR-Runtime commit: 98c6784cc1]
Current debugger uses KFD version directly.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I212a53560a94dd24c599addce72f59c527c8af25
[ROCm/ROCR-Runtime commit: 8471f80bac]
For xnack off, skip SVM evict tests if memory allocation size is larger
than 15/16 total system memory, because the test may fail to allocate
CWSR svm range to create queue after allocating test memory.
Limit eviction size from total VRAM size to 1/2 total VRAM size,
because for 192GB VRAM, evict 192GB may takes more than 120 seconds
and cause test timeout failed.
Change-Id: Ib1483b9aab580a8539187b2943cadea0fd5a7c71
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: a395dd7306]
For --node and --exclude, these flags take arguments, but usage was
unclear. This led to attempts like --node=1 , which will not work
appropriately. Add examples for flags that take parameters, as well as
the requirements for those parameters. Also change --exclude parsing to
match --node parsing, for consistency
Change-Id: I563ba9b370a24d9a84b9c39093f3cb1a5d723cef
[ROCm/ROCR-Runtime commit: 1958224379]
GFX11 will no longer use GWS for cooperative launch so disable the test.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I8611c8158e1654782150ad10f1f65edb578e6435
[ROCm/ROCR-Runtime commit: 2d3a09cbd6]
The purpose of this patch is to fix an issue in kfdtest.exclude's
blacklist for KFDSVMRangeTest.ReadOnlyRangeTest.
Excluding "KFDSVMRangeTest.ReadOnlyRangeTest" without adding a "*"
to the end causes the test to still run, since after a recent patch
the test actually runs these two variants instead:
-"KFDSVMRangeTest.ReadOnlyRangeTest/0"
-"KFDSVMRangeTest.ReadOnlyRangeTest/1"
(For XNACK OFF/ON)
Now, the test is excluded as "KFDSVMRangeTest.ReadOnlyRangeTest*"
to cover those two XNACK ON/OFF variants.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I067c4c99fe839ce6cec5d134bd605e8cb41b8291
[ROCm/ROCR-Runtime commit: 7cc3ffc115]
Functions to API added to extract the following information from KFD
Runtime information, device info and queues snapshot.
Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f
[ROCm/ROCR-Runtime commit: 0cbf26c148]
Queries for runtime capabilities after its being enabled
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691
[ROCm/ROCR-Runtime commit: 5e0a32d7b3]
The purpose of this patch is to fix an issue in the run_kfdtest.sh
script's gfx_target_version parsing.
When the character length of the "gfx_target_version" value is
equal to 5 instead of 6, it will now be zero padded on the left to
allow each Major/Minor/Stepping value to be parsed correctly.
Also, kfdtest.exclude file now replaces the default filter for
aqua_vanjaram with the following 3 gfx filters:
gfx940, gfx941, & gfx942
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I1f0264d3705803f24ad3c458e6bd367fbbec62be
[ROCm/ROCR-Runtime commit: 3447b795df]
This has slowly become less and less reliable on more and more ASICs,
so just blacklist it altogether. Using wall clock for performance
is not a reliable method for testing performance, so skip it to avoid
more failure reports on various systems.
Change-Id: I1a5744604e4620bc7675a629d146ba4ffba669d2
[ROCm/ROCR-Runtime commit: 9a22bade89]
If asics don't need software traps within GFX11 domain,
test with COMPUTE_PGM_RSRC1.PRIV = 1 will make system hang.
Change-Id: I00cf8eb6d6b07856885c77bd343ca3c41cc3cad5
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
[ROCm/ROCR-Runtime commit: 9bf1cbe4ed]
Typo to calculate bufferSize from vramBufSizeInPages. The OOM shows up
only with HSA_XNACK=1 because HSA_XNACK=0 doesn't support VRAM
oversubscription. We changed to run SVM tests with both XNACK off and
on.
Change-Id: I3949959288fd92f4e7f4a87115a5f1547e225042
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 29b04c2534]
Add 5 different test scenario to cover new event age tracking features.
Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7
Signed-off-by: James Zhu <James.Zhu@amd.com>
[ROCm/ROCR-Runtime commit: 4ba8f1fe77]
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.
Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>
[ROCm/ROCR-Runtime commit: a0cbf90b90]
The purpose of this patch is to fix a minor typo in KFDSVMRangeTest.
Before:
"Skipping test: no enough system memory."
After:
"Skipping test: Not enough system memory."
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba
[ROCm/ROCR-Runtime commit: 4675492852]
KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version
0x3c ().
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c
[ROCm/ROCR-Runtime commit: d1a095123d]
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.
Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6
[ROCm/ROCR-Runtime commit: 1428a7538e]
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base
Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: 718d95de77]
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.
Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e
[ROCm/ROCR-Runtime commit: f6183f937e]
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6
[ROCm/ROCR-Runtime commit: 7e4e57ae5f]
Query render node fds that libdrm uses for current process and
use them at Thunk if available.
v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471
[ROCm/ROCR-Runtime commit: ac1db60fc2]
When gpu map info is not provided import DMABuf without VA assigned.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292
[ROCm/ROCR-Runtime commit: 989c6c617c]
Alloc vram by kfd, then map by GEM api to GPU VM and map to CPU VM.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib5b2f35662cd5473f622f6ffc9b62925fe57ae42
[ROCm/ROCR-Runtime commit: 108c0e5f92]
This new manageable_aperture_t is used for VRAM allocation-only and
VA allocation-only.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50
[ROCm/ROCR-Runtime commit: 0a2989083b]
This reverts commit 89ce41694f.
Current amdgpu exposes one render node for one gpu node/partition,
revert to previous way to open render node at Thunk.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57
[ROCm/ROCR-Runtime commit: cc4fb2d1a9]
Add parameterization for KFDSVM tests so that we test with both XNACK
enabled and XNACK disabled. This will be overridden by HSA_XNACK, if set
Change-Id: Ie96eb61c03115f947e08cfa076ac459f7440f5d8
[ROCm/ROCR-Runtime commit: 478a68d49c]
For aqua_vanjaram APU mode, KFDEvictTest and KFDSVMEvictTest are
skipped. Those tests passed on dGPU mode with memory reporting partition
support on GFX 9.4.3.
Change-Id: I56357843c6743b01b807359dbb37b32391fd9a25
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 5df82e3d14]
Add support functions to remap the first page of device memory (GPU/GTT)
to share host ASAN logic.
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85
[ROCm/ROCR-Runtime commit: 1e6d728730]
When we merge thunk into ROCr, kfdtest will be in a different folder
structure. Add the new location to ensure that we can build now and in
the future with no disruptions
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I6517e061cb0da7137d903abbc380bfc7126f40d4
[ROCm/ROCR-Runtime commit: d966243783]
Starting with GFX11, wptr BOs must be mapped to GART for MES to determine work
on unmapped queues for usermode queue oversubscription (no aggregated doorbell)
Change-Id: I10e30fdc2bec587cef9427faa4874957988c34b3
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
[ROCm/ROCR-Runtime commit: d319660838]
If MES is enabled, wptr has to be non paged memory,
Add an API to check this condition.
Change-Id: I53af1f6687d5332d102e7062c3d760e33b96e722
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
[ROCm/ROCR-Runtime commit: 53ed978c3d]
Using wrapper header files will result in #warning message by default
Change-Id: I8301e433d39f3e5d39384ede6f0e4464d0eb20a6
[ROCm/ROCR-Runtime commit: b487f87363]
If Dev0 and Dev1 are not the same gfx, we should temporarily
set the target ASIC for compiling Shader code.
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Signed-off-by: Shikai Guo <shikai.guo@amd.com>
Change-Id: I5836beb16ade519f5a148d3d2b9c2875554f0c35
[ROCm/ROCR-Runtime commit: 5d6f900353]
Overload Assembler::RunAssembleBuf to take in an extra Gfxv parameter.
Using this overload will temporarily set the target ASIC to Gfxv before
calling RunAssemble, and copy back the original MCPU literal upon
completion. The copy to reset the original MCPU in this case is safe as
the MCPU length is always known.
This will be useful in multi-device test cases whereby the devices are
not necessarily the same gfx version. The overload is explicitly for the
RunAssembleBuf wrapper rather than RunAssemble to ensure the default
MCPU is always reset independent of errors in RunAssemble.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7fe5a962876314b6df32e4b7160174949d98f9e3
[ROCm/ROCR-Runtime commit: 54136f60a0]
LLVM MC does not seem to accept multi-line conditionals. This may be
fixable in the future with macros. The Aqua Vanjaram shader spec states
that while buffer_invl2 has been replaced by buffer_inv, the former may
still be used for compatibility. However, this does not seem to be
implemented. For now, fix conditional.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7f8b64c96055371d7e0090b758d2cfd2a37ecd3c
[ROCm/ROCR-Runtime commit: 92f3d4a458]
Previous code might fail to get the correct ln node. And trigger extra
walk through of the tree. Fix it.
While walking through the tree, better to search from right to left as
the node->start likely close to *address*.
Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
[ROCm/ROCR-Runtime commit: 77761836ae]
These tests should also pass on Aqua Vanjaram, so enable them
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ibbb9cd43d653c63b08c39efd1d7326cfac1f8411
[ROCm/ROCR-Runtime commit: eed5518e4c]
Aqua Vanjaram is intended to have fine-grained coherency
from anywhere to anywhere else using read-acquire and
write-release primitives.
Add a test that writes to memory covered by five
different cache lines, then write-releases, while
another thread read-acquires, then reads those
five locations in memory.
There are nine variations of the test to cover
CPU-GPU, same-GPU and across-GPU, vector instructions and
scalar instructions, and data local to the
acquirer or receiver.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I20d2db5c53bd280e971479aad7e61df6ed5d3623
[ROCm/ROCR-Runtime commit: 30b1f23f7a]
For vector iterator loop access current node directly, don't need
gpuNodesAll.at(i), which also causes out of range access.
Change vector index loop to iterator loop to simplify the code.
Change-Id: I2627ef8d13b5d2c9cd8c51cf4dacc3e8a97fcfb0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 0696f06c16]
AppAPU VRAM is part of system memory managed by Linux kernel, no
VRAM eviction and restore is needed between VRAM and system memory.
Those Evict test failed on AppAPU now, skip those tests on AppAPU.
No page migration between VRAM and system on AppAPU, HMMProfilingEvent
depends on migration event, skip it on AppAPU.
Change-Id: I4c809b97c947e809d136c1f88db2278cf74f5b47
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 21abaef3f8]
If there is connection between GPU and CPU with weight 13,
KFD_CRAT_INTRA_SOCKET_WEIGHT, then this is AppAPU.
This will be used to skip tests not suitable for AppAPU.
Change-Id: If6fad81528b52afd4ac4cefa508d787b0f6637ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: e2df2c21af]