KFDDBGTest is deprecated, so just removing references to IsaGen.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9f094d847a8ae43cb3793253b34a7d7ed2179ac1
[ROCm/ROCR-Runtime commit: ac48163885]
Use ReadMemoryIsa transferred and updated from KFDEvictTest.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I566f9ec36398bc4d08ab90231688600356df4d6a
[ROCm/ROCR-Runtime commit: 097b11abad]
Makes use of macros to simplify shader code with instruction-level
differences depending on GFX version. These macros are extensible and
are prepended to every shader so that they are usable everywhere.
This patch introduces three macros used within IterateIsa and
ReadMemoryIsa shaders.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: If954e1b6d2027e9f55bf7e99bd9df2668d1da524
[ROCm/ROCR-Runtime commit: 5ceb35f428]
Initial commit for ShaderStore.hpp. Will contain consts char*'s for
all shaders used within KFDTest.
The LLVM assembler now takes care of the correct instructions to be used
for various GFX versions using directives embedded into the shader assembly.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2887a03b33d5c2cc382e4f96c2bc3e067715ab54
[ROCm/ROCR-Runtime commit: 34ca37d9e8]
- Reformat shaders for legibility
- Move assembly processes to from IsaGen (CompileShader) to Assembler
(RunAssembleBuf)
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id1eb3856bc74bf0da46685c5dc08e91f5df66d4f
[ROCm/ROCR-Runtime commit: a7b85fdb08]
- Reformat shaders for legibility
- Move assembly processes to from IsaGen (CompileShader) to Assembler
(RunAssembleBuf)
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7333d0e45ccd3f43690a2a01227f89a6e04fcecb
[ROCm/ROCR-Runtime commit: b44d6762bd]
- Reformat shaders for legibility
- Move assembly processes to from IsaGen (CompileShader) to Assembler
(RunAssembleBuf)
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I174f1ea5332c499440b30d9bcf06836274428a0f
[ROCm/ROCR-Runtime commit: c845b976d0]
- Reformat shaders for legibility
- Move assembly processes to from IsaGen (CompileShader) to Assembler
(RunAssembleBuf)
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I669f076b5c34eb90349865eeca1b29e17c9e80d6
[ROCm/ROCR-Runtime commit: 08d38fb140]
- Reformat shaders for legibility
- Move assembly processes to from IsaGen (CompileShader) to Assembler
(RunAssembleBuf)
- LLVM syntax change on ScratchCopyDwordIsa_gfx10:
hwreg(HW_REG_SHADER_FLAT_SCRATCH_LO/HI) -> hwreg(HW_REG_FLAT_SCR_LO/HI)
- Fix bug in CopyOnSignalIsa_gfx10 and PollMemoryIsa_gfx10 whereby
flat_store_dword used vector reg format v[n,n]. Changed to v[n:n]
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id182cfb8aeb7372366c59affb5cbdd145909ee96
[ROCm/ROCR-Runtime commit: 039bce94a6]
Instantiate in KFDBaseComponentTest::SetUp() and destroy in TearDown().
This ensures m_pAsm is available for all tests.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I8b98a5350a9739d71455f14552c9879bdb1c475d
[ROCm/ROCR-Runtime commit: 235636d598]
Initial commit for transition from IsaGenerator/SP3 assembler model to
the LLVM AMDGPU (AMDGCN) assembler backend:
- Add Assembler class, may be instantiated for assembly similar to
IsaGenerator.
- Add Assembler and LLVM archive dependencies to build process.
- CXX bumped to gnu++14 as required for LLVM compilation.
- Compatible with LLVM 7.0 and greater (latest Lightning/llvm-git
version should be used for up-to-date gfx support). Note that this is
just a build dependency and *not* a runtime dependency. LLVM does not
need to be installed on the host machine to run kfdtest.
- CMake will first look for a Lightning build. Lightning itself does not
need to be installed system-wide, just built. If this fails, it will
attempt to find a system-wide LLVM install.
General Assembler usage and notes:
- Similar to IsaGenerator, applicable test classes will contain an
Assembler object pointer which may be instantiated in the test
constructor.
- Instantiation requires the GFXIP version in order to find the
appropriate LLVM AMDGPU Target ID.
- The RunAssemble() member func takes in a standard const char* shader and
fills the TextData member with the output binary; TextSize with the size
of TextData. These may be accessed via GetInstrStream() and
GetInstrStreamSize(), or the output binary may be copied into an
IsaBuffer via CopyInstrStream(). RunAssembleBuf() combines RunAssemble()
and CopyInstrStream() and additionally takes an optional BufSize
parameter to specify the size of the output buffer (defaults to
PAGE_SIZE).
- Assembler object deletion is to be done in the base test destructor.
Assembler-specific memory allocation is freed in the Assembler
destructor.
- For debug, one can call PrintTextHex() to print out a formatted hex
representation of the output binary, or PrintELFHex() to print out the
intermediate ELF object. Note that PrintTextHex() is public whereas
PrintELFHex() is private.
- Prints use the LLVM outs() call as that allows for use of the LLVM
format_hex() func in the aforementioned debug prints. This is subject to
change if the LOG() call would be preferred.
RunAssemble control flow:
- Ensure correct Assembler initialization and clear previous run
TextData (if necessary).
- Initialize LLVM AMDGPU target, required interfaces, and buffers.
- Set parser to specified target/subtarget and assemble into ELF code
object.
- Extract .text section from ELF, allocate space for TextData and store.
- On success, returns 0 (HSAKMT_STATUS_SUCCESS). On error, returns -1
(subject to change to be in line with HSAKMT_STATUS enum).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1d96230824db651d3ffbaa46eb68fc274e7066b5
[ROCm/ROCR-Runtime commit: 65b1e0c058]
According env setting HSA_XNACK=1 or 0, set XNACK mode ON or OFF to run
KFDSVMRangeTest and KFDSVMEvictTest. If HSA_XNACK is not defined, use
system boot-time XNACK mode setting.
Restore to the original XNACK mode when test finished.
Change-Id: Ia896a1b0a90854646c8a79acca38a7d46098efde
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 23ec6e880e]
AQL firmware can sometime send invalid signal interrupts with 0 context
ID. This test simulates this by submitting similar events using PM4
packets and measures the performance of signaling a normal event after
that.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I69028dc6dd98a5a93f18daad4efbe1b16b6098f9
[ROCm/ROCR-Runtime commit: e738e57fc4]
The KFD patch "drm/amdkfd: Ignore bogus signals from MEC efficiently" will
reserve one signal slot that user mode cannot use any more. Update
the maximum event number in KFDEventTest to match that change.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ic789e16b6d73dfea66ab51c5bbc075c8e8e2d052
[ROCm/ROCR-Runtime commit: 347bf6a03c]
On the some platform there's only 256MB vram and then will fail to
allocate 256MB vram. So let's limit a small vram allocation for
ensuring vram allocated successfully.
Change-Id: Iba4c469de56925675e5624b300a6153e24ab19b3
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
[ROCm/ROCR-Runtime commit: c86a0b8332]
It's not possible to allocate the 3/4 vram size with granularityMB
being 128 when vram size < 512MB and decrease granularityMB to 16 has
no significant impact on ROCt test on other system. So let's decrease
granularityMB on small vram system for handling LargestVramBufferTest().
Change-Id: Iea7c29abfd382a20761b653730fd09a220ad2fd0
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
[ROCm/ROCR-Runtime commit: 6c103877dd]
Tested on Talos II with Vega 64
POWER systems allocate NUMA nodes on multiples of 8 to allow CPU
onlining / offlining
Set the correct NUMA mask bits when requesting node-bound memory
allocations
This is a cleanup/squash/rebase of:
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/47
Change-Id: Id4af6dff7e66e9d464d6b17a1e99087eb3ac8e51
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 5fd3c868b2]
Some VRAM access tests in MMBandWidth can be very slow on systems with
complicated PCIe topology. Skip tests that take a long time to avoid
excessively long running tests with little benefit.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I2950237347fc2f764f6aa3292ab819051472bf37
[ROCm/ROCR-Runtime commit: 3ecd54f098]
Map failures happen in AllocBuffers function when there
isn't enough space to move BO to vram. In such cases, the
function retries allocation/map until successful to continue
testing eviction and restore.
Print a message in KFDEvictTest when this happens to correlate
to the message seen in the kernel log.
amdgpu 0000:c1:00.0: amdgpu: Failed to map peer:0000:c1:00.0 mem_domain:4
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I0475d8d9521a07612182e54fc7cddb9bd44353e6
[ROCm/ROCR-Runtime commit: 0d07b3477b]
If PCIe Atomics aren't supported, we shouldn't try to run a test that
tests PCIe Atomics. Check for support, and bail early if it's not there
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Ie9aa0fed3ece07fb83a33e6cacef2961626afab4
[ROCm/ROCR-Runtime commit: f62e9b9821]
While this is currently only used in one subtest, it's useful to have
this separated into the test utilities. This will also allow us to check
for PCI Atomics support before trying to run them.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I9704d151bfaa627eceae8399cc46c15babde6ff1
[ROCm/ROCR-Runtime commit: 8b54459e12]
Import the latest version from the kernel tree.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: If5f998ad55085ebd5020adaa382181204d834e3e
[ROCm/ROCR-Runtime commit: f88aaa933b]
This error messages should be handled by the caller.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I68d879d6d41835f47b8ac138c2218eaa6b86a512
[ROCm/ROCR-Runtime commit: dc33a092c0]
Currently, context save area size passed to KFD includes the
size of the debug area. Change this to report the actual size
of the context save area to KFD.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I5d440ae802255a97ade046775f6a000bae79d5d5
[ROCm/ROCR-Runtime commit: b8dc875b3c]
Include the upgrade operation check in the prerm and postun scripts
in package.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
Change-Id: Ib95ea72f15bfbf4141b69b0a8ca4d3a71fe1c093
[ROCm/ROCR-Runtime commit: 046f2e9116]
Add PCI DID for cyan skillfish.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I1d06936cccdf99af76fe5ca3ff323538fac76c9c
[ROCm/ROCR-Runtime commit: 052b7957ea]
The gfx version of gfx90c is 90C instead of 902.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Change-Id: Id009c9357f816b8ccab605090df47626f1a579ef
[ROCm/ROCR-Runtime commit: 7cdf38f6c0]
Incease more timeout according to peers number to pass the
test on some PCIe link platforms.
Change-Id: Ifcb8c7297d6960c96fc18d29bc0a48733ca50165
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
[ROCm/ROCR-Runtime commit: 7c62a12918]
Mapped memory areas become invalid after fork, and the child process is
required to remap the memory areas after a fork. So we mark these device
memory mappings with MADV_DONTFORK so that they are removed from the
child process after fork.
This was causing some issues when doing CRIU checkpoint/restore because
CRIU and amdgpu_plugin were not able to handle these mappings.
Change-Id: I50eb334aecea6dab7522d94da0273adcf4fb1ce0
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
[ROCm/ROCR-Runtime commit: 4986f4a5c2]
Total VRAM size on APU is 512M usually,
Framebuffer also is allocated from VRAM.
There is no enough memory for this case.
/home/ruiliji2/p5/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1285: Failure
Value of: (hsaKmtMapMemoryToGPUNodes(bufs[i], bufSize, &altVa, mapFlags, 1, &defaultGPUNode))
[ FAILED ] KFDMemoryTest.MMBench (1034 ms)
Change-Id: Ib4201291122d85f6512a85859aea9a4713fb4f5c
(cherry picked from commit a9f924484e7022a2d53ee02811b080f0833eba55)
[ROCm/ROCR-Runtime commit: 0340c68031]
skip HDP flush test when remap feature is not supported.
Backgroud:
the HDP register remap is skipped in sriov mode,
it will cause mmio base is nullPtr.
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib9aea1900931e30571656397a485ee4db051ec0a
[ROCm/ROCR-Runtime commit: 033b52c4e4]
Explicitly free the user buffer ptr before test's tear down. Otherwise
the svm_bo object will never be released, causing a BUG error. Due to
a late callback to svm_migrate_page_free when prange not longer exist.
Also did cosmetic adjustments.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I989c62de8a9634faa84e42def956cecb3f84e329
[ROCm/ROCR-Runtime commit: 2dbee30232]
The AMD compiler team has confirmed that they expect gfx90c
to be gfx90c, with a major/minor/stepping of 9, 0, and 12
respectively. It appears that there is a typo in the libhsakmt
topology information that lists this part as gfx902. This patch
fixes the issue.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Change-Id: I6f907a7aa6f190b12aba8bb4210c7b341b3c720b
[ROCm/ROCR-Runtime commit: a06d1a3884]
This is causing issues with side by side, sorry for the noise.
This license location isn't ideal but it's good enough for now.
Change-Id: Iba2a84cedf22466fdaaf3c63b6ea49c9fc277967
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 3f90750304]
Calling cmake replaces this file, so no need to commit it.
Change-Id: Ic4747cc9eebd9cbfc61d524a31d2025c04eda12e
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 3b64517787]
The copyright file will conflict if multiple thunks are installed. This
should resolve the issue by adding the version to the install path.
Change-Id: Ieac5a3eba979b3e934fb9100f890b92fc7c35d71
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 348a3613d6]