For GC 9.4.0, modifications were made to various shaders since certain
flat_ instructions no longer support glc/slc modifiers (replaced with
nt/sc1/sc0). Instead of repeating conditionals inside various shader
bodies, we can make use of LLVM AMDGCN macros.
This patch modularizes the shader macros into seperated defines. Prior
to the core raw-string literal, each shader now starts with the
SHADER_START literal (".text\n") plus any number of SHADER_MACRO_*
literals. This allows us to seperate the macro definitions logically and
use the pre-processor to only include the required macro groups on a
per-shader basis.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I19eb3fd14252a0601bb7509249051b68e7fdb02a
[ROCm/ROCR-Runtime commit: e2435d9e93]
Previously, KFDEvictTest.QueueTest and KFDSVMEvictTest.QueueTest
would create a variable number of wavefronts, one for each 64MB
of memory under test. This ran into limits on the buffers used
by the wavefronts, and may at some point have exceeded the
wavefront limit.
Restrict the number of wavefronts to 512, and adjust the shader
to accomodate a variable buffer size
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I2ec292e2900e2efa62a08313bca3d2f4bdabca8b
[ROCm/ROCR-Runtime commit: 680c8ca5a9]
A gfx940 code path was erroneously added to this shader.
It's unneccesary; without this path, the shader uses
the scalar store, which works just fine on gfx940 without changes.
Remove it.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I825cbbebbdb25c4a7c2f16e228c2bea6a6bcc30c
[ROCm/ROCR-Runtime commit: 2a01e5c33b]
gfx940 changed the semantics of the glc and slc coherency options
on vector stores and loads. This means that shaders that use
those bits no longer compile on gfx940.
Add precompilation if statements to those shaders to use the
new coherency bits.
Also add gfx940 to ASMTest so that compilation is tested.
Note: One of the tests enabled by this patch on gfx940,
KFDEvictTest.QueueTest, does not pass on gfx940 emulators.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I942f9d2536e9eb5510c4d5af30df6ff1a95c8cf7
[ROCm/ROCR-Runtime commit: 30da9a3cf9]
Modifier scc is disabled from gfx90a's asm, so remove the
shader for gfx90a A+A and keep it for newer asics with scc
support.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iec3c7ccd5156a855adb2b02feb3db0761876aa2f
[ROCm/ROCR-Runtime commit: 8e8aa024fd]
To avoid confusion since this shader has changed to be persistent
(original IterateIsa may be re-used for debugger tests).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I4643692765fc7665933257e89d5b922e779ad2e5
[ROCm/ROCR-Runtime commit: 6467664ec7]
IterateIsa had some leftover instructions from when the shader was
getting updated for KFDCWSRTest.BasicTest.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I41ae7b7948cbe2aff8bf61b170b9a7d498b836a3
[ROCm/ROCR-Runtime commit: 82a41c7e4d]
This patch restructures the CWSR basic test and allows for
creating parameterized CWSR tests. This patch introduces four
parameterizations. These tests behave as follows:
This test dispatches the IterateIsa shader, which continuously
increments a vgpr for (num_witems / WAVE_SIZE) waves. While this shader
is running, dequeue/requeue requests are sent in a loop to trigger
CWSRs.
This test defines a CWSR threshold. Once the number of CWSRs triggered
reaches the threshold, a known-value is filled into the inputBuf to
signal the shader to exit.
4 parameterized tests are defined:
KFDCWSRTest.BasicTest/0
KFDCWSRTest.BasicTest/1
KFDCWSRTest.BasicTest/2
KFDCWSRTest.BasicTest/3
0: 1 work-item, CWSR threshold of 10
1: 256 work-items, CWSR threshold of 50
2: 512 work-items, CWSR threshold of 100
3: 1024 work-items, CWSR threshold of 1000
Tuple Format: (num_witems, cwsr_thresh)
num_witems: Defines the number of work-items.
cwsr_thresh: Defines the number of CWSRs to trigger.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I639eb7bd75b14ee70e190b4bd19dcf34096fc7bf
[ROCm/ROCR-Runtime commit: 0dbac97b75]
LoopIsa is a shader that performs a variety of intensive
calculations in a loop. It is used by tests such as
KFDQMTest.QueuePriorityOn*
It contained a scalar load, despite not having any buffer to
read from. This load causes page faults on GFX11. It is
unclear why it did not cause page faults on earlier ASICs.
Remove the load.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I7426d0db48e933f3bb870467ea88476f7a283040
[ROCm/ROCR-Runtime commit: 39e8a85aac]
Includes a simple AssembleShader test which loops through all shaders
for all supported targets, dispatching a RunAssemble call for each
shader.
Also adds extra safety on a couple shaders that only work on
gfx9/gfx90a.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I3ca1c92136f3871eb62fcb9645694f22287aaeec
[ROCm/ROCR-Runtime commit: 7eeba830f8]
Makes use of macros to simplify shader code with instruction-level
differences depending on GFX version. These macros are extensible and
are prepended to every shader so that they are usable everywhere.
This patch introduces three macros used within IterateIsa and
ReadMemoryIsa shaders.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: If954e1b6d2027e9f55bf7e99bd9df2668d1da524
[ROCm/ROCR-Runtime commit: 5ceb35f428]
Initial commit for ShaderStore.hpp. Will contain consts char*'s for
all shaders used within KFDTest.
The LLVM assembler now takes care of the correct instructions to be used
for various GFX versions using directives embedded into the shader assembly.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2887a03b33d5c2cc382e4f96c2bc3e067715ab54
[ROCm/ROCR-Runtime commit: 34ca37d9e8]