提交图

2959 次代码提交

作者 SHA1 备注 提交日期
Shane Xiao 5d6f900353 kfdtest: DeviceHdpFlush need set target ASIC with different Gfx versions
If Dev0 and Dev1 are not the same gfx, we should temporarily
set the target ASIC for compiling Shader code.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Signed-off-by: Shikai Guo <shikai.guo@amd.com>
Change-Id: I5836beb16ade519f5a148d3d2b9c2875554f0c35
2023-05-09 09:50:07 -04:00
Sam Wu 57b3fcde51 add sphinx configurations
Change-Id: I1a66a02b18fb699415a87a6473eb72c097a13b5f
2023-05-08 15:58:01 -06:00
Graham Sider 54136f60a0 kfdtest: Add Assembler::RunAssembleBuf overload
Overload Assembler::RunAssembleBuf to take in an extra Gfxv parameter.
Using this overload will temporarily set the target ASIC to Gfxv before
calling RunAssemble, and copy back the original MCPU literal upon
completion. The copy to reset the original MCPU in this case is safe as
the MCPU length is always known.

This will be useful in multi-device test cases whereby the devices are
not necessarily the same gfx version. The overload is explicitly for the
RunAssembleBuf wrapper rather than RunAssemble to ensure the default
MCPU is always reset independent of errors in RunAssemble.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7fe5a962876314b6df32e4b7160174949d98f9e3
2023-05-08 11:35:32 -04:00
Graham Sider 7a4c9273d7 rocrtst: Move kernel object loading outside of loops
Negative queue validation tests were doing many redundant from-file
kernel object loads in a loop. This was creating many simulataneous open
file handles within many dynamically allocated CodeObject objects. While
the CodeObject class implements RAII on the file handles to cleanup on
destruction, clear_code_object() only gets called on the destruction of
the TestBase-derived test objects (these being a suite abstraction).

Due to this we were hitting file open() EMFILE errors (too many open
files) in gfx94x CPX mode. Move LoadKernelFromObjFile outside of the
test loops and clear_code_object() for each test on each agent.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I6f9d23fd122720c49a58c22698f097906d2fc97c
2023-04-27 16:16:12 -04:00
David Yat Sin a180c9ee78 Add env var to override SRAM ECC
Add HSA_ENABLE_SRAMECC environment variable that can be used to
override SRAM ECC mode reported by KFD

Change-Id: I2b95511820a2d3d146a76b03070659c0695b61fd
2023-04-27 16:16:05 -04:00
David Yat Sin f024d21e3d Add query for number of XCCs per agent
Change-Id: I4b694b4904ba0326c998356388a62c19a972a7ff
2023-04-27 16:15:59 -04:00
Mike Li 46b667e530 Return failure with any IMAGE attribute for gfx940
The gfx940 does not support IMAGE instructions. Any get_info with
IMAGE attributes should return failure.

Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: I12005628f92780f551ab6f8b41526c66b54c6a59
2023-04-27 16:15:51 -04:00
Mike Li d7fa654338 Do not use the function part of the location_id
The function IDs used to be 0 on previous asics but on gfx94x and newer
asics, these bits are set. These bits are used by user applications to
uniquely identify the locations of GPU nodes. These exta bits break
hwloc and are not needed for rocrtst.

Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I1202f504645b0662d009b9c0926eebb7ddc08d73
2023-04-27 16:15:43 -04:00
Mike Li 9554e95de0 Scratch memory changes to support multi-xcc
Change-Id: I115ba4cfe250c59cb7421217cfe0fad6302f25b3
2023-04-27 16:15:30 -04:00
Laurent Morichetti f31b312611 Update the trap handler for gfx940
gfx940 uses ttmp11 to hold the queue packet index so the first level
trap handler uses ttmp13 instead to save ib_sts.

Repurpose ttmp11[31] to mean that the ttmps are initialized. The issue
was that the debugger could not tell whether ttmp6 was written by the
trap handler when determining the stop reason.

If ttmp11[31]=0, then the trap handler has not been executed and ttmp6
should be assumed to be 0.  If ttmp11[31]=1, then ttmp6 holds the
trap_id, if an s_trap instruction caused the exception.

Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Lancelot Six <lancelot.six@amd.com>

Change-Id: I9af903abae044b9ec530306229caf3b883f3ee46
2023-04-27 16:15:14 -04:00
Mike Li de4d1ce424 Add gfx940 to AmdHsaCode
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Ib4f7c801c3d3bac9a04c880c5bf86b72bfa3404f
2023-04-27 16:09:26 -04:00
Mike Li bd98a1e5bf Added gfx940 ISA
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Icb1830fe186abc69fe7ee709b7f12b882cab9e87
2023-04-27 16:08:58 -04:00
Graham Sider 92f3d4a458 kfdtest: Fix new shader directives
LLVM MC does not seem to accept multi-line conditionals. This may be
fixable in the future with macros. The Aqua Vanjaram shader spec states
that while buffer_invl2 has been replaced by buffer_inv, the former may
still be used for compatibility. However, this does not seem to be
implemented. For now, fix conditional.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7f8b64c96055371d7e0090b758d2cfd2a37ecd3c
2023-04-27 10:48:44 -04:00
xinhui pan 77761836ae thunk: Fix and optimise for pointer range search
Previous code might fail to get the correct ln node. And trigger extra
walk through of the tree. Fix it.

While walking through the tree, better to search from right to left as
the node->start likely close to *address*.

Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2023-04-20 19:36:29 -04:00
David Francis eed5518e4c kfdtest: Enable gfx90a coherency tests on Aqua Vanjaram
These tests should also pass on Aqua Vanjaram, so enable them

Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ibbb9cd43d653c63b08c39efd1d7326cfac1f8411
2023-04-19 10:28:05 -04:00
David Francis 30b1f23f7a kfdtest: Add coherency tests for Aqua Vanjaram
Aqua Vanjaram is intended to have fine-grained coherency
from anywhere to anywhere else using read-acquire and
write-release primitives.

Add a test that writes to memory covered by five
different cache lines, then write-releases, while
another thread read-acquires, then reads those
five locations in memory.

There are nine variations of the test to cover
CPU-GPU, same-GPU and across-GPU, vector instructions and
scalar instructions, and data local to the
acquirer or receiver.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I20d2db5c53bd280e971479aad7e61df6ed5d3623
2023-04-19 10:28:05 -04:00
Philip Yang 0696f06c16 kfdtest: fix KFDSVMRangeTest.MultiGPU tests vector iterator
For vector iterator loop access current node directly, don't need
gpuNodesAll.at(i), which also causes out of range access.

Change vector index loop to iterator loop to simplify the code.

Change-Id: I2627ef8d13b5d2c9cd8c51cf4dacc3e8a97fcfb0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-04-18 17:58:06 -04:00
Alex Sierra e82025bffa use mkstemp instead tempnam for temp file
tempnam has been marked as obsolete.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Ie64d9a351bf386da00a96ceff059f685e11f2cca
2023-04-17 15:38:59 -04:00
Philip Yang 21abaef3f8 kfdtest: AppAPU Skip KFDEvictTest, KDFSVMEvictTest, HMMProfilingEvent
AppAPU VRAM is part of system memory managed by Linux kernel, no
VRAM eviction and restore is needed between VRAM and system memory.
Those Evict test failed on AppAPU now, skip those tests on AppAPU.

No page migration between VRAM and system on AppAPU, HMMProfilingEvent
depends on migration event, skip it on AppAPU.

Change-Id: I4c809b97c947e809d136c1f88db2278cf74f5b47
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-04-17 13:28:23 -04:00
Philip Yang e2df2c21af kfdtest: Add helper to check if IsAppAPU system
If there is connection between GPU and CPU with weight 13,
KFD_CRAT_INTRA_SOCKET_WEIGHT, then this is AppAPU.

This will be used to skip tests not suitable for AppAPU.

Change-Id: If6fad81528b52afd4ac4cefa508d787b0f6637ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-04-17 13:28:23 -04:00
Lancelot SIX 183f5d90aa linux os_thread: improve error handling
On Linux, the os_thread abstraction is built on top of pthread.  Many of
the pthread calls might fail and return error codes.  The error
conditions are only checked via assertions (if ever checked) which means
that when doing a release build, no error condition is checked.  The
same goes for dlsym/dlinfo and clock_gettime.

This commit improves the situation this by checking the error conditions
and acting accordingly.  When the error condition is detected in a
function with a mean to indicate some error to its caller, then this
patch prints some error message and returns.  If there is no way to
propagate the error up the call stack, print some error message and
abort the process.

For the os_info::os_info ctor, the only user is CreateThread, which
checks that the built thread is Valid().  If not, nullptr is returned to
the caller.

It could be possible to use exceptions when functions cannot pass
errors, but for now I only use abort as it is what abort would do with
debug build.

Change-Id: I815703c3b95777cc29bb89a7d654ac879c14a759
2023-04-17 09:48:11 -04:00
Lancelot SIX 72219b8237 Runtime::GetSystemInfo: Supress parentheses warning
When building with g++-11.3.0, I have the following warning:

    /home/.../core/runtime/runtime.cpp: In member function ‘hsa_status_t rocr::core::Runtime::GetSystemInfo(hsa_system_info_t, void*)’:
    /home/.../core/runtime/runtime.cpp:693:56: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
      693 |           kfd_version.KernelInterfaceMajorVersion == 1 &&
          |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
      694 |               kfd_version.KernelInterfaceMinorVersion >= 12)
          |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This patch adds the parenthesis as suggested.  This silences the
compiler warning.

No functional change expected.

Change-Id: I69c1a73a432b0f2393dbaf36d4424cf0056c535f
2023-04-17 09:43:02 -04:00
Jesse Zhang 4d54d6e706 libhsakmt: Add compute core check for APU
We should check compute core instead of cpu core,
in order to exclude the case of APU.

Signed-off-by: Jesse zhang <jesse.zhang@amd.com>
Change-Id: I2ec2a6807f51f49f80e0e500f5d9af81c2efae37
2023-04-17 09:34:37 +08:00
Graham Sider 11a04fe1f5 kfdtest: Fix PersistentIterateShader for gfx target 9.4.x
Replace 'flat_load_dword <...> glc' with appropriate macro.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9fdc7c916c685304457cd9698e741577f6c10c82
2023-04-14 12:05:08 -04:00
Graham Sider e2435d9e93 kfdtest: Add flat compatability macros for gfx target 9.4.x
For GC 9.4.0, modifications were made to various shaders since certain
flat_ instructions no longer support glc/slc modifiers (replaced with
nt/sc1/sc0). Instead of repeating conditionals inside various shader
bodies, we can make use of LLVM AMDGCN macros.

This patch modularizes the shader macros into seperated defines. Prior
to the core raw-string literal, each shader now starts with the
SHADER_START literal (".text\n") plus any number of SHADER_MACRO_*
literals. This allows us to seperate the macro definitions logically and
use the pre-processor to only include the required macro groups on a
per-shader basis.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I19eb3fd14252a0601bb7509249051b68e7fdb02a
2023-04-14 12:05:08 -04:00
David Francis 680c8ca5a9 kfdtest: Make queue evict tests use constant number of wavefronts.
Previously, KFDEvictTest.QueueTest and KFDSVMEvictTest.QueueTest
would create a variable number of wavefronts, one for each 64MB
of memory under test. This ran into limits on the buffers used
by the wavefronts, and may at some point have exceeded the
wavefront limit.

Restrict the number of wavefronts to 512, and adjust the shader
to accomodate a variable buffer size

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I2ec292e2900e2efa62a08313bca3d2f4bdabca8b
2023-04-14 12:05:08 -04:00
Graham Sider 831d1ad352 libhsakmt: Mask stepping version for GC 9.4.3 checks
GC 9.4.3 to set gfx target version to 9.4.x dependent on revision and
capabilities. Due to this, where applicable, mask off the gfx target
stepping version and only check major/minor version (9.4). There are no
collisions due to this change since GC 9.4.3 is the only ASIC that uses
gfx target version 9.4.x.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I72803e594c421f054d18ccfa7e92c507128fa5be
2023-04-14 12:03:23 -04:00
Philip Yang 598e3e8d86 kfdtest: KFDMemoryTest.DeviceHdpFlush requires large bar
KFDMemoryTest.DeviceHdpFlush requires device node 0 is large bar to
check VRAM content from CPU, run the test only if device 0 is large
bar GPU.

Change-Id: I874b153219550c50b724625e971e3ed3a84dc652
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2023-04-14 10:03:38 -04:00
David Francis e32278a612 kfdtest: Restrict DriverHDPFlush to systems with PCIe
Nodes with XGMI have no HDP, so DriverHDPFlush should skip.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: If5a87e660712e51d03e750d8e044786036b2e603
2023-04-14 10:03:38 -04:00
David Francis 16c6530330 kfdtest: Deprecate PollNCMemoryIsa
Even with the restriction to only compile on gfx90a, this
shader still fails CompileShaders test.

There don't seem to be any systems that actually use it.

Leave it in the shader store, but remove it otherwise

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I41bec6ba10363d42b163ac101c3a92edaad6d6df
2023-04-14 10:03:38 -04:00
David Francis 2a01e5c33b kfdtest: Use scalar path for PollMemoryIsa Shader on gfx940
A gfx940 code path was erroneously added to this shader.

It's unneccesary; without this path, the shader uses
the scalar store, which works just fine on gfx940 without changes.

Remove it.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I825cbbebbdb25c4a7c2f16e228c2bea6a6bcc30c
2023-04-14 10:03:38 -04:00
Ori Messinger c234f84245 kfdtest: Update blacklist for Aqua Vanjanran
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I8f822bb71e8e5dbee6bdb62f77cbe5ea83faabb5
2023-04-14 10:03:38 -04:00
David Francis 30da9a3cf9 kfdtest: Update shaders to compile on gfx940
gfx940 changed the semantics of the glc and slc coherency options
on vector stores and loads. This means that shaders that use
those bits no longer compile on gfx940.

Add precompilation if statements to those shaders to use the
new coherency bits.

Also add gfx940 to ASMTest so that compilation is tested.

Note: One of the tests enabled by this patch on gfx940,
KFDEvictTest.QueueTest, does not pass on gfx940 emulators.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I942f9d2536e9eb5510c4d5af30df6ff1a95c8cf7
2023-04-14 10:03:38 -04:00
Graham Sider ae659e5427 libhsakmt: Fix queue destroy SVM path free size
Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0
2023-04-14 10:03:38 -04:00
Mukul Joshi a713fb766e libhsakmt: Fix memory leak on queue destroy for GFX9.4.3
Currently, on queue destroy, context save restore memory is freed
only for a single XCC. Instead, we need to free the entire context
save restore memory, which was allocated for all XCCs.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062
2023-04-14 10:03:38 -04:00
David Belanger 252a2cf959 libhsakmt: EOP Removal
Do not allocate the EOP buffer when not required.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e
2023-04-14 10:03:38 -04:00
Mukul Joshi 8994c3ba0e kfdtest: Program COMPUTE_PGM_RSRC3 for GFX 9.4.3
Program ACCUM_OFFSET to match the number of VGPRS used
by the shader as part of Dispatch setup.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: Icfa1fbe4de2a62f00743de567f3ed382d3378b17
2023-04-14 10:03:38 -04:00
David Yat Sin f43a284b8e Change error reported when receiving code 128
We used to report HSA_STATUS_ERROR_INVALID_ISA when receiving error code
128, but there are several other reasons why we could be exceeding
number of VGPRs, so updating the error code.

Change-Id: I6a6980d5b07b09c93d00dee5207a0d52399bc77e
2023-04-14 09:12:07 -04:00
Graham Sider fd48f14ceb libhsakmt: Update FD creation logic
In multi-partition modes, e.g. CPX, we want to create new file
descriptor despite using the same render node. Update
open_drm_render_device to use a gpu_id to fd map partitioned by render
node. Different gpu_id's requesting the same render node will be added
to that render node's map list for fetching its fd. Different gpu_id's
requesting different render nodes as well as the same gpu_id's
requesting the same render node will behave as they did previously.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ie153d42355d4d75b1c6ba6ff40fac3295bc87009
2023-04-13 15:25:09 -04:00
Mukul Joshi 97a669a979 libhsakmt: Update context save handling for multi XCC
Allocate debug area big enough for all XCCs in the partition. Also, fix
the cu_num calculations as driver now reports cu_num as the total number
of CUs in the partition.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b
2023-04-13 15:25:09 -04:00
Graham Sider 6be4461a0d libhsakmt: Add Aqua Vanjaram support
Add gfx version for VGPR size per CU calc, add FAMILY_AV to KfdFamilyId,
add blacklist filter to kfdtest.exclude.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9b8072e45f4d497e0a8fd3f8f97f1425238e8b42
2023-04-13 15:25:09 -04:00
David Yat Sin 511855d344 Fix assertion when _GLIBCXX_ASSERTIONS is enabled
One some platforms, e.g Arch Linux, -D_GLIBCXX_ASSERTIONS compile flag
is enabled by default, causing a runtime assertion.
Avoid assertion by using std::vector accessor function data().

Change-Id: I118cdf102c3e353f32c618823e363ee1059f3453
2023-04-11 11:40:10 +00:00
David Yat Sin c5bf7eb112 Fix for overwriting pointer info size
Fix for overwriting pointer info size provided by caller of
hsa_amd_pointer_info.

Change-Id: I2e5d73ab9ba1a32bc9b4d112bc29b4a99fd8b3b5
2023-04-06 16:35:37 -04:00
David Yat Sin 8ebf5f9c48 Adding scratch memory reservation
Some applications will keep trying to allocate device memory until the
allocation fails. This causes all device memory to be used up and we are
then unable to allocate scratch memory for dispatches. Reserve enough
memory for 1 small scratch allocation.

Change-Id: I968400d41540ba1aca8f28581f229693eec02225
2023-04-06 15:13:36 +00:00
Kent Russell d0c2770cde CMakeLists: Use pkgconfig more effectively with DRM_DIR
Instead of hard-coding lib64 and other include locations, just prepend
the DRM_DIR to the beginning of the CMake prefix path. Then let
pkgconfig find the package, the same way that it would if DRM_DIR wasn't
set. DRM_DIR takes precedence, but the default paths will be used if
DRM_DIR isn't set, or doesn't point to where libdrm is housed

Note that /lib and /lib/$ARCH aren't required for DRM_DIR, just the
path to the root folder for the package (e.g. /opt/amdgpu instead of
/opt/amdgpu/lib or /opt/amdgpu/lib64 or /opt/amdgpu/lib/x86_64-linux-gnu
etc)

Change-Id: I56767db28476d14e3fa77be1089c3904e2a32450
2023-04-06 10:39:40 -04:00
Kent Russell aab0e36538 README: Update README to point to current documentation
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I3fed80e94edf5ff08a70b2e43450fe8168c5d355
2023-04-05 10:35:49 -04:00
Graham Sider 287cb29340 Revert "kfdtest: add MES judging API in test utility."
See description of previous revert.

This reverts commit 564913526a.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I969dc6469e62b50cd7ba0595918538602afa7516
2023-03-27 17:08:03 -04:00
Graham Sider 0750856d4a Revert "kfdtest: Using non-paged memory allocation only on devices that have MES scheduler"
This patch and the previous made it such that the queue ring buffer was
allocated as non-paged for GFX11+. The queue ring buffer should not be
mapped as non-paged; the non-paged requirement on GFX11 is only needed
for the queue wptr.

This patch was causing issues on various tests, such as intermittent
CP_INTSRC_BAD_OPCODE interrupts.

This reverts commit e40ae8481e.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I55b64aed73dc3b792f0756ae00daf6e10d93ce10
2023-03-27 17:07:59 -04:00
Graham Sider 5d80a4d214 kfdtest: Add KFDQMTest.BasicCuMaskingEven to GFX11 blacklist
Test is inconsistent across ASICs. Add to blacklist to unblock QA.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I31e5aa2450165227107536bef8402db2c0dc6d7f
2023-03-23 11:14:58 -04:00
Alex Sierra 2a1d6ee8b5 libhsakmt: query svm info from userptrs at fault events
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab
2023-03-22 13:34:02 -05:00