Instead of hard-coding lib64 and other include locations, just prepend
the DRM_DIR to the beginning of the CMake prefix path. Then let
pkgconfig find the package, the same way that it would if DRM_DIR wasn't
set. DRM_DIR takes precedence, but the default paths will be used if
DRM_DIR isn't set, or doesn't point to where libdrm is housed
Note that /lib and /lib/$ARCH aren't required for DRM_DIR, just the
path to the root folder for the package (e.g. /opt/amdgpu instead of
/opt/amdgpu/lib or /opt/amdgpu/lib64 or /opt/amdgpu/lib/x86_64-linux-gnu
etc)
Change-Id: I56767db28476d14e3fa77be1089c3904e2a32450
See description of previous revert.
This reverts commit 564913526a.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I969dc6469e62b50cd7ba0595918538602afa7516
This patch and the previous made it such that the queue ring buffer was
allocated as non-paged for GFX11+. The queue ring buffer should not be
mapped as non-paged; the non-paged requirement on GFX11 is only needed
for the queue wptr.
This patch was causing issues on various tests, such as intermittent
CP_INTSRC_BAD_OPCODE interrupts.
This reverts commit e40ae8481e.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I55b64aed73dc3b792f0756ae00daf6e10d93ce10
Test is inconsistent across ASICs. Add to blacklist to unblock QA.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I31e5aa2450165227107536bef8402db2c0dc6d7f
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message
Change-Id: Ibb84241ba35aefb7a8450d68231e52242a634ed3
The MemoryAllocAll test in kfdtests exercises the new KFD memory
availability API by trying to allocate a single buffer object that
exactly fills all of vram. Desired object size is determined using the
memory availility KFD ioctl via libhsakmt, then an object is allocated
slightly larger than that size. If the allocation attempt fails then
the test tries to allocate a slightly smaller object, and continues
trying with smaller sizes until the allocation succeeds. The test
succeeds if the successfully allocated object is within some specified
tolerance of the available memory reported.
There are a number of known issues that can cause the successfully
allocated object to be significantly smaller than reported availability.
Until these issues are addressed, we should not fail the test, but just
log the actual divergence between the size of the object we thought we
could allocate, and what was actually possible.
Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: I165a30865ffbb2353286dcc896ad8e24af124615
Since KFD counts svm allocation as system memory usage,
KFDSVMEvictTest will fail on the case of small system
memory, adding check is to skip test.
Signed-off-by: Eric Huang <jinhuieric.Huang@amd.com>
Change-Id: I040f16f2dd0d4092d069a632cfba9c28293f781b
Implement hsaKmtExportDMABufHandle, which can be used for a new
upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary
virtual address along with the offset of the address within the
allocation. It also checks that the size of the intended export does
not exceed the allocation.
This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl
API version 1.12.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6
Remove BLACKLIST_GFX10_NV2X from GFX11 blacklists, update
BLACKLIST_GFX11 as needed.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I84bd91ba20a5d3df27478fb4c97afa12f8a3e76a
The Shader Engines number should be shadder array_count divided by simd_arrays_per_engine
not array_count.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I808d1fedd6b9843500719e902ecf759f5668a7d1
KFDTopologyTest.BasicTest duplicates Thunk logic to calculate VGPR size,
meaning it will always be the same, and SGPR size is a constant. Since
no benefit, remove comparisons.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I99e7ff6fb69ed07bc0716fdf43946b19c67b9268
Fixed VGPR memory size, size was too small for some GPU, causing a memory overflow.
Refactored macro code into a function.
Thanks to Jay Cornwall for locating the problem and proposing the fix.
Change-Id: Iffedea1c4f341967f02c56d810ff048225b02c16
Signed-off-by: David Belanger <david.belanger@amd.com>
This is a temporary work around for GPU hang issues observed on GFX11.
Change-Id: I98fbedbbd1c51fe402c2116b35ca548931a390c9
Signed-off-by: David Belanger <david.belanger@amd.com>
This reverts commit 7787a039bd.
It causes a regression in pytorch benchmark.
Change-Id: I96173dbd061cf38d6f451c02cb181ae51b7f625e
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
This reverts commit 178a619b80.
There are some openMP issues that were introduced after SVM userptr
feature was added.
Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5
This reverts commit 45fad29752.
There are some openMP issues that were introduced after SVM userptr
feature was added.
Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I6566c9f0d39d05ecb92f38159880763f432939a5
This reverts commit 8a746bdaed.
There are some openMP issues that were introduced after SVM userptr
feature was added.
Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Ib01046571d2c84fa0fd228ecba0dee0eae3f994d
Track Test Status in syslog, it will help understand
sys log assoicated with test cases.
Change-Id: I7c0749102db9bc73d6ae3a237ec347a8fefb12e9
Signed-off-by: James Zhu <James.Zhu@amd.com>
This is handled by __fmm_release calling aperture_release_area.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib8ed300e1734f03aeb9dfc8074897ece310b8af9
Use a common helper for CPU mappings to reduce duplicate code.
Consistently use MAP_SHARED for all render_fd mappings.
Remove double-mapping for AQL queue buffers on the CPU. This workaround
is only needed on the GPU.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iff86c8cc9f1e5c982614b3f11129bc2cf8cbba02
The NULL pointer check was the only way for that function to fail. And it
was done after the pointer was accessed. Simplify this by just returning
the result as a return value instead of using a pointer as output
parameter. This way the function can never fail and the caller doesn't
need to do any error handling.
Declare the function in libhsakmt.h instead of duplicating the
declaration in fmm.c.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I91b90d66166fd3b5cdc47c73a9bbc369c45b51fe
Setting this variable to '0' will force to disable memory
registration/allocation through SVM API mechanism.
Not setting this or setting to '1', SVM API will be used only if all
GPUs support it.
Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Icdf7656de09aa9988b567ec6c024953398e9bb48
Detect under-reporting of available memory by initially attempting to
allocate substantially more than reported available memory, and ensure
that the allocation fails. Continue shrinking the attempted allocation
until it succeeds, then fail the test if the successful allocation is
either too much more than or too much less than reported available.
Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: Ib418f0aa26e8db80590a6c5f2578da56a4b60f2b
When is hsaKmtCreateQueue called first time for node
doorbells[NodeId].size is initialized to zero in init_process_doorbells
but used to calculate the doorbell offset. It works just by accident
because doorbells[NodeId].size is uint32_t so -1 will be 0xFFFFFFFF which
is zero extended into 0x00000000FFFFFFFF and it will work as long as mmap
offset bits are not within lower 32 bits.
Bug: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/issues/78
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia791adfc51363d4704cb50fa4f01137b7dd48a75
Modifier scc is disabled from gfx90a's asm, so remove the
shader for gfx90a A+A and keep it for newer asics with scc
support.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iec3c7ccd5156a855adb2b02feb3db0761876aa2f
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition
Change-Id: I21025f4cefb40721f095130263b4247877979d36
When hsa is closed, it would close open fds for /dev/kfd but
not for /dev/dri/renderD*. This caused issues with CRIU
checkpoint, which expects that /dev/kfd will be open if
/dev/dri/renderD* is.
As a workaround for the CRIU behaviour, leave /dev/kfd open
when closing hsa.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: Ie1b2d5b1d8986750b0e560ae2934b7c73cff942e
To avoid confusion since this shader has changed to be persistent
(original IterateIsa may be re-used for debugger tests).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I4643692765fc7665933257e89d5b922e779ad2e5
Don't use the full path to link against libc, but rather let
cmake find it.
Regarding gcc_s, it doesn't seem like this is still needed, so I've
removed it instead.
Change-Id: I1dc594f10c647b2abfdab7c5e0de90c331c6eeaf
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
To be more alligned with ROCr, libdrm dev package appears to be
required, but we don't care if it's ours or the distro's. So require
either but recommend our package to get the latest version.
Change-Id: I744ce4861644a83ba94c39e0bf4230eab58cc68a
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I8e9df3c1c6c3f42d7b9235d12406d80d31746443
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14