커밋 그래프

2925 커밋

작성자 SHA1 메시지 날짜
Ori Messinger 378f9999ed kfdtest: Update blacklist for Aqua Vanjanran
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I8f822bb71e8e5dbee6bdb62f77cbe5ea83faabb5


[ROCm/ROCR-Runtime commit: c234f84245]
2023-04-14 10:03:38 -04:00
David Francis 78f489fb95 kfdtest: Update shaders to compile on gfx940
gfx940 changed the semantics of the glc and slc coherency options
on vector stores and loads. This means that shaders that use
those bits no longer compile on gfx940.

Add precompilation if statements to those shaders to use the
new coherency bits.

Also add gfx940 to ASMTest so that compilation is tested.

Note: One of the tests enabled by this patch on gfx940,
KFDEvictTest.QueueTest, does not pass on gfx940 emulators.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I942f9d2536e9eb5510c4d5af30df6ff1a95c8cf7


[ROCm/ROCR-Runtime commit: 30da9a3cf9]
2023-04-14 10:03:38 -04:00
Graham Sider 543fe60c96 libhsakmt: Fix queue destroy SVM path free size
Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0


[ROCm/ROCR-Runtime commit: ae659e5427]
2023-04-14 10:03:38 -04:00
Mukul Joshi dc0d800908 libhsakmt: Fix memory leak on queue destroy for GFX9.4.3
Currently, on queue destroy, context save restore memory is freed
only for a single XCC. Instead, we need to free the entire context
save restore memory, which was allocated for all XCCs.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062


[ROCm/ROCR-Runtime commit: a713fb766e]
2023-04-14 10:03:38 -04:00
David Belanger 7f91f54b27 libhsakmt: EOP Removal
Do not allocate the EOP buffer when not required.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e


[ROCm/ROCR-Runtime commit: 252a2cf959]
2023-04-14 10:03:38 -04:00
Mukul Joshi c5b17ec68f kfdtest: Program COMPUTE_PGM_RSRC3 for GFX 9.4.3
Program ACCUM_OFFSET to match the number of VGPRS used
by the shader as part of Dispatch setup.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: Icfa1fbe4de2a62f00743de567f3ed382d3378b17


[ROCm/ROCR-Runtime commit: 8994c3ba0e]
2023-04-14 10:03:38 -04:00
David Yat Sin 6812573d06 Change error reported when receiving code 128
We used to report HSA_STATUS_ERROR_INVALID_ISA when receiving error code
128, but there are several other reasons why we could be exceeding
number of VGPRs, so updating the error code.

Change-Id: I6a6980d5b07b09c93d00dee5207a0d52399bc77e


[ROCm/ROCR-Runtime commit: f43a284b8e]
2023-04-14 09:12:07 -04:00
Graham Sider 89ce41694f libhsakmt: Update FD creation logic
In multi-partition modes, e.g. CPX, we want to create new file
descriptor despite using the same render node. Update
open_drm_render_device to use a gpu_id to fd map partitioned by render
node. Different gpu_id's requesting the same render node will be added
to that render node's map list for fetching its fd. Different gpu_id's
requesting different render nodes as well as the same gpu_id's
requesting the same render node will behave as they did previously.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ie153d42355d4d75b1c6ba6ff40fac3295bc87009


[ROCm/ROCR-Runtime commit: fd48f14ceb]
2023-04-13 15:25:09 -04:00
Mukul Joshi 0f7cfe5e4b libhsakmt: Update context save handling for multi XCC
Allocate debug area big enough for all XCCs in the partition. Also, fix
the cu_num calculations as driver now reports cu_num as the total number
of CUs in the partition.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b


[ROCm/ROCR-Runtime commit: 97a669a979]
2023-04-13 15:25:09 -04:00
Graham Sider ea4e2a82bb libhsakmt: Add Aqua Vanjaram support
Add gfx version for VGPR size per CU calc, add FAMILY_AV to KfdFamilyId,
add blacklist filter to kfdtest.exclude.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9b8072e45f4d497e0a8fd3f8f97f1425238e8b42


[ROCm/ROCR-Runtime commit: 6be4461a0d]
2023-04-13 15:25:09 -04:00
David Yat Sin 6c4528ba33 Fix assertion when _GLIBCXX_ASSERTIONS is enabled
One some platforms, e.g Arch Linux, -D_GLIBCXX_ASSERTIONS compile flag
is enabled by default, causing a runtime assertion.
Avoid assertion by using std::vector accessor function data().

Change-Id: I118cdf102c3e353f32c618823e363ee1059f3453


[ROCm/ROCR-Runtime commit: 511855d344]
2023-04-11 11:40:10 +00:00
David Yat Sin f84f83702c Fix for overwriting pointer info size
Fix for overwriting pointer info size provided by caller of
hsa_amd_pointer_info.

Change-Id: I2e5d73ab9ba1a32bc9b4d112bc29b4a99fd8b3b5


[ROCm/ROCR-Runtime commit: c5bf7eb112]
2023-04-06 16:35:37 -04:00
David Yat Sin d476ff16eb Adding scratch memory reservation
Some applications will keep trying to allocate device memory until the
allocation fails. This causes all device memory to be used up and we are
then unable to allocate scratch memory for dispatches. Reserve enough
memory for 1 small scratch allocation.

Change-Id: I968400d41540ba1aca8f28581f229693eec02225


[ROCm/ROCR-Runtime commit: 8ebf5f9c48]
2023-04-06 15:13:36 +00:00
Kent Russell c758f24222 CMakeLists: Use pkgconfig more effectively with DRM_DIR
Instead of hard-coding lib64 and other include locations, just prepend
the DRM_DIR to the beginning of the CMake prefix path. Then let
pkgconfig find the package, the same way that it would if DRM_DIR wasn't
set. DRM_DIR takes precedence, but the default paths will be used if
DRM_DIR isn't set, or doesn't point to where libdrm is housed

Note that /lib and /lib/$ARCH aren't required for DRM_DIR, just the
path to the root folder for the package (e.g. /opt/amdgpu instead of
/opt/amdgpu/lib or /opt/amdgpu/lib64 or /opt/amdgpu/lib/x86_64-linux-gnu
etc)

Change-Id: I56767db28476d14e3fa77be1089c3904e2a32450


[ROCm/ROCR-Runtime commit: d0c2770cde]
2023-04-06 10:39:40 -04:00
Kent Russell a4edd5bcce README: Update README to point to current documentation
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I3fed80e94edf5ff08a70b2e43450fe8168c5d355


[ROCm/ROCR-Runtime commit: aab0e36538]
2023-04-05 10:35:49 -04:00
Graham Sider 3e75f92cf2 Revert "kfdtest: add MES judging API in test utility."
See description of previous revert.

This reverts commit 8554f0df14.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I969dc6469e62b50cd7ba0595918538602afa7516


[ROCm/ROCR-Runtime commit: 287cb29340]
2023-03-27 17:08:03 -04:00
Graham Sider c298048035 Revert "kfdtest: Using non-paged memory allocation only on devices that have MES scheduler"
This patch and the previous made it such that the queue ring buffer was
allocated as non-paged for GFX11+. The queue ring buffer should not be
mapped as non-paged; the non-paged requirement on GFX11 is only needed
for the queue wptr.

This patch was causing issues on various tests, such as intermittent
CP_INTSRC_BAD_OPCODE interrupts.

This reverts commit 92a336d485.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I55b64aed73dc3b792f0756ae00daf6e10d93ce10


[ROCm/ROCR-Runtime commit: 0750856d4a]
2023-03-27 17:07:59 -04:00
Graham Sider 332effdb67 kfdtest: Add KFDQMTest.BasicCuMaskingEven to GFX11 blacklist
Test is inconsistent across ASICs. Add to blacklist to unblock QA.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I31e5aa2450165227107536bef8402db2c0dc6d7f


[ROCm/ROCR-Runtime commit: 5d80a4d214]
2023-03-23 11:14:58 -04:00
Alex Sierra e4c4b6369d libhsakmt: query svm info from userptrs at fault events
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab


[ROCm/ROCR-Runtime commit: 2a1d6ee8b5]
2023-03-22 13:34:02 -05:00
Alex Sierra 9d8a548b17 src: use SVM mechanism to register userptr memory
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9


[ROCm/ROCR-Runtime commit: 63c8cf115a]
2023-03-22 13:33:35 -05:00
Konstantin Zhuravlyov 536f0aa118 Loader: Skip vdso.so code objects in GetUriFromMemoryInExecutableFile
Change-Id: Ie2cac880c406ed90d6fa614707fa8df7b87458da


[ROCm/ROCR-Runtime commit: a5932ef5ef]
2023-03-17 09:57:15 -04:00
Lang Yu 44b940e033 Switch to completion signal wait for amd_aql_pm4_ib processing
Wait on completion signal for amd_aql_pm4_ib processing
on ASICs with gfx version >= 9.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia704d9cc5b2535dcf8564a30f694262b113f77a2


[ROCm/ROCR-Runtime commit: aec7200cb2]
2023-03-16 20:23:53 -04:00
Jonathan Kim ad1a3fc9c4 Fix Invalid Engine Offset Check
Engine offset that is the maximum number of engines is still valid
as offset enum 0 is occupied by blit copies so raise the limit by 1.

Change-Id: I6fcab106290e6647702efe297a4281861da4e0b8


[ROCm/ROCR-Runtime commit: fc8f3f9fd5]
2023-03-16 09:50:10 -04:00
Shweta Khatri 2bb7d9cfe8 By default, disable mwaitx feature.
This can be enabled by setting HSA_ENABLE_MWAITX=1

Change-Id: I4be00892780beeb8b14c3c5f34aa10b158921bff


[ROCm/ROCR-Runtime commit: 83a307c449]
2023-03-15 19:57:25 -04:00
AravindanC 768d8982e1 ASAN Packaging for libhsakmt
Change-Id: I0a6232cdb61742aa81394bb49d2b5e890b6ada6f


[ROCm/ROCR-Runtime commit: 0f977fd1d8]
2023-03-14 20:04:51 -07:00
Ranjith Ramakrishnan 1851235878 ASAN packaging for hsa
Package ASAN libraries and license file
Suffix "asan" added to package name

Change-Id: I2af416d86a9068a41e3880836a21c9005e45271b


[ROCm/ROCR-Runtime commit: dd9b7b3b3a]
2023-03-13 23:32:30 -07:00
Ranjith Ramakrishnan 92cc423428 Compile time flag to switch between #warning and #error message
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message

Change-Id: Ibb84241ba35aefb7a8450d68231e52242a634ed3


[ROCm/ROCR-Runtime commit: c911848242]
2023-03-10 13:09:13 -08:00
Ranjith Ramakrishnan 0ba8268db3 Compile time flag to switch between #warning and #error message
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message

Change-Id: Ib48e361b72176e2845c8f74f980f0234e7eb4a7d


[ROCm/ROCR-Runtime commit: 629ddde072]
2023-03-10 08:39:54 -08:00
Konstantin Zhuravlyov c05cf2ea0f ISA/NFC: Change tabs to spaces
Change-Id: Iabc541ec78607881a2828cd79916a928b39dcfcb


[ROCm/ROCR-Runtime commit: 7e403f08a6]
2023-03-08 19:39:15 -05:00
Konstantin Zhuravlyov d861267d20 Loader/NFC: Factor out mach information into the struct
Change-Id: I9304c96336c434570bd5da92cd197ee764945907


[ROCm/ROCR-Runtime commit: 8043fe9ee0]
2023-03-07 14:41:03 -05:00
Sean Keely deee152909 Add support for exporting portable handles to GPU allocations.
Adds hsa_amd_portable_export_dmabuf and hsa_amd_portable_close_dmabuf
which allow obtaining dmabuf handles to rocr allocations.  These handles
may be shared with other APIs to support cross vendor & cross device
memory sharing.
Adds query to return whether dmabuf export is supported

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>

Change-Id: I7f98501087d9563d07fc2cb428cc886b1e518b1e


[ROCm/ROCR-Runtime commit: 42243c1e8f]
2023-03-06 12:39:01 -05:00
Jonathan Kim 57064af98d Fix Engine Offsetting for Copy on Engine
Forgot SDMA blit engine indices are offset by DevToDev 0-position in
a couple of places.

Change-Id: Ie811d8281bc812738ed0107694f3dffde5e93685


[ROCm/ROCR-Runtime commit: 7364a93b98]
2023-03-03 20:45:35 -05:00
Daniel Phillips 7691c4600f kfdtests: Relax MemoryAllocAll failure criteria
The MemoryAllocAll test in kfdtests exercises the new KFD memory
availability API by trying to allocate a single buffer object that
exactly fills all of vram. Desired object size is determined using the
memory availility KFD ioctl via libhsakmt, then an object is allocated
slightly larger than that size. If the allocation attempt fails then
the test tries to allocate a slightly smaller object, and continues
trying with smaller sizes until the allocation succeeds. The test
succeeds if the successfully allocated object is within some specified
tolerance of the available memory reported.

There are a number of known issues that can cause the successfully
allocated object to be significantly smaller than reported availability.
Until these issues are addressed, we should not fail the test, but just
log the actual divergence between the size of the object we thought we
could allocate, and what was actually possible.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: I165a30865ffbb2353286dcc896ad8e24af124615


[ROCm/ROCR-Runtime commit: d3bb1ca4af]
2023-03-03 15:24:39 -08:00
Eric Huang 9e41c799a0 kfdtest: add the check for svm usage limit
Since KFD counts svm allocation as system memory usage,
KFDSVMEvictTest will fail on the case of small system
memory, adding check is to skip test.

Signed-off-by: Eric Huang <jinhuieric.Huang@amd.com>
Change-Id: I040f16f2dd0d4092d069a632cfba9c28293f781b


[ROCm/ROCR-Runtime commit: 3f55ba9fb8]
2023-03-03 11:03:17 -05:00
Yifan Zhang 7daedd5eef gfx11 is able to perform atomic ops even PCI reports no atomic support.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ie0d8af5a64ed717b140ac14db654c65ec7aa5ebb


[ROCm/ROCR-Runtime commit: 9f0f7741de]
2023-03-02 09:23:37 -05:00
Felix Kuehling ecf502f50d kfdtest: Add test for hsaKmtExportDMABufHandle
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia87377c1d4201fecfa00c2e0ca53b507608df2b3


[ROCm/ROCR-Runtime commit: e5ab87ede7]
2023-02-27 14:44:11 -05:00
Felix Kuehling caf8b70da7 libhsakmt: Implement dmabuf export for RDMA
Implement hsaKmtExportDMABufHandle, which can be used for a new
upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary
virtual address along with the offset of the address within the
allocation. It also checks that the size of the intended export does
not exceed the allocation.

This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl
API version 1.12.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6


[ROCm/ROCR-Runtime commit: 332f59eb2a]
2023-02-27 14:44:11 -05:00
Yifan Zhang 92a336d485 kfdtest: Using non-paged memory allocation only on devices that have MES scheduler
Change-Id: I9181b353aac791f546aa7679ffd7cb8d9f8ef765


[ROCm/ROCR-Runtime commit: e40ae8481e]
2023-02-27 10:32:15 +08:00
Yifan Zhang 8554f0df14 kfdtest: add MES judging API in test utility.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I978fc85b7c81ea65b97953a50d2d0312bcba95bf


[ROCm/ROCR-Runtime commit: 564913526a]
2023-02-26 21:22:39 -05:00
kent.russell@amd.com b6467c7bc8 Add check for available_memory API
If the KFD IOCTL version doesn't support available_memory, don't run the
test. Just skip the test

Change-Id: Iebf526d4563ab9f3c054bbfb38c214a1b893fcb5


[ROCm/ROCR-Runtime commit: 64aa9009e1]
2023-02-23 15:19:28 -05:00
David Yat Sin 908a2b1eba Revert "Add flag for external memory allocations"
This reverts commit 000f4c0547.

Change-Id: I32a92672553c4c38ffae53a085f83c0403c160ae


[ROCm/ROCR-Runtime commit: 7ed6d73b6d]
2023-02-23 11:31:15 -05:00
Graham Sider 179ddc4870 kfdtest: Update GFX11 blacklists
Remove BLACKLIST_GFX10_NV2X from GFX11 blacklists, update
BLACKLIST_GFX11 as needed.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I84bd91ba20a5d3df27478fb4c97afa12f8a3e76a


[ROCm/ROCR-Runtime commit: 60831e86b2]
2023-02-23 09:11:27 -05:00
David Yat Sin a2e4ba149c Revert "Enforce uncached memory on AllocatePCIeRW request"
This reverts commit 36da397f96.

Change-Id: I5a7fe9e99685f589f95dd89eacf04d44e5587f2f


[ROCm/ROCR-Runtime commit: 37b5b421b3]
2023-02-22 21:55:48 -05:00
David Yat Sin d022746cb7 Use mwaitx when busy-waiting signals
Use mwaitx instructions when busy waiting for signals to reduce CPU
energy usage.
This can be disabled by setting HSA_ENABLE_MWAITX=0

Change-Id: Ic207895a491b2bf6dacba47ef0921df3faad5b5a


[ROCm/ROCR-Runtime commit: cc48dfdbff]
2023-02-22 16:55:43 +00:00
David Yat Sin 72e7fe7aec Add function for parse CPUID information
Used to detect whether mwaitx instruction is supported

Change-Id: I66fe906325aa523c8815133cf782df3a17a7edab


[ROCm/ROCR-Runtime commit: 0ed1568afc]
2023-02-22 16:55:42 +00:00
Yifan Zhang 90e271e19b Fix MemoryConcurrentTest failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I85b8f9f1ff0fbb5a063b310aa6f72b9b5cdc13b4


[ROCm/ROCR-Runtime commit: d0330d7958]
2023-02-16 20:23:38 +08:00
Yifan Zhang 565665f141 Fix rocrtstPerf.Memory_Async_Copy failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ieec5b76f0e058d5655145b51fdea48e3d87560b4


[ROCm/ROCR-Runtime commit: 83cb79510e]
2023-02-16 20:18:04 +08:00
Yifan Zhang e93868b503 Fix rocrtstFunc.Memory_Available failure for APUs w/ small VRAM.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I0e9d5f1880c0e484e88ed424888d94d1bcac4d53


[ROCm/ROCR-Runtime commit: 9bab46130a]
2023-02-16 20:16:28 +08:00
Yifan Zhang 94d5ab8c9e Avoid memory leak when rocrtstFunc.Memory_Available fails
Assert abort the test thread w/ memPtr1 allocated. Free memPtr1
to avoid memory leak.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I4e1a202c1acb9ba71a23e112254f875bf5a0abcf


[ROCm/ROCR-Runtime commit: afae35b0fd]
2023-02-16 20:13:15 +08:00
Yifan Zhang 04ea6db7e6 Fix rocrtstFunc.Memory_Max_Mem failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I1c0f481af8b1d2a0939d28fb184ff6887747ab03


[ROCm/ROCR-Runtime commit: 4ebb9857ee]
2023-02-16 20:12:19 +08:00