커밋 그래프

2959 커밋

작성자 SHA1 메시지 날짜
Alex Sierra 63c8cf115a src: use SVM mechanism to register userptr memory
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9
2023-03-22 13:33:35 -05:00
Konstantin Zhuravlyov a5932ef5ef Loader: Skip vdso.so code objects in GetUriFromMemoryInExecutableFile
Change-Id: Ie2cac880c406ed90d6fa614707fa8df7b87458da
2023-03-17 09:57:15 -04:00
Lang Yu aec7200cb2 Switch to completion signal wait for amd_aql_pm4_ib processing
Wait on completion signal for amd_aql_pm4_ib processing
on ASICs with gfx version >= 9.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia704d9cc5b2535dcf8564a30f694262b113f77a2
2023-03-16 20:23:53 -04:00
Jonathan Kim fc8f3f9fd5 Fix Invalid Engine Offset Check
Engine offset that is the maximum number of engines is still valid
as offset enum 0 is occupied by blit copies so raise the limit by 1.

Change-Id: I6fcab106290e6647702efe297a4281861da4e0b8
2023-03-16 09:50:10 -04:00
Shweta Khatri 83a307c449 By default, disable mwaitx feature.
This can be enabled by setting HSA_ENABLE_MWAITX=1

Change-Id: I4be00892780beeb8b14c3c5f34aa10b158921bff
2023-03-15 19:57:25 -04:00
AravindanC 0f977fd1d8 ASAN Packaging for libhsakmt
Change-Id: I0a6232cdb61742aa81394bb49d2b5e890b6ada6f
2023-03-14 20:04:51 -07:00
Ranjith Ramakrishnan dd9b7b3b3a ASAN packaging for hsa
Package ASAN libraries and license file
Suffix "asan" added to package name

Change-Id: I2af416d86a9068a41e3880836a21c9005e45271b
2023-03-13 23:32:30 -07:00
Ranjith Ramakrishnan c911848242 Compile time flag to switch between #warning and #error message
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message

Change-Id: Ibb84241ba35aefb7a8450d68231e52242a634ed3
2023-03-10 13:09:13 -08:00
Ranjith Ramakrishnan 629ddde072 Compile time flag to switch between #warning and #error message
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message

Change-Id: Ib48e361b72176e2845c8f74f980f0234e7eb4a7d
2023-03-10 08:39:54 -08:00
Konstantin Zhuravlyov 7e403f08a6 ISA/NFC: Change tabs to spaces
Change-Id: Iabc541ec78607881a2828cd79916a928b39dcfcb
2023-03-08 19:39:15 -05:00
Konstantin Zhuravlyov 8043fe9ee0 Loader/NFC: Factor out mach information into the struct
Change-Id: I9304c96336c434570bd5da92cd197ee764945907
2023-03-07 14:41:03 -05:00
Sean Keely 42243c1e8f Add support for exporting portable handles to GPU allocations.
Adds hsa_amd_portable_export_dmabuf and hsa_amd_portable_close_dmabuf
which allow obtaining dmabuf handles to rocr allocations.  These handles
may be shared with other APIs to support cross vendor & cross device
memory sharing.
Adds query to return whether dmabuf export is supported

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>

Change-Id: I7f98501087d9563d07fc2cb428cc886b1e518b1e
2023-03-06 12:39:01 -05:00
Jonathan Kim 7364a93b98 Fix Engine Offsetting for Copy on Engine
Forgot SDMA blit engine indices are offset by DevToDev 0-position in
a couple of places.

Change-Id: Ie811d8281bc812738ed0107694f3dffde5e93685
2023-03-03 20:45:35 -05:00
Daniel Phillips d3bb1ca4af kfdtests: Relax MemoryAllocAll failure criteria
The MemoryAllocAll test in kfdtests exercises the new KFD memory
availability API by trying to allocate a single buffer object that
exactly fills all of vram. Desired object size is determined using the
memory availility KFD ioctl via libhsakmt, then an object is allocated
slightly larger than that size. If the allocation attempt fails then
the test tries to allocate a slightly smaller object, and continues
trying with smaller sizes until the allocation succeeds. The test
succeeds if the successfully allocated object is within some specified
tolerance of the available memory reported.

There are a number of known issues that can cause the successfully
allocated object to be significantly smaller than reported availability.
Until these issues are addressed, we should not fail the test, but just
log the actual divergence between the size of the object we thought we
could allocate, and what was actually possible.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: I165a30865ffbb2353286dcc896ad8e24af124615
2023-03-03 15:24:39 -08:00
Eric Huang 3f55ba9fb8 kfdtest: add the check for svm usage limit
Since KFD counts svm allocation as system memory usage,
KFDSVMEvictTest will fail on the case of small system
memory, adding check is to skip test.

Signed-off-by: Eric Huang <jinhuieric.Huang@amd.com>
Change-Id: I040f16f2dd0d4092d069a632cfba9c28293f781b
2023-03-03 11:03:17 -05:00
Yifan Zhang 9f0f7741de gfx11 is able to perform atomic ops even PCI reports no atomic support.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ie0d8af5a64ed717b140ac14db654c65ec7aa5ebb
2023-03-02 09:23:37 -05:00
Felix Kuehling e5ab87ede7 kfdtest: Add test for hsaKmtExportDMABufHandle
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia87377c1d4201fecfa00c2e0ca53b507608df2b3
2023-02-27 14:44:11 -05:00
Felix Kuehling 332f59eb2a libhsakmt: Implement dmabuf export for RDMA
Implement hsaKmtExportDMABufHandle, which can be used for a new
upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary
virtual address along with the offset of the address within the
allocation. It also checks that the size of the intended export does
not exceed the allocation.

This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl
API version 1.12.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6
2023-02-27 14:44:11 -05:00
Yifan Zhang e40ae8481e kfdtest: Using non-paged memory allocation only on devices that have MES scheduler
Change-Id: I9181b353aac791f546aa7679ffd7cb8d9f8ef765
2023-02-27 10:32:15 +08:00
Yifan Zhang 564913526a kfdtest: add MES judging API in test utility.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I978fc85b7c81ea65b97953a50d2d0312bcba95bf
2023-02-26 21:22:39 -05:00
kent.russell@amd.com 64aa9009e1 Add check for available_memory API
If the KFD IOCTL version doesn't support available_memory, don't run the
test. Just skip the test

Change-Id: Iebf526d4563ab9f3c054bbfb38c214a1b893fcb5
2023-02-23 15:19:28 -05:00
David Yat Sin 7ed6d73b6d Revert "Add flag for external memory allocations"
This reverts commit 59685f4492.

Change-Id: I32a92672553c4c38ffae53a085f83c0403c160ae
2023-02-23 11:31:15 -05:00
Graham Sider 60831e86b2 kfdtest: Update GFX11 blacklists
Remove BLACKLIST_GFX10_NV2X from GFX11 blacklists, update
BLACKLIST_GFX11 as needed.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I84bd91ba20a5d3df27478fb4c97afa12f8a3e76a
2023-02-23 09:11:27 -05:00
David Yat Sin 37b5b421b3 Revert "Enforce uncached memory on AllocatePCIeRW request"
This reverts commit ed0a1be2c3.

Change-Id: I5a7fe9e99685f589f95dd89eacf04d44e5587f2f
2023-02-22 21:55:48 -05:00
David Yat Sin cc48dfdbff Use mwaitx when busy-waiting signals
Use mwaitx instructions when busy waiting for signals to reduce CPU
energy usage.
This can be disabled by setting HSA_ENABLE_MWAITX=0

Change-Id: Ic207895a491b2bf6dacba47ef0921df3faad5b5a
2023-02-22 16:55:43 +00:00
David Yat Sin 0ed1568afc Add function for parse CPUID information
Used to detect whether mwaitx instruction is supported

Change-Id: I66fe906325aa523c8815133cf782df3a17a7edab
2023-02-22 16:55:42 +00:00
Yifan Zhang d0330d7958 Fix MemoryConcurrentTest failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I85b8f9f1ff0fbb5a063b310aa6f72b9b5cdc13b4
2023-02-16 20:23:38 +08:00
Yifan Zhang 83cb79510e Fix rocrtstPerf.Memory_Async_Copy failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ieec5b76f0e058d5655145b51fdea48e3d87560b4
2023-02-16 20:18:04 +08:00
Yifan Zhang 9bab46130a Fix rocrtstFunc.Memory_Available failure for APUs w/ small VRAM.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I0e9d5f1880c0e484e88ed424888d94d1bcac4d53
2023-02-16 20:16:28 +08:00
Yifan Zhang afae35b0fd Avoid memory leak when rocrtstFunc.Memory_Available fails
Assert abort the test thread w/ memPtr1 allocated. Free memPtr1
to avoid memory leak.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I4e1a202c1acb9ba71a23e112254f875bf5a0abcf
2023-02-16 20:13:15 +08:00
Yifan Zhang 4ebb9857ee Fix rocrtstFunc.Memory_Max_Mem failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I1c0f481af8b1d2a0939d28fb184ff6887747ab03
2023-02-16 20:12:19 +08:00
Lang Yu 8501c0bcb1 Fix memory async copy test performance issue
Copying memory from device to host with a CPU agent
would cause a poor performance due to the reading of
uncahced device memory by CPU.

Fix it by using a GPU agent.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia3b562758fe73ef9efaa284f47e67bf569cc7b7b
2023-02-15 22:24:55 -05:00
Ranjith Ramakrishnan 3636d487c9 File reorg backward compatibility message changed to #error
Change-Id: I699dee834865ee573a516d58b8b8faa1da4f288a
2023-02-14 21:46:43 -08:00
David Yat Sin 53d53655d7 Fix for unitialized variables
Change-Id: Ie8a004db699248d0cde4213077520ea503754399
2023-02-14 14:19:31 +00:00
Jonathan Kim 30920fc94d Add interface to DMA copy directly to a target engine.
Change-Id: Ic87cfeabb11c1a465f98f3f444d39955f5300525
2023-02-13 13:50:49 -05:00
Jonathan Kim 8f27f495c6 Make SDMA engine availability status queryable.
Report the availability of SDMA engines for memory copies.

Change-Id: Ie31b02d6b65355122bb8c98bc73700a59bee166e
2023-02-13 13:50:49 -05:00
Jonathan Kim 4f283d9bb3 Make the number of per agent SDMA engines queryable.
Change-Id: Iae1cc9b7ec783fdda05f9384f0ad0327ea1a8cc3
2023-02-13 13:50:49 -05:00
Ranjith Ramakrishnan 053b89414a File reorg backward compatibility message changed to #error
Change-Id: I70b6f06b5e82242b3f50e7d1f0dac8a1eb8add11
2023-02-10 13:09:10 -05:00
David Yat Sin fb8f42233d Fix unitialized variable warning in valgrind
Change-Id: I91e70d67671a8f7289b734407011380b6b97238a
2023-02-09 17:35:53 -05:00
Xiaogang Chen efcc9b275b libhsakmt: Correct reporting of Shader Engines number.
The Shader Engines number should be shadder array_count divided by simd_arrays_per_engine
not array_count.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I808d1fedd6b9843500719e902ecf759f5668a7d1
2023-02-09 14:34:17 -05:00
Cordell Bloor 5873a78d58 Fix static initialization order
Change-Id: I1d51e150b526d050b988fe5a422644667a561cd7
2023-02-09 13:51:08 -05:00
David Yat Sin 59685f4492 Add flag for external memory allocations
ROCr internally uses the same allocation_map_ list to track memory
allocations that are both for internal allocations and allocations by
users of ROCr library. In some edge cases, the library user would call
hsa_amd_pointer_info on an invalid pointer, but ROCR would return the
pointer as valid because this pointer belongs to a memory range that
was allocated internally within ROCr. Adding a flag to differentiate
between internal and external allocations.

Change-Id: I98c52bd85f3985d1ba1b0e3101d2254b003412cf
2023-02-09 13:21:43 -05:00
Sean Keely 27596aef0c Track size of pending operations in blits.
Track and report the size, in bytes, of pending unexecuted blit
commands.  To be used in copy ganging.

Change-Id: Ia7453ff88571e927df771c6c819b73c17e67708e
2023-02-06 12:38:40 -05:00
Konstantin Zhuravlyov f115a3505c Compile image blit kernels with code object v4
Change-Id: I4b1923fe8f22dda1277409794d0856419228eceb
2023-02-02 17:33:15 -05:00
Graham Sider 3fb1496fb3 kfdtest: Remove redundant SGPR/VGPR size checks in KFDTopologyTest
KFDTopologyTest.BasicTest duplicates Thunk logic to calculate VGPR size,
meaning it will always be the same, and SGPR size is a constant. Since
no benefit, remove comparisons.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I99e7ff6fb69ed07bc0716fdf43946b19c67b9268
2023-01-26 15:08:12 -05:00
Ranjith Ramakrishnan a649357dec rocrtst package name updated as per ROCm standards
Change-Id: I6d7096a5c5c27648bf0cbfb4b1e83e72b7949421
2023-01-26 13:04:41 -05:00
Shweta Khatri 8aac885318 Fixes hang due to change in order of initialization of libraries
Fixes hang due to change in order of initialization of libraries
that have cyclical dependencies and they call hsa_init() during their
initialization phase.
This implementation looks for a symbol called "HSA_AMD_TOOL_PRIORITY"
across all loaded shared libraries using dynamic section entries of the
loaded lib instead of using dlopen and dlsym for the same purpose.

Change-Id: I4865f2fd18dd186ec311a432ec38fbb5583805d2
2023-01-26 01:17:22 -05:00
David Belanger 0eb0bae38b Revert "libhsakmt: Disabled allocation of CWSR with SVM for GFX11."
This reverts commit b25867c4b8.

Change-Id: I05bf82266f563c63c0b794a24b0926e7652ce42d
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-25 10:48:46 -05:00
David Belanger a847a7b80e libhsakmt: Fixed VGPR memory size for GFX11.0 and GFX11.1.
Fixed VGPR memory size, size was too small for some GPU, causing a memory overflow.
Refactored macro code into a function.
Thanks to Jay Cornwall for locating the problem and proposing the fix.

Change-Id: Iffedea1c4f341967f02c56d810ff048225b02c16
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-25 10:45:44 -05:00
Ranjith Ramakrishnan bc40579f96 Added OS details to kfdtest rpm packge name
Change-Id: I600e094c364e1c7219ae3db12f0c4e1f7598c132
2023-01-23 12:13:32 -08:00