Граф коммитов

175 Коммитов

Автор SHA1 Сообщение Дата
Flora Cui a765dd7e94 rocr: add specific flag for blit kernel object
so that aql-to-pm4 conversion could verify the validity of the kernel
object.

Signed-off-by: Flora Cui <flora.cui@amd.com>
2025-07-17 21:55:02 +08:00
Honglei Huang 45af009c5d libhsakmt: use uint32_t for loop index variables
This patch changes the type of several loop index variables from int to
uint32_t in fmm.c. The affected functions are:
- __fmm_release
- _fmm_map_to_gpu
- _fmm_unmap_from_gpu

To fix compile warning:

warning: comparison of integer expressions of different signedness:
'int' and 'uint32_t' {aka 'unsigned int'} [-Wsign-compare]
 2009 |         for (i = 0; i < object->handle_num; i++) {

Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
2025-07-09 13:15:42 +08:00
Apurv Mishra 3115384874 kfdtest: Temporarily blacklist KFDEvictTest suite
blacklist the KFDEvictTest suite until the defects
SWDEV 535386 and 537002, where these test cases fail
inconsistently, are fixed

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-07-04 11:47:20 -04:00
David Yat Sin 96d0f07b15 rocr: Fix compile warning when using clang 2025-06-12 10:38:58 -04:00
Apurv Mishra d9a95605cc kfdtest: Disable KFD RAS test case
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-05-27 19:04:04 -04:00
Eric Huang afe7965796 libhsakmt: optimize big system buffer allocation
To change biggest single buffer to be huge page aligned
and other optimization.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2025-05-26 18:30:00 -04:00
Eric Huang 8887d25304 libhsakmt: add big system buffer allocation support
when allocating userptr buffer in system ram with size bigger
than or equal 512G, TTM has limit and returns error, to split one
big buffer into multiple small buffers in vm_object will solve
this issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2025-05-26 11:04:30 -04:00
Amber Lin 31d51acb26 kfdtest: blacklist KFDSVMEvictTest.QueueTest
Temporarily blacklist KFDSVMEvictTest.QueueTest on gfx950

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2025-05-23 01:22:11 -04:00
Ramakrishnan, Ranjith 1785cff6a5 CMake: Remove file reorganization backward compatibility code (#176)
The feature has already been disabled, and the related source code is no longer required
2025-05-22 09:47:26 -07:00
Philip Yang bd86fb1e63 kfdtest: Add KFDQMTest UserQueueBufValidation
Create CP queue and SDMA queue should fail with invalid queue ring
buffer or ring buffer size.

Test unmap or free queue buffers should fail before queue is destroyed.

Use child process to test unmap CWSR buffer will evict queue.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I5dcd51d6b43445d19a986f8b0b82063e20348a5f
2025-05-22 10:06:42 -04:00
Philip Yang 3e6f51b715 libhsakmt: unmap from GPU error handling
If unmap from GPU return failed, for example, unmap user queue buffer
while queue is active, we should not free obj->mapped_node_id_array,
otherwise, the following unmap user queue buffer after queue is
destroyed still return failed.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I32aeb18871c2e971d01900d92916c54680f5c9fa
2025-05-22 10:06:42 -04:00
Apurv Mishra f853dda9ba kfdtest: Disable tests that cause unwanted behavior
disable KFDLocalMemoryTest.Fragmentation and
KFDEventTest.MeasureInterruptConsumption as
part of the  KFD test suite improvement feature

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-05-21 16:29:15 -04:00
Ben Vanik d54124383f kfdtest: Fix SVM profiler QUEUE_RESTORE parsing 2025-05-21 13:17:25 -04:00
Searles, Mark ac1e6d59c2 Update createMCObjectStreamer() to use new LLVM API (#156) (#157)
* Update createMCObjectStreamer() to use new LLVM API

Obsolete interfaces were removed via llvm-project's
f2ff298867d7733122e32eead5a8c524b09dfdb1

* Fix typo: LLVM_VERSION -> LLVM_VERSION_MAJOR

* Fix typo
2025-05-05 13:18:05 -07:00
Apurv Mishra aa0a32a166 kfdtest: Update ROCr homepage in CMakeLists.txt
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-05-01 11:22:49 -04:00
Amber Lin 5e28208cec kfdtest: Skip SVMEvict with xnack=0
Random driver deadlock on svm_range_evict_svm_bo_worker() is obeserved on
NPS2/DPX mode. It's seen with xnack off and happens more often on the
partition with less VRAM because of TMR.

Temporarily skip SVM Evict tests on Family AV when xnack is disabled.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2025-04-25 12:45:36 -04:00
Tony Gutierrez 6e3c375bf1 rocr: Flags to alloc queue buf/struct in dev mem
This builds on a prior change that allowed for allocating
a user-mode queue's packet buffer in device memory to also
allocate the queue struct in device memory. This provides
additional latency benefits particularly for cases where
dispatches are performed from the GPU itself. Flags are
added to support the various use cases.
2025-04-23 15:53:29 -04:00
Amber Lin bdb6e43b54 Revert "kfdtest: Temporarily blacklist KFDNegativeTest"
This reverts commit fcf3f91379.

MEC v18 starts to support pipe reset
2025-04-21 14:14:10 -04:00
Jonathan Kim 4c3a0698f8 kfdtest: fix trap on start for gfx 9 and 11
Similar to GFX 12, GFX 9 and 11 need to exit without forwarding
the PC.
2025-04-10 14:48:19 -04:00
Eric Huang df6048429c kfdtest: fix max queues on multi-gpu mode
The max queues per process is 1024 in KFD,
KFDQMTest.OverSubscribeCpQueues fails with multi-gpu mode
on more than 15 gpus, because 65x16=1040 exceeds 1024, so
changing MAX_CP_QUEUES to adapt it will fix the issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2025-04-08 12:57:00 -04:00
Eric Huang d3265234e9 kfdtest: fix ptrace error on multi-gpu mode
The parent process can only be ptraced by 1 process
once, to avoid the error we have to add mutex to
synchronize the ptrace call.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2025-04-08 09:58:28 -04:00
Apurv Mishra 10530fa2a7 kfdtest: support for upstream kernel driver
detect if the loaded driver is upstream or DKMS version and
add a filter for for the tests that fail in upstream driver

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-03-27 16:55:21 -04:00
Jonathan Kim c710a06ee0 kfdtest: fix trap on wave start and end
The debugger override will set the initial request mask to the
previously set request mask so use a different mask to assert
enablement.
Trap on wave start and end also run back to back, so fix the
previous override mask check as well.

In addition, unlike instruction traps, trap on wave start and end
will not require a rewind of the program counter on wave exit.
2025-03-24 20:44:27 -04:00
jordans d4b85b6bf5 hsakmt: Initial Commit for the HSA KMT Model
The over arching goal it so provide an API that pre-silicon models can latch into for software bring up.# Please enter the commit message for your changes. Lines starting
2025-03-18 16:22:17 -04:00
Stella Laurenzo c36ccaaf4b rocr: Search for libnuma with find_package before find_library.
This avoids a false dependence on a system library when not desired.
2025-03-14 08:16:13 -07:00
Emily Deng 42f79776cd kfdtest: Fix the childStatus is 0x7f error for KFDDBGTest.HitMemoryViolation
For the case parent goes faster then child, and child hasn't call the second
raise(SIGSTOP), then parent's "waitpid(childPid, &childStatus, 0)" will return,
and the childStatus will be 0x137f, which is SIGSTOP signal id.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
2025-03-13 13:38:46 +08:00
Emily Deng 91ef44d3ec kfdtest: Fix DeviceSnapshot return fail error for KFDDBGTest.HitMemoryViolation
For the case that the child goes to the second raise(SIGSTOP),
and parent sends PTRACE_CONT, than child exits. Parent will assert at
DeviceSnapshot, as in kfd_ioctl, couldn't get the mm from child pid.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
2025-03-13 13:38:46 +08:00
Apurv Mishra 85c4b0020a kfdtest: limit GFX VRAM allocation to 1/4 sys mem
reduce the allocated memory for GFX VRAM as
KFD Evict test faced intermittent page faults,
which can be due to larger GFX CS BO size
2025-03-12 13:54:04 -04:00
Apurv Mishra de8f8f076d kfdtest: add blacklist for RHEL9 system
add tests for exclusion when running kfdtest
on RHEL9 system, tested with Navi 31

Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
2025-03-11 16:40:25 -04:00
Longlong Yao 5916467552 libhsakmt: set node_id to 0 for OnlyAddress
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-03-11 10:16:58 -04:00
Amber Lin fcf3f91379 kfdtest: Temporarily blacklist KFDNegativeTest
Blacklist KFDNegativeTest.BasicPipeReset from gfx950 until MEC can
support pipe reset on GC 9.5.0.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2025-03-10 10:37:19 -07:00
Jonathan Kim c879fdefcf kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.
2025-03-06 14:04:42 -05:00
Jonathan Kim ee890e7d2b kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.
2025-03-06 14:04:42 -05:00
Jonathan Kim d047708317 kfdtest: Allow user to modify packet size for SDMA write packets
This is primarily used for debug and negative testing for SDMA queue
reset and shouldn't be used for normal run cases.
2025-03-06 14:04:42 -05:00
Jonathan Kim 9e57ce48e8 kfdtest: Add create SDMA queue by target engine
KFD supports SDMA queue creation by target engine.
Enable this for testing.
2025-03-06 14:04:42 -05:00
Jonathan Kim a957b24153 kfdtest: Add SDMA poll memory register packet support
The SDMA can wait on poll user memory.  This is being added to
support per-SDMA queue reset testing.
2025-03-06 14:04:42 -05:00
Jonathan Kim e3d09e30dc hsakmt: Expose per-SDMA queue reset capabilities
Expose new capabilities field that flags per-sdma queue reset
support.
2025-03-06 14:04:42 -05:00
David Belanger 3ceb131df5 kfdtest: Fix ExtendedCuMasking test case
Modify test case to support XL cards.

Change-Id: I6ad45a290d50a5238804ce7417bcdb33a3912872
Signed-off-by: David Belanger <david.belanger@amd.com>
2025-02-27 21:25:19 -05:00
James Zhu f8d8b8011f kfdtest: fix resource leakage
Resource allocated in SetUp/HsaNodeInfo::Init,
needs be delete in TearDown/HsaNodeInfo::Delete.

Signed-off-by: James Zhu <James.Zhu@amd.com>
2025-02-24 19:38:59 -05:00
Longlong Yao 26f001d3cb libhsakmt: allocate va in host path
Change-Id: I40a4395aca99ea8dfd8ff0ecde64eb2c3840d867
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-02-15 07:56:45 -05:00
Harish Kasiviswanathan 2a64fa5e06 libhsakmt: gfx950: Add option to enable HIGH_PRECISION
Environment variable HSA_HIGH_PRECISION_MODE can be used to control MFMA
precision

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ib78dd9dd8867025e090a3cca96ab6db4f65dea12
2025-02-10 16:05:25 -05:00
sonadeem ff01f62777 cmake: Fix BUILD_SHARED_LIBS option and README for it
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.

Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f
2025-01-30 14:17:27 -05:00
David Belanger f24d789dee kfdtest: Convert ExtendedCuMask test to multi-GPU framework
Convert test to use multi-GPU framework.

Add mutex to fix intermixed log issue and annotate logging with
gpu node number.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ic2beeadb1eb4b5a9a0710ac1dbd60b9bf1d84c33
2025-01-30 11:41:00 -05:00
Sv. Lockal 5d04bd42f3 Fix build issues for musl libc (#267)
Change-Id: Ia31330b0f96669966712b58986abeca754c2cbb9
2025-01-29 14:31:05 +00:00
Lang Yu d159b29dc6 kfdtest: update AtomicIncIsa for gfx12
"s_waitcnt 0" (deprecated in gfx12) is redundant here.

s_endpgm will wait for all outstanding instructions
to complete before executing.

Change-Id: Ia8b4dd0fd8dd713e7ba2cba9db85b7b12cee1dd4
Signed-off-by: Lang Yu <lang.yu@amd.com>
2025-01-28 20:32:41 -05:00
James Zhu 9509af4b98 libhsakmt: increase default svm.alignment_order
Since GFX950 can support page table fragment up to 18 without
performance loss. So set GFX950  default svm.alignment_order to 18.

Change-Id: Ibcdb7f041fb07a38e924c471beec261ea227ca1d
Signed-off-by: James Zhu <James.Zhu@amd.com>
2025-01-28 08:27:19 -05:00
Amber Lin 0b6e457201 kfdtest: Create gfx950 blacklist
This patch creates the blacklist for gfx950 by copying gfx942 but adding
KFDGWSTest.Semaphore as GWS support is completely removed from gfx950.

Change-Id: I5d7c17e57b8cfd9fae63780ecc9dd55662cfdade
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2025-01-28 08:26:44 -05:00
Lancelot Six 76052ba028 libhsakmt: gfx950 uses same VGPR block size as gfx940
Make sure to use allocate the same amount of size for VGPR data in
gfx950 as it is done for gfx940.

Change-Id: I6a0820996389627ccbdfef856e5150c46fac92a1
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
2025-01-27 14:06:42 -05:00
Lancelot Six c51aa0d155 libhsakmt: Use the node info to determine LDS size
The CWSR area size needs to take into account the size of LDS each
active workgroup can have.  The current implementation uses a constant
for that.  This patch refactors this to use the HsaNodeProperties of the
device's the CWSR area is for to figure out the size of LDS.

Change-Id: Ib8585b2b7140ec5c99e7b7d62e67f785697c028a
Signed-off-by: Lancelot Six <Lancelot.Six@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2025-01-26 21:46:32 -05:00
Alex Sierra 268054cd28 kfdtest: add support for gfx9.5.0 in shader store
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I48b98ff631bd1aa1a044b60583ff256e43b17423
2025-01-26 21:45:07 -05:00