İşleme Grafiği

172 İşleme

Yazar SHA1 Mesaj Tarih
David Yat Sin 8982f2c2c6 rocr: Fix compile warning when using clang
[ROCm/ROCR-Runtime commit: 96d0f07b15]
2025-06-12 10:38:58 -04:00
Apurv Mishra 226d8126c9 kfdtest: Disable KFD RAS test case
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: d9a95605cc]
2025-05-27 19:04:04 -04:00
Eric Huang 0d5e261f39 libhsakmt: optimize big system buffer allocation
To change biggest single buffer to be huge page aligned
and other optimization.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: afe7965796]
2025-05-26 18:30:00 -04:00
Eric Huang 2c6f84b12c libhsakmt: add big system buffer allocation support
when allocating userptr buffer in system ram with size bigger
than or equal 512G, TTM has limit and returns error, to split one
big buffer into multiple small buffers in vm_object will solve
this issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: 8887d25304]
2025-05-26 11:04:30 -04:00
Amber Lin 9c6828647b kfdtest: blacklist KFDSVMEvictTest.QueueTest
Temporarily blacklist KFDSVMEvictTest.QueueTest on gfx950

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 31d51acb26]
2025-05-23 01:22:11 -04:00
Ramakrishnan, Ranjith 85cd72987f CMake: Remove file reorganization backward compatibility code (#176)
The feature has already been disabled, and the related source code is no longer required

[ROCm/ROCR-Runtime commit: 1785cff6a5]
2025-05-22 09:47:26 -07:00
Philip Yang 4ac71d1f5d kfdtest: Add KFDQMTest UserQueueBufValidation
Create CP queue and SDMA queue should fail with invalid queue ring
buffer or ring buffer size.

Test unmap or free queue buffers should fail before queue is destroyed.

Use child process to test unmap CWSR buffer will evict queue.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I5dcd51d6b43445d19a986f8b0b82063e20348a5f


[ROCm/ROCR-Runtime commit: bd86fb1e63]
2025-05-22 10:06:42 -04:00
Philip Yang 50886316e9 libhsakmt: unmap from GPU error handling
If unmap from GPU return failed, for example, unmap user queue buffer
while queue is active, we should not free obj->mapped_node_id_array,
otherwise, the following unmap user queue buffer after queue is
destroyed still return failed.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I32aeb18871c2e971d01900d92916c54680f5c9fa


[ROCm/ROCR-Runtime commit: 3e6f51b715]
2025-05-22 10:06:42 -04:00
Apurv Mishra 5c42a9f1bf kfdtest: Disable tests that cause unwanted behavior
disable KFDLocalMemoryTest.Fragmentation and
KFDEventTest.MeasureInterruptConsumption as
part of the  KFD test suite improvement feature

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: f853dda9ba]
2025-05-21 16:29:15 -04:00
Ben Vanik ba02a7b1ca kfdtest: Fix SVM profiler QUEUE_RESTORE parsing
[ROCm/ROCR-Runtime commit: d54124383f]
2025-05-21 13:17:25 -04:00
Searles, Mark f698518819 Update createMCObjectStreamer() to use new LLVM API (#156) (#157)
* Update createMCObjectStreamer() to use new LLVM API

Obsolete interfaces were removed via llvm-project's
f2ff298867d7733122e32eead5a8c524b09dfdb1

* Fix typo: LLVM_VERSION -> LLVM_VERSION_MAJOR

* Fix typo

[ROCm/ROCR-Runtime commit: ac1e6d59c2]
2025-05-05 13:18:05 -07:00
Apurv Mishra aa896090f8 kfdtest: Update ROCr homepage in CMakeLists.txt
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: aa0a32a166]
2025-05-01 11:22:49 -04:00
Amber Lin 9d98d7479d kfdtest: Skip SVMEvict with xnack=0
Random driver deadlock on svm_range_evict_svm_bo_worker() is obeserved on
NPS2/DPX mode. It's seen with xnack off and happens more often on the
partition with less VRAM because of TMR.

Temporarily skip SVM Evict tests on Family AV when xnack is disabled.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 5e28208cec]
2025-04-25 12:45:36 -04:00
Tony Gutierrez 6f37386eb2 rocr: Flags to alloc queue buf/struct in dev mem
This builds on a prior change that allowed for allocating
a user-mode queue's packet buffer in device memory to also
allocate the queue struct in device memory. This provides
additional latency benefits particularly for cases where
dispatches are performed from the GPU itself. Flags are
added to support the various use cases.


[ROCm/ROCR-Runtime commit: 6e3c375bf1]
2025-04-23 15:53:29 -04:00
Amber Lin bf3bb1f1a1 Revert "kfdtest: Temporarily blacklist KFDNegativeTest"
This reverts commit fffdffc3ce.

MEC v18 starts to support pipe reset


[ROCm/ROCR-Runtime commit: bdb6e43b54]
2025-04-21 14:14:10 -04:00
Jonathan Kim a595c0bd25 kfdtest: fix trap on start for gfx 9 and 11
Similar to GFX 12, GFX 9 and 11 need to exit without forwarding
the PC.


[ROCm/ROCR-Runtime commit: 4c3a0698f8]
2025-04-10 14:48:19 -04:00
Eric Huang 13cdca7fb3 kfdtest: fix max queues on multi-gpu mode
The max queues per process is 1024 in KFD,
KFDQMTest.OverSubscribeCpQueues fails with multi-gpu mode
on more than 15 gpus, because 65x16=1040 exceeds 1024, so
changing MAX_CP_QUEUES to adapt it will fix the issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: df6048429c]
2025-04-08 12:57:00 -04:00
Eric Huang 9055cf8092 kfdtest: fix ptrace error on multi-gpu mode
The parent process can only be ptraced by 1 process
once, to avoid the error we have to add mutex to
synchronize the ptrace call.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: d3265234e9]
2025-04-08 09:58:28 -04:00
Apurv Mishra b490aec8e6 kfdtest: support for upstream kernel driver
detect if the loaded driver is upstream or DKMS version and
add a filter for for the tests that fail in upstream driver

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: 10530fa2a7]
2025-03-27 16:55:21 -04:00
Jonathan Kim 20d9a9a15a kfdtest: fix trap on wave start and end
The debugger override will set the initial request mask to the
previously set request mask so use a different mask to assert
enablement.
Trap on wave start and end also run back to back, so fix the
previous override mask check as well.

In addition, unlike instruction traps, trap on wave start and end
will not require a rewind of the program counter on wave exit.


[ROCm/ROCR-Runtime commit: c710a06ee0]
2025-03-24 20:44:27 -04:00
jordans 938b34da24 hsakmt: Initial Commit for the HSA KMT Model
The over arching goal it so provide an API that pre-silicon models can latch into for software bring up.# Please enter the commit message for your changes. Lines starting


[ROCm/ROCR-Runtime commit: d4b85b6bf5]
2025-03-18 16:22:17 -04:00
Stella Laurenzo 5a3b9a1fdf rocr: Search for libnuma with find_package before find_library.
This avoids a false dependence on a system library when not desired.


[ROCm/ROCR-Runtime commit: c36ccaaf4b]
2025-03-14 08:16:13 -07:00
Emily Deng af293c4a61 kfdtest: Fix the childStatus is 0x7f error for KFDDBGTest.HitMemoryViolation
For the case parent goes faster then child, and child hasn't call the second
raise(SIGSTOP), then parent's "waitpid(childPid, &childStatus, 0)" will return,
and the childStatus will be 0x137f, which is SIGSTOP signal id.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>


[ROCm/ROCR-Runtime commit: 42f79776cd]
2025-03-13 13:38:46 +08:00
Emily Deng 46bb10ff2d kfdtest: Fix DeviceSnapshot return fail error for KFDDBGTest.HitMemoryViolation
For the case that the child goes to the second raise(SIGSTOP),
and parent sends PTRACE_CONT, than child exits. Parent will assert at
DeviceSnapshot, as in kfd_ioctl, couldn't get the mm from child pid.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>


[ROCm/ROCR-Runtime commit: 91ef44d3ec]
2025-03-13 13:38:46 +08:00
Apurv Mishra 1e279a19c3 kfdtest: limit GFX VRAM allocation to 1/4 sys mem
reduce the allocated memory for GFX VRAM as
KFD Evict test faced intermittent page faults,
which can be due to larger GFX CS BO size


[ROCm/ROCR-Runtime commit: 85c4b0020a]
2025-03-12 13:54:04 -04:00
Apurv Mishra 77f4bbfdf1 kfdtest: add blacklist for RHEL9 system
add tests for exclusion when running kfdtest
on RHEL9 system, tested with Navi 31

Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: de8f8f076d]
2025-03-11 16:40:25 -04:00
Longlong Yao ef1740b88b libhsakmt: set node_id to 0 for OnlyAddress
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>


[ROCm/ROCR-Runtime commit: 5916467552]
2025-03-11 10:16:58 -04:00
Amber Lin fffdffc3ce kfdtest: Temporarily blacklist KFDNegativeTest
Blacklist KFDNegativeTest.BasicPipeReset from gfx950 until MEC can
support pipe reset on GC 9.5.0.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: fcf3f91379]
2025-03-10 10:37:19 -07:00
Jonathan Kim 8cbb23183c kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.


[ROCm/ROCR-Runtime commit: c879fdefcf]
2025-03-06 14:04:42 -05:00
Jonathan Kim 36c69a6cff kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.


[ROCm/ROCR-Runtime commit: ee890e7d2b]
2025-03-06 14:04:42 -05:00
Jonathan Kim 06b2c3aeb6 kfdtest: Allow user to modify packet size for SDMA write packets
This is primarily used for debug and negative testing for SDMA queue
reset and shouldn't be used for normal run cases.


[ROCm/ROCR-Runtime commit: d047708317]
2025-03-06 14:04:42 -05:00
Jonathan Kim 297e8f729e kfdtest: Add create SDMA queue by target engine
KFD supports SDMA queue creation by target engine.
Enable this for testing.


[ROCm/ROCR-Runtime commit: 9e57ce48e8]
2025-03-06 14:04:42 -05:00
Jonathan Kim 303cdb8f7e kfdtest: Add SDMA poll memory register packet support
The SDMA can wait on poll user memory.  This is being added to
support per-SDMA queue reset testing.


[ROCm/ROCR-Runtime commit: a957b24153]
2025-03-06 14:04:42 -05:00
Jonathan Kim 599a20ee2d hsakmt: Expose per-SDMA queue reset capabilities
Expose new capabilities field that flags per-sdma queue reset
support.


[ROCm/ROCR-Runtime commit: e3d09e30dc]
2025-03-06 14:04:42 -05:00
David Belanger 2c11a41adc kfdtest: Fix ExtendedCuMasking test case
Modify test case to support XL cards.

Change-Id: I6ad45a290d50a5238804ce7417bcdb33a3912872
Signed-off-by: David Belanger <david.belanger@amd.com>


[ROCm/ROCR-Runtime commit: 3ceb131df5]
2025-02-27 21:25:19 -05:00
James Zhu b42578b070 kfdtest: fix resource leakage
Resource allocated in SetUp/HsaNodeInfo::Init,
needs be delete in TearDown/HsaNodeInfo::Delete.

Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: f8d8b8011f]
2025-02-24 19:38:59 -05:00
Longlong Yao 082c6b7830 libhsakmt: allocate va in host path
Change-Id: I40a4395aca99ea8dfd8ff0ecde64eb2c3840d867
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>


[ROCm/ROCR-Runtime commit: 26f001d3cb]
2025-02-15 07:56:45 -05:00
Harish Kasiviswanathan 729f98b05f libhsakmt: gfx950: Add option to enable HIGH_PRECISION
Environment variable HSA_HIGH_PRECISION_MODE can be used to control MFMA
precision

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ib78dd9dd8867025e090a3cca96ab6db4f65dea12


[ROCm/ROCR-Runtime commit: 2a64fa5e06]
2025-02-10 16:05:25 -05:00
sonadeem 02edf09f87 cmake: Fix BUILD_SHARED_LIBS option and README for it
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.

Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f


[ROCm/ROCR-Runtime commit: ff01f62777]
2025-01-30 14:17:27 -05:00
David Belanger 75a060fc53 kfdtest: Convert ExtendedCuMask test to multi-GPU framework
Convert test to use multi-GPU framework.

Add mutex to fix intermixed log issue and annotate logging with
gpu node number.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ic2beeadb1eb4b5a9a0710ac1dbd60b9bf1d84c33


[ROCm/ROCR-Runtime commit: f24d789dee]
2025-01-30 11:41:00 -05:00
Sv. Lockal d1507361ec Fix build issues for musl libc (#267)
Change-Id: Ia31330b0f96669966712b58986abeca754c2cbb9


[ROCm/ROCR-Runtime commit: 5d04bd42f3]
2025-01-29 14:31:05 +00:00
Lang Yu 85125b1054 kfdtest: update AtomicIncIsa for gfx12
"s_waitcnt 0" (deprecated in gfx12) is redundant here.

s_endpgm will wait for all outstanding instructions
to complete before executing.

Change-Id: Ia8b4dd0fd8dd713e7ba2cba9db85b7b12cee1dd4
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/ROCR-Runtime commit: d159b29dc6]
2025-01-28 20:32:41 -05:00
James Zhu 647b705679 libhsakmt: increase default svm.alignment_order
Since GFX950 can support page table fragment up to 18 without
performance loss. So set GFX950  default svm.alignment_order to 18.

Change-Id: Ibcdb7f041fb07a38e924c471beec261ea227ca1d
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 9509af4b98]
2025-01-28 08:27:19 -05:00
Amber Lin e262729f6f kfdtest: Create gfx950 blacklist
This patch creates the blacklist for gfx950 by copying gfx942 but adding
KFDGWSTest.Semaphore as GWS support is completely removed from gfx950.

Change-Id: I5d7c17e57b8cfd9fae63780ecc9dd55662cfdade
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 0b6e457201]
2025-01-28 08:26:44 -05:00
Lancelot Six a95209dde6 libhsakmt: gfx950 uses same VGPR block size as gfx940
Make sure to use allocate the same amount of size for VGPR data in
gfx950 as it is done for gfx940.

Change-Id: I6a0820996389627ccbdfef856e5150c46fac92a1
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>


[ROCm/ROCR-Runtime commit: 76052ba028]
2025-01-27 14:06:42 -05:00
Lancelot Six c7b1fd714e libhsakmt: Use the node info to determine LDS size
The CWSR area size needs to take into account the size of LDS each
active workgroup can have.  The current implementation uses a constant
for that.  This patch refactors this to use the HsaNodeProperties of the
device's the CWSR area is for to figure out the size of LDS.

Change-Id: Ib8585b2b7140ec5c99e7b7d62e67f785697c028a
Signed-off-by: Lancelot Six <Lancelot.Six@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: c51aa0d155]
2025-01-26 21:46:32 -05:00
Alex Sierra da483d7588 kfdtest: add support for gfx9.5.0 in shader store
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I48b98ff631bd1aa1a044b60583ff256e43b17423


[ROCm/ROCR-Runtime commit: 268054cd28]
2025-01-26 21:45:07 -05:00
Alex Sierra 840a613723 kfdtest: Add gfx 9.5 as FAMILY_AV
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Ib5696eee1d4f64c9c87d714eae7c80fbbd1e2b23


[ROCm/ROCR-Runtime commit: e94ff8a36c]
2025-01-26 21:43:55 -05:00
Tony Gutierrez e82871f20b rocr: Generalize driver discovery
Generalize the driver discovery and move driver-specific
functionality to the concrete driver implementations.
Currently, this process is tightly coupled to the hsakmt
which is GPU and OS specific.

Change-Id: Ie1c53fef407a71b5ec4c6eaf3a3ed00871184409


[ROCm/ROCR-Runtime commit: 15107afb11]
2025-01-23 15:09:14 -05:00
Shweta Khatri c5822cee5a Revert "Revert "hsakmt: Only set exec flag when requested""
This reverts commit 5a8092bccf.

Reason for revert: This will put back the change ID - Id1154f08f6ba21c633905fd46b06053994d6f3cc to ROCR repo, which will prevent memory allocations from being automatically granted the 'executable' flag, addressing previously -  incorrect and unsafe behavior in ROCm driver.

Change-Id: I3d45c45859929a80f7791681b411251e099a1901


[ROCm/ROCR-Runtime commit: 2d4a578020]
2025-01-23 09:08:25 -05:00