122 کامیت‌ها

مولف SHA1 پیام تاریخ
Alysa Liu 13091e18ad libhsakmt: Add THEROCK_SANITIZER support for ASAN builds (#2978)
Add THEROCK_SANITIZER support for ASAN builds.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2026-01-30 10:02:10 -05:00
Alysa Liu 5be4fddf06 kfdtest: Support blit kernel copy (#677)
Add support for blit kernel copy.
Add GpuMemCopyTest test for KFDQMTest.
2026-01-07 16:48:11 -05:00
Maneesh Gupta 4a9833e70e Revert "Add HasExpertSchedMode device prop (#2241)" (#2371)
This reverts commit c0b4aef5ad.
2025-12-17 21:26:44 -08:00
Filip Jankovic c0b4aef5ad Add HasExpertSchedMode device prop (#2241)
* Add HasExpertSchedMode device prop

* Add unit tests for HasExpertSchedMode

* Add gfx12 check for HasExpertSchedMode prop

* Update gfx major version check and test for ExpertSchedMode

* Minor fix and ROCr version bump

* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h

* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h

* Apply suggestion from @dayatsin-amd

* Apply suggestion from @dayatsin-amd

---------

Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com>
2025-12-17 17:06:08 +01:00
Alysa Liu 3a7b5571c0 kfdtest: Replace pthread with std::thread (#1448)
* kfdtest: Replace pthread with std::thread

Modify concurrent kfdtest to use std::thread
instead of pthread, eventually modify KFDTestLaunch
to take in a member function of test instance
instead of static function.

Convert KFDQMTest to pass in member function for
multi-gpu kfdtest.

* kfdtest: Convert KFDPerfCountersTest to use std::thread

Convert KFDPerfCountersTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDGraphicsInterop to use std::thread

Convert KFDGraphicsInterop to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDGWSTest to use std::thread

Convert KFDGWSTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDCWSRTest to use std::thread

Convert KFDCWSRTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDEventTest to use std::thread

Convert KFDEventTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDExceptionTest to use std::thread

Convert KFDExceptionTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDLocalMemoryTest to use std::thread

Convert KFDLocalMemoryTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDMemoryTest to use std::thread

Convert KFDMemoryTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDSVMRangeTest to use std::thread

Convert KFDSVMRangeTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Convert KFDHWSTest to use std::thread

Convert KFDHWSTest to use std::thread for
multi-gpu kfdtest.

* kfdtest: Remove pthread multigpu test structure

Remove older multi-gpu test framework which
uses pthread.
2025-12-02 10:25:21 -05:00
andmar-amd da6e939c6c Disable PCSampling on upstream branches (#1421)
- PC Sampling ioctls/tests are not up-streamed. They should be skipped
   for any and all upstream branches.
2025-11-19 14:15:40 -08:00
andmar-amd 70fc774ad0 Disable KFDDBGTest.HitMemoryViolation for navi 10 (#1423)
- Filter out KFDDBGTest.HitMemoryViolation for navi10, which is
   currently failing
2025-11-19 14:15:05 -08:00
andmar-amd 2b4d17078a Improve test script logic and error handling (#1424)
- Fix exclude+gtest_filter logic
 - Improve error handling when detecting upstream branches
2025-11-19 14:14:40 -08:00
David Yat Sin de3b7322f2 rocr/hsakmt: Fix asan compile errors - KFDQMTest (#1638)
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-07 14:52:36 -05:00
systems-assistant[bot] 740b27528f kfdtest: Enable GPU selection via CLI for multi-GPU tests (#245)
* kfdtest: Enable GPU selection via CLI for multi-GPU tests

Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>

* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)

---------

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
2025-11-03 09:27:38 -05:00
David Bélanger 02294e3852 kfdtest: Fix ExtendedCuMasking on GPUs with inactive CUs (#726)
Modify the code that computes the adjusted CU mask array to take
into account of additional cases for inactive CUs.

Signed-off-by: David Belanger <david.belanger@amd.com>
2025-10-17 08:26:12 -07:00
Alysa Liu 2b2b8329b5 rocr: Add copyright for new files (#886)
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-09-11 10:56:31 -04:00
Kent Russell 991f72bb9f kfdtest: Remove gfx940/941 references
Support was removed for these eng samples, so remove them from the
blacklist, and make sure that we're using 942 for the shader store


[ROCm/ROCR-Runtime commit: f755981f03]
2025-07-22 08:47:34 -04:00
Apurv Mishra 6c89d61cef kfdtest: Temporarily blacklist KFDEvictTest suite
blacklist the KFDEvictTest suite until the defects
SWDEV 535386 and 537002, where these test cases fail
inconsistently, are fixed

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: 3115384874]
2025-07-04 11:47:20 -04:00
Apurv Mishra 226d8126c9 kfdtest: Disable KFD RAS test case
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: d9a95605cc]
2025-05-27 19:04:04 -04:00
Amber Lin 9c6828647b kfdtest: blacklist KFDSVMEvictTest.QueueTest
Temporarily blacklist KFDSVMEvictTest.QueueTest on gfx950

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 31d51acb26]
2025-05-23 01:22:11 -04:00
Philip Yang 4ac71d1f5d kfdtest: Add KFDQMTest UserQueueBufValidation
Create CP queue and SDMA queue should fail with invalid queue ring
buffer or ring buffer size.

Test unmap or free queue buffers should fail before queue is destroyed.

Use child process to test unmap CWSR buffer will evict queue.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I5dcd51d6b43445d19a986f8b0b82063e20348a5f


[ROCm/ROCR-Runtime commit: bd86fb1e63]
2025-05-22 10:06:42 -04:00
Apurv Mishra 5c42a9f1bf kfdtest: Disable tests that cause unwanted behavior
disable KFDLocalMemoryTest.Fragmentation and
KFDEventTest.MeasureInterruptConsumption as
part of the  KFD test suite improvement feature

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: f853dda9ba]
2025-05-21 16:29:15 -04:00
Ben Vanik ba02a7b1ca kfdtest: Fix SVM profiler QUEUE_RESTORE parsing
[ROCm/ROCR-Runtime commit: d54124383f]
2025-05-21 13:17:25 -04:00
Searles, Mark f698518819 Update createMCObjectStreamer() to use new LLVM API (#156) (#157)
* Update createMCObjectStreamer() to use new LLVM API

Obsolete interfaces were removed via llvm-project's
f2ff298867d7733122e32eead5a8c524b09dfdb1

* Fix typo: LLVM_VERSION -> LLVM_VERSION_MAJOR

* Fix typo

[ROCm/ROCR-Runtime commit: ac1e6d59c2]
2025-05-05 13:18:05 -07:00
Apurv Mishra aa896090f8 kfdtest: Update ROCr homepage in CMakeLists.txt
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: aa0a32a166]
2025-05-01 11:22:49 -04:00
Amber Lin 9d98d7479d kfdtest: Skip SVMEvict with xnack=0
Random driver deadlock on svm_range_evict_svm_bo_worker() is obeserved on
NPS2/DPX mode. It's seen with xnack off and happens more often on the
partition with less VRAM because of TMR.

Temporarily skip SVM Evict tests on Family AV when xnack is disabled.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 5e28208cec]
2025-04-25 12:45:36 -04:00
Amber Lin bf3bb1f1a1 Revert "kfdtest: Temporarily blacklist KFDNegativeTest"
This reverts commit fffdffc3ce.

MEC v18 starts to support pipe reset


[ROCm/ROCR-Runtime commit: bdb6e43b54]
2025-04-21 14:14:10 -04:00
Jonathan Kim a595c0bd25 kfdtest: fix trap on start for gfx 9 and 11
Similar to GFX 12, GFX 9 and 11 need to exit without forwarding
the PC.


[ROCm/ROCR-Runtime commit: 4c3a0698f8]
2025-04-10 14:48:19 -04:00
Eric Huang 13cdca7fb3 kfdtest: fix max queues on multi-gpu mode
The max queues per process is 1024 in KFD,
KFDQMTest.OverSubscribeCpQueues fails with multi-gpu mode
on more than 15 gpus, because 65x16=1040 exceeds 1024, so
changing MAX_CP_QUEUES to adapt it will fix the issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: df6048429c]
2025-04-08 12:57:00 -04:00
Eric Huang 9055cf8092 kfdtest: fix ptrace error on multi-gpu mode
The parent process can only be ptraced by 1 process
once, to avoid the error we have to add mutex to
synchronize the ptrace call.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: d3265234e9]
2025-04-08 09:58:28 -04:00
Apurv Mishra b490aec8e6 kfdtest: support for upstream kernel driver
detect if the loaded driver is upstream or DKMS version and
add a filter for for the tests that fail in upstream driver

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>


[ROCm/ROCR-Runtime commit: 10530fa2a7]
2025-03-27 16:55:21 -04:00
Jonathan Kim 20d9a9a15a kfdtest: fix trap on wave start and end
The debugger override will set the initial request mask to the
previously set request mask so use a different mask to assert
enablement.
Trap on wave start and end also run back to back, so fix the
previous override mask check as well.

In addition, unlike instruction traps, trap on wave start and end
will not require a rewind of the program counter on wave exit.


[ROCm/ROCR-Runtime commit: c710a06ee0]
2025-03-24 20:44:27 -04:00
Emily Deng af293c4a61 kfdtest: Fix the childStatus is 0x7f error for KFDDBGTest.HitMemoryViolation
For the case parent goes faster then child, and child hasn't call the second
raise(SIGSTOP), then parent's "waitpid(childPid, &childStatus, 0)" will return,
and the childStatus will be 0x137f, which is SIGSTOP signal id.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>


[ROCm/ROCR-Runtime commit: 42f79776cd]
2025-03-13 13:38:46 +08:00
Emily Deng 46bb10ff2d kfdtest: Fix DeviceSnapshot return fail error for KFDDBGTest.HitMemoryViolation
For the case that the child goes to the second raise(SIGSTOP),
and parent sends PTRACE_CONT, than child exits. Parent will assert at
DeviceSnapshot, as in kfd_ioctl, couldn't get the mm from child pid.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>


[ROCm/ROCR-Runtime commit: 91ef44d3ec]
2025-03-13 13:38:46 +08:00
Apurv Mishra 1e279a19c3 kfdtest: limit GFX VRAM allocation to 1/4 sys mem
reduce the allocated memory for GFX VRAM as
KFD Evict test faced intermittent page faults,
which can be due to larger GFX CS BO size


[ROCm/ROCR-Runtime commit: 85c4b0020a]
2025-03-12 13:54:04 -04:00
Apurv Mishra 77f4bbfdf1 kfdtest: add blacklist for RHEL9 system
add tests for exclusion when running kfdtest
on RHEL9 system, tested with Navi 31

Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: de8f8f076d]
2025-03-11 16:40:25 -04:00
Amber Lin fffdffc3ce kfdtest: Temporarily blacklist KFDNegativeTest
Blacklist KFDNegativeTest.BasicPipeReset from gfx950 until MEC can
support pipe reset on GC 9.5.0.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: fcf3f91379]
2025-03-10 10:37:19 -07:00
Jonathan Kim 8cbb23183c kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.


[ROCm/ROCR-Runtime commit: c879fdefcf]
2025-03-06 14:04:42 -05:00
Jonathan Kim 36c69a6cff kfdtest: Add KFD SDMA queue reset testing
The KFD can per-SDMA queue reset similar to compute queue reset.
Add test.


[ROCm/ROCR-Runtime commit: ee890e7d2b]
2025-03-06 14:04:42 -05:00
Jonathan Kim 06b2c3aeb6 kfdtest: Allow user to modify packet size for SDMA write packets
This is primarily used for debug and negative testing for SDMA queue
reset and shouldn't be used for normal run cases.


[ROCm/ROCR-Runtime commit: d047708317]
2025-03-06 14:04:42 -05:00
Jonathan Kim 297e8f729e kfdtest: Add create SDMA queue by target engine
KFD supports SDMA queue creation by target engine.
Enable this for testing.


[ROCm/ROCR-Runtime commit: 9e57ce48e8]
2025-03-06 14:04:42 -05:00
Jonathan Kim 303cdb8f7e kfdtest: Add SDMA poll memory register packet support
The SDMA can wait on poll user memory.  This is being added to
support per-SDMA queue reset testing.


[ROCm/ROCR-Runtime commit: a957b24153]
2025-03-06 14:04:42 -05:00
David Belanger 2c11a41adc kfdtest: Fix ExtendedCuMasking test case
Modify test case to support XL cards.

Change-Id: I6ad45a290d50a5238804ce7417bcdb33a3912872
Signed-off-by: David Belanger <david.belanger@amd.com>


[ROCm/ROCR-Runtime commit: 3ceb131df5]
2025-02-27 21:25:19 -05:00
James Zhu b42578b070 kfdtest: fix resource leakage
Resource allocated in SetUp/HsaNodeInfo::Init,
needs be delete in TearDown/HsaNodeInfo::Delete.

Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: f8d8b8011f]
2025-02-24 19:38:59 -05:00
David Belanger 75a060fc53 kfdtest: Convert ExtendedCuMask test to multi-GPU framework
Convert test to use multi-GPU framework.

Add mutex to fix intermixed log issue and annotate logging with
gpu node number.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ic2beeadb1eb4b5a9a0710ac1dbd60b9bf1d84c33


[ROCm/ROCR-Runtime commit: f24d789dee]
2025-01-30 11:41:00 -05:00
Sv. Lockal d1507361ec Fix build issues for musl libc (#267)
Change-Id: Ia31330b0f96669966712b58986abeca754c2cbb9


[ROCm/ROCR-Runtime commit: 5d04bd42f3]
2025-01-29 14:31:05 +00:00
Lang Yu 85125b1054 kfdtest: update AtomicIncIsa for gfx12
"s_waitcnt 0" (deprecated in gfx12) is redundant here.

s_endpgm will wait for all outstanding instructions
to complete before executing.

Change-Id: Ia8b4dd0fd8dd713e7ba2cba9db85b7b12cee1dd4
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/ROCR-Runtime commit: d159b29dc6]
2025-01-28 20:32:41 -05:00
Amber Lin e262729f6f kfdtest: Create gfx950 blacklist
This patch creates the blacklist for gfx950 by copying gfx942 but adding
KFDGWSTest.Semaphore as GWS support is completely removed from gfx950.

Change-Id: I5d7c17e57b8cfd9fae63780ecc9dd55662cfdade
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 0b6e457201]
2025-01-28 08:26:44 -05:00
Alex Sierra da483d7588 kfdtest: add support for gfx9.5.0 in shader store
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I48b98ff631bd1aa1a044b60583ff256e43b17423


[ROCm/ROCR-Runtime commit: 268054cd28]
2025-01-26 21:45:07 -05:00
Alex Sierra 840a613723 kfdtest: Add gfx 9.5 as FAMILY_AV
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Ib5696eee1d4f64c9c87d714eae7c80fbbd1e2b23


[ROCm/ROCR-Runtime commit: e94ff8a36c]
2025-01-26 21:43:55 -05:00
Harish Kasiviswanathan e004ab79f5 kfdtest: Fix KFDASMTest failure on older ASICs
HW_REG_HW_ID1 is only available from gfx12 onwards

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ibf4bd62e01ada3dee6dd88762ccb853bab63ff87


[ROCm/ROCR-Runtime commit: 1d71975fcc]
2025-01-13 15:22:20 -05:00
Harish Kasiviswanathan 0c461ee74a kfdtest: Add gfx12 to TargetList for AssembleShaders
Add gfx12 so that it gets tested when KFDASMTest.AssembleShaders is run.
GWS support has been removed for gfx12. Modify shaders to take that into
account.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I70e87febb6388852ea54d69cf9201339a7910581


[ROCm/ROCR-Runtime commit: f8ae5c47ba]
2025-01-13 15:22:15 -05:00
Lang Yu 6c18e6188d kfdtest: consolidate LoopIsa
1, Initialize the registers before using them is the best practice.
Though the use case here doesn't care whether the registers are
initialized or not, some emulators complain the "read_before_write"
behavior. Initialize the registers used to silence these complaints.

2, Update s_wait stuff for gfx12.

Change-Id: I462b2b0b5017dd2876a5954169d3b6b2f1c2a75b
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/ROCR-Runtime commit: fe5f12342d]
2025-01-10 21:27:23 -05:00
Kent Russell f256811bab kfdtest: Can't initialize variable-sized objects
Do a memset, since we can't initialize variable-sized objects

Change-Id: I57faf4a0581a29f9d30391aa387812c2b7bb5011
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: cc7ff73e7f]
2025-01-09 10:36:06 -05:00