* Add HasExpertSchedMode device prop
* Add unit tests for HasExpertSchedMode
* Add gfx12 check for HasExpertSchedMode prop
* Update gfx major version check and test for ExpertSchedMode
* Minor fix and ROCr version bump
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Apply suggestion from @dayatsin-amd
* Apply suggestion from @dayatsin-amd
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com>
* kfdtest: Replace pthread with std::thread
Modify concurrent kfdtest to use std::thread
instead of pthread, eventually modify KFDTestLaunch
to take in a member function of test instance
instead of static function.
Convert KFDQMTest to pass in member function for
multi-gpu kfdtest.
* kfdtest: Convert KFDPerfCountersTest to use std::thread
Convert KFDPerfCountersTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDGraphicsInterop to use std::thread
Convert KFDGraphicsInterop to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDGWSTest to use std::thread
Convert KFDGWSTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDCWSRTest to use std::thread
Convert KFDCWSRTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDEventTest to use std::thread
Convert KFDEventTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDExceptionTest to use std::thread
Convert KFDExceptionTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDLocalMemoryTest to use std::thread
Convert KFDLocalMemoryTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDMemoryTest to use std::thread
Convert KFDMemoryTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDSVMRangeTest to use std::thread
Convert KFDSVMRangeTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDHWSTest to use std::thread
Convert KFDHWSTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Remove pthread multigpu test structure
Remove older multi-gpu test framework which
uses pthread.
* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)
---------
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
Modify the code that computes the adjusted CU mask array to take
into account of additional cases for inactive CUs.
Signed-off-by: David Belanger <david.belanger@amd.com>
Support was removed for these eng samples, so remove them from the
blacklist, and make sure that we're using 942 for the shader store
[ROCm/ROCR-Runtime commit: f755981f03]
blacklist the KFDEvictTest suite until the defects
SWDEV 535386 and 537002, where these test cases fail
inconsistently, are fixed
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: 3115384874]
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: d9a95605cc]
Create CP queue and SDMA queue should fail with invalid queue ring
buffer or ring buffer size.
Test unmap or free queue buffers should fail before queue is destroyed.
Use child process to test unmap CWSR buffer will evict queue.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I5dcd51d6b43445d19a986f8b0b82063e20348a5f
[ROCm/ROCR-Runtime commit: bd86fb1e63]
disable KFDLocalMemoryTest.Fragmentation and
KFDEventTest.MeasureInterruptConsumption as
part of the KFD test suite improvement feature
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: f853dda9ba]
* Update createMCObjectStreamer() to use new LLVM API
Obsolete interfaces were removed via llvm-project's
f2ff298867d7733122e32eead5a8c524b09dfdb1
* Fix typo: LLVM_VERSION -> LLVM_VERSION_MAJOR
* Fix typo
[ROCm/ROCR-Runtime commit: ac1e6d59c2]
Random driver deadlock on svm_range_evict_svm_bo_worker() is obeserved on
NPS2/DPX mode. It's seen with xnack off and happens more often on the
partition with less VRAM because of TMR.
Temporarily skip SVM Evict tests on Family AV when xnack is disabled.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 5e28208cec]
The max queues per process is 1024 in KFD,
KFDQMTest.OverSubscribeCpQueues fails with multi-gpu mode
on more than 15 gpus, because 65x16=1040 exceeds 1024, so
changing MAX_CP_QUEUES to adapt it will fix the issue.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
[ROCm/ROCR-Runtime commit: df6048429c]
The parent process can only be ptraced by 1 process
once, to avoid the error we have to add mutex to
synchronize the ptrace call.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
[ROCm/ROCR-Runtime commit: d3265234e9]
detect if the loaded driver is upstream or DKMS version and
add a filter for for the tests that fail in upstream driver
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: 10530fa2a7]
The debugger override will set the initial request mask to the
previously set request mask so use a different mask to assert
enablement.
Trap on wave start and end also run back to back, so fix the
previous override mask check as well.
In addition, unlike instruction traps, trap on wave start and end
will not require a rewind of the program counter on wave exit.
[ROCm/ROCR-Runtime commit: c710a06ee0]
For the case parent goes faster then child, and child hasn't call the second
raise(SIGSTOP), then parent's "waitpid(childPid, &childStatus, 0)" will return,
and the childStatus will be 0x137f, which is SIGSTOP signal id.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
[ROCm/ROCR-Runtime commit: 42f79776cd]
For the case that the child goes to the second raise(SIGSTOP),
and parent sends PTRACE_CONT, than child exits. Parent will assert at
DeviceSnapshot, as in kfd_ioctl, couldn't get the mm from child pid.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
[ROCm/ROCR-Runtime commit: 91ef44d3ec]
reduce the allocated memory for GFX VRAM as
KFD Evict test faced intermittent page faults,
which can be due to larger GFX CS BO size
[ROCm/ROCR-Runtime commit: 85c4b0020a]
Blacklist KFDNegativeTest.BasicPipeReset from gfx950 until MEC can
support pipe reset on GC 9.5.0.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: fcf3f91379]
This is primarily used for debug and negative testing for SDMA queue
reset and shouldn't be used for normal run cases.
[ROCm/ROCR-Runtime commit: d047708317]
Modify test case to support XL cards.
Change-Id: I6ad45a290d50a5238804ce7417bcdb33a3912872
Signed-off-by: David Belanger <david.belanger@amd.com>
[ROCm/ROCR-Runtime commit: 3ceb131df5]
Resource allocated in SetUp/HsaNodeInfo::Init,
needs be delete in TearDown/HsaNodeInfo::Delete.
Signed-off-by: James Zhu <James.Zhu@amd.com>
[ROCm/ROCR-Runtime commit: f8d8b8011f]
Convert test to use multi-GPU framework.
Add mutex to fix intermixed log issue and annotate logging with
gpu node number.
Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: Ic2beeadb1eb4b5a9a0710ac1dbd60b9bf1d84c33
[ROCm/ROCR-Runtime commit: f24d789dee]
"s_waitcnt 0" (deprecated in gfx12) is redundant here.
s_endpgm will wait for all outstanding instructions
to complete before executing.
Change-Id: Ia8b4dd0fd8dd713e7ba2cba9db85b7b12cee1dd4
Signed-off-by: Lang Yu <lang.yu@amd.com>
[ROCm/ROCR-Runtime commit: d159b29dc6]
This patch creates the blacklist for gfx950 by copying gfx942 but adding
KFDGWSTest.Semaphore as GWS support is completely removed from gfx950.
Change-Id: I5d7c17e57b8cfd9fae63780ecc9dd55662cfdade
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 0b6e457201]
HW_REG_HW_ID1 is only available from gfx12 onwards
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ibf4bd62e01ada3dee6dd88762ccb853bab63ff87
[ROCm/ROCR-Runtime commit: 1d71975fcc]
Add gfx12 so that it gets tested when KFDASMTest.AssembleShaders is run.
GWS support has been removed for gfx12. Modify shaders to take that into
account.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I70e87febb6388852ea54d69cf9201339a7910581
[ROCm/ROCR-Runtime commit: f8ae5c47ba]
1, Initialize the registers before using them is the best practice.
Though the use case here doesn't care whether the registers are
initialized or not, some emulators complain the "read_before_write"
behavior. Initialize the registers used to silence these complaints.
2, Update s_wait stuff for gfx12.
Change-Id: I462b2b0b5017dd2876a5954169d3b6b2f1c2a75b
Signed-off-by: Lang Yu <lang.yu@amd.com>
[ROCm/ROCR-Runtime commit: fe5f12342d]
Do a memset, since we can't initialize variable-sized objects
Change-Id: I57faf4a0581a29f9d30391aa387812c2b7bb5011
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: cc7ff73e7f]