IterateIsa had some leftover instructions from when the shader was
getting updated for KFDCWSRTest.BasicTest.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I41ae7b7948cbe2aff8bf61b170b9a7d498b836a3
[ROCm/ROCR-Runtime commit: 82a41c7e4d]
fork process copy-on-write MMU nitifier on CWSR range will evict user
queues, and then update GPU mapping and resume queues, use MADV_DONTFORK
to avoid COW MMU notifier callback on CWSR SVM range.
Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc
new range in child process, this fails to register svm range as range is
invalid address in forked child process.
Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 093cf898fb]
mmap alloc larger address range with align padding page plus guard
pages, then unmap the padding and guard pages at beginning and end
of the range, return aligned address range.
Change-Id: Iaf3c711a079c744289efbafee9b5e63aaf724765
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: b7710a1dda]
GFX1036(ISA version) is not included in the previous range.
This patch can really include all gfx10 series ASICs.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Change-Id: I0e28dbfc031c216166b306b9fb39f644f75a330f
[ROCm/ROCR-Runtime commit: 06a90612e9]
Avoiding the segfault, runtime debugger enable is not supported
if the firmware of gpu doesn't support debug exceptions.
Signed-off-by: jie1zhan <jesse.zhang@amd.com>
Change-Id: Ifad57a6e78cb1c92b1f8927355ece8c64e89c51b
[ROCm/ROCR-Runtime commit: d98c729ff9]
Remove potential double free condition when free_queue() is called
after hsaKmtDestroyQueue() if mapping doorbell fails during queue
creation.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: If2aa19c455b30d2940b232dbafb9cc1eaad721a5
[ROCm/ROCR-Runtime commit: 57a1c6f3ff]
When running kfdtest test case, because the filter node of the new chip is
missing in libhsakmt, the test case is not supported, so a new test node
is added in order to spporting kfdtest case.
Signed-off-by: shikaguo <shikai.guo@amd.com>
Change-Id: I0cd9ffd7d4387129cfb0f8de6b669f431949ab49
[ROCm/ROCR-Runtime commit: 4951495fca]
Queue ctx_save_restore memory is allocated with size
ctx_save_restore_size + debug_memory_size, use the same size
in free_queue to free ctx_save_restore memory.
Change-Id: I4902ff15fb82ddea64b8342b89776a1bf5c38d13
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 3dbf5feffe]
Avoiding segfault when an invalid SharedMemoryHandle is passed in
when calling fmm_register_shared_memory.
Change-Id: I0e0bbed01487fc10afcbb170eb9330e70b209d14
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
[ROCm/ROCR-Runtime commit: 1c385fb257]
Now that HsaNodeProperties is passed in to
topology_get_node_props_from_drm, check that pointer instead of the
pointer for MarketingName (which throws a compiler warning)
Signed-off-by: kent.russell@amd.com <kent.russell@amd.com>
Change-Id: If76b24e1bab5a62e514ab440b6316c7b7cd264c1
[ROCm/ROCR-Runtime commit: ea4d4917c1]
Query family id info from drm render node, then
ROCr can query this info directly from Thunk
instead of parsing the info by itself.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I030bd27ab2379fbf87f3d787302c3b8613456278
[ROCm/ROCR-Runtime commit: 66e9e97e0d]
Required due to LLVM retirement of llvm::apply_tuple, instead using
std::apply which was introduced in C++17.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I6646ebcca7d71d3e1bcf340ccfa3db2c15a3110a
[ROCm/ROCR-Runtime commit: 4267c4b524]
Failure with new CWSR tests reported for GFX10, for now add to blacklist.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I5b2bd9ec61c64ad66e1c34ba2c192bece808f56f
[ROCm/ROCR-Runtime commit: 0055ef46c4]
This patch restructures the CWSR basic test and allows for
creating parameterized CWSR tests. This patch introduces four
parameterizations. These tests behave as follows:
This test dispatches the IterateIsa shader, which continuously
increments a vgpr for (num_witems / WAVE_SIZE) waves. While this shader
is running, dequeue/requeue requests are sent in a loop to trigger
CWSRs.
This test defines a CWSR threshold. Once the number of CWSRs triggered
reaches the threshold, a known-value is filled into the inputBuf to
signal the shader to exit.
4 parameterized tests are defined:
KFDCWSRTest.BasicTest/0
KFDCWSRTest.BasicTest/1
KFDCWSRTest.BasicTest/2
KFDCWSRTest.BasicTest/3
0: 1 work-item, CWSR threshold of 10
1: 256 work-items, CWSR threshold of 50
2: 512 work-items, CWSR threshold of 100
3: 1024 work-items, CWSR threshold of 1000
Tuple Format: (num_witems, cwsr_thresh)
num_witems: Defines the number of work-items.
cwsr_thresh: Defines the number of CWSRs to trigger.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I639eb7bd75b14ee70e190b4bd19dcf34096fc7bf
[ROCm/ROCR-Runtime commit: 0dbac97b75]
The debugger can now request snapshot copies with entry size and
set/clear watchpoints by device.
v3: drop min version check to v10.0
v2: check runtime allowance from v10.3 to 13.x
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I9befefb596201a11591de218db29a9317b41e69b
[ROCm/ROCR-Runtime commit: c1d8ac8437]
This didn't return anything, so add a "return 0" at the end, since the
function expects to return an int value
Change-Id: I17c398e431b2ce4571e6ca4abe6d567f110ea2a7
[ROCm/ROCR-Runtime commit: 90ada94141]
: The kernel driver will do align VRAM allocations to 2MB, instead of 4KB.
Change-Id: Iea9d8c0f02999b9ea5fd931da82240a33f7bcc69
[ROCm/ROCR-Runtime commit: 17fb40f1f6]
The debugger depends on the CWSR area being executable. Set the right
flag when registering SVM memory.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Laurent Morichetti <Laurent.Morichetti@amd.com>
Change-Id: I7441e214d1a4da8324d775e777976fabd1c81a6f
[ROCm/ROCR-Runtime commit: deb7a20c92]
KFDExceptionTest.SdmaQueueException allocates VRAM with host access. This
fails on small-BAR GPUs. This error was incorrectly ignored before
7ccda4ba26 ("kfdtest: Full TearDown and SetUp in child process").
The test doesn't really need host access to the memory. Therefore the fix
is to disable the HostAccess flag.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ifec279eeb6c1ecb1160db9b692e6dc8816d761a3
[ROCm/ROCR-Runtime commit: 9d33827a84]
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream. Leave dummy functions in
place for now.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I9e51403d753cb91630553aff4f19e931af509740
[ROCm/ROCR-Runtime commit: 9b2b81e555]
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I64b6213eb3adbdc550542e51181cd8ba6ca4cb45
[ROCm/ROCR-Runtime commit: cdaaf8236a]
hsaKmtMapMemoryToGPU should not try to map VRAM on peer GPUs that don't
have an IO-Link to the memory. The new P2P mapping code in KFD will
fail otherwise.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I6d59b55651b98756865a0f69eafef3e386372cf3
[ROCm/ROCR-Runtime commit: 9ac2c75171]
This allows init_process_apertures to use the whole consistent topoology
instead of taking its own partial snapshot.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia13e7aa7fcd090ea8d6cacd4babb29a27c20207f
[ROCm/ROCR-Runtime commit: 87aca673e8]
With the next patch, child processes need to fully reinitialize the
topology in order to recreate the process apertures. Just calling
hsaKmtOpenKFD is no longer sufficient. Tests based on
KFDMultiProcessTest already did this correctly (KFDHWSTest, eviction
tests). This patch fixes KFDExceptionTest and KFDIPCTest.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iaad24e88ddd29c1105bf791a77891cc55a6072ff
[ROCm/ROCR-Runtime commit: 412b24137e]
We should link against numa without hardcoding the path to it.
CMake should determine how to link numa automatically, similar to how rt
and pthread is linked.
Fixes
Change-Id: Ifb9ac30e200c66cbd7f1cf80d25fffef1dcf8d2f
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: aa25cb1acc]
LoopIsa is a shader that performs a variety of intensive
calculations in a loop. It is used by tests such as
KFDQMTest.QueuePriorityOn*
It contained a scalar load, despite not having any buffer to
read from. This load causes page faults on GFX11. It is
unclear why it did not cause page faults on earlier ASICs.
Remove the load.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I7426d0db48e933f3bb870467ea88476f7a283040
[ROCm/ROCR-Runtime commit: 39e8a85aac]
When the shaders were moved to ShaderStore,
KFDQMTest.EmptyDispatch was erroneously
changed to use LoopIsa instead of NoopIsa.
Change it back.
Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: Iaf7d0d107e3bf3bd8b7d616b137a1740e309cf91
[ROCm/ROCR-Runtime commit: 4b8c74bf04]
Previously we omitted the version and arch in the filenames. By adding this,
as well as the ROCM build variable, this will allow for easy version
version detection on systems. Instead of kfdtest = v1.0.0, now it will
feature the build number, allowing for easier identification as to which
version is installed.
Change-Id: I311ed7010486e7c70af669d282910fe29ee8db45
[ROCm/ROCR-Runtime commit: 9745db3053]
To improve performance on queue preemption, allocate ctx s/r
area in VRAM instead of system memory, and migrate it back
to system memory when VRAM is full.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: If775782027188dbe84b6868260e429373675434c
[ROCm/ROCR-Runtime commit: 37be876cad]
It is to add new option for always keeping gpu mapping
and bump KFD version for the feature of unified save
restore memory.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iebee35e6de4d52fa29f82dd19f6bbf5640249492
[ROCm/ROCR-Runtime commit: e1d1a6fbb0]
Open SMI event file handle, prefetch to migrate svm range to GPU, read
HMM profiling events, then check event_id, address, size, pid, event
triggers are the expected value.
Start separate thread to read SMI event, the same way applications use.
Use thread barrier to ensure no event is dropped.
Change-Id: I0683969d18d1579847e125d86aa4257602adb13f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 7799611c01]
The KFD no longer allow debug ops that modify HW state prior to
trap activation so permit bump in major version.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I072d3998b7b043df9a67f0f6762b0afdfa9382c6
[ROCm/ROCR-Runtime commit: 79cd63fab6]
Kernel amd-staging-drm-next branch changes GFX11 fish_colour sysfs
naming to "ip discovery". Update run_kfdtest.sh to use sysfs
gfx_target_version for ASICs that have transitioned to IP discovery
topology.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: If202a0ceeed7324364539a33661f0abcf0973f07
[ROCm/ROCR-Runtime commit: 350eba3a07]
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES on
GFX11.
Change ensures queue memory does not specify ATS.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I10b23b0205c90dad902c711a88cfb5e9b4979617
[ROCm/ROCR-Runtime commit: e17b159230]
System Management Interface event is read from anonymous file handle,
this helper wrap the ioctl interface to get anonymous file handle for
GPU nodeid.
Define SMI event IDs, event triggers, copy the same value from
kfd_ioctl.h to avoid translation.
Change-Id: I5c8ba5301473bb3b80bb4e2aa33a9f675bedb001
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 405fbd6f93]
Add KFDGpuID to HsaNodeProperties to return gpu_id to upper layer,
gpu_id is hash ID generated by KFD to distinguish GPUs on the system.
ROCr and ROCProfiler will use gpu_id to analyze SMI event message.
Change-Id: I6eabe6849230e04120674f5bc55e6ea254a532d6
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 88778e13dc]
Currently, using --gtest_filter will override any exclusions from
kfdtest.exclude. To add an additional test (or set of tests) to the
exclusion list dynamically, the --exclude (-e) flag will allow the user
to pass in a string of GTest-style exclusions to add to the list
generated by kfdtest.exclude
e.g. run_kfdtest.sh -e "*KFDLocalMemoryTest.*:KFDEventTest.*"
will use kfdtest.exclude, but will also exclude all LocalMemory and
Event tests
Change-Id: Ic23ec271ba2cd2240d2e98558c0117ff2a064ed2
[ROCm/ROCR-Runtime commit: f7978b1ff6]
This env variable sets the max VA alignment order size as
"PAGE_SIZE * 2^alignment order" during mapping. By default the order
size is set to 9(2MB).
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I01ae4e0963f4d21c7c367464e60f865bc58d7fac
[ROCm/ROCR-Runtime commit: 0e908f05bb]
Add blacklist for tests on GFX11 that are under debug/not functional.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I32f375f29036a2fc1a54f2792b31ebc45f4d668b
[ROCm/ROCR-Runtime commit: 852d29bca2]
New API function to report available memory per GPU
Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com>
Change-Id: I63c1e4ca0020c657977ab3635947ab0ed0a81440
[ROCm/ROCR-Runtime commit: 6da6058d4a]
Use GNUInstallDirs variables in post install scripts
License file installed in CMAKE_INSTALL_DOCDIR
Change-Id: I182ca292e03787a6c189e8de31d32244b65b5687
[ROCm/ROCR-Runtime commit: 707200e26e]
GFX11 set to FAMILY_NV.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I545241c39bbe739e39a8943b242b9fc49a65a7e1
[ROCm/ROCR-Runtime commit: c198b91c94]
With this patch gfx11 is supported in amd-staging.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I4bbfc87345a486dd8f2e0091ea7b82c255a8ad15
[ROCm/ROCR-Runtime commit: cee64d4825]
These were required back due to dependency issues in earlier ROCm
releases. With thunk being static now and with better dependency
definitions being used, we can remove these
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I266a783993edf32811caf027f4289ede0cbfcb16
[ROCm/ROCR-Runtime commit: 65aec53bca]