close the file at the end of every test, instead of the whole test
Change-Id: Ia510990dad8d0bd82625bbd9b2958181e8f1dd25
[ROCm/ROCR-Runtime commit: 8941e7135c]
Now that HsaNodeProperties is passed in to
topology_get_node_props_from_drm, check that pointer instead of the
pointer for MarketingName (which throws a compiler warning)
Signed-off-by: kent.russell@amd.com <kent.russell@amd.com>
Change-Id: If76b24e1bab5a62e514ab440b6316c7b7cd264c1
[ROCm/ROCR-Runtime commit: ea4d4917c1]
Add agent info query HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID.
Then we can remove the codes to parse family id.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I3ac4746d3015e89b32322ebc0f8a3084f98677a4
[ROCm/ROCR-Runtime commit: d0e7c617df]
Query family id info from drm render node, then
ROCr can query this info directly from Thunk
instead of parsing the info by itself.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I030bd27ab2379fbf87f3d787302c3b8613456278
[ROCm/ROCR-Runtime commit: 66e9e97e0d]
Required due to LLVM retirement of llvm::apply_tuple, instead using
std::apply which was introduced in C++17.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I6646ebcca7d71d3e1bcf340ccfa3db2c15a3110a
[ROCm/ROCR-Runtime commit: 4267c4b524]
Failure with new CWSR tests reported for GFX10, for now add to blacklist.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I5b2bd9ec61c64ad66e1c34ba2c192bece808f56f
[ROCm/ROCR-Runtime commit: 0055ef46c4]
This reverts commit 005a0b6979.
The change from using RUNPATH to RPATH was not approved formally.
Reverting this patch until this gets approved.
Change-Id: Ibc1a8f9d5dfa6694adacccfd9e3b0d053660e848
[ROCm/ROCR-Runtime commit: 0647960019]
This patch restructures the CWSR basic test and allows for
creating parameterized CWSR tests. This patch introduces four
parameterizations. These tests behave as follows:
This test dispatches the IterateIsa shader, which continuously
increments a vgpr for (num_witems / WAVE_SIZE) waves. While this shader
is running, dequeue/requeue requests are sent in a loop to trigger
CWSRs.
This test defines a CWSR threshold. Once the number of CWSRs triggered
reaches the threshold, a known-value is filled into the inputBuf to
signal the shader to exit.
4 parameterized tests are defined:
KFDCWSRTest.BasicTest/0
KFDCWSRTest.BasicTest/1
KFDCWSRTest.BasicTest/2
KFDCWSRTest.BasicTest/3
0: 1 work-item, CWSR threshold of 10
1: 256 work-items, CWSR threshold of 50
2: 512 work-items, CWSR threshold of 100
3: 1024 work-items, CWSR threshold of 1000
Tuple Format: (num_witems, cwsr_thresh)
num_witems: Defines the number of work-items.
cwsr_thresh: Defines the number of CWSRs to trigger.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I639eb7bd75b14ee70e190b4bd19dcf34096fc7bf
[ROCm/ROCR-Runtime commit: 0dbac97b75]
The debugger can now request snapshot copies with entry size and
set/clear watchpoints by device.
v3: drop min version check to v10.0
v2: check runtime allowance from v10.3 to 13.x
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I9befefb596201a11591de218db29a9317b41e69b
[ROCm/ROCR-Runtime commit: c1d8ac8437]
The allocation logic of the SPI does not take into account compute
user thread management settings for masking CUs with the exception of
skipping fully disabled SEs. This means that occupancy limited
dispatches such as cooperative launch may over allocate onto hardware
resources that are not immediately available, resulting in a potential
barrier logic hang as occupying work groups are waiting on enqueued
work groups to reach the barrier.
Further work will have to be done to get the per-SA CU enablement count
from the KFD in order to correctly clip the cooperative CU limit based
on the CU mask, which will require breaking the current ABI.
For now, report that cooperative launch is not supported while a CU
mask has been applied to prevent potential shader hangs.
Change-Id: I8be4bb47d65ceb62d805f36ef6ef3996d756021f
[ROCm/ROCR-Runtime commit: 2b75a73ce7]
Change default behavior for library search to use RPATH instead of
RUNPATH.
Change-Id: I328766006d02c2a8c76a3b1e0780ae5ca678ed86
[ROCm/ROCR-Runtime commit: c904cc5856]
New environment variable HSA_OVERRIDE_CPU_AFFINITY_DEBUG to
enable/disable overriding CPU affinity.
Default value is enabled(1).
This is a temporary variable and may be removed in the future.
Change-Id: Id6a7c611730471ddc276ca333fde1e57046bf32a
[ROCm/ROCR-Runtime commit: df3fe8c2fb]
Add support to expose executable bit.
Change-Id: I054f5c3173822c369dd9908eec5c449459600ce1
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
[ROCm/ROCR-Runtime commit: a7db31c5d1]
Fix for regression in commit:
da0ca94219
When running rocrtstNeg.Queue_Validation_InvalidWorkGroupSize, each
time rocrtst::LoadKernelFromObjFile is called, a new CodeObject is
created and not deleted until end of the whole test. Each CodeObject
keeps an open file descriptor of the kernel file and this can exceed
maximum allowed open files on some systems. Deleting the CodeObjects
after each iteration in the test.
Change-Id: I388e56f95f7b671ecc29d5ecb4eb8ac2d0ddc412
[ROCm/ROCR-Runtime commit: 50b636d1d8]
Add new test for GPU agents memory available
Change-Id: Ib07e2003a21659b99732b535cd004081635d6aa1
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
[ROCm/ROCR-Runtime commit: ec759c7995]
Add max enum value to force size of enum and avoid clang compile
warnings.
Change-Id: I9cdf529517cc605a5039c3a924fd718ece16029d
[ROCm/ROCR-Runtime commit: 86e4cb1ddd]
For gfx11 the image type table has some different values compared to
previous asic families (e.g TYPE_SRGB). Creating a new LUT class to
use these new values.
Change-Id: Ifdfc6cd29bfd5f4ec2643c848fcb9986eb874f9e
[ROCm/ROCR-Runtime commit: 117495fe88]
Update image table enums and format tables for gfx11.
Remove some entries that are not needed.
Change-Id: I060c1e285925a6d428ef1c5498f5dd89f5d79d97
[ROCm/ROCR-Runtime commit: f971834d7a]
This library was taken from public MESA library:
https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/amd/addrlib
with top commit:
2866ae32da0348caf71ad2d11c353321df626ff4
Removing macros.h as it is no longer used by addrlib
Change-Id: I0fdabfe48b74c259b4d29d81beae89604bbc141a
[ROCm/ROCR-Runtime commit: a742b7e830]
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.
Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4
[ROCm/ROCR-Runtime commit: 061aa04147]
Adding new ImageManager class for GFX11 GPUs
ImageManagerGfx11 functions copied from ImageManagerNv.
Register descriptions in resource_gfx11.h updated for gfx11.
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Change-Id: I48b39f6a633aef14aa829f7240a43fe0feb1c290
[ROCm/ROCR-Runtime commit: 907e05c1b3]
GPUs excluded by RVD are not expected to have scratch, memory, trap
handling nor memory regions set up. Now that these GPUs are added to
a new list, early return on agent destruction to prevent bad function
calls on destroy.
Also fix up broken memory releases between the gpu lists and ugly braces.
Change-Id: I52fc6e86ceba0a0383cedc63310eb409515eaf9f
[ROCm/ROCR-Runtime commit: 9d2fe1ac2a]
This didn't return anything, so add a "return 0" at the end, since the
function expects to return an int value
Change-Id: I17c398e431b2ce4571e6ca4abe6d567f110ea2a7
[ROCm/ROCR-Runtime commit: 90ada94141]
: The kernel driver will do align VRAM allocations to 2MB, instead of 4KB.
Change-Id: Iea9d8c0f02999b9ea5fd931da82240a33f7bcc69
[ROCm/ROCR-Runtime commit: 17fb40f1f6]
Fix the issue of rocrtst test - The runtime failed to allocate the necessary resources
Change-Id: Ie4ffeb939fb322db068f3132a7973a359c204176
[ROCm/ROCR-Runtime commit: 8a0fe6a832]
Atomic memory operations on these memory buffers are not guaranteed
to be visible at system scope
Change-Id: I4cccde114632071a000384502a83bc191e77e85b
[ROCm/ROCR-Runtime commit: 364715cbc6]
The debugger depends on the CWSR area being executable. Set the right
flag when registering SVM memory.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Laurent Morichetti <Laurent.Morichetti@amd.com>
Change-Id: I7441e214d1a4da8324d775e777976fabd1c81a6f
[ROCm/ROCR-Runtime commit: deb7a20c92]
The current state of hsa-rocr does
NOT requires thunk lib as its dependency.
Its unnecessary pulling thunk package while
installing rocr. This patch corrects
the same
Change-Id: Id98ede8b66ffd9aaf4a47da96ba2f981f4c3da73
[ROCm/ROCR-Runtime commit: a229f5c320]
Maintainer distribution list field had wrong information.
Adding the newly formed DL by the component team.
Change-Id: I61651e429375cdc512d0fe4b0768f917506b5392
[ROCm/ROCR-Runtime commit: 23f908708a]
KFDExceptionTest.SdmaQueueException allocates VRAM with host access. This
fails on small-BAR GPUs. This error was incorrectly ignored before
7ccda4ba26 ("kfdtest: Full TearDown and SetUp in child process").
The test doesn't really need host access to the memory. Therefore the fix
is to disable the HostAccess flag.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ifec279eeb6c1ecb1160db9b692e6dc8816d761a3
[ROCm/ROCR-Runtime commit: 9d33827a84]
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream. Leave dummy functions in
place for now.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I9e51403d753cb91630553aff4f19e931af509740
[ROCm/ROCR-Runtime commit: 9b2b81e555]
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I64b6213eb3adbdc550542e51181cd8ba6ca4cb45
[ROCm/ROCR-Runtime commit: cdaaf8236a]
hsaKmtMapMemoryToGPU should not try to map VRAM on peer GPUs that don't
have an IO-Link to the memory. The new P2P mapping code in KFD will
fail otherwise.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I6d59b55651b98756865a0f69eafef3e386372cf3
[ROCm/ROCR-Runtime commit: 9ac2c75171]
This allows init_process_apertures to use the whole consistent topoology
instead of taking its own partial snapshot.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia13e7aa7fcd090ea8d6cacd4babb29a27c20207f
[ROCm/ROCR-Runtime commit: 87aca673e8]
With the next patch, child processes need to fully reinitialize the
topology in order to recreate the process apertures. Just calling
hsaKmtOpenKFD is no longer sufficient. Tests based on
KFDMultiProcessTest already did this correctly (KFDHWSTest, eviction
tests). This patch fixes KFDExceptionTest and KFDIPCTest.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iaad24e88ddd29c1105bf791a77891cc55a6072ff
[ROCm/ROCR-Runtime commit: 412b24137e]
We should link against numa without hardcoding the path to it.
CMake should determine how to link numa automatically, similar to how rt
and pthread is linked.
Fixes
Change-Id: Ifb9ac30e200c66cbd7f1cf80d25fffef1dcf8d2f
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: aa25cb1acc]