Граф коммитов

1166 Коммитов

Автор SHA1 Сообщение Дата
Philip Yang 6bf1babb51 kfdtest: Fix KFDSVMEvictTest.QueueTest OOM
Typo to calculate bufferSize from vramBufSizeInPages. The OOM shows up
only with HSA_XNACK=1 because HSA_XNACK=0 doesn't support VRAM
oversubscription. We changed to run SVM tests with both XNACK off and
on.

Change-Id: I3949959288fd92f4e7f4a87115a5f1547e225042
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 29b04c2534]
2023-06-13 21:15:31 -04:00
James Zhu 1ff2f1f7d8 kfdtest: Add test for event wait with event age tracking enable
Add 5 different test scenario to cover new event age tracking features.

Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 4ba8f1fe77]
2023-06-10 11:41:50 -04:00
James Zhu 498b718e83 libhsakmt: add event age tracking
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.

Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: a0cbf90b90]
2023-06-10 11:41:50 -04:00
Ori Messinger 0f44742bc4 kfdtest: Fix minor typo
The purpose of this patch is to fix a minor typo in KFDSVMRangeTest.
Before:
"Skipping test: no enough system memory."
After:
"Skipping test: Not enough system memory."

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba


[ROCm/ROCR-Runtime commit: 4675492852]
2023-06-08 15:58:25 -04:00
Graham Sider 73f293fc01 kfdtest: Update GFX11 blacklist
KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version
0x3c ().

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c


[ROCm/ROCR-Runtime commit: d1a095123d]
2023-06-08 14:11:30 -04:00
Sreekant Somasekharan c84cdca17e kfdtest: RoundToPowerOf2 function modified for compiler compliant bit shift values
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.

Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6


[ROCm/ROCR-Runtime commit: 1428a7538e]
2023-06-05 18:14:58 -04:00
Alex Sierra 6998fcee45 libhsakmt: include changes for upstream debugger API
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467


[ROCm/ROCR-Runtime commit: 728162c2c8]
2023-06-02 15:56:19 -04:00
Kent Russell c11b1022f1 fmm.c: Fix possibly initialized variable usage
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base

Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: 718d95de77]
2023-05-30 15:46:14 -04:00
Xiaogang Chen 682173c851 libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero.
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.

Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e


[ROCm/ROCR-Runtime commit: f6183f937e]
2023-05-29 00:15:14 -04:00
Xiaogang Chen ebce4177ad libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu.
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6


[ROCm/ROCR-Runtime commit: 7e4e57ae5f]
2023-05-29 00:15:14 -04:00
Xiaogang Chen dd8954e83e libhsakmt: query/use render node fds that libdrm uses.
Query render node fds that libdrm uses for current process and
use them at Thunk if available.

v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471


[ROCm/ROCR-Runtime commit: ac1db60fc2]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 6800fbec43 libhsakmt: add NodeId at HsaGraphicsResourceInfo.
Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver
at fmm_register_graphics_handle.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b


[ROCm/ROCR-Runtime commit: 9bebb276be]
2023-05-29 00:15:14 -04:00
Xiaogang Chen eeec387ca2 libhsakmt: add DMABuf import without address allocation.
When gpu map info is not provided import DMABuf without VA assigned.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292


[ROCm/ROCR-Runtime commit: 989c6c617c]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 4d6b75d857 libhsakmt: support allocating a fixed address at mmap_aperture.
When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c


[ROCm/ROCR-Runtime commit: d2a37894bb]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 038916c727 libhsakmt: update HsaPointerInfo for address-only allocated VRAM.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56


[ROCm/ROCR-Runtime commit: 11ac57d293]
2023-05-29 00:15:14 -04:00
Xiaogang Chen a7ccb14b9c kfdtest: add kfdtest cases for VA-only, VRAM-only allocated VRAM.
Alloc vram by kfd, then map by GEM api to GPU VM and map to CPU VM.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib5b2f35662cd5473f622f6ffc9b62925fe57ae42


[ROCm/ROCR-Runtime commit: 108c0e5f92]
2023-05-29 00:15:14 -04:00
Xiaogang Chen b4b03aca20 libhsakmt: support vram-only and VA-only alloc/free.
Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0


[ROCm/ROCR-Runtime commit: 0138487aa4]
2023-05-29 00:15:14 -04:00
Xiaogang Chen ee0c668706 libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space.
This new manageable_aperture_t is used for VRAM allocation-only and
VA allocation-only.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50


[ROCm/ROCR-Runtime commit: 0a2989083b]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 51392afedb libhsakmt: Revert "libhsakmt: Update FD creation logic"
This reverts commit 89ce41694f.
Current amdgpu exposes one render node for one gpu node/partition,
revert to previous way to open render node at Thunk.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57


[ROCm/ROCR-Runtime commit: cc4fb2d1a9]
2023-05-29 00:15:14 -04:00
Kent Russell 8bbcfca082 kfdtest: Test XNACK on and off for SVM tests
Add parameterization for KFDSVM tests so that we test with both XNACK
enabled and XNACK disabled. This will be overridden by HSA_XNACK, if set

Change-Id: Ie96eb61c03115f947e08cfa076ac459f7440f5d8


[ROCm/ROCR-Runtime commit: 478a68d49c]
2023-05-25 12:08:16 -04:00
Philip Yang 3795454c1b kfdtest: Enable KFDEvictTest and KFDSVMEvictTest on aqua_vanjaram
For aqua_vanjaram APU mode, KFDEvictTest and KFDSVMEvictTest are
skipped. Those tests passed on dGPU mode with memory reporting partition
support on GFX 9.4.3.

Change-Id: I56357843c6743b01b807359dbb37b32391fd9a25
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 5df82e3d14]
2023-05-17 17:17:46 -04:00
Bing Ma a868d0972b libhsakmt: Add support functions for ASAN
Add support functions to remap the first page of device memory (GPU/GTT)
to share host ASAN logic.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85


[ROCm/ROCR-Runtime commit: 1e6d728730]
2023-05-17 13:38:19 -04:00
Kent Russell 4d26b1cf48 kfdtest: Add include directory for ROCr merge
When we merge thunk into ROCr, kfdtest will be in a different folder
structure. Add the new location to ensure that we can build now and in
the future with no disruptions

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I6517e061cb0da7137d903abbc380bfc7126f40d4


[ROCm/ROCR-Runtime commit: d966243783]
2023-05-15 10:13:49 -04:00
Yifan Zhang 7b4b914b72 kfdtest: Using non-paged memory allocation for wptr on devices that have MES scheduler
Starting with GFX11, wptr BOs must be mapped to GART for MES to determine work
on unmapped queues for usermode queue oversubscription (no aggregated doorbell)

Change-Id: I10e30fdc2bec587cef9427faa4874957988c34b3
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>


[ROCm/ROCR-Runtime commit: d319660838]
2023-05-12 01:06:37 -04:00
Yifan Zhang 32eb2e1b33 kfdtest: add non paged wptr judging API.
If MES is enabled, wptr has to be non paged memory,
Add an API to check this condition.

Change-Id: I53af1f6687d5332d102e7062c3d760e33b96e722
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>


[ROCm/ROCR-Runtime commit: 53ed978c3d]
2023-05-12 01:06:37 -04:00
Ranjith Ramakrishnan b2279a9f0d Set the default value of ROCM_HEADER_WRAPPER_WERROR to OFF
Using wrapper header files will result in #warning message by default

Change-Id: I8301e433d39f3e5d39384ede6f0e4464d0eb20a6


[ROCm/ROCR-Runtime commit: b487f87363]
2023-05-10 12:36:00 -04:00
Shane Xiao 1f112bced0 kfdtest: DeviceHdpFlush need set target ASIC with different Gfx versions
If Dev0 and Dev1 are not the same gfx, we should temporarily
set the target ASIC for compiling Shader code.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Signed-off-by: Shikai Guo <shikai.guo@amd.com>
Change-Id: I5836beb16ade519f5a148d3d2b9c2875554f0c35


[ROCm/ROCR-Runtime commit: 5d6f900353]
2023-05-09 09:50:07 -04:00
Graham Sider ccb7c62b79 kfdtest: Add Assembler::RunAssembleBuf overload
Overload Assembler::RunAssembleBuf to take in an extra Gfxv parameter.
Using this overload will temporarily set the target ASIC to Gfxv before
calling RunAssemble, and copy back the original MCPU literal upon
completion. The copy to reset the original MCPU in this case is safe as
the MCPU length is always known.

This will be useful in multi-device test cases whereby the devices are
not necessarily the same gfx version. The overload is explicitly for the
RunAssembleBuf wrapper rather than RunAssemble to ensure the default
MCPU is always reset independent of errors in RunAssemble.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7fe5a962876314b6df32e4b7160174949d98f9e3


[ROCm/ROCR-Runtime commit: 54136f60a0]
2023-05-08 11:35:32 -04:00
Graham Sider 312b960e9c kfdtest: Fix new shader directives
LLVM MC does not seem to accept multi-line conditionals. This may be
fixable in the future with macros. The Aqua Vanjaram shader spec states
that while buffer_invl2 has been replaced by buffer_inv, the former may
still be used for compatibility. However, this does not seem to be
implemented. For now, fix conditional.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I7f8b64c96055371d7e0090b758d2cfd2a37ecd3c


[ROCm/ROCR-Runtime commit: 92f3d4a458]
2023-04-27 10:48:44 -04:00
xinhui pan a945a9824e thunk: Fix and optimise for pointer range search
Previous code might fail to get the correct ln node. And trigger extra
walk through of the tree. Fix it.

While walking through the tree, better to search from right to left as
the node->start likely close to *address*.

Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488
Signed-off-by: xinhui pan <xinhui.pan@amd.com>


[ROCm/ROCR-Runtime commit: 77761836ae]
2023-04-20 19:36:29 -04:00
David Francis d6edf970cf kfdtest: Enable gfx90a coherency tests on Aqua Vanjaram
These tests should also pass on Aqua Vanjaram, so enable them

Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ibbb9cd43d653c63b08c39efd1d7326cfac1f8411


[ROCm/ROCR-Runtime commit: eed5518e4c]
2023-04-19 10:28:05 -04:00
David Francis b7dcb91b58 kfdtest: Add coherency tests for Aqua Vanjaram
Aqua Vanjaram is intended to have fine-grained coherency
from anywhere to anywhere else using read-acquire and
write-release primitives.

Add a test that writes to memory covered by five
different cache lines, then write-releases, while
another thread read-acquires, then reads those
five locations in memory.

There are nine variations of the test to cover
CPU-GPU, same-GPU and across-GPU, vector instructions and
scalar instructions, and data local to the
acquirer or receiver.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I20d2db5c53bd280e971479aad7e61df6ed5d3623


[ROCm/ROCR-Runtime commit: 30b1f23f7a]
2023-04-19 10:28:05 -04:00
Philip Yang 25a9421f64 kfdtest: fix KFDSVMRangeTest.MultiGPU tests vector iterator
For vector iterator loop access current node directly, don't need
gpuNodesAll.at(i), which also causes out of range access.

Change vector index loop to iterator loop to simplify the code.

Change-Id: I2627ef8d13b5d2c9cd8c51cf4dacc3e8a97fcfb0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 0696f06c16]
2023-04-18 17:58:06 -04:00
Philip Yang fed84df49f kfdtest: AppAPU Skip KFDEvictTest, KDFSVMEvictTest, HMMProfilingEvent
AppAPU VRAM is part of system memory managed by Linux kernel, no
VRAM eviction and restore is needed between VRAM and system memory.
Those Evict test failed on AppAPU now, skip those tests on AppAPU.

No page migration between VRAM and system on AppAPU, HMMProfilingEvent
depends on migration event, skip it on AppAPU.

Change-Id: I4c809b97c947e809d136c1f88db2278cf74f5b47
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 21abaef3f8]
2023-04-17 13:28:23 -04:00
Philip Yang e80be9112f kfdtest: Add helper to check if IsAppAPU system
If there is connection between GPU and CPU with weight 13,
KFD_CRAT_INTRA_SOCKET_WEIGHT, then this is AppAPU.

This will be used to skip tests not suitable for AppAPU.

Change-Id: If6fad81528b52afd4ac4cefa508d787b0f6637ca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: e2df2c21af]
2023-04-17 13:28:23 -04:00
Jesse Zhang 76c0c340f9 libhsakmt: Add compute core check for APU
We should check compute core instead of cpu core,
in order to exclude the case of APU.

Signed-off-by: Jesse zhang <jesse.zhang@amd.com>
Change-Id: I2ec2a6807f51f49f80e0e500f5d9af81c2efae37


[ROCm/ROCR-Runtime commit: 4d54d6e706]
2023-04-17 09:34:37 +08:00
Graham Sider 17465d0e4f kfdtest: Fix PersistentIterateShader for gfx target 9.4.x
Replace 'flat_load_dword <...> glc' with appropriate macro.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9fdc7c916c685304457cd9698e741577f6c10c82


[ROCm/ROCR-Runtime commit: 11a04fe1f5]
2023-04-14 12:05:08 -04:00
Graham Sider beae8d0713 kfdtest: Add flat compatability macros for gfx target 9.4.x
For GC 9.4.0, modifications were made to various shaders since certain
flat_ instructions no longer support glc/slc modifiers (replaced with
nt/sc1/sc0). Instead of repeating conditionals inside various shader
bodies, we can make use of LLVM AMDGCN macros.

This patch modularizes the shader macros into seperated defines. Prior
to the core raw-string literal, each shader now starts with the
SHADER_START literal (".text\n") plus any number of SHADER_MACRO_*
literals. This allows us to seperate the macro definitions logically and
use the pre-processor to only include the required macro groups on a
per-shader basis.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I19eb3fd14252a0601bb7509249051b68e7fdb02a


[ROCm/ROCR-Runtime commit: e2435d9e93]
2023-04-14 12:05:08 -04:00
David Francis 50fbd49bbb kfdtest: Make queue evict tests use constant number of wavefronts.
Previously, KFDEvictTest.QueueTest and KFDSVMEvictTest.QueueTest
would create a variable number of wavefronts, one for each 64MB
of memory under test. This ran into limits on the buffers used
by the wavefronts, and may at some point have exceeded the
wavefront limit.

Restrict the number of wavefronts to 512, and adjust the shader
to accomodate a variable buffer size

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I2ec292e2900e2efa62a08313bca3d2f4bdabca8b


[ROCm/ROCR-Runtime commit: 680c8ca5a9]
2023-04-14 12:05:08 -04:00
Graham Sider d8960d6b57 libhsakmt: Mask stepping version for GC 9.4.3 checks
GC 9.4.3 to set gfx target version to 9.4.x dependent on revision and
capabilities. Due to this, where applicable, mask off the gfx target
stepping version and only check major/minor version (9.4). There are no
collisions due to this change since GC 9.4.3 is the only ASIC that uses
gfx target version 9.4.x.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I72803e594c421f054d18ccfa7e92c507128fa5be


[ROCm/ROCR-Runtime commit: 831d1ad352]
2023-04-14 12:03:23 -04:00
Philip Yang bc5d6f5bf7 kfdtest: KFDMemoryTest.DeviceHdpFlush requires large bar
KFDMemoryTest.DeviceHdpFlush requires device node 0 is large bar to
check VRAM content from CPU, run the test only if device 0 is large
bar GPU.

Change-Id: I874b153219550c50b724625e971e3ed3a84dc652
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 598e3e8d86]
2023-04-14 10:03:38 -04:00
David Francis 252acefd9c kfdtest: Restrict DriverHDPFlush to systems with PCIe
Nodes with XGMI have no HDP, so DriverHDPFlush should skip.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: If5a87e660712e51d03e750d8e044786036b2e603


[ROCm/ROCR-Runtime commit: e32278a612]
2023-04-14 10:03:38 -04:00
David Francis 3560adf58c kfdtest: Deprecate PollNCMemoryIsa
Even with the restriction to only compile on gfx90a, this
shader still fails CompileShaders test.

There don't seem to be any systems that actually use it.

Leave it in the shader store, but remove it otherwise

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I41bec6ba10363d42b163ac101c3a92edaad6d6df


[ROCm/ROCR-Runtime commit: 16c6530330]
2023-04-14 10:03:38 -04:00
David Francis 47ddda3c6d kfdtest: Use scalar path for PollMemoryIsa Shader on gfx940
A gfx940 code path was erroneously added to this shader.

It's unneccesary; without this path, the shader uses
the scalar store, which works just fine on gfx940 without changes.

Remove it.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I825cbbebbdb25c4a7c2f16e228c2bea6a6bcc30c


[ROCm/ROCR-Runtime commit: 2a01e5c33b]
2023-04-14 10:03:38 -04:00
Ori Messinger 378f9999ed kfdtest: Update blacklist for Aqua Vanjanran
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I8f822bb71e8e5dbee6bdb62f77cbe5ea83faabb5


[ROCm/ROCR-Runtime commit: c234f84245]
2023-04-14 10:03:38 -04:00
David Francis 78f489fb95 kfdtest: Update shaders to compile on gfx940
gfx940 changed the semantics of the glc and slc coherency options
on vector stores and loads. This means that shaders that use
those bits no longer compile on gfx940.

Add precompilation if statements to those shaders to use the
new coherency bits.

Also add gfx940 to ASMTest so that compilation is tested.

Note: One of the tests enabled by this patch on gfx940,
KFDEvictTest.QueueTest, does not pass on gfx940 emulators.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I942f9d2536e9eb5510c4d5af30df6ff1a95c8cf7


[ROCm/ROCR-Runtime commit: 30da9a3cf9]
2023-04-14 10:03:38 -04:00
Graham Sider 543fe60c96 libhsakmt: Fix queue destroy SVM path free size
Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0


[ROCm/ROCR-Runtime commit: ae659e5427]
2023-04-14 10:03:38 -04:00
Mukul Joshi dc0d800908 libhsakmt: Fix memory leak on queue destroy for GFX9.4.3
Currently, on queue destroy, context save restore memory is freed
only for a single XCC. Instead, we need to free the entire context
save restore memory, which was allocated for all XCCs.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062


[ROCm/ROCR-Runtime commit: a713fb766e]
2023-04-14 10:03:38 -04:00
David Belanger 7f91f54b27 libhsakmt: EOP Removal
Do not allocate the EOP buffer when not required.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e


[ROCm/ROCR-Runtime commit: 252a2cf959]
2023-04-14 10:03:38 -04:00
Mukul Joshi c5b17ec68f kfdtest: Program COMPUTE_PGM_RSRC3 for GFX 9.4.3
Program ACCUM_OFFSET to match the number of VGPRS used
by the shader as part of Dispatch setup.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: Icfa1fbe4de2a62f00743de567f3ed382d3378b17


[ROCm/ROCR-Runtime commit: 8994c3ba0e]
2023-04-14 10:03:38 -04:00