커밋 그래프

2925 커밋

작성자 SHA1 메시지 날짜
Jeremy Newton 8e8c335aa5 Don't install asan license if disabled
Change-Id: I8bffe5ec8496ff11e6d66995dd470cddb13f3c0d
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/ROCR-Runtime commit: 473a66d115]
2023-06-29 09:34:49 -04:00
Alex Sierra 6fb53cdb1b src: add debug API to support GPU core dump
Functions to API added to extract the following information from KFD
Runtime information, device info and queues snapshot.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f


[ROCm/ROCR-Runtime commit: 0cbf26c148]
2023-06-26 18:58:15 +00:00
Alex Sierra b5ba77994a add hsaKmtGetRuntimeCapabilities API
Queries for runtime capabilities after its being enabled

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691


[ROCm/ROCR-Runtime commit: 5e0a32d7b3]
2023-06-26 18:58:15 +00:00
Ori Messinger ab3b0098ba kfdtest: Fix gfx_target_version Parsing Issue
The purpose of this patch is to fix an issue in the run_kfdtest.sh
script's gfx_target_version parsing.

When the character length of the "gfx_target_version" value is
equal to 5 instead of 6, it will now be zero padded on the left to
allow each Major/Minor/Stepping value to be parsed correctly.

Also, kfdtest.exclude file now replaces the default filter for
aqua_vanjaram with the following 3 gfx filters:
gfx940, gfx941, & gfx942

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I1f0264d3705803f24ad3c458e6bd367fbbec62be


[ROCm/ROCR-Runtime commit: 3447b795df]
2023-06-23 13:18:05 -04:00
David Yat Sin 175265aef4 Add query for driver gpu_id
Add query OS driver node ID (gpu_id)

Change-Id: I72ebc54d8ae5dbcd1346535912160a642b1065ae


[ROCm/ROCR-Runtime commit: 60a0fd64c4]
2023-06-23 15:02:48 +00:00
Konstantin Zhuravlyov e126b5a054 Cache referenced symbol table when pulling data in relocation section
Change-Id: I6ef21cedde1aca6fd1ec5e5d5634563f030eaab8


[ROCm/ROCR-Runtime commit: 8a6edb07d9]
2023-06-21 16:35:45 -04:00
Jonathan Kim dbf125b5cf Prevent unnecessary SDMA queue creation on copy on status
Unless SDMA blits have actually been used for copies, prevent the DMA
copy status from querying the blit's pending byte status to avoid
creating an unnecessary HW queue.

Change-Id: Ied1fbed73c08f0408f0e3583f9b56f2768c71708


[ROCm/ROCR-Runtime commit: 92467fd282]
2023-06-21 03:10:53 -04:00
Jonathan Kim 2147e8ccbf Prevent blit copy pending bytes query when out of SDMA resources
Querying pending bytes on a blit kernel is unnecessary when runtime
runs out of SDMA resource since we are returning an SDMA availabilty
mask.

Change-Id: I347efba0c85b70ea3ba8749d76a499afc23909e8


[ROCm/ROCR-Runtime commit: 8c60f04a99]
2023-06-21 03:10:52 -04:00
Shweta Khatri 76cc9034ff Defined a new extended scope memory region
Added HSA_AMD_MEMORY_POOL_GLOBAL_FLAG_EXT_SCOPE_FINE_GRAINED flag to enable extended scope memory region
where the device-scope atomics act as system-scope atomics

Change-Id: I79fc3207cb630dfc68bed2f8aabd75f35fe80b12


[ROCm/ROCR-Runtime commit: 77bf357647]
2023-06-20 11:00:05 -04:00
James Zhu 3ac5245f3b Enable sleep for all waiters
Enable sleep for all waiters with event age tracking support kernel.

Change-Id: Icd4e1e8d83b4a54e9f6aaa99691a6573211b3337
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 36666f5895]
2023-06-20 09:32:16 -04:00
James Zhu 2cf7c88b34 Add kernel version flag supports event age
KFD kernel version 1.13 starts to support event age
tracking which help elimating unncessary busy wait.

Change-Id: Ib447ed6e0350f3110a4d6b9b80a0388000dd0e72
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 5871b28503]
2023-06-20 09:32:03 -04:00
Sreekant Somasekharan bf22d10ceb rocrtst: Fix RoundToPowerOf2 function
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.

Change-Id: Iddcc15b43d2331bc8bf5fc3aa4725f76844655ec
Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>


[ROCm/ROCR-Runtime commit: ea2f832a43]
2023-06-19 19:17:49 -04:00
Jonathan Kim 63463b14c3 Ensure HSA_ENABLE_SDMA=0 persists on new copy on engine API
Copy on engine API still needs to respect HSA_ENABLE_SDMA settings.

Change-Id: I26038b1e3082d62687c2e279615557583d20f229


[ROCm/ROCR-Runtime commit: 3e3e11bc5a]
2023-06-19 13:48:59 -04:00
raghavmedicherla 2758da98cd [hsa-runtime] Add support to hsa-runtime to find symbols from ".dynsym" section.
Earlier, hsa-runtime was unable to find symbols from a stripped ELF-image becasue
no support to find symbols from ".dynsym" section.

Looking for symbols in .dynsym is enabled by LOADER_USE_DYNSYM=1
environment variable

Change-Id: I4f0e8dd0eb053a6066d4d49b670c52e51149531a


[ROCm/ROCR-Runtime commit: 4142a77375]
2023-06-16 14:40:50 -04:00
Kent Russell 410f1dbe2d kfdtest.exclude: Blacklist CuMaskingEven on all ASICs
This has slowly become less and less reliable on more and more ASICs,
so just blacklist it altogether. Using wall clock for performance
is not a reliable method for testing performance, so skip it to avoid
more failure reports on various systems.

Change-Id: I1a5744604e4620bc7675a629d146ba4ffba669d2


[ROCm/ROCR-Runtime commit: 9a22bade89]
2023-06-15 11:24:04 -04:00
David Yat Sin 8c3acb3974 Update documentation for IPC handles
Explicitly mention that IPC handles can only be created on GPU agents.

Change-Id: I19bc3578d6e5243c795bf6fbf981ea4bd3bfc2e8


[ROCm/ROCR-Runtime commit: 5e4490f180]
2023-06-14 16:21:26 -04:00
Ruili Ji 0a8096b34a kfdtest: Update COMPUTE_PGM_RSRC1 for software trap
If asics don't need software traps within GFX11 domain,
test with COMPUTE_PGM_RSRC1.PRIV = 1 will make system hang.

Change-Id: I00cf8eb6d6b07856885c77bd343ca3c41cc3cad5
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>


[ROCm/ROCR-Runtime commit: 9bf1cbe4ed]
2023-06-14 07:46:51 -04:00
Philip Yang 6bf1babb51 kfdtest: Fix KFDSVMEvictTest.QueueTest OOM
Typo to calculate bufferSize from vramBufSizeInPages. The OOM shows up
only with HSA_XNACK=1 because HSA_XNACK=0 doesn't support VRAM
oversubscription. We changed to run SVM tests with both XNACK off and
on.

Change-Id: I3949959288fd92f4e7f4a87115a5f1547e225042
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 29b04c2534]
2023-06-13 21:15:31 -04:00
Jonathan Kim 1772d866c9 Soften trap handler loading failure when exception handling not supported
GFX11 and up including some GFX9 devices will not support
old trap handling without the new exception handling.

Instead of a hard assert failure that runs into a core dump,
let ROCr initialization continue instead.

Change-Id: I309becdc72ef4fb2fafd118c1faf0801407e658e


[ROCm/ROCR-Runtime commit: bfb94b3b6e]
2023-06-13 13:05:47 -04:00
James Zhu 1ff2f1f7d8 kfdtest: Add test for event wait with event age tracking enable
Add 5 different test scenario to cover new event age tracking features.

Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 4ba8f1fe77]
2023-06-10 11:41:50 -04:00
James Zhu 498b718e83 libhsakmt: add event age tracking
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.

Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: a0cbf90b90]
2023-06-10 11:41:50 -04:00
Laurent Morichetti 3736a0ffeb Fix a race condition in the trap handler
status.priv may be read after returning from the trap handler, which
causes sq_interrupt_word_wave.priv to be 0 even though the s_sendmsg
instruction was initiated when status.priv was 1.

To work around this, added a s_waitcnt lgkmcnt(0) after s_sendmsg
to make sure the message is sent before continuing.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Laurent Morichetti <Laurent.Morichetti@amd.com>
Change-Id: Ieb75005ca1559ef03d0efac80e966f521e41fcb7


[ROCm/ROCR-Runtime commit: 6a82b0a038]
2023-06-09 10:03:55 -04:00
Ammar ELWazir 5675ed837a : Adding support to UMC & MMEA System Blocks
Change-Id: I92601f37757e0cff3f1fdc10f2e5e0db51c1ee2d


[ROCm/ROCR-Runtime commit: fc603d58d2]
2023-06-08 21:22:19 +00:00
Ori Messinger 0f44742bc4 kfdtest: Fix minor typo
The purpose of this patch is to fix a minor typo in KFDSVMRangeTest.
Before:
"Skipping test: no enough system memory."
After:
"Skipping test: Not enough system memory."

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba


[ROCm/ROCR-Runtime commit: 4675492852]
2023-06-08 15:58:25 -04:00
Graham Sider 73f293fc01 kfdtest: Update GFX11 blacklist
KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version
0x3c ().

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c


[ROCm/ROCR-Runtime commit: d1a095123d]
2023-06-08 14:11:30 -04:00
Jonathan Kim 21f24c1348 Remove Tab Indent on SDMA Status Fix
Use spaces not tabs.

Change-Id: Icaeb16158ebaddd8e5ac518103d285d55fe976f3


[ROCm/ROCR-Runtime commit: 233413eb08]
2023-06-07 16:47:04 -04:00
Xiaomeng Hou 99d3d2afbd Do not reserve scratch memory on asic with finite vram resource
Change-Id: I0a2207cb01f464ed3e73331637cfa9bd62f03d97


[ROCm/ROCR-Runtime commit: 389cd3564b]
2023-06-06 22:01:31 +08:00
Sreekant Somasekharan c84cdca17e kfdtest: RoundToPowerOf2 function modified for compiler compliant bit shift values
Compiler behavior is undefined if the right operand is negative,
or greater than or equal to the width of the promoted left operand.
For release builds with address sanitizer enabled, this compiler
optimization behavior leads to unsupported queue size value since
current method shifts till 128 bits on a 64 bit value.

Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6


[ROCm/ROCR-Runtime commit: 1428a7538e]
2023-06-05 18:14:58 -04:00
David Yat Sin c83eee3f2b Removing __linux__ definition in CMake
Removing this definition as this should already be defined by compiler.
This is causing compile errors on newer versions of llvm because the
macro is being redefined.

Change-Id: Ica6a06f46a14e16d3f52e83b9b5ee8cfd7359510


[ROCm/ROCR-Runtime commit: e4fffa140a]
2023-06-05 12:23:56 -04:00
Graham Sider 5ec7dcd4c4 Revert "Disable Queue_Validation_InvalidGroupMemory"
This reverts commit 7a157d8e55.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I8424c96d5e5c3c9a9e7711ecff7c5372190b0d2d


[ROCm/ROCR-Runtime commit: e2c3c3e510]
2023-06-05 09:41:02 -04:00
Graham Sider 74f9ba24e0 rocrtst: Remove extra clear_code_object() calls
A patch was made in gfx940 npi branch to move the kernel object file
loading to outside the rocrtstNeg.Queue_Validation_* main queue creation
and submission loops, and added a clear_code_object() after the loop.

Another patch was made to the non-npi branch which adds a
clear_code_object() inside the loop. When the npi branch patch was
merged, this was causing the code object to be cleared at the end of
the first loop. Remove these clear_code_object() calls.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id4188e78411e81c5071bf715c1f02491f571ab79


[ROCm/ROCR-Runtime commit: dbe2a82e35]
2023-06-05 09:41:02 -04:00
Alex Sierra 6998fcee45 libhsakmt: include changes for upstream debugger API
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467


[ROCm/ROCR-Runtime commit: 728162c2c8]
2023-06-02 15:56:19 -04:00
Xiaomeng Hou 381ea164ba Correct the SDMA engine mask reported on apu
There is only one SDMA instance on small APUs.

Change-Id: I9d4dda511c40fc78f002be720e5f1909dc5b91e4


[ROCm/ROCR-Runtime commit: 557da77c4e]
2023-06-02 19:10:08 +08:00
David Yat Sin 9c54cdaaf1 Change failure to parse CPUID to warning
Change-Id: If42dbcd11ac1be09597e43a8f11caa91cf37903e


[ROCm/ROCR-Runtime commit: fc3b554121]
2023-05-31 11:46:52 -04:00
Kent Russell c11b1022f1 fmm.c: Fix possibly initialized variable usage
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base

Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: 718d95de77]
2023-05-30 15:46:14 -04:00
Xiaogang Chen 682173c851 libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero.
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.

Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e


[ROCm/ROCR-Runtime commit: f6183f937e]
2023-05-29 00:15:14 -04:00
Xiaogang Chen ebce4177ad libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu.
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6


[ROCm/ROCR-Runtime commit: 7e4e57ae5f]
2023-05-29 00:15:14 -04:00
Xiaogang Chen dd8954e83e libhsakmt: query/use render node fds that libdrm uses.
Query render node fds that libdrm uses for current process and
use them at Thunk if available.

v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471


[ROCm/ROCR-Runtime commit: ac1db60fc2]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 6800fbec43 libhsakmt: add NodeId at HsaGraphicsResourceInfo.
Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver
at fmm_register_graphics_handle.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b


[ROCm/ROCR-Runtime commit: 9bebb276be]
2023-05-29 00:15:14 -04:00
Xiaogang Chen eeec387ca2 libhsakmt: add DMABuf import without address allocation.
When gpu map info is not provided import DMABuf without VA assigned.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292


[ROCm/ROCR-Runtime commit: 989c6c617c]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 4d6b75d857 libhsakmt: support allocating a fixed address at mmap_aperture.
When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c


[ROCm/ROCR-Runtime commit: d2a37894bb]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 038916c727 libhsakmt: update HsaPointerInfo for address-only allocated VRAM.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56


[ROCm/ROCR-Runtime commit: 11ac57d293]
2023-05-29 00:15:14 -04:00
Xiaogang Chen a7ccb14b9c kfdtest: add kfdtest cases for VA-only, VRAM-only allocated VRAM.
Alloc vram by kfd, then map by GEM api to GPU VM and map to CPU VM.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib5b2f35662cd5473f622f6ffc9b62925fe57ae42


[ROCm/ROCR-Runtime commit: 108c0e5f92]
2023-05-29 00:15:14 -04:00
Xiaogang Chen b4b03aca20 libhsakmt: support vram-only and VA-only alloc/free.
Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0


[ROCm/ROCR-Runtime commit: 0138487aa4]
2023-05-29 00:15:14 -04:00
Xiaogang Chen ee0c668706 libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space.
This new manageable_aperture_t is used for VRAM allocation-only and
VA allocation-only.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50


[ROCm/ROCR-Runtime commit: 0a2989083b]
2023-05-29 00:15:14 -04:00
Xiaogang Chen 51392afedb libhsakmt: Revert "libhsakmt: Update FD creation logic"
This reverts commit 89ce41694f.
Current amdgpu exposes one render node for one gpu node/partition,
revert to previous way to open render node at Thunk.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57


[ROCm/ROCR-Runtime commit: cc4fb2d1a9]
2023-05-29 00:15:14 -04:00
David Yat Sin 3661d76c74 Bump interface versions due to hsa_amd_memory_async_copy_on_engine added
Change-Id: Iff36719e800280d58217647bb70d3b5d5fcc91fe


[ROCm/ROCR-Runtime commit: b290d65ec9]
2023-05-26 12:04:06 +00:00
Kent Russell 8bbcfca082 kfdtest: Test XNACK on and off for SVM tests
Add parameterization for KFDSVM tests so that we test with both XNACK
enabled and XNACK disabled. This will be overridden by HSA_XNACK, if set

Change-Id: Ie96eb61c03115f947e08cfa076ac459f7440f5d8


[ROCm/ROCR-Runtime commit: 478a68d49c]
2023-05-25 12:08:16 -04:00
Graham Sider f0eeb60222 rocrtst: Throw on LocateKernelFile open() failures
Throw runtime error instead of returning empty string when open() fails
in LocateKernelFile()

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Iafa360fbc2d3c9b01b9fe7ea4c11d70bd254ccce


[ROCm/ROCR-Runtime commit: 0772e8d618]
2023-05-24 14:31:26 -04:00
David Yat Sin 3345ada378 Adding gfx941 and gfx942
Adding support for gfx941 and gfx942 ISAs.
gfx940 ISA will use sc0:1 sc1:1 on load/store operations
gfx942 ISA will use default load/store operations

Change-Id: If1efbef86f59e2cf2d48fe359cd4166405a0a579


[ROCm/ROCR-Runtime commit: 41f6d0426d]
2023-05-23 11:13:16 -04:00