Graf commitů

2930 Commity

Autor SHA1 Zpráva Datum
David Yat Sin bd5d032fba Add clang-format file
Change-Id: Iaa207895b027227c0eae51b899f0d2e41e3f64b8


[ROCm/ROCR-Runtime commit: b7a420f7b7]
2024-12-05 11:47:35 -05:00
David Yat Sin b7208786a2 rocr: Avoid polling for SDMA signals
When all 64-bits of the signal value are 0, we can skip polling for that
signal.
We need to keep signals as 64-bit numbers as part of the spec. But most
users of ROCr do will never set the signal value to more than 32-bits.
When the dependent-signals are less than 32-bits, avoid adding extra
SDMA poll packet as this adds latency to the SDMA copies.

Change-Id: I37dca65fe3f060dc7164f49b98cb1985023663c4


[ROCm/ROCR-Runtime commit: 0544c2336b]
2024-12-04 16:45:04 -05:00
Chris Freehill 2a492e6f04 rocr: Add gfx9-4-generic support
Change-Id: I4ebfbf0dcffa5b784d7fbfda7398d44dcc47aaef


[ROCm/ROCR-Runtime commit: f32e264933]
2024-12-03 19:33:57 -05:00
Lang Yu 04a13608b7 kfdtest: update PersistentIterateIsa for gfx12
1, Use s_wait_* instead of s_waitcnt
2, Remove a redundant s_waitcnt

Change-Id: Id0f31db0fc520adadd81eb574ad389f63859303a
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/ROCR-Runtime commit: 37135aadfa]
2024-12-02 19:59:57 -05:00
taosang2 a5de0f048d rocr: Support different address modes
Support different address modes in X, Y, Z directions

Change-Id: If1db5a8af33c92ddc4b48968c3d8eceb97daea6a


[ROCm/ROCR-Runtime commit: df250a49a5]
2024-12-02 09:07:56 -05:00
David Yat Sin 91a28fce54 rocr: Move _loader_debug_state to rocr namespace
This avoids exposing the symbol to the default namespace

Change-Id: I2fe5fbab4b59f271effacab93eeb2d95c236ae02


[ROCm/ROCR-Runtime commit: 147abb6ca0]
2024-11-29 10:44:23 -05:00
Apurv Mishra 55064cbb4c hsakmt: modified to free all_gpu_id_array in fmm.c
Add free() for 'all_gpu_id_array' in
hsakmt_fmm_destroy_process_apertures() and
removed it from 'hsakmt_fmm_clear_all_mem()'

Change-Id: I32d2d22e7152f62a3f2e7da4f601f0db7cebd534
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: c066ec13dd]
2024-11-28 13:08:03 -05:00
Apurv Mishra 05e927bcb0 hsakmt: minor code cleanup and refactor topology.c
removed unused value assignment for HSAKMT_STATUS,
restructured 'topology_sysfs_check_node_supported'

Change-Id: I21cdccb3e3c5e42981f10597426de479d0f4ee6a
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: 79f0ac2534]
2024-11-28 13:06:23 -05:00
Emily Deng 7aa33126cf kfdtest: fix event leaks issue
DisableCpQueueByUpdateWithZeroPercentage need to destroy event to avoid
event leak.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Change-Id: I4fb51b670fbff1edcd7fd61517f5c8a6674003c0


[ROCm/ROCR-Runtime commit: 1f9c080932]
2024-11-27 19:33:10 -05:00
Chris Freehill ed575729bd rocr: Dynamically allocate supported_isas map
This was missing from a previous commit regarding
dynamically allocated static data structures.

Change-Id: Iae1c674e762f85e3aebf338210ba96942ba80278


[ROCm/ROCR-Runtime commit: eec2130443]
2024-11-27 11:11:22 -05:00
Apurv Mishra baf737a3cb rocr: declare 'args' as class member in 'os_thread'
Removed 'args' as a unique pointer and deletion in
'ThreadTrampoline', then declared as a class member.

Change-Id: Ia52058392d0170e8b5e57cfdd2c587f47a6f93f0
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: 89115369cc]
2024-11-27 10:27:40 -05:00
Emily Deng 7f0c2de9ad kfdtest: Fix the dead lock SignalHandling
The issue arises in the CatchSignal function, which attempts to write to
the standard error stream upon receiving a signal. However, the standard
error stream may already be locked at this point, as the parent process
also attempts to write to the standard error stream after mapping the GPU
memory. This leads to a deadlock, with the program waiting for the
release of the lock on the standard error stream.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Change-Id: Ie69354f4342b96ffe1f2a87f655687da1cbee4b9


[ROCm/ROCR-Runtime commit: c8031f2a69]
2024-11-26 19:39:45 -05:00
Eric Huang c931ec17d4 kfdtest: increase test timeout and optimize evict tests timeout
there are some timeout issues of evict tests on recent new boards,
it is to solve those issues and optimize evict timeout, as well
as to give user a chance to change timeout in command line.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: I2f40c8ea809c55675b0d0b62296b663481e5fb16


[ROCm/ROCR-Runtime commit: 09b899b079]
2024-11-26 11:04:29 -05:00
Apurv Mishra b552e9d15d rocr: initialized missing fields in ext_table
Added initializations for 'ext_table' in 'hsa_system_get_major_extension_table()'

Change-Id: I5e46592192b7d7a294d30011481f16e93db11794
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
Reviewed-by: David Yat Sin <david.yatsin@amd.com>


[ROCm/ROCR-Runtime commit: d91a14ae0c]
2024-11-26 10:45:29 -05:00
Shweta Khatri 7019761f4f rocrtst: Disable FreeQueueRingBufferTest temporarily
This test is disabled until kernel patches are added to handle invalid
user actions gracefully. These patches validate and block operations
like freeing active queue buffers, which can corrupt the driver's state
if unhandled.

Currently, such operations result in driver state corruption, leading
to segmentation faults and subsequent failures during runtime.

Change-Id: If4c321a14df950a639141fc96048889659c14477


[ROCm/ROCR-Runtime commit: 2cf3813f9f]
2024-11-26 09:18:47 -05:00
German Andryeyev 18c28ebdaf rocr: Add logic to track the age of events
Some KFD versions can return from hsaKmtWaitOnMultipleEvents_Ext without
any wait and require the second call without age array init.

Change-Id: I8358c33080084d47c273c2a2827085d0570c8201


[ROCm/ROCR-Runtime commit: 816af44b05]
2024-11-25 14:55:22 -05:00
Apurv Mishra 76812e7110 rocr: uninitialized pointer read in InitScratchPool
Initialized 'scratch_base' as a nullptr to avoid
uninitialized read in hsaKmtAllocMemory()

Change-Id: I3b0e67f3fd3b591e1d21d691f0777b1d1a059b73
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>


[ROCm/ROCR-Runtime commit: 6f6ee9679c]
2024-11-25 14:02:37 -05:00
Apurv Mishra 6ff8fb022a rocr: Uninitialized scalar variables and pointer
Added check and initialized parameters for PtrInfo().

v1: Checking if PtrInfo() returns success.
v2: Initialization for variables being passed to PtrInfo().

Change-Id: If3ec4608c8e58be259b4fd51ad681b9bc34ddff6
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
Reviewed-by: David Yat Sin <david.yatsin@amd.com>


[ROCm/ROCR-Runtime commit: 610f8a1e0f]
2024-11-22 16:23:29 -05:00
Jonathan Kim 65ebb89e70 kfdtest: exclude negative testing from gfx908
GFX 9.0.8 may not properly support pipe reset capabilities so disable
test for now.

Change-Id: I3061cdad87eb979ba884c194f4229c0cbb144ee2


[ROCm/ROCR-Runtime commit: 0f02ed6ffb]
2024-11-20 12:23:09 -05:00
Jonathan Kim 4b7e1d79b7 kfdtest: fix dispatch pointers and event leaks tests
KFDDBGTest and KFDNegative test can eat into memory and event resources
for subsequent test interations if unallocated.

Change-Id: Iea170c20df8d487703441181b6c152b61f02d3db


[ROCm/ROCR-Runtime commit: 26d338df12]
2024-11-19 11:25:24 -05:00
Emily Deng f27d995211 kfdtest: Fix InterruptRestore randomly hang
Queue 2's wave blocked the queue 1's wave save, which will cause unmap
queue preemption fail. Add nop per SQ suggested.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Change-Id: Iea7f280e35487059c4499ea999b9e0cdf841d1e1


[ROCm/ROCR-Runtime commit: f047f96161]
2024-11-14 23:36:06 -05:00
Konstantin Zhuravlyov bee079fc24 loader: add gfx9-4-generic support
Change-Id: Icb148f7a78a4ce0fc661e35d0df605e05db2de3d


[ROCm/ROCR-Runtime commit: 4c7a9a0f67]
2024-11-14 12:47:46 -05:00
David Yat Sin ed5bbc1eeb rocr: Fix sem_post overflow errors
WaitSemaphore and PostSemaphore are used in the HybridMutex
implementation. If HybridMutex did not have to call WaitSemaphore when
acquired, then calling PostSemaphore would cause the internal count
inside sem_t to slowly grow to large values and eventually cause
overflow.

Change-Id: I173fc17c874b49926e56991405e9086ea8c138fc


[ROCm/ROCR-Runtime commit: f58aff630c]
2024-11-13 21:57:26 -05:00
David Yat Sin 3e694d739a rocr: Add HSA_SIGNAL_WAIT_ABORT_TIMEOUT
Add support for abort timeout when hsa_signal_wait_relaxed is called and
signal does not clear within timeout.
timeout is in seconds

Change-Id: If1db5a8af33c82ddc4b48968c3d8eceb97d0ea6d


[ROCm/ROCR-Runtime commit: 4ec730f1dc]
2024-11-13 21:57:02 -05:00
Jonathan Kim 8af8a65320 kfdtest: add per-pipe reset negative test
Add basic KFD per-pipe reset support.

Change-Id: I0f68c4d33e6d043de0b5cbda1d494640ba8175f1


[ROCm/ROCR-Runtime commit: 865e32baf4]
2024-11-13 13:34:44 -05:00
Jonathan Kim 71bd95ed9e hsakmt: Update HSA capabilites with per-queue reset
Per-queue reset is now supported and flagged in HSA capabilities.

Change-Id: I21e2421da73b9fafae19c903dc3eeeab1f84968d


[ROCm/ROCR-Runtime commit: 1a4adaf7bc]
2024-11-13 13:34:35 -05:00
Konstantin Zhuravlyov 5133b16637 loader: add gfx12-generic support
Change-Id: I0bf5d48ec357278bdb7a9c4eae61a7b7995411f0


[ROCm/ROCR-Runtime commit: ec3d4aa5e9]
2024-11-11 16:27:47 -05:00
Konstantin Zhuravlyov a384ada964 loader: add gfx1153 support
Change-Id: Ie3f0ecf1c6631d95cbff5e14ddc48e751f4c356d


[ROCm/ROCR-Runtime commit: cf9c2efbbd]
2024-11-11 16:27:39 -05:00
Konstantin Zhuravlyov 048a6dc0bd loader/nfc: reorder cases when switching on targets, specific first, generic second
Change-Id: I47f38c1691b9b6ff589f7ff445143997b0801dc6


[ROCm/ROCR-Runtime commit: 7d9a51e22a]
2024-11-11 16:27:34 -05:00
Konstantin Zhuravlyov 68f7fb4fa7 loader: add missing support for gfx700
Change-Id: Ia08e93b0e2d300a183a7a5fb92604cd801b2d52a


[ROCm/ROCR-Runtime commit: 4344f012b6]
2024-11-11 16:27:27 -05:00
Ranjith Ramakrishnan 0954cb2724 Correct the provides field of hsa-rocr and has-rocr-devel package
runtime and devel packages are providing the hsakmt packages. Only devel package need to provide the same
Change the package replaces/obsoletes field accordingly

Change-Id: Ia1a4f128a1f6928faf57faee5f301a77c21acca2


[ROCm/ROCR-Runtime commit: 2970545ded]
2024-11-08 13:51:10 -05:00
Konstantin Zhuravlyov 45c824a387 amd_hsa_elf.h: bring EF_AMDGPU_MACH_* in sync with llvm-project
- formatting
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X56
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X57
  - add EF_AMDGPU_MACH_AMDGCN_GFX1153
  - add EF_AMDGPU_MACH_AMDGCN_GFX12_GENERIC

Change-Id: Ibad464c659137c0c98fa9fa9d1f293ea62684ee6


[ROCm/ROCR-Runtime commit: d9404a52ed]
2024-11-07 18:03:27 -05:00
Chris Freehill 9c7e73ff98 rocr: Dynamically allocate static global memory
To allow non-POD global variables to last until the last thread
has exited, use "new" to allocate the memory instead of static
allocation.

Change-Id: Ica571b61ff8068a52e472c49cb1c44917e60c8c8


[ROCm/ROCR-Runtime commit: 0878deda17]
2024-11-07 09:53:31 -05:00
Jaydeep Patel 6dc8a4ae55 rocr: Decrement counter only if event is popped
Also restore dead signals cleanup for old path when HSA_WAIT_ANY_DEBUG
is used.

Change-Id: I51a7404991443c9f6cbf57b4b9e9faa694b9538c


[ROCm/ROCR-Runtime commit: 700f1d9abd]
2024-11-07 01:03:09 -05:00
AravindanC 697d500cb7 Update static package dependency of rocrtst
Change-Id: Ic12a6f2ec3bd03d871815810cc79488e7d5c57ab


[ROCm/ROCR-Runtime commit: 1a0de862aa]
2024-11-06 07:06:37 -08:00
Yiannis Papadopoulos 83513d4daf rocr: Adding pointer to the owner driver in Agent class
Change-Id: If913d7c7e4caf6d6e6eee3a858a27c6027c2923f


[ROCm/ROCR-Runtime commit: 2837825b14]
2024-10-31 12:29:10 -04:00
Chris Freehill 4005fd9b9d rocr: Fix supported_isas transient memory issue
An ASAN run of the release build revealed some elements of
the supported_isas static map were still using stack data. This
change makes it use heap data so it will persist.

Change-Id: Ie51887e88b9e2dec27acfc97ea45a6219fea971c


[ROCm/ROCR-Runtime commit: c7521a5f2a]
2024-10-31 11:59:29 -04:00
Chris Freehill dd33820b23 rocr: Fix several rocrtst memory errors
Change-Id: I9049a3905fb26cf9b8ad0839684a70771a49f616


[ROCm/ROCR-Runtime commit: 4256630fd0]
2024-10-30 20:36:25 -04:00
Jonathan Kim ce09a178d3 rocr: revert back to old copy behaviour with no xgmi sdma engines
SDMA queue resources are limited when all SDMA copies are bottle necked
into 2 engines.  Callers will not be able to make the best decisions
to allocate queue resources fairly so have ROCr fallback to old round
robin behaviour dictated by KFD.

Change-Id: I93d52297976d74e20129c5eb1dcfbfa5aa5067a7


[ROCm/ROCR-Runtime commit: 7f8676e177]
2024-10-29 16:01:01 -04:00
Chris Freehill 8fe7c40390 rocr: Generic ISA targets support
Change-Id: I6a0341ec9c1ec1e710143676b80a8a3c1a78f725


[ROCm/ROCR-Runtime commit: 0c18ff22e1]
2024-10-28 08:54:06 -05:00
Chris Freehill dd037425ed rocr: Quiet some ROCr compile warnings
These are mostly AIE related, but there are a couple of others.

Change-Id: I549e004772160ca282d4c94dc9d94dd2ccae8b1c


[ROCm/ROCR-Runtime commit: 08699069d6]
2024-10-28 09:08:14 -04:00
German Andryeyev 6617af10e6 rocr: Disable WaitAny() in AsyncEventsLoop()
- Add the new path to avoid WaitAny() calls  in AsyncEventsLoopp() with
HSA_WAIT_ANY_DEBUG key. The new path is selected by default.
The optimizaiton combines all logic of WaitAny() in a single processing loop
and avoids extra memory allocations or ref counting.  Also it won't spin
on the CPU if all events are busy.

Change-Id: I197ce60d0d023fbb672f700d6e87702686f1f55a


[ROCm/ROCR-Runtime commit: 0fc7369ba5]
2024-10-25 14:37:02 -04:00
David Yat Sin 7d84abbc3b rocr: find first dispatch pkt that needs scratch
On GPUs where EOP is handled in asic, the read_dispatch_id is not always
updated after each packet. Look for the first dispatch packet that needs
scratch memory before allocating scratch.

Change-Id: Ibf4b4b485f99bf2fabfe48e9609ca99111fdafbe


[ROCm/ROCR-Runtime commit: d90fbee9c4]
2024-10-25 14:36:40 -04:00
Philip Yang b0558f264c kfdtest: Update KFDSVMEvictTest.QueueTest for CPX mode
Current test has 4 processes, each process allocate and access 512
buffers, this requires 2048 waves to access 2048 buffers at same time to
finish the test. For CPX compute partition mode, each compute node has
less waves and cause random test failure. Change test to 2 processes to
use 1024 waves to access 1024 buffers with the increased buffer size.

Add waves_num check to avoid the test failure on new ASICs or simulator,
skip test if the available waves is less than 1024.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I64b5f9172b62cf38f62fbb0b48a801b8a11401c0


[ROCm/ROCR-Runtime commit: e6d4a32c42]
2024-10-24 12:57:30 -04:00
Yiannis Papadopoulos 6525bb1a5d rocr/aie: Remove unused set container and error when using AIE agents in MemoryRegion
Change-Id: Icf1e56412c840810a679f376293a616068841b8c


[ROCm/ROCR-Runtime commit: c7785a6da1]
2024-10-23 09:42:32 -04:00
Chris Freehill 234de802e6 rocr: supported_isas map elements should persist
The supported_isas static unordered_map was adding stack
allocated Isa objects. Instead, make the objects statically
allocated, as supported_isas itself is.

Change-Id: I23405e218290d48deea6f984f76c57e7b43e314e


[ROCm/ROCR-Runtime commit: fd99b74287]
2024-10-22 18:09:03 -05:00
David Yat Sin 2e6a37f111 kfdtest: Inherit CXX flags
Change-Id: I2e902ec3e6fd582c53a6d95cd49fe2b18f56b8ca


[ROCm/ROCR-Runtime commit: e1865f7b16]
2024-10-17 14:17:08 -04:00
Chris Freehill b617b05c2a rocr: Ensure globals are initialized at first use
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.

Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88


[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
2024-10-16 23:19:48 -04:00
David Yat Sin 5a8092bccf Revert "hsakmt: Only set exec flag when requested"
This reverts commit cfb1ab45ac.

Reason for revert: 
This is currently breaking some tools. Will put it back as soon as tools update their code.

Change-Id: I05c82d443f3a274a618d05e6dc5a87943f5dc7a4


[ROCm/ROCR-Runtime commit: 80da7d5ee4]
2024-10-16 20:31:27 -04:00
David Yat Sin 35187a00df rocr: Add executable flag for memory allocations
Change-Id: I8307cd3562c3ab9c12fef8c457a59916e33b7923


[ROCm/ROCR-Runtime commit: d58c9dea0a]
2024-10-15 16:52:00 +00:00