Grafico dei commit

2649 Commit

Autore SHA1 Messaggio Data
Konstantin Zhuravlyov 048a6dc0bd loader/nfc: reorder cases when switching on targets, specific first, generic second
Change-Id: I47f38c1691b9b6ff589f7ff445143997b0801dc6


[ROCm/ROCR-Runtime commit: 7d9a51e22a]
2024-11-11 16:27:34 -05:00
Konstantin Zhuravlyov 68f7fb4fa7 loader: add missing support for gfx700
Change-Id: Ia08e93b0e2d300a183a7a5fb92604cd801b2d52a


[ROCm/ROCR-Runtime commit: 4344f012b6]
2024-11-11 16:27:27 -05:00
Ranjith Ramakrishnan 0954cb2724 Correct the provides field of hsa-rocr and has-rocr-devel package
runtime and devel packages are providing the hsakmt packages. Only devel package need to provide the same
Change the package replaces/obsoletes field accordingly

Change-Id: Ia1a4f128a1f6928faf57faee5f301a77c21acca2


[ROCm/ROCR-Runtime commit: 2970545ded]
2024-11-08 13:51:10 -05:00
Konstantin Zhuravlyov 45c824a387 amd_hsa_elf.h: bring EF_AMDGPU_MACH_* in sync with llvm-project
- formatting
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X56
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X57
  - add EF_AMDGPU_MACH_AMDGCN_GFX1153
  - add EF_AMDGPU_MACH_AMDGCN_GFX12_GENERIC

Change-Id: Ibad464c659137c0c98fa9fa9d1f293ea62684ee6


[ROCm/ROCR-Runtime commit: d9404a52ed]
2024-11-07 18:03:27 -05:00
Chris Freehill 9c7e73ff98 rocr: Dynamically allocate static global memory
To allow non-POD global variables to last until the last thread
has exited, use "new" to allocate the memory instead of static
allocation.

Change-Id: Ica571b61ff8068a52e472c49cb1c44917e60c8c8


[ROCm/ROCR-Runtime commit: 0878deda17]
2024-11-07 09:53:31 -05:00
Jaydeep Patel 6dc8a4ae55 rocr: Decrement counter only if event is popped
Also restore dead signals cleanup for old path when HSA_WAIT_ANY_DEBUG
is used.

Change-Id: I51a7404991443c9f6cbf57b4b9e9faa694b9538c


[ROCm/ROCR-Runtime commit: 700f1d9abd]
2024-11-07 01:03:09 -05:00
AravindanC 697d500cb7 Update static package dependency of rocrtst
Change-Id: Ic12a6f2ec3bd03d871815810cc79488e7d5c57ab


[ROCm/ROCR-Runtime commit: 1a0de862aa]
2024-11-06 07:06:37 -08:00
Yiannis Papadopoulos 83513d4daf rocr: Adding pointer to the owner driver in Agent class
Change-Id: If913d7c7e4caf6d6e6eee3a858a27c6027c2923f


[ROCm/ROCR-Runtime commit: 2837825b14]
2024-10-31 12:29:10 -04:00
Chris Freehill 4005fd9b9d rocr: Fix supported_isas transient memory issue
An ASAN run of the release build revealed some elements of
the supported_isas static map were still using stack data. This
change makes it use heap data so it will persist.

Change-Id: Ie51887e88b9e2dec27acfc97ea45a6219fea971c


[ROCm/ROCR-Runtime commit: c7521a5f2a]
2024-10-31 11:59:29 -04:00
Chris Freehill dd33820b23 rocr: Fix several rocrtst memory errors
Change-Id: I9049a3905fb26cf9b8ad0839684a70771a49f616


[ROCm/ROCR-Runtime commit: 4256630fd0]
2024-10-30 20:36:25 -04:00
Jonathan Kim ce09a178d3 rocr: revert back to old copy behaviour with no xgmi sdma engines
SDMA queue resources are limited when all SDMA copies are bottle necked
into 2 engines.  Callers will not be able to make the best decisions
to allocate queue resources fairly so have ROCr fallback to old round
robin behaviour dictated by KFD.

Change-Id: I93d52297976d74e20129c5eb1dcfbfa5aa5067a7


[ROCm/ROCR-Runtime commit: 7f8676e177]
2024-10-29 16:01:01 -04:00
Chris Freehill 8fe7c40390 rocr: Generic ISA targets support
Change-Id: I6a0341ec9c1ec1e710143676b80a8a3c1a78f725


[ROCm/ROCR-Runtime commit: 0c18ff22e1]
2024-10-28 08:54:06 -05:00
Chris Freehill dd037425ed rocr: Quiet some ROCr compile warnings
These are mostly AIE related, but there are a couple of others.

Change-Id: I549e004772160ca282d4c94dc9d94dd2ccae8b1c


[ROCm/ROCR-Runtime commit: 08699069d6]
2024-10-28 09:08:14 -04:00
German Andryeyev 6617af10e6 rocr: Disable WaitAny() in AsyncEventsLoop()
- Add the new path to avoid WaitAny() calls  in AsyncEventsLoopp() with
HSA_WAIT_ANY_DEBUG key. The new path is selected by default.
The optimizaiton combines all logic of WaitAny() in a single processing loop
and avoids extra memory allocations or ref counting.  Also it won't spin
on the CPU if all events are busy.

Change-Id: I197ce60d0d023fbb672f700d6e87702686f1f55a


[ROCm/ROCR-Runtime commit: 0fc7369ba5]
2024-10-25 14:37:02 -04:00
David Yat Sin 7d84abbc3b rocr: find first dispatch pkt that needs scratch
On GPUs where EOP is handled in asic, the read_dispatch_id is not always
updated after each packet. Look for the first dispatch packet that needs
scratch memory before allocating scratch.

Change-Id: Ibf4b4b485f99bf2fabfe48e9609ca99111fdafbe


[ROCm/ROCR-Runtime commit: d90fbee9c4]
2024-10-25 14:36:40 -04:00
Philip Yang b0558f264c kfdtest: Update KFDSVMEvictTest.QueueTest for CPX mode
Current test has 4 processes, each process allocate and access 512
buffers, this requires 2048 waves to access 2048 buffers at same time to
finish the test. For CPX compute partition mode, each compute node has
less waves and cause random test failure. Change test to 2 processes to
use 1024 waves to access 1024 buffers with the increased buffer size.

Add waves_num check to avoid the test failure on new ASICs or simulator,
skip test if the available waves is less than 1024.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Change-Id: I64b5f9172b62cf38f62fbb0b48a801b8a11401c0


[ROCm/ROCR-Runtime commit: e6d4a32c42]
2024-10-24 12:57:30 -04:00
Yiannis Papadopoulos 6525bb1a5d rocr/aie: Remove unused set container and error when using AIE agents in MemoryRegion
Change-Id: Icf1e56412c840810a679f376293a616068841b8c


[ROCm/ROCR-Runtime commit: c7785a6da1]
2024-10-23 09:42:32 -04:00
Chris Freehill 234de802e6 rocr: supported_isas map elements should persist
The supported_isas static unordered_map was adding stack
allocated Isa objects. Instead, make the objects statically
allocated, as supported_isas itself is.

Change-Id: I23405e218290d48deea6f984f76c57e7b43e314e


[ROCm/ROCR-Runtime commit: fd99b74287]
2024-10-22 18:09:03 -05:00
David Yat Sin 2e6a37f111 kfdtest: Inherit CXX flags
Change-Id: I2e902ec3e6fd582c53a6d95cd49fe2b18f56b8ca


[ROCm/ROCR-Runtime commit: e1865f7b16]
2024-10-17 14:17:08 -04:00
Chris Freehill b617b05c2a rocr: Ensure globals are initialized at first use
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.

Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88


[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
2024-10-16 23:19:48 -04:00
David Yat Sin 5a8092bccf Revert "hsakmt: Only set exec flag when requested"
This reverts commit cfb1ab45ac.

Reason for revert: 
This is currently breaking some tools. Will put it back as soon as tools update their code.

Change-Id: I05c82d443f3a274a618d05e6dc5a87943f5dc7a4


[ROCm/ROCR-Runtime commit: 80da7d5ee4]
2024-10-16 20:31:27 -04:00
David Yat Sin 35187a00df rocr: Add executable flag for memory allocations
Change-Id: I8307cd3562c3ab9c12fef8c457a59916e33b7923


[ROCm/ROCR-Runtime commit: d58c9dea0a]
2024-10-15 16:52:00 +00:00
David Yat Sin d0c5158374 rocrtst: Fix VirtMemory_Basic_Test permissions
Fix VirtMemory_Basic_Test permissions to adjust for previous change to
the hsa_amd_vmem_set_access behavior change that was done with this
patch:

rocr/vmm: Only modify permisions for specified agents

Change-Id: I97230600b9b9144459b08ca3da3a5bfbdbb98231


[ROCm/ROCR-Runtime commit: ead3aafcda]
2024-10-11 10:41:11 -04:00
jokim 1e7d5c5833 rocr: Workaround segfault on GFX9 devices older than GFX90a
Devices older than GFX90a hit a segfault on queue unmap when an
SDMA queue has been assigned a fixed engine.  Bypass fixing the
engine for these devices for now.

Change-Id: I7d2f882d2377f004a7bb65f3b397396db07ce6d3


[ROCm/ROCR-Runtime commit: 1d6ff45673]
2024-10-10 14:41:10 -04:00
Kent Russell 9f4a01e8a9 kfdtest: Fix in-tree scriptless build
If you build thunk following the instructions in the thunk's README,
there is no /lib folder in the build folder. Adjust the include path,
and clean up the docs to reflect that. The header include is already
defined in the CMake file as ../../include, so we don't use
LIBHSAKMT_PATH for that linking, just the lib location

Change-Id: I73435d59adb9d01f527a28b1935086260e9d3d70
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: ccd80d19ba]
2024-10-08 14:42:33 -04:00
Shweta Khatri 6506a664e7 hsakmt: pmc_table.c: Fix Coverity reported warnings
Eliminate out-of-bounds access in get_block_properties

Change-Id: I3abee1e36fafdda053d4bc4a611698d676b01d5c


[ROCm/ROCR-Runtime commit: 8bc4efc8ca]
2024-10-07 14:15:26 -04:00
Shweta Khatri f2bd877365 hsakmt: debug.c: Fix Coverity reported warnings
Fix potential  memory leak reported by Coverity warnings

Change-Id: Iacbaa99be3f4fe7fae5fb6a10bd41dfc34b96059


[ROCm/ROCR-Runtime commit: 52e7fd1480]
2024-10-07 14:14:26 -04:00
Shweta Khatri 42043a59ea hsakmt: fmm.c: Fix Coverity reported warnings
Fixed multiple issues related to memory management, atomicity,
and error handling across various functions: handle null checks,
use-after-free, unchecked returns, and memory leaks.

Change-Id: Ia7c76320cc20e24001052fbba2dd0600bd412140


[ROCm/ROCR-Runtime commit: c9454794b6]
2024-10-07 13:54:03 -04:00
David Yat Sin 2445514435 rocr: Fix memory leak on non-visible GPUs
Fix memory leak for memory regions objects when GPU is masked using
ROCR_VISIBLE_DEVICES.

Change-Id: I610842a18adbc3cdc854b12650844e271bc00592


[ROCm/ROCR-Runtime commit: dbae8da515]
2024-10-04 17:40:47 -04:00
Jonathan Kim 2e7b02add9 rocr: Use new extended graphics handle registration call on IPC import
To correctly map to all GPUs after an import, use the new extended
registration call that can import a virtual address without having to
specify a target node.

Change-Id: Ifca8f6f6ee24fa99b2af357dcc3ea1de3ab234f7


[ROCm/ROCR-Runtime commit: 0ae064fe2d]
2024-10-03 14:06:37 -04:00
Jonathan Kim 43ebf5b524 hsakmt: Enable graphics handle registration with a virtual address
Currently registering graphics memory without specifying a target
node will return a memory handle that's not a virtual address.

As a result, ROCr is forced to register with a target node for
IPC usage.

Mapping memory without specifying a target node afterwards will
result in mapping to the target node that was imported because the
previous import call flags this node targeting action to future mapping.

For ROCr IPC usage, ROCr wants to map to all GPU nodes if the target node
is not specified.

Allow the caller to register graphics handles that returns a virtual
address without having to specify the target node so that the caller
can make a subsequent map call to all GPUs.

Change-Id: I5a935092b885cc3568e4f3a5dd951c7ec6c84fca


[ROCm/ROCR-Runtime commit: 03463ed2c0]
2024-10-03 14:06:31 -04:00
Ranjith Ramakrishnan 7b21e011d2 cmake component grouping should not be ignored
In static build, the dev and binary components are grouped to generate static package
Removed the line that was ignoring the component grouping

Change-Id: Ie0ca9db109f2002891260985634f2e6b1ea7f236


[ROCm/ROCR-Runtime commit: f27ae44b8c]
2024-10-01 14:22:21 -07:00
Shweta Khatri b4ebbe7448 hsakmt: spm.c: Fix Coverity reported warnings
Fix unused ret value and initialize gpu_id

Change-Id: Ib3acc7db4bbab519318d0970786a5dc641dcc9eb


[ROCm/ROCR-Runtime commit: 9f43c9fd51]
2024-09-30 19:46:51 -04:00
David Yat Sin 298ec3d840 rocr/vmm: Only modify permisions for specified agents
When hsa_amd_vmem_set_access is called, do not remove permissions for
unspecified agents. Also updating documentation in header to clarify
this.

Change-Id: I3bb4cf08ba399f85cc67b17fd13a4a40d862415f


[ROCm/ROCR-Runtime commit: 73f6bfa747]
2024-09-30 17:41:58 -04:00
Mukul Joshi e1ffd97abd kfdtest: Update KFDPerformanceTest.P2PBandWidthTest for CPX mode
Currently, KFDPerformanceTest.P2PBandWidthTest cannot work if there are
more than 16 KFD nodes in the system. This limit was put in to match the
number of SDMA queues supported on a single node.
This patch updates the test to make it run on systems with more than
16 KFD nodes.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I561d0cdef664cae84fb9c13a801052e2001256e5


[ROCm/ROCR-Runtime commit: b81e45f03c]
2024-09-30 11:28:33 -04:00
Jonathan Kim 86d28294b7 rocr: Fix race condition in IPC DMABUF socket server
Socket server accept calls do not guarantee synchronous actions
post-accept. This can result in a race condition.

To resolve this, first limit the socket server's listen backlog to a
single connection. This will force competing clients to busy-retry
until timeout.

Second, make the DMABUF IPC file descriptor send-receive and import
calls into an atomic routine per connection.

By doing these fixes, not only to we resolve potential races but
we guarantee that any exporter process will create at most one
file descriptor that will only last for the duration of the import
transaction.  This alleviates any concern on running into system
limits for the number of open file descriptors per process.

Change-Id: I6d8b14795a680d89a2707e082fa027d525792e05


[ROCm/ROCR-Runtime commit: 909b82d463]
2024-09-27 14:40:59 -04:00
Jonathan Kim ff4690de61 rocr: Fix IPC DMA Buf fragment handling and enable for development
Discarding blocks for reallocation on IPC export for better memory
performance trigger memory violations with DMA BUF exports so bypass
this for now as application performance drops haven't been observed
with the bypass.

The raw fragment should be passed to the DMA Buf export call as well
since offsets will be implicitly applied in the Thunk/KFD for
export/import calls.

Also, use the agent information directly from the pointer
information so that the export call doesn't have to scan memory to find
this.  Pass the node ID in the handle so that the import call doesn't
have to make two thunk imports to fetch the node ID for GPU memory
imports.

Finally, allow the user to use DMA Buf IPC via
HSA_ENABLE_IPC_MODE_LEGACY=0 for developer testing as legacy mode will
be applied by default.

Change-Id: Ie8fe267f8768fa5df37126078406f7065f69ff4e


[ROCm/ROCR-Runtime commit: 32bb0764b7]
2024-09-27 14:40:42 -04:00
Kent Russell 17d23cbd78 rocrtst: Various codeql fixes
Fix some potentially unreleased memory, null value checks, files not
closed, and other such issues reported by codeql

Change-Id: Ia679aff97a773a642d8c8cbadeae30955554a62e
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: d64e33520f]
2024-09-27 09:56:18 -04:00
Samuel Zhang 0c81e6a391 SWDEV-484614 - KFDSVMRangeTest.HMMProfilingEvent/1 random fail in VM
In VM with 6vcpu, cpu schedule of
queue_delayed_work(system_freezable_wq) is lower than BM.
HSA_SMI_EVENT_QUEUE_RESTORE event from case HMMProfilingEvent/0 got
delayed execution and caused HMMProfilingEvent/1 fail.

The fix is only listen to HSA_SMI_EVENT_MIGRATE_START event and ignore
all other events.

Change-Id: I534e49b030bd4c534bc7a63eb431f4907659c8cd


[ROCm/ROCR-Runtime commit: 5a1b6bf14d]
2024-09-26 13:37:08 +08:00
Jonathan Kim 4cd7f3b675 hsakmt: Update hsa capabilities with precise ALU ops
Update the HSA capabilities field with precise ALU ops bit support
for GPU debugging.

Change-Id: I796f2c2e0559577828aba510c401ed5187e10179


[ROCm/ROCR-Runtime commit: 027af8dacd]
2024-09-24 13:45:04 -04:00
Jonathan Kim a5954a2442 hsakmt: Update thunk doc comments for debug firmware support caps
Update commentary on HWS scheduler support bit for GPU debugging in
the HSA capabilities node properties field.

Change-Id: I59c519d74a528d5ecf5817ef94e75091314bd844


[ROCm/ROCR-Runtime commit: a926a070ee]
2024-09-23 16:34:31 -04:00
Shweta Khatri fb7b04e06e hsakmt: queues.c: Fix Coverity reported warnings
Move variable declarations inline and add NULL checks to prevent errors

Change-Id: Ia5bf5e245bcc0f756a15bc799b55c5e2a8459f89


[ROCm/ROCR-Runtime commit: 681610937a]
2024-09-23 15:07:28 -04:00
Longlong Yao 745d799f9b rocrtst: fix resource leak
Change-Id: Ib57dccad0b639539e1076daba31eef278f2cf638
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>


[ROCm/ROCR-Runtime commit: 3b829d0e62]
2024-09-23 15:04:43 -04:00
Young Hui ea8cf386f4 docs: move .readthedocs.yaml to the root of repo
Change-Id: I6c5afb806c47c029359a2dee2a7e73c6d076cfb1


[ROCm/ROCR-Runtime commit: b530e0f619]
2024-09-23 15:49:19 +00:00
Young Hui e2b14cc3ec docs: path adjustments to allow documentation to build again
- adding doc files to .gitignore

Change-Id: Ia6b2358bb1f236298ad1d705c1bed0636026632d


[ROCm/ROCR-Runtime commit: 75b674f0ad]
2024-09-23 15:49:13 +00:00
Shweta Khatri 5dff20b660 hsakmt: events.c: Fix Coverity reported warnings
Fix data race by protecting events_page access with mutex in event create
Fix potential NULL dereference in hsaKmtWaitOnMultipleEvents_Ext
Fix unchecked return value in hsaKmtCreateEvent function

Change-Id: I434bef43666e5205a8b061259569c1d99a952752


[ROCm/ROCR-Runtime commit: 857200e28c]
2024-09-23 11:35:02 -04:00
Shweta Khatri b4db09e269 hsakmt: hsakmttypes.h: Fix Coverity reported warnings
Eliminated declared but not referenced variables to fix warnings

Change-Id: I80032a699fb59ce4635c5001f669d009ba60e588


[ROCm/ROCR-Runtime commit: 303c02690d]
2024-09-23 11:34:44 -04:00
Shweta Khatri fa0978e19a hsakmt: topology.c: Fix Coverity reported warnings
Refactor fscanf_str to use fgets for safer string handling, remove unused code

Change-Id: Ibf4b4b485f99bf2fabfe48e9609ca99111feaf1e


[ROCm/ROCR-Runtime commit: 659fa04d8c]
2024-09-23 11:34:28 -04:00
Yiannis Papadopoulos 14b3b04250 rocr/aie: Correct reporting of dev heap size
Storing the correct dev heap size in the memory region.

Change-Id: I14b053330c187da1d7d0213256625e50795b9902


[ROCm/ROCR-Runtime commit: 48fdc17179]
2024-09-20 12:44:23 -04:00
Kent Russell 4901f1a528 hsakmt: Undo HSAKMT prefix for PAGE_SHIFT
We had skipped doing it for PAGE_SIZE, but it should be left as the
regular PAGE_SHIFT name, especially for users who are using different
headers. We want PAGE_SHIFT and PAGE_SIZE to be consistent with one
another, so set them both explicitly to the same value if either
of them is undefined

Change-Id: I121d81c48409dd77351b59a192d824e2419a2410
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: daad183bf8]
2024-09-20 11:04:34 -04:00