Commit-Graf

1069 Incheckningar

Upphovsman SHA1 Meddelande Datum
Konstantin Zhuravlyov 048a6dc0bd loader/nfc: reorder cases when switching on targets, specific first, generic second
Change-Id: I47f38c1691b9b6ff589f7ff445143997b0801dc6


[ROCm/ROCR-Runtime commit: 7d9a51e22a]
2024-11-11 16:27:34 -05:00
Konstantin Zhuravlyov 68f7fb4fa7 loader: add missing support for gfx700
Change-Id: Ia08e93b0e2d300a183a7a5fb92604cd801b2d52a


[ROCm/ROCR-Runtime commit: 4344f012b6]
2024-11-11 16:27:27 -05:00
Konstantin Zhuravlyov 45c824a387 amd_hsa_elf.h: bring EF_AMDGPU_MACH_* in sync with llvm-project
- formatting
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X56
  - add EF_AMDGPU_MACH_AMDGCN_RESERVED_0X57
  - add EF_AMDGPU_MACH_AMDGCN_GFX1153
  - add EF_AMDGPU_MACH_AMDGCN_GFX12_GENERIC

Change-Id: Ibad464c659137c0c98fa9fa9d1f293ea62684ee6


[ROCm/ROCR-Runtime commit: d9404a52ed]
2024-11-07 18:03:27 -05:00
Chris Freehill 9c7e73ff98 rocr: Dynamically allocate static global memory
To allow non-POD global variables to last until the last thread
has exited, use "new" to allocate the memory instead of static
allocation.

Change-Id: Ica571b61ff8068a52e472c49cb1c44917e60c8c8


[ROCm/ROCR-Runtime commit: 0878deda17]
2024-11-07 09:53:31 -05:00
Jaydeep Patel 6dc8a4ae55 rocr: Decrement counter only if event is popped
Also restore dead signals cleanup for old path when HSA_WAIT_ANY_DEBUG
is used.

Change-Id: I51a7404991443c9f6cbf57b4b9e9faa694b9538c


[ROCm/ROCR-Runtime commit: 700f1d9abd]
2024-11-07 01:03:09 -05:00
Yiannis Papadopoulos 83513d4daf rocr: Adding pointer to the owner driver in Agent class
Change-Id: If913d7c7e4caf6d6e6eee3a858a27c6027c2923f


[ROCm/ROCR-Runtime commit: 2837825b14]
2024-10-31 12:29:10 -04:00
Chris Freehill 4005fd9b9d rocr: Fix supported_isas transient memory issue
An ASAN run of the release build revealed some elements of
the supported_isas static map were still using stack data. This
change makes it use heap data so it will persist.

Change-Id: Ie51887e88b9e2dec27acfc97ea45a6219fea971c


[ROCm/ROCR-Runtime commit: c7521a5f2a]
2024-10-31 11:59:29 -04:00
Jonathan Kim ce09a178d3 rocr: revert back to old copy behaviour with no xgmi sdma engines
SDMA queue resources are limited when all SDMA copies are bottle necked
into 2 engines.  Callers will not be able to make the best decisions
to allocate queue resources fairly so have ROCr fallback to old round
robin behaviour dictated by KFD.

Change-Id: I93d52297976d74e20129c5eb1dcfbfa5aa5067a7


[ROCm/ROCR-Runtime commit: 7f8676e177]
2024-10-29 16:01:01 -04:00
Chris Freehill 8fe7c40390 rocr: Generic ISA targets support
Change-Id: I6a0341ec9c1ec1e710143676b80a8a3c1a78f725


[ROCm/ROCR-Runtime commit: 0c18ff22e1]
2024-10-28 08:54:06 -05:00
Chris Freehill dd037425ed rocr: Quiet some ROCr compile warnings
These are mostly AIE related, but there are a couple of others.

Change-Id: I549e004772160ca282d4c94dc9d94dd2ccae8b1c


[ROCm/ROCR-Runtime commit: 08699069d6]
2024-10-28 09:08:14 -04:00
German Andryeyev 6617af10e6 rocr: Disable WaitAny() in AsyncEventsLoop()
- Add the new path to avoid WaitAny() calls  in AsyncEventsLoopp() with
HSA_WAIT_ANY_DEBUG key. The new path is selected by default.
The optimizaiton combines all logic of WaitAny() in a single processing loop
and avoids extra memory allocations or ref counting.  Also it won't spin
on the CPU if all events are busy.

Change-Id: I197ce60d0d023fbb672f700d6e87702686f1f55a


[ROCm/ROCR-Runtime commit: 0fc7369ba5]
2024-10-25 14:37:02 -04:00
David Yat Sin 7d84abbc3b rocr: find first dispatch pkt that needs scratch
On GPUs where EOP is handled in asic, the read_dispatch_id is not always
updated after each packet. Look for the first dispatch packet that needs
scratch memory before allocating scratch.

Change-Id: Ibf4b4b485f99bf2fabfe48e9609ca99111fdafbe


[ROCm/ROCR-Runtime commit: d90fbee9c4]
2024-10-25 14:36:40 -04:00
Yiannis Papadopoulos 6525bb1a5d rocr/aie: Remove unused set container and error when using AIE agents in MemoryRegion
Change-Id: Icf1e56412c840810a679f376293a616068841b8c


[ROCm/ROCR-Runtime commit: c7785a6da1]
2024-10-23 09:42:32 -04:00
Chris Freehill 234de802e6 rocr: supported_isas map elements should persist
The supported_isas static unordered_map was adding stack
allocated Isa objects. Instead, make the objects statically
allocated, as supported_isas itself is.

Change-Id: I23405e218290d48deea6f984f76c57e7b43e314e


[ROCm/ROCR-Runtime commit: fd99b74287]
2024-10-22 18:09:03 -05:00
Chris Freehill b617b05c2a rocr: Ensure globals are initialized at first use
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.

Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88


[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
2024-10-16 23:19:48 -04:00
David Yat Sin 35187a00df rocr: Add executable flag for memory allocations
Change-Id: I8307cd3562c3ab9c12fef8c457a59916e33b7923


[ROCm/ROCR-Runtime commit: d58c9dea0a]
2024-10-15 16:52:00 +00:00
jokim 1e7d5c5833 rocr: Workaround segfault on GFX9 devices older than GFX90a
Devices older than GFX90a hit a segfault on queue unmap when an
SDMA queue has been assigned a fixed engine.  Bypass fixing the
engine for these devices for now.

Change-Id: I7d2f882d2377f004a7bb65f3b397396db07ce6d3


[ROCm/ROCR-Runtime commit: 1d6ff45673]
2024-10-10 14:41:10 -04:00
David Yat Sin 2445514435 rocr: Fix memory leak on non-visible GPUs
Fix memory leak for memory regions objects when GPU is masked using
ROCR_VISIBLE_DEVICES.

Change-Id: I610842a18adbc3cdc854b12650844e271bc00592


[ROCm/ROCR-Runtime commit: dbae8da515]
2024-10-04 17:40:47 -04:00
Jonathan Kim 2e7b02add9 rocr: Use new extended graphics handle registration call on IPC import
To correctly map to all GPUs after an import, use the new extended
registration call that can import a virtual address without having to
specify a target node.

Change-Id: Ifca8f6f6ee24fa99b2af357dcc3ea1de3ab234f7


[ROCm/ROCR-Runtime commit: 0ae064fe2d]
2024-10-03 14:06:37 -04:00
David Yat Sin 298ec3d840 rocr/vmm: Only modify permisions for specified agents
When hsa_amd_vmem_set_access is called, do not remove permissions for
unspecified agents. Also updating documentation in header to clarify
this.

Change-Id: I3bb4cf08ba399f85cc67b17fd13a4a40d862415f


[ROCm/ROCR-Runtime commit: 73f6bfa747]
2024-09-30 17:41:58 -04:00
Jonathan Kim 86d28294b7 rocr: Fix race condition in IPC DMABUF socket server
Socket server accept calls do not guarantee synchronous actions
post-accept. This can result in a race condition.

To resolve this, first limit the socket server's listen backlog to a
single connection. This will force competing clients to busy-retry
until timeout.

Second, make the DMABUF IPC file descriptor send-receive and import
calls into an atomic routine per connection.

By doing these fixes, not only to we resolve potential races but
we guarantee that any exporter process will create at most one
file descriptor that will only last for the duration of the import
transaction.  This alleviates any concern on running into system
limits for the number of open file descriptors per process.

Change-Id: I6d8b14795a680d89a2707e082fa027d525792e05


[ROCm/ROCR-Runtime commit: 909b82d463]
2024-09-27 14:40:59 -04:00
Jonathan Kim ff4690de61 rocr: Fix IPC DMA Buf fragment handling and enable for development
Discarding blocks for reallocation on IPC export for better memory
performance trigger memory violations with DMA BUF exports so bypass
this for now as application performance drops haven't been observed
with the bypass.

The raw fragment should be passed to the DMA Buf export call as well
since offsets will be implicitly applied in the Thunk/KFD for
export/import calls.

Also, use the agent information directly from the pointer
information so that the export call doesn't have to scan memory to find
this.  Pass the node ID in the handle so that the import call doesn't
have to make two thunk imports to fetch the node ID for GPU memory
imports.

Finally, allow the user to use DMA Buf IPC via
HSA_ENABLE_IPC_MODE_LEGACY=0 for developer testing as legacy mode will
be applied by default.

Change-Id: Ie8fe267f8768fa5df37126078406f7065f69ff4e


[ROCm/ROCR-Runtime commit: 32bb0764b7]
2024-09-27 14:40:42 -04:00
Young Hui ea8cf386f4 docs: move .readthedocs.yaml to the root of repo
Change-Id: I6c5afb806c47c029359a2dee2a7e73c6d076cfb1


[ROCm/ROCR-Runtime commit: b530e0f619]
2024-09-23 15:49:19 +00:00
Young Hui e2b14cc3ec docs: path adjustments to allow documentation to build again
- adding doc files to .gitignore

Change-Id: Ia6b2358bb1f236298ad1d705c1bed0636026632d


[ROCm/ROCR-Runtime commit: 75b674f0ad]
2024-09-23 15:49:13 +00:00
Yiannis Papadopoulos 14b3b04250 rocr/aie: Correct reporting of dev heap size
Storing the correct dev heap size in the memory region.

Change-Id: I14b053330c187da1d7d0213256625e50795b9902


[ROCm/ROCR-Runtime commit: 48fdc17179]
2024-09-20 12:44:23 -04:00
James Xu 94ee49bdbd rocr: Add nullptr check in IterateExecutables
When an entry is deleted from the array, it's set to nullptr
but not removed. Most other functions that
iterate over the array check if the entry is nullptr
but this loop in IterateExecutables did not.

Change-Id: I763b361eea59f6df201bb86ead0234e95f2cf79c


[ROCm/ROCR-Runtime commit: f3664fd124]
2024-09-19 19:44:53 +00:00
David Yat Sin c6bb1eef90 rocr: Add extended fine-grain memory on host
Change-Id: Id9317fee89b51a5097459255e0a3092820eff430


[ROCm/ROCR-Runtime commit: 7f3dcd4e0b]
2024-09-19 19:44:53 +00:00
David Yat Sin 098bccbda4 rocr: Return err when freeing invalid pointer
Return false if trying to free a NULL pointer (or invalid size)
internally in ROCr. This is to detect errors within ROCr when trying
to free NULL pointers. If a user of ROCr tries to free a NULL
pointer, this condition should be caught at the beginning of the
Runtime::FreeMemory(...) function and return HSA_STATUS_SUCCESS. This
matches the behavior of the free(...) or delete functions that
silently ignores calls when the passed a NULL pointer.

Change-Id: I84bc26928b35023e19cd9f214b42c6ee9508029c


[ROCm/ROCR-Runtime commit: 0af7a54ebe]
2024-09-19 19:44:53 +00:00
David Yat Sin 5322dd3116 rocr: extend agents_allow_access support VMM
Extend hsa_amd_agents_allow_access API to handle memory allocations done
via VMM APIs.

Change-Id: I4ae51d3e42dd104e98d513b1da86133d312a7203


[ROCm/ROCR-Runtime commit: 561c44a4a9]
2024-09-19 19:44:53 +00:00
David Yat Sin c34ba41fe4 rocr: refactor VMemorySetAccess function
Refactor VMemorySetAccess so that it can be re-used in the following
patch.

Change-Id: I341241da7a59724bb3611172f0d26b0689d7bb46


[ROCm/ROCR-Runtime commit: 8f1b05660a]
2024-09-19 19:44:53 +00:00
Sam Wu 01c1e91c70 Update rocm-docs-core to 1.8.0
Updating past rocm-docs-core 0.x.x requires Python 3.10 (specified in .readthedocs.yaml config)

Change-Id: I3421aa92cda62d48a1466046f1fdeb0a3abf3ef7


[ROCm/ROCR-Runtime commit: 5506b9af7a]
2024-09-19 19:44:53 +00:00
Tony Gutierrez 29de0df770 rocr/aie: Support VMEM handle creation
Adds support for AllocateMemoryOnly inside XDNA driver.

Move the IsLocalMemory() check inside the KFD driver
since the XDNA driver can, and needs to, create handles
on system memory buffer objects.

Changed handle variable name from thunk_handle to user_mode_driver_handle,
which is more representative if we support non-GPU drivers.

Change-Id: I95db9d575afd1ab0ff2de74cea5175d9a12a721b


[ROCm/ROCR-Runtime commit: 4bf102dc6b]
2024-09-19 19:44:53 +00:00
Tony Gutierrez f58a06c656 rocr/aie: Allocate AIE queue's ring buf
Change-Id: I799a8223d695ec5c0ea2eaea012bc1b5d877e103


[ROCm/ROCR-Runtime commit: 54a459c05c]
2024-09-19 19:44:53 +00:00
Tony Gutierrez db03f1130f rocr/aie: Init mem regions for AIE agents
Change-Id: If180bdbcb3eb659f0d05a710526864494316d7a9


[ROCm/ROCR-Runtime commit: a851f73da5]
2024-09-19 19:44:53 +00:00
Tony Gutierrez f096174cfc rocr/aie: Add AMD AIE Embedded Runtime vendor packets
Adds support for the packet interface for interacting with
the Embedded Runtime (ERT) on AIE agents. The ERT is what
interprets command packets send to the AIE agent work
queues.

Change-Id: Id28fb98056b2c046354c446bdc9568d74385bea1


[ROCm/ROCR-Runtime commit: 6abb993f65]
2024-09-19 19:44:53 +00:00
Tony Gutierrez 1103441d22 rocr/aie: Add support for creating AIE queue context
Adds support for initialzing the XDNA driver so that
a hardware context can be created for an AIE queue.

Right now this initializes the device heap in the driver,
gets the relevant tile parameters for the AIE agent,
and creates a hardware context that backs the AIE queue.

Change-Id: Ib90e1bc67a8637f6db3ff2bebe34677843796417


[ROCm/ROCR-Runtime commit: 931733d51a]
2024-09-19 19:44:53 +00:00
David Yat Sin 3189a6b827 rocr: Remove unnecessary function declarations
Change-Id: Ia2613ce74cac808f9239fc24049b57b7b1abaed9


[ROCm/ROCR-Runtime commit: d6ec7b6489]
2024-09-19 19:44:53 +00:00
David Yat Sin c4874743ce rocr: Fix compile error
Change-Id: Iae6bf08e834a426f6f97cbc51d2a1a38199015bd


[ROCm/ROCR-Runtime commit: 12e299e8d4]
2024-09-19 19:44:53 +00:00
David Yat Sin e000dabf8b rocr: Increase queue size for co-op queues
Increase queue-size for co-op queues to 16K to improve performance on
some workloads

Change-Id: I4d3bf0ecbd30ebb648b68d9c5fdabadc670a386c


[ROCm/ROCR-Runtime commit: 924e11ba7f]
2024-09-19 19:44:53 +00:00
Jonathan Kim 2ecc057548 rocr: Reverse host-device copy engines on GFX94x
GFX 9.4.x has better performance for CPU-GPU copies when using
engines in reverse order from other devices.

Change-Id: I1eaebf0e837bb7f44712f40d5115df618f6a73d7


[ROCm/ROCR-Runtime commit: 509e8d863a]
2024-09-06 19:02:59 -04:00
Jonathan Kim 9e58f63b9f rocr: Fix backwards compatible host-device copies on target engines
If the KFD doesn't support targeting SDMA engines, ensure that ROCr
selects the correct downstream queue type by using an invalid engine.

Change-Id: Ia6848126f67f3d35ab37248633e8e0e6e2d77fff


[ROCm/ROCR-Runtime commit: 24b25003b0]
2024-09-06 19:02:51 -04:00
Saleel Kudchadker 8d1fe1f7ea rocr: Allocate AQL queue on device memory
- Use HSA_ALLOCATE_QUEUE_DEV_MEM=1 to create AQL queue in device
memory.
- Before writing AQL packet header to the queue use an SFENCE to ensure
that there is no reodering of the writes over PCIE

Change-Id: I5eacdc35108c4a1e245c75ae349b7495451aa60d


[ROCm/ROCR-Runtime commit: 3baaa6e9c0]
2024-09-05 17:48:02 -04:00
Chris Freehill b56dd0ef3b hsakmt: Fix ROCr static lib build in new layout
Change-Id: Idc71524924b96a44d63be9b1d0fccbe0e328d96e


[ROCm/ROCR-Runtime commit: 7e13b9e62f]
2024-09-05 10:26:06 -04:00
Tony Gutierrez e58ac6b591 rocr: Generalize AMD::MemoryRegion Allocate and Free
Remove KFD-specific Allocate/Free calls from the AMD::MemoryRegion.
The KFD-driver-specific Allocate/Free calls are now implemented in
the KfdDriver. Future changes will migrate the remaining KFD-specific
calls out of AMD::MemoryRegion.

This allows the MemoryRegion to be used across AMD drivers like the
XDNA driver.

Change-Id: Ib6a2a9e5e1a15e61644d2592beb3a8e6578c3010


[ROCm/ROCR-Runtime commit: 68669f4e1a]
2024-08-28 14:35:07 -07:00
Tony Gutierrez 6e02f63333 rocr/kfd-driver: Add initial KFD driver interface
Adds the initial KFD driver interface and use it to open the
KFD from amd_topology.cpp.

This change is to show the direction of the Driver interface for
initially supporting the KFD and to get feedback on the approach.
For now we wrap relevant ROCt calls behind this generic driver
interface so that we can generalize core ROCr components like
MemoryRegion, Runtime, etc.

Now that ROCt is incorporated into ROCr, we can more fully integrate
ROCt into the Driver interface. Ideally, we get to a point where
the generic Driver interface can support KFD, XDNA, and potential
future drivers.

Change-Id: I4573fd6af1f8398233ee9d3814d9f3139dd0279c


[ROCm/ROCR-Runtime commit: c42ff44a6a]
2024-08-28 14:34:54 -07:00
Tony Gutierrez c75f2d749d rocr/xdna-driver: Initial support for amdxdna driver
Change-Id: I319b55d89dc644e7151228cb6c19d1a633171295


[ROCm/ROCR-Runtime commit: 86f40ae489]
2024-08-28 14:34:39 -07:00
Shweta Khatri ed5b2e6661 Set internal cache for rocprofiler-register dependency
Change-Id: I8a661818c11c4de0df9743dacb78b7c5163b6da9


[ROCm/ROCR-Runtime commit: da69ffff0f]
2024-08-28 14:48:51 -04:00
Tony Gutierrez 64e2d37be8 rocr/aie: Add initial support for AIE agents
This change adds the initial classes for the AIE agent and AIE AQL
queue.

An AIE agent list is added to the core runtime object.

Change-Id: I84b02f52171b80726dfb2c8431582a3ea2986eb3


[ROCm/ROCR-Runtime commit: 8ea62f1cea]
2024-08-27 14:47:05 -07:00
David Yat Sin 5cb81438c7 Set ELF_GETSHDRSTRNDX when cxx compiler is not loaded
Change-Id: Ia26b8999909f688ce78d9bbe4cb2a7262df2ee02


[ROCm/ROCR-Runtime commit: cb672ebcd1]
2024-08-22 17:20:37 -04:00
David Yat Sin de85c5738e rocr: Handle pthread_create returning errors
Rewriting logic to fix issue where pthread_create would return errors
other than EINVAL, and these errors would be ignored.

Change-Id: I573958724dcf886c20e8c14e6a9182303b3ffa06


[ROCm/ROCR-Runtime commit: c8dd4d2b3b]
2024-08-22 12:15:10 -04:00