Fix VirtMemory_Basic_Test permissions to adjust for previous change to
the hsa_amd_vmem_set_access behavior change that was done with this
patch:
rocr/vmm: Only modify permisions for specified agents
Change-Id: I97230600b9b9144459b08ca3da3a5bfbdbb98231
[ROCm/ROCR-Runtime commit: ead3aafcda]
Devices older than GFX90a hit a segfault on queue unmap when an
SDMA queue has been assigned a fixed engine. Bypass fixing the
engine for these devices for now.
Change-Id: I7d2f882d2377f004a7bb65f3b397396db07ce6d3
[ROCm/ROCR-Runtime commit: 1d6ff45673]
If you build thunk following the instructions in the thunk's README,
there is no /lib folder in the build folder. Adjust the include path,
and clean up the docs to reflect that. The header include is already
defined in the CMake file as ../../include, so we don't use
LIBHSAKMT_PATH for that linking, just the lib location
Change-Id: I73435d59adb9d01f527a28b1935086260e9d3d70
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: ccd80d19ba]
Fixed multiple issues related to memory management, atomicity,
and error handling across various functions: handle null checks,
use-after-free, unchecked returns, and memory leaks.
Change-Id: Ia7c76320cc20e24001052fbba2dd0600bd412140
[ROCm/ROCR-Runtime commit: c9454794b6]
Fix memory leak for memory regions objects when GPU is masked using
ROCR_VISIBLE_DEVICES.
Change-Id: I610842a18adbc3cdc854b12650844e271bc00592
[ROCm/ROCR-Runtime commit: dbae8da515]
To correctly map to all GPUs after an import, use the new extended
registration call that can import a virtual address without having to
specify a target node.
Change-Id: Ifca8f6f6ee24fa99b2af357dcc3ea1de3ab234f7
[ROCm/ROCR-Runtime commit: 0ae064fe2d]
Currently registering graphics memory without specifying a target
node will return a memory handle that's not a virtual address.
As a result, ROCr is forced to register with a target node for
IPC usage.
Mapping memory without specifying a target node afterwards will
result in mapping to the target node that was imported because the
previous import call flags this node targeting action to future mapping.
For ROCr IPC usage, ROCr wants to map to all GPU nodes if the target node
is not specified.
Allow the caller to register graphics handles that returns a virtual
address without having to specify the target node so that the caller
can make a subsequent map call to all GPUs.
Change-Id: I5a935092b885cc3568e4f3a5dd951c7ec6c84fca
[ROCm/ROCR-Runtime commit: 03463ed2c0]
In static build, the dev and binary components are grouped to generate static package
Removed the line that was ignoring the component grouping
Change-Id: Ie0ca9db109f2002891260985634f2e6b1ea7f236
[ROCm/ROCR-Runtime commit: f27ae44b8c]
When hsa_amd_vmem_set_access is called, do not remove permissions for
unspecified agents. Also updating documentation in header to clarify
this.
Change-Id: I3bb4cf08ba399f85cc67b17fd13a4a40d862415f
[ROCm/ROCR-Runtime commit: 73f6bfa747]
Currently, KFDPerformanceTest.P2PBandWidthTest cannot work if there are
more than 16 KFD nodes in the system. This limit was put in to match the
number of SDMA queues supported on a single node.
This patch updates the test to make it run on systems with more than
16 KFD nodes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I561d0cdef664cae84fb9c13a801052e2001256e5
[ROCm/ROCR-Runtime commit: b81e45f03c]
Socket server accept calls do not guarantee synchronous actions
post-accept. This can result in a race condition.
To resolve this, first limit the socket server's listen backlog to a
single connection. This will force competing clients to busy-retry
until timeout.
Second, make the DMABUF IPC file descriptor send-receive and import
calls into an atomic routine per connection.
By doing these fixes, not only to we resolve potential races but
we guarantee that any exporter process will create at most one
file descriptor that will only last for the duration of the import
transaction. This alleviates any concern on running into system
limits for the number of open file descriptors per process.
Change-Id: I6d8b14795a680d89a2707e082fa027d525792e05
[ROCm/ROCR-Runtime commit: 909b82d463]
Discarding blocks for reallocation on IPC export for better memory
performance trigger memory violations with DMA BUF exports so bypass
this for now as application performance drops haven't been observed
with the bypass.
The raw fragment should be passed to the DMA Buf export call as well
since offsets will be implicitly applied in the Thunk/KFD for
export/import calls.
Also, use the agent information directly from the pointer
information so that the export call doesn't have to scan memory to find
this. Pass the node ID in the handle so that the import call doesn't
have to make two thunk imports to fetch the node ID for GPU memory
imports.
Finally, allow the user to use DMA Buf IPC via
HSA_ENABLE_IPC_MODE_LEGACY=0 for developer testing as legacy mode will
be applied by default.
Change-Id: Ie8fe267f8768fa5df37126078406f7065f69ff4e
[ROCm/ROCR-Runtime commit: 32bb0764b7]
Fix some potentially unreleased memory, null value checks, files not
closed, and other such issues reported by codeql
Change-Id: Ia679aff97a773a642d8c8cbadeae30955554a62e
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: d64e33520f]
In VM with 6vcpu, cpu schedule of
queue_delayed_work(system_freezable_wq) is lower than BM.
HSA_SMI_EVENT_QUEUE_RESTORE event from case HMMProfilingEvent/0 got
delayed execution and caused HMMProfilingEvent/1 fail.
The fix is only listen to HSA_SMI_EVENT_MIGRATE_START event and ignore
all other events.
Change-Id: I534e49b030bd4c534bc7a63eb431f4907659c8cd
[ROCm/ROCR-Runtime commit: 5a1b6bf14d]
Update the HSA capabilities field with precise ALU ops bit support
for GPU debugging.
Change-Id: I796f2c2e0559577828aba510c401ed5187e10179
[ROCm/ROCR-Runtime commit: 027af8dacd]
Update commentary on HWS scheduler support bit for GPU debugging in
the HSA capabilities node properties field.
Change-Id: I59c519d74a528d5ecf5817ef94e75091314bd844
[ROCm/ROCR-Runtime commit: a926a070ee]
Fix data race by protecting events_page access with mutex in event create
Fix potential NULL dereference in hsaKmtWaitOnMultipleEvents_Ext
Fix unchecked return value in hsaKmtCreateEvent function
Change-Id: I434bef43666e5205a8b061259569c1d99a952752
[ROCm/ROCR-Runtime commit: 857200e28c]
Eliminated declared but not referenced variables to fix warnings
Change-Id: I80032a699fb59ce4635c5001f669d009ba60e588
[ROCm/ROCR-Runtime commit: 303c02690d]
We had skipped doing it for PAGE_SIZE, but it should be left as the
regular PAGE_SHIFT name, especially for users who are using different
headers. We want PAGE_SHIFT and PAGE_SIZE to be consistent with one
another, so set them both explicitly to the same value if either
of them is undefined
Change-Id: I121d81c48409dd77351b59a192d824e2419a2410
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: daad183bf8]
Add check before close to prevent closing invalid file descriptors
Change-Id: Ie1d50e0d55159512a14a70c1e4be058218aae668
[ROCm/ROCR-Runtime commit: ff6e1b44bf]
When an entry is deleted from the array, it's set to nullptr
but not removed. Most other functions that
iterate over the array check if the entry is nullptr
but this loop in IterateExecutables did not.
Change-Id: I763b361eea59f6df201bb86ead0234e95f2cf79c
[ROCm/ROCR-Runtime commit: f3664fd124]
Return false if trying to free a NULL pointer (or invalid size)
internally in ROCr. This is to detect errors within ROCr when trying
to free NULL pointers. If a user of ROCr tries to free a NULL
pointer, this condition should be caught at the beginning of the
Runtime::FreeMemory(...) function and return HSA_STATUS_SUCCESS. This
matches the behavior of the free(...) or delete functions that
silently ignores calls when the passed a NULL pointer.
Change-Id: I84bc26928b35023e19cd9f214b42c6ee9508029c
[ROCm/ROCR-Runtime commit: 0af7a54ebe]
Refactor VMemorySetAccess so that it can be re-used in the following
patch.
Change-Id: I341241da7a59724bb3611172f0d26b0689d7bb46
[ROCm/ROCR-Runtime commit: 8f1b05660a]
Individual simple tests such as CPUAccessToGPUMemoryTest are taking
several hours on emulators as the total amount of VRAM keeps increasing.
Limit the pool sizes to 2GB, only on emulator.
Change-Id: I4b33e8549f89413da255731e6748f606ca64a663
[ROCm/ROCR-Runtime commit: 588a5a2fd3]
Adds support for AllocateMemoryOnly inside XDNA driver.
Move the IsLocalMemory() check inside the KFD driver
since the XDNA driver can, and needs to, create handles
on system memory buffer objects.
Changed handle variable name from thunk_handle to user_mode_driver_handle,
which is more representative if we support non-GPU drivers.
Change-Id: I95db9d575afd1ab0ff2de74cea5175d9a12a721b
[ROCm/ROCR-Runtime commit: 4bf102dc6b]
If BUILD_SHARED_LIBS doesn't get set at all, other projects importing
hsakmt may have an inconsistent state regarding SHARED vs STATIC.
Instead of setting the option and setting it as an option(), just detect
the variable and use it
Change-Id: I9d5a5fc6049ca5351f5e7c63d38ee9bfcb89bdad
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: 2a9572dda0]
The fmm_node_[added|removed] functions were added in the initial FMM
support, but weren't used. Remove them now since no one's referencing
them
Change-Id: I1e46e57294a72012227b38f46c7099de0b9263be
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: 3b61f75f49]
License file is already there in hsa-rocr package .Devel package do not need the same
Change-Id: I08cceeb169d0c061078cd495342f78c089087f0d
[ROCm/ROCR-Runtime commit: d60f56ab32]
Adds support for the packet interface for interacting with
the Embedded Runtime (ERT) on AIE agents. The ERT is what
interprets command packets send to the AIE agent work
queues.
Change-Id: Id28fb98056b2c046354c446bdc9568d74385bea1
[ROCm/ROCR-Runtime commit: 6abb993f65]
Adds support for initialzing the XDNA driver so that
a hardware context can be created for an AIE queue.
Right now this initializes the device heap in the driver,
gets the relevant tile parameters for the AIE agent,
and creates a hardware context that backs the AIE queue.
Change-Id: Ib90e1bc67a8637f6db3ff2bebe34677843796417
[ROCm/ROCR-Runtime commit: 931733d51a]
Ensure rocprofiler-register is linked and added to DEB and RPM package dependencies.
Github ticket - https://github.com/ROCm/ROCm/issues/3654
Change-Id: Iaaaca8bfa81ca33da147673ef1be798109b70aa5
[ROCm/ROCR-Runtime commit: c30ff893a6]