Since the sub-buffer(virtual memory that is mapped to device memory) is associated with device memory, it should utilize the device context instead of the host context. The original implementation caused hipMemcpyPeer to not take the P2P path, as the memory object was treated as host memory.
* SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT
As HIP enables UVA by default, it seems we can simply expose the context to support this feature.
Currently, we check if there's enough system RAM even if we don't allocate on host device. This is incorrect logic.
We should not check for this size on windows because PAL checks for memory allocation. See SWDEV-467263.
Co-authored-by: Jimbo Xie <jiabaxie@amd.com>
- This change tries to save extra synchronization packets we may insert
as we didnt track the completion signals for every command. We track
the current enqueued command until it exits the enqueue stage. We also
record the exit scope to know if we flushed the caches
- Handle correct release scopes and store completion signal as HW events
- Use a new finishCommand implementation to only wait for the command
passed as the argument
Change-Id: Ie4350c5dd24f5d48dfa6ccbabd892f0544caadcc
- Added missing validation as graph node should not be created
if parameters are invalid
- Fix conversion of input params to graphNode params
Change-Id: I37ab04942b5fb2eb07386850cb7dbbf26f9ca967
On Windows, hipHostRegister may add a single object in the MemObjMap
that maps to memory that is allocated on different devices.
This change ensures that the offset that is returned from
getMemoryObject() is computed relative to the memory that is allocated
on the current device.
Change-Id: I5fd3af200bf6f4926fdeaea12dcb9d0154d3a843
This change replaces some asserts, that were only available in debug
mode, with standard error handling.
Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com>
Change-Id: I112f9e56f921abd72daf0d11e4ecdcb7b1a9f9e6
This reverts commit efce2f77c4.
Reason for revert: Even though this change is valid, this would break backward compatibility.
Change-Id: I9c7cab83198c8d5c8485b11194099162e3e7a874
PAL supports allocating from system memory once device memory is used up
or allocation is larger than the device memory.
Change-Id: Iccd3377e95a6cc6d23e45d4738a17af8b9ee32d7
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer
Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
This change adds a new HIP API `hipExtHostAlloc` which preserves
the functionality of `hipHostMalloc`.
Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff
- awaitCompletion would wait for host side command compelete(aka
cpuWait). The correct way is to check the completion signal and if not
dispatch a marker that has a signal.
Change-Id: I0f4f23c7ea68c329bf1d5f05e9735f631e5e3808