We unmap a memory with a different pointer.
ROCr runtime might be confused and silently ignore the unmap request
Change-Id: Ic5a1387a426cf02a985a4ef8ff8ff05e6a870cbf
[ROCm/clr commit: e5588f188c]
PAL may internally align up the allocation size to the page size
reported by KMD. This will cause a mismatch in size between OCL and PAL.
To avoid this, use PAL size when updating the free memory counter on
both alloc and free.
Change-Id: Ic6e8c861a52170476474fb70a769eef93be3261f
[ROCm/clr commit: be66e29e94]
Fix dumpImage() issue exposed on Win10.
It's called by internal compiler on Navi14
Change-Id: I693ffd45b6b03657822afdc872781901bc69b65d
[ROCm/clr commit: b74d120627]
Enable this optimization when the barrier is disabled, since
reuse requires a signal wait.
Use the size of pending AQL signals as the size of signal pool.
Change-Id: I2754a0f8b67e19d2601c58945e10fdf0e8be1624
[ROCm/clr commit: a5661192b6]
On ReBar systems the invible heap is not present, so in theory we should
fail creating the suballocation chunk, however PAL doesn't report any
errors.
To make sure we never fail, allow creating the allocation in the visible
heap and system memory.
Change-Id: Iea9cc68d98b9cb396a2b7a37398b98b66274083b
[ROCm/clr commit: 330b674821]
Now rocm/rocdevice.cpp also includes comgrctx.hpp, and we don't want to statically link against comgr when buidling shared libs.
Change-Id: Ic330bd860559b3e07b776c951afe6126b0f43f7d
[ROCm/clr commit: b38317cb3c]
This is helpfull to do when debugging issues on lowend asics. Navi14 can be emulated as Navi10. So can Navi22 be emulated as Navi21.
Change-Id: I693ffd45a5b03657822afdc872781901bc69b65c
[ROCm/clr commit: 26d1b28b16]
With the PAL_ALWAYS_RESIDENT flag memory objects are resident at allocation time, no need to make them resident again before submit.
Also we should never evict anything with this setting, or we'll generate a VM fault.
Change-Id: Ieacc6af88ab4e09c20efd94100e148b2502e1d70
[ROCm/clr commit: fd09a7a23c]
The change reuses HSA signals for dispatches as a wait signal.
Skipping the barrier requires to disable L2 cache for sysmem
allocations and extra tracking for HDP access with the large bar.
ROC_BARRIER_SYNC=0 activates the new logic. Barrier sync is
still used by default.
ROC_ACTIVE_WAIT=1 enables unconditional active wait in ROCr.
The change also consolidated ROCr wait logic under single function.
Change-Id: I6bd1be30aa88258da1b1f9de319ef5a45852afd8
[ROCm/clr commit: d9397590de]
SWDEV-249719 - root cause: queues with custom CU mask are not inserted
into queuePool_ (i.e., queue of reusable HSA queues) of ROC device class
causing a crash when creating hostcall buffers for printf
Change-Id: Ieee7005d9a5a30b3113394ce23ee65927126d0d6
[ROCm/clr commit: 2e199bd492]