CPU read updates L2 with the latest values and requires
invalidation after, because SDMA doesn't use L2 and data can become
out of sync.
Change-Id: I98d1c91ca78a103fa5409e638f97485d62d5b11e
When OCL ROCr backend performs CL_MEM_COPY_HOST_PTR it may attempt
to have access to amd::Memory object it's currently creating,
but it's not ready yet. The logic creates a temporary dummy object
to perform a copy transfer. The new change will make sure runtime
skips allocation of the same device::Memory object second time.
Change-Id: I14c6a00a3941fdcaa6aea299e9f096e4c3f5cadf
Make sure the logic updates the command status when it's done in
HW, but not on submission.
Add the last command tracking, otherwise queue sync logic in the HIP
upper layer may skip synchronization, assuming the queue is empty.
Change-Id: I2d046792553e74df090a10f7d7a78914610f6df2
Use barrier packets for every profile marker that gets submitted
and use the completion signal to get GPU ts. This gives most accurate
dispatch time. Club cache flushes with profile marker if there is a
pending dispatch that needs cache flush. This optimization saves on
extra barrier and helps wall time
Change-Id: Ib62d6d7aabf4743827b561be6c9c5afa813203da
[PAL to KFD/ROCr][ROCr_Runtime][Vega10] OCLSeparateCompile subtest of
oclcompiler from ocltst test package is encountering clLinkProgram()
failed (chksum 0x00000001) error
If runtime does not provide a file name as dump file to ELF library,
ELF library use a temp file in current folder.
The current folder can be not writable for several reasons:
1. The application current folder might be system folder, the user
does not have write permission.
2. The current folder is under a readonly file system. This happens for
embedded customers.
Tested in VEGA10. Issue was fixed.
Change-Id: Ic0e9f040b7c7583914301673cce237ab28b0c0cb
PAL doesn't perform chunking for system memory allocations, hence we
should fall back to using pinned memory for mapping large buffers.
Change-Id: I1b472616b72d12ed0105fb65532acacdb98ac7b3
If deferred allocation is disabled, then make sure the image view
is created without a delay. Also reset the allocation state, since
create() method isn't called for a view creation.
Change-Id: I7aa22a62bff18289ade83e56b5d3305ba68c715b
The hack dosn't really track the commands status. It may be not
necessary for HIP, but will cause early resource release.
Change-Id: I791ad36dd8abd3b6b3d2c9b16a210a555c08ca64
A device's offset in Pal::AsicRevision could be changed from time to time, while the current implementation assume the offset never changes.
Change-Id: Id993512aa0da6e0b2356f594d5e58f76d1f97f16