Enqueue a handler callback for hipEventRecords(aka marker_ts_) for every
64 submits, This recycles the memory if we dont end up calling
synchronize for the longest time.
Change-Id: I3d39fe76d52a5d81387927edd85b5663b563682c
[ROCm/clr commit: fa76f03654]
Disable hostcall buffer in OCL for now. COv5 can add hostcallbuffer
metadata for unknown reason. OCL may fail the buffer allocation
and kernel launch.
Change-Id: I34a6a45bac86c57422b764c0d69760c96920d6c5
[ROCm/clr commit: 934149ff0a]
Disable devlib linking when runtime links multiple objects from
the app. Otherwise devlibs will be linked twice and may cause
undefined behavior with COv5.
Change-Id: I3b8640c64ff898893225fe3af5b4b4a32d42bf40
[ROCm/clr commit: c275d9b4b3]
Implement map/unmap for PAL backend
Create commands since PAL uses the IQueue to map/unmap
Change-Id: I97e26a7d28ae5e10774c9ca65307153100945621
[ROCm/clr commit: 67657d6099]
- check pcie atomci support for printf functionality
- if not enabled printf wont work
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Ib366e8e71772b02210c4a830bca4bd8cc7a11664
[ROCm/clr commit: 15f1632dfa]
This code change is to improve error handling.
This code change does not fix issue itself.
Before this code change, hostcallBuffer_ point is initialized in the end of
create() function. If create function fails and returns early,
hostcallBuffer_ point is not initialized. This non-initialized point can
cause access violation when object is destructed.
This code change put the initialization of the pointer in the constructor.
Change-Id: I7fb6e764eb0547196dca03db237e49d3ff0fd06a
[ROCm/clr commit: 5528812aa9]
Adding virtual memory management APIs to rocclr.
The HIP layer will handle virtual allocs on devices.
Change-Id: Ia978f105c2c3fed3959c77580ba228e845105754
[ROCm/clr commit: b5f555f9ec]
- Add a global cache state for a device to indicate scopes of submitted
AQL packets
- Remove scopes for TS marker if hipEventReleaseToDevice is passed. Set
env ROC_EVENT_NO_FLUSH=1 to use NOP AQL for event records.
It would flush caches by default with system scope release.
- Calling finish() should ensure if caches are flushed, if not queue a
marker
Change-Id: Ibbbdbb1cd7ac61cb35649169212142545be159e0
[ROCm/clr commit: 8eeaa998c0]
Remove assert for kernel arg size, because COv5 reports a value
bigger than the actual usage in the most of cases
Change-Id: I8e15bc45a9e21b58a5894f9977511ca84408ce61
[ROCm/clr commit: 2be0b1e612]
- Clean up detection by using visual studio macros to detect arch; I
didn't list all possible ARM platforms (can be done later if desired)
- Fixed two incorrect uses of !defined(ATI_ARCH_ARM) to instead use
defined(ATI_ARCH_X86), as they contain X86 specific code
- Fixed one use of __ARM_ARCH_7A__ to use ATI_ARCH_ARM instead
This is an improvement to the fixes in the last patch for SWDEV-323669
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I8568167293c34ad5331902105877f3ab6e25acb3
[ROCm/clr commit: 00efdc1cd6]
With COv5 local size calculation must occur before
runtime programs kernel arguments
Change-Id: I0726c6529bde69b8fcf5360aa83986cf84e04168
[ROCm/clr commit: caa6110c29]
Adding opaque data handle to memory. This is used to look back the HIP object associated with it.
Change-Id: I1bbb14a915bed79c6c3593a29a627778c7aaf13a
[ROCm/clr commit: 867346520f]