PCMark10 counts the time spent in clCreateKernel as part of execution
time, so as workaround for the PAL path, move code object loading
back to clBuildProgram.
Change-Id: I3b9cf1879ece08ab59f447ec165b0525bc8593a4
[ROCm/clr commit: 1d0364e590]
Pass the device agent specified by the user to the ROCr api instead of passing the device agent attached to the specified stream
Change-Id: I86c98935b9dc404eaa6d47ccdd082a8c3678fb36
[ROCm/clr commit: 169cc857fd]
Fixes Seg fault caused when the attribute hipMemRangeAttributeAccessedBy
is queried using hipMemRangeGetAttribute
Change-Id: I2ceb2267d89bfc31a55d9eae2685610c7ad89b1f
[ROCm/clr commit: 48c1b895c0]
Reuse FillMemory function, that should fix the cache syncs from the host
Change-Id: Ieebec5fc3ed3a322b88d5187c8dca4805ec6f84b
[ROCm/clr commit: 24442be35a]
This patch allows to substitute binary for the opencl program. It supposed to be used as:
1. Run the opencl program with -save-temps.
2. Open the cl temp and find the following text in the program header:
Hash to override:
Source: 0xd66bcfa20e69e605
Source + clang options: 0x656a9dd8aedcbfb6
3. Create config file (ascii text) with a pair(s):
<hash> <path_to_binary_to_substitute>
where hash is the hex value from step 2 (without leading 0x), you can use either hash
depending on what you're going to match:
only the source text of the program or along with it's clang options.
4. Set the env variable AMD_OCL_SUBST_OBJFILE to the path of your config file.
5. Rerun the opencl program.
Change-Id: I977c80fe529ea14458194918c6ddfbe2de6a8857
[ROCm/clr commit: 51cc9c2f8c]
Current logic when creating a buffer view will end up going into the
allocation block. Even though no memory will be allocated, since
owner()->getSvmPtr() is already allocated, we'll still end up
calling updateFreeMemory().
Checking if we're creating a view, will skip the SVM allocation logic
and let us fall into the actual view creation logic. This won't end up
updating the free memory counter.
Change-Id: I1c260a9ef57895130b272ea1246e06e812b25b37
[ROCm/clr commit: f167136918]
The new query MemRangeAttribute::CoherencyMode can return current
coherency mode for the provided memory region. Coherency mode can
be one of the following types: FineGrain, CoarseGrain and
Indeterminate
Change-Id: Ib66feeeb14f57a8b1cc731c65bb3d0276d297ff7
[ROCm/clr commit: 992830bab7]
Redshift sees around a 3x performance uplift this change.
Turning this on for OpenCL might cause unwanted behaviour, due to
apps like RSX running in the background all the time.
Change-Id: I9f32d5f2e05b6697a8aaa9ddf74474b5531bb7e1
[ROCm/clr commit: 2f00782829]
There is a possible race condition when signal reuse can have
access to a destroyed Timestamp object, because the callback
was running asynchronously. Use reference counter and lock
to allow asynchronous timestamp update
Change-Id: I6224f7c62cb0a03a7466fcc512e5e5afb06736fa
[ROCm/clr commit: ec89348291]
Note that this requires base driver CL#2340320+ to have SQ interrupt
functionality enabled by default.
Change-Id: I04b936819ebe1eb7cf5de1db4fafe83af3a1b5f6
[ROCm/clr commit: 4171e9e0a3]
This module will be used to add any specific compiler options to ROCclr
and it's clients.
Currently it only adds a workaround to remove the MSVC flag /GR, which
is added by default CMake <3.20. This resolves the conflict of PAL
adding /GR-.
Change-Id: If83adb271bcec86812a6e9de940da3920fc75393
[ROCm/clr commit: ce9182e62b]