We are passing this arg as an address, and memcpy complains about
overreading (8 bytes instead of 4).
Change-Id: Ica9207f6c5f6056a4bfc968280c76e779ded13ae
- Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to
SCOPE_NONE in AQL packet header
Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868
Uri decoder logic currently silently ignores processing of memory uri.
This patch enables the existing logic to handle the processing of offset
and size related to loaded code-object having memory URI.
Change-Id: If03579cefb11d91f667410464dc89404df9270a3
Updated CHANGELOG to include the performance fix for
kernel launch latency with increasing number of idle streams.
Change-Id: I509e14cb8f8cd3abe61c6ede78808e96ef8f06e1
NOPTION is meant for component options or alias runtime options so
the option group must not be OA_RUNTIME or OA_MISC_ALIAS must be set,
otherwise we incorrectly assume that it has an option variable and
attempting to write to it causes corruption of OptionVariables.
Change-Id: Iafb5a8f743e5ed0f87be36061c44578178f6cfde
The vector with all kernels is preallocated on the executable init.
Thus, reduce the scope of global lock to the binary creation only.
Change-Id: I73035013a6562175069137e895bba815f466ee35
Support gfx9-4-generic target to cover mi3XX.
Support features sramecc and xnack in generic target.
Improve some code formats.
Add more log on compiler.
Change-Id: I6b3c6af55c60cffd43ce6f17b75998f751b75713
Some libs use __HIP_NO_HALF_OPERATORS__ and __HIP_NO_HALF_CONVERSIONS__
which results in operators being hidden and can cause errors.
Change-Id: I83c194d7d727cba30b46d7c296f7d396549f5fca
In active wait mode use signals without interrupts by default and switch
to the interrupts only if a callback is required.
Change-Id: Ibcde8f7d44c70f8fb8fa5e0a7fdd8b08a2982a8e
On Windows, hipHostRegister may add a single object in the MemObjMap
that maps to memory that is allocated on different devices.
This change ensures that the offset that is returned from
getMemoryObject() is computed relative to the memory that is allocated
on the current device.
Change-Id: I5fd3af200bf6f4926fdeaea12dcb9d0154d3a843
- Header files inside rocclr/utils when included from hipamd or opencl should be included as #include "rocclr/utils/xxx.h" instead of "utils/xxx.h"
Change-Id: Ic0760c33b9d091f5620dec67e5482c9698d22093
Some functions were __device__ only, but should be __host__ and
__device__, changed them to __HOST_DEVICE__.
Some functions were __HOST_DEVICE__ but were using ockl functions,
changed them to __device__ only.
Change-Id: Ife9e7abe60415bda68f5f9a101e6e7c39ad51064
The hipGraph will use VMM by default when allocating memory.
However, the handle of Phy mem has been added to Memobj by default.
Since the Memobj will track the whole address range from handle to
handle + size, this needs the system to reserve the whole address
range. If the system range have not reserved by the system, then it
will have the potential issue that clr finds the Memobj incorrectly.
This patch removes the handle from the Memobj to fix this potential
issue.
Change-Id: I2da38e6b2d11d0d48e1afe66c46899500c290624
- Refactor blit code and clean ASAN instrumentation
- Use unified function for rocr copy
- Enable shader copy path for unpinned writeBuffer/readBuffer paths
- Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for
pinned copies or unpinned H2D/D2H copies < 16KB
Change-Id: I42045cca79234b340dbf53dafb93044199736ae4