Runtime has to release extra memory, held by the pools,
in synchronization points for event, stream or device.
Change-Id: Id533a5e1d137812aa72bdfe101b4b333c6a43d66
When an offset is applied to the source or destination pointers plus the kind is set to
hipMemcpyDefault and the source or destination is allocated with hipMallocManaged
hipMemCpy2D erroneously fails with hipErrorInvalidValue.
Change-Id: I0db4c17514f743652d8f9a2691da6601a2abb2a1
Fix sporatic segment fault in texture test
via retaining image in texture object which
references the image.
The image will be released when the texture
object is destroyed.
Change-Id: Ic3fefa2d5dda6afebd1acd4d41ad310b138af6dd
Recent changes disabled system memory allocation
in the abstraciton layer. That requires memory
allocation/destruction in ROCR. Add destruction logic.
Change-Id: I68fe6b0a620ca743fe5850052ea0efa8bb7931c2
Extra CPU read back will be performed before every submission to make sure
previous writes over PCIE reached GPU. HDP flush is done by CP.
Change-Id: I402d28ca26c8cee4a3920feb3599af8c285d0889
It cannot be moved to amd_device_functions.h because that causes circular
dependences when trying to use the macro in other files. So we create a new
header and move all assert/abort macros to that common header.
As a side-effect, also fix the macro to correctly expand the entire condition
argument, and also consume the trailing semicolon.
Change-Id: I43688c8e61183503a3a1a039b91321a3779152af
Original logic didn't use pitch because, abstraction layer had
a sysmem copy without pitch. Since extra sysmem copy was
disabled, the code has to accept pitch values from the app.
Change-Id: Ia9fba7b33ddff4e9109b4e63d0d6afa52f501c8f
Separating -mllvm from its option can cause, in rare circumstances,
the option to be dropped. Or the mllvm to be dropped. Either of which
can cause a compilation error. This issue was exposed investigating
SWDEV-435276
Change-Id: Ie665d49183b55a57c9b58619cad525e44f3be8a5
If uuid is copied via strncpy it will stop at first null character. We
need to copy all 16 bytes which might have a null on windows.
Change-Id: I8667919cb251133eec3333a23768c356879727e8
Add clang pragma push and pop diagnostics for ignoring "-Weverything"
in the hiprtc builtins header. Otherwise this will ignore even the
geniune errors occurring in the hiprtc kernels.
Change-Id: I8c3dacf902732b2ea495d83e797369f8aebd75d6
- Allow capture to be less rectrictive if we set the global thread
interaction mode for the thread.
Change-Id: I84f65d9418ac26ada0477c85a45a3831c2351ce4
If the compiler decides not to inline these functions, we might break ODR (one definition rule) due to this file being included in multiple files and being linked together
Change-Id: Iacbfdabb53f5b4e5db8c690b23f3730ec9af16c0
This reverts commit 353dbe6e3b.
Reason for revert: This is considered a breaking change and requires
multiple apps to change their behavior. This will be reintroduced in later releases.
Change-Id: I3481627115af1872785585a155cc6a0ecfbe1372
This reverts commit 629e279f72.
Reason for revert: This is considered a breaking change and requires
multiple apps to change their behavior. This will be reintroduced in later releases.
Change-Id: I0354ce4e0f5e6c402499a7a8c2aaf43bf5b1bfc7
C++ does not allow const qualifier on function type, even if we add it it will get ignored and clang will fail with failed cast from const void* to func*. const_cast here is necessary to make it work.
Change-Id: I72cec8d9e715bdf9e163cb9b08393dd733dafaf2
__HIP_CLANG_ONLY__ is not recognized in HIPRTC, due to which some
math functions like amd_mixed_dot were not included in hiprtc builtins.
Change-Id: I1fe41e1ddc8911f6a5b5b1405dd4730d0170a4f7
The new copy kernel can limit the number of launched workgoups.
It can copy in chunks of 16 bytes or 4 bytes.
Workgoup size is increased to 512 or 1024
Change-Id: Ic3fefa2d5bda6afebd1acc4d41ad310b138af6df