Positive and negative test cases for atomicAdd and
AtomicAddnoRet device functions
SWDEV-238517 for enhancing hip unit tests
Change-Id: Id20ba2550d20f224004f105cdcd087002cb80e56
Tests heq2, hne2, hle2, hge2, hlt2, hgt2 APIs for functionality
and NaN tests
SWDEV-238517 for enhancing hip unit tests
Change-Id: I88a9a8ead0d00a1261f3d650361d655f2f397e48
This technique should never be used, and only accessed through
__builtins.
There's currently no builtin for groupstaticsize. I left ds_swizzle
since for some reason it switches to the builtin based on __HCC__ or
not.
Change-Id: If1e1394221dba83ea4add6db5e94d6b715552044
* Disable device side malloc
Currently device side malloc is not working and takes excessive
device memory.
Disable it for now until a working malloc is implemented.
Change-Id: I1ad908c1c53a83752383b4be96688a848642c699
The randomly generated offset+width may exceeds 32, which causes
a left shift operation with 32-offset-width. As an unsigned number
that is greater than 32 and causes undefined behavior. When the
test is compiled without -mavx it is still OK. However when
the test is compiled with -mavx, the undefined behavior causes
wrong results and test failure.
This patch adjusts width so that offset+width<=32 always.
HIP_VERSION_MAJOR, HIP_VERSION_MINOR, HIP_VERSION_PATCH and HIP_VERSION pre-processor macros are now defined in hipVersion.h instead of being set by hipcc.
Added new memory API's hipMemAllocPitch, hipMemAllocHost, hipMemsetD16, hipMemsetD16Async, hipMemsetD8Async
Modified to support all scenarios hipMemcpyParam2DAsync, hipMemcpyParam2D.
Changed the third arg of the functions __hip_as_write_block and __ockl_as_write_block from ulong to uint64_t so as to fix the compilation error in windows
* Put 3-wide vector types on a ketogenic diet.
* Remove needless include.
* Do not be narrow-minded.
* Do not be narrow-minded.
* Put the C people on a diet too.
- Current clang disallows any invocation of wrong-side functions even
under context with type-inspection only. Work around that by adding a
variant of `std::decl` with `__device__` attribute.
- It's a common mistake by assuming 1 << shamt would be promoted to
64-bit, if shamt is a 64-bit integer. That's not the case. Replace
that left shift to a 64-bit one to ensure it won't fall into undefined
behavior.
- Fix the host-side implementation as well for device function testing.
Since rcp implementations of non-default rounded versions are not correct or supported in OCML, guard them using the same macro OCML_BASIC_ROUNDED_OPERATIONS. Also update the docs and tests.