Add a test for fine-grained device to device coherency.
Add a test for fine-grained host to device coherency.
Change-Id: I62482cae917fa19feaa17adb53f3084527ad8fda
Addresses the scenarios when the size passed is more than
the allocated size and when the deviceid is invalid
Change-Id: I6c9b62639096f655ffb61976905b1ce8c5f51ee7
Change-Id: I8a0d660924a8e2300c517aba6f9088626b8f6ef5
1. Added 21 test scenarios to test the hipComplex functions on both host and device.
2. Modified the floating point comparisons with precision check.
Change-Id: I9edfb0c635ced255935087c85b77d3cc6a1a82e3
For addMarker, assume T1 comes in first and enqueues a command C1.
Before T1 grabs the event_::lock_ it gets preempted. At this time,
T2 comes in, enqueues C2 and grabs the lock_ and updates event_. Now T1
wakes up and updates a older command C1 for the event.
Change-Id: Ia423782b23026302c40976385623cfdede32d70b
Add device_id_ in hip::event to match cuda behaviour in
hipEventQuery() and hipEventRecord().
Enable hipEventElapsedTime test on AMD platform.
Workarround sporadic crash of hipEventIpc test due to
some bug of event ipc.
Add missing hipEventDestroy() in some event tests.
Fix some logic code errors.
Fix typo in comment.
Change-Id: I9ec74c475161b3e31df48d193449023e921f2924
HIP assumes that image width is in bytes, but OCL/ROCclr assumes that
it's in pixels. AtoD/DtoA need to account for this.
Change-Id: I275bd41d8b03e141caaf951bc6b714e51ca72dfc
[dtest] Additional tests for Memcpy
APIs tested:
hipMemcpy, hipMemcpyAsync,
hipMemcpyHtoD, hipMemcpyHtoDAsync,
hipMemcpyDtoH, hipMemcpyDtoHAsync,
hipMemcpyDtoD, hipMemcpyDtoDAsync
1: The aim of this test case is to cover all
the negative test cases for 8 hipMemcpy apis
2: This test launches NUM_THREADS threads.
Each thread in turn tests the working of
8 hipmemcpy apis
3: This test case verifies the working of
Memcpy apis for range of memory sizes from
smallest one unit transfer to 1GB.
Change-Id: If5c99527a78e817bafab2e1bd9b686a9ff916184
On Windows there's something fundamentally broken about redirecting IO
into a file and then restoring that said IO to it's original state. Even
though no syscalls would fail, the output would sometimes either go into
CLI or straight up nowhere.
Simply using pipes instead of a temporary file magically resolves the
above issue ¯\_(ツ)_/¯
Unfortunately the max pipe size on Linux is 1Mb, which is not enough to
store all the data printed by the kernel. This leads to a softhang in
vprintf().
Stick to using a temporary file on Linux, but switch to pipes on
Windows. Slightly refactor the CaptureStream struct to accomadate this
difference.
Change-Id: Id8e68f150df47815a4f652ee2bcd6cfb7c3e3bac
The following snippets has different behaviour based on platform.
printf("%p", 0x123abc);
Linux -> 0x123abc
Windows -> 123ABC
printf("%p", nullptr);
Linux -> (nil)
Windows -> 0000000000000000
%p specifier according to C spec is implementation defined, so we need
to adjust the reference string to be correct on Windows.
Change-Id: I7059fa0f6cde611718bd76655637670fcbccf43c
The current implementation in ROCclr for callback is
based on OCL specification. If in HIP the same command
could get multiple callbacks, then ROCclr will process them in
a reverse order. Unique Markers for each callback will make
sure it won't happen.
Add a dependency wait for callbacks, since HSA signal callback
doesn't guarantee the order.
Change-Id: I9d514734e258312fe9a74d48132361eb17c52d67