On Windows there's something fundamentally broken about redirecting IO
into a file and then restoring that said IO to it's original state. Even
though no syscalls would fail, the output would sometimes either go into
CLI or straight up nowhere.
Simply using pipes instead of a temporary file magically resolves the
above issue ¯\_(ツ)_/¯
Unfortunately the max pipe size on Linux is 1Mb, which is not enough to
store all the data printed by the kernel. This leads to a softhang in
vprintf().
Stick to using a temporary file on Linux, but switch to pipes on
Windows. Slightly refactor the CaptureStream struct to accomadate this
difference.
Change-Id: Id8e68f150df47815a4f652ee2bcd6cfb7c3e3bac
The following snippets has different behaviour based on platform.
printf("%p", 0x123abc);
Linux -> 0x123abc
Windows -> 123ABC
printf("%p", nullptr);
Linux -> (nil)
Windows -> 0000000000000000
%p specifier according to C spec is implementation defined, so we need
to adjust the reference string to be correct on Windows.
Change-Id: I7059fa0f6cde611718bd76655637670fcbccf43c
The current implementation in ROCclr for callback is
based on OCL specification. If in HIP the same command
could get multiple callbacks, then ROCclr will process them in
a reverse order. Unique Markers for each callback will make
sure it won't happen.
Add a dependency wait for callbacks, since HSA signal callback
doesn't guarantee the order.
Change-Id: I9d514734e258312fe9a74d48132361eb17c52d67
With direct dispatch enqueue occurs before callback update and
it can't be tracked in the device backend
Change-Id: Ie8793e3ddb68cc5bb36348f7a8dcdbdc87a2487c
Some upstream will load hip multiple times thus Rocclr static lib will
be linked to hip::amdhip64 many times. This will bring some unexpected
issues such as in hipBLAS. The patch prevents his happening.
Change-Id: I6bb27659f74371dae6e59c59fd6bb2022cc062ff
Flags of NVCC_OPTIONS need be sent to linker. Because compiler and
linker flager are mixed in NVCC_OPTIONS.
Change-Id: I3db37b962808566ea145e3cbdefa66d373e2d360
which can cause segmentation faults. Hence checking for its sanity
Signed-off-by: Ashutosh Mishra <ashutosh.mishra@amd.com>
Change-Id: I78b4d029f0926a1369a8ebbeb4aef951a8f1f1d7
[dtest] Tests for hipEvent related APIs
Added Negative scenarios for hipEvent related APIs
1. Verifying all hipEvent related APIs by passing nullptr.
2. Pass illegal/unknown flag to hipEventCreateWithFlags API
Change-Id: Ia0a24065d16fe0f5ee28a88e280c25c1be0c3590
If MT is enabled, then a new callback can be received before the previous
command is processed, causing a conflict of 2 callbacks.
Change-Id: I5ff8f231208e8d62824d590d3c8e791e8e36affb
runtimeApi/event/hipEventElapsedTime will report invalid resource error
on cuda due to wrong calling sequence. The fix will arrange the calling
in right sequence.
Change-Id: I3db28a962888566ea135e3cbdefa68d373e2d369
memory/hipMalloc_MultiThreaded_MultiGpu costs too much time to finish.
1 GPU: about 1000s, 2 GPUs: about 2200s
But Jenkin build need quick return and ctest will kill test that last
1500+s. So we need shorten the test time.
Change-Id: I3db27a962808566ea135e3cbdefa66d373e2d369
1. Added 1 scenario to validate value of deviceProp.arch.has* with
value of __HIP_ARCH_HAS_* device flag.
SWDEV-238517 - Enhancing hip unit tests
Change-Id: Idb237a76b75180ce77808853a5351f19077a0d33
Added a test to generate a code object for multiple target
architectures (including for the current device),
load and execute the kernel.
SWDEV-238517 for enhancing hip unit tests
Change-Id: I509d01124abdc0495cfc770ab5508738f108c91c