SWDEV-201128 - [HIP] test_snli_cuda failure
Default to sync packet
Make sure GPU_NUM_MEM_DEPENDENCY is 0 for HIP
No sync packet is only used when there are mem dependency check
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#86 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#28 edit
SWDEV-203438 - [HIP] AllGather RCCL test issue
The test tries to launch a kernel on two devices at once and they need to communicate with each other.
For that, it uses a custom stream for each devices.
Problem is in getNullStream we used to call syncStreams all the time
and it was syncing all the streams even the ones on different devices.
So that made the second kernel launch (on 2n dev) to wait for the first kernel to finish which
would never occur since the first one was waiting for the second one.
The fix is to not call syncStreams from getNullStream because we sync already anyway prior in general.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#21 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_event.cpp#16 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#40 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#41 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#24 edit
SWDEV-145570 - [HIP] Use a context with all devices in system for host register
hipHostRegister and hipMemcpy 0x10 and 0x20 fail in mGPU systems because
we only register the memory on the current device. But in HIP, the registering
needs to happen on all devices.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#17 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#50 edit
SWDEV-145570 - [HIP] Track last used event and use last enqueued command in a stream rather than creating a new event
ReviewBoardURL = http://ocltc.amd.com/reviews/r/15996/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_event.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#90 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.hpp#20 edit
SWDEV-145570 - [HIP] Make streamSet global and protect it
By default from the spec, streamSet should be global and not per thread.
There is a flag to make it per thread but we don't handle this yet. We
would just add another variable that will be thread local and use it instead.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device_runtime.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#11 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#8 edit
SWDEV-145570 - [HIP] Fix multithread init
Make the g_ihipInitialized variable per thread
And make sure to assign a default g_context
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#9 edit
SWDEV-145570 - [HIP] Get hipCtx_simple to pass
Implemented hipCtxGetDevice
hipCtxCreate must push the created context onto the context stack
hipCtxDestroy must check if the top of the stack is the context being destroy
and not just pop the top of the stack w/o checking.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#8 edit
SWDEV-145570 - Contexts
Create one amd::Context per device
g_context is now thread's current context
HIP doesn't want more than one context per device so we always use the primary one
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#3 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#4 edit