SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 1)
1. The log macros is turned off for release build. So log functions has zero impact to release build.
2. The log macros have level, mask, condition control. So we can have more control to avoid log flooding.
I also adjusted some existing log to use new log functions.
1. To excercise and test the new log functions.
2. To improve performance slightly.
3. The change is mainly for HIP-ROCM, we can move more in next phases for PAL or ORCA.
4. I make these log feature unavailable for release build. We can revert to old log functions for release build in a case by case method.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=128289&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=128358&personal=true&tab=vcsModificationBuilds
2. release build, run hip program, there is no log
3. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967295
There was a lot of logs.
4. fastdebug build, run hip program,
export LOG_LEVEL=2
export GPU_LOG_MASK=4294967295
There was no logs.
5. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967294
There was much less logs.
6. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=47102
There was even much less logs. The logs was expected according to the mask.
7. Tested step 2 to 6 similarily in Windows and Linux
ReviewBoard: http://ocltc.amd.com/reviews/r/18215
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#46 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hiprtc_internal.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/comgrctx.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#68 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#137 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#91 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/runtime.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#323 edit
SWDEV-203438 - [HIP] AllGather RCCL test issue
The test tries to launch a kernel on two devices at once and they need to communicate with each other.
For that, it uses a custom stream for each devices.
Problem is in getNullStream we used to call syncStreams all the time
and it was syncing all the streams even the ones on different devices.
So that made the second kernel launch (on 2n dev) to wait for the first kernel to finish which
would never occur since the first one was waiting for the second one.
The fix is to not call syncStreams from getNullStream because we sync already anyway prior in general.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#21 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_event.cpp#16 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#40 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#41 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#24 edit
SWDEV-145570 - Fix build failure due to type mismatch of amd::Event::CallBackFunction
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#16 edit
SWDEV-145570 - [HIP] Track last used event and use last enqueued command in a stream rather than creating a new event
ReviewBoardURL = http://ocltc.amd.com/reviews/r/15996/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_event.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#90 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.hpp#20 edit
SWDEV-145570 - [HIP] Make streamSet global and protect it
By default from the spec, streamSet should be global and not per thread.
There is a flag to make it per thread but we don't handle this yet. We
would just add another variable that will be thread local and use it instead.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device_runtime.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#11 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#8 edit
SWDEV-145570 - [HIP] - Release a stream first before taking it off from the set.
- Queue::create() needs to be called before returning a valid queue.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/14830/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#7 edit
SWDEV-145570 - [HIP] Use the as_amd()->asHostQueue with streamSet
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#6 edit