0aa24f6d4d8b20cd14e678f3b80b2924d45ea898
SWDEV-203438 - [HIP] AllGather RCCL test issue The test tries to launch a kernel on two devices at once and they need to communicate with each other. For that, it uses a custom stream for each devices. Problem is in getNullStream we used to call syncStreams all the time and it was syncing all the streams even the ones on different devices. So that made the second kernel launch (on 2n dev) to wait for the first kernel to finish which would never occur since the first one was waiting for the second one. The fix is to not call syncStreams from getNullStream because we sync already anyway prior in general. Affected files ... ... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#21 edit ... //depot/stg/opencl/drivers/opencl/api/hip/hip_event.cpp#16 edit ... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#40 edit ... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#70 edit ... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#41 edit ... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#24 edit
描述
未提供描述
儲存庫語言
C++
67.5%
C
20.6%
Python
6.6%
CMake
3.4%
Shell
0.6%
其他
1.1%