This commit ensures that GPU finishes all kernel before destroying communicator thread. [ROCm/rccl commit: 52654e2301]
52654e2301