* Using hip_bf16.h instead of hip_bfloat16.h for the __bf16 intrinsic * Switching to hip_bf16.h from ROCm 6.0.0 [ROCm/rccl commit: fb67e5b467]
fb67e5b467