파일
rocm-systems/projects
Mustafa Abduljabbar ef6d75b3ee MSCCL Multithreaded regression root cause fix (#1347)
* Make sure the target device is used for MSCCL

* Enable single process mode by default to use MSCCL in MT

* Create a per-rank state when GPUs share a thread

[ROCm/rccl commit: 03a3ef3c34]
2024-09-25 15:24:25 -04:00
..