Files
rocm-systems/ext-src
Nusrat Islam 0fb3b5eba9 ext-src: Improved allreduce performance in cpx mode for MI308 (#1393)
To get the improved performance for TP=4, the user needs to use
RCCL_MSCCL_FORCE_ENABLE=1 and MSCCLPP_READ_ALLRED=1. For TP=8, the
user should use MSCCLPP_HIERARCHICAL_ALLRED=1.
2024-10-30 08:30:15 -05:00
..
2024-09-11 09:55:16 -06:00