Wenkai Du a724f1ebb7 Add ring simple chunk size tuning (#1180)
* Add ring simple chunk size tuning

* modifying the tuning table to improve the performance of broadcast for 8MB to 32MB for single-node MI300X after ring simple chunk size tuning

* modifying the tuning table to improve the performance of reduce for 1MB to 4MB for single-node MI300X after ring simple chunk size tuning

---------

Co-authored-by: PedramAlizadeh <pmohamma@amd.com>

[ROCm/rccl commit: 73221b4230]
2024-05-29 07:59:47 -07:00
S
Описание
No description provided
282 MiB
Languages
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Разное 1.1%