Files
rocm-systems/projects
Changpeng Fang d8a06589c9 Tuning the inline and unroll to reduce the scratch usage
Summary:
 1. remove the noinline attribute for AllReduceThreeKernel;
 2. change AUTPUNROLL for tree functions to 1 or 2;
 Combining 1 and 2 will reduce the scratch usage from 1256 to 952


[ROCm/rccl commit: eec319038e]
2019-10-08 14:02:25 -07:00
..