3ce10dc6880959329aa9a03139981a4f5917af43
- use the reduce_psync buffers for synchronization in allreduce, not the
barrier_psync.
- execute a wwg barrier after the allreduce operation. After internal
discussion it was determined that it is required for correctness.
[ROCm/rocshmem commit: 6f512e92a5]
Descripción
No description provided
Languages
C++
67.5%
C
20.6%
Python
6.6%
CMake
3.4%
Shell
0.6%
Otros
1.1%