Add instructions to README regarding benchmarking on pre ROCm 6.4.x versions with HSA_NO_SCRATCH_RECLAIM=1 (#114)

[ROCm/rccl-tests commit: 284ff2ac84]
Этот коммит содержится в:
Alex Breslow
2025-04-08 11:19:45 -05:00
коммит произвёл GitHub
родитель 590c2b0187
Коммит 9da345dadf
+12
Просмотреть файл
@@ -59,6 +59,18 @@ Running with 1 MPI process per GPU ensures a 1:1 mapping for CPUs and GPUs, whic
See the [Performance](doc/PERFORMANCE.md) page for explanation about numbers, and in particular the "busbw" column.
### Environment variables
On some older versions of ROCm before 6.4.0, setting `HSA_NO_SCRATCH_RECLAIM=1`
as part of the environment might be necessary to achieve better performance. When running without MPI, a command similar to the following one should be sufficient:
```shell
HSA_NO_SCRATCH_RECLAIM=1 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 8
```
For MPI, you might need to use a command similar to the following:
```shell
mpirun.mpich -np 8 -env NCCL_DEBUG=VERSION -env HSA_NO_SCRATCH_RECLAIM=1 ./build/all_reduce_perf -b 8M -e 128M -i 8388608 -g 1 -d bfloat16
```
### Arguments
All tests support the same set of arguments :