Add instructions to README regarding benchmarking on pre ROCm 6.4.x versions with HSA_NO_SCRATCH_RECLAIM=1 (#114)
[ROCm/rccl-tests commit: 284ff2ac84]
Этот коммит содержится в:
коммит произвёл
GitHub
родитель
590c2b0187
Коммит
9da345dadf
@@ -59,6 +59,18 @@ Running with 1 MPI process per GPU ensures a 1:1 mapping for CPUs and GPUs, whic
|
||||
|
||||
See the [Performance](doc/PERFORMANCE.md) page for explanation about numbers, and in particular the "busbw" column.
|
||||
|
||||
### Environment variables
|
||||
On some older versions of ROCm before 6.4.0, setting `HSA_NO_SCRATCH_RECLAIM=1`
|
||||
as part of the environment might be necessary to achieve better performance. When running without MPI, a command similar to the following one should be sufficient:
|
||||
```shell
|
||||
HSA_NO_SCRATCH_RECLAIM=1 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 8
|
||||
```
|
||||
|
||||
For MPI, you might need to use a command similar to the following:
|
||||
```shell
|
||||
mpirun.mpich -np 8 -env NCCL_DEBUG=VERSION -env HSA_NO_SCRATCH_RECLAIM=1 ./build/all_reduce_perf -b 8M -e 128M -i 8388608 -g 1 -d bfloat16
|
||||
```
|
||||
|
||||
### Arguments
|
||||
|
||||
All tests support the same set of arguments :
|
||||
|
||||
Ссылка в новой задаче
Block a user