Update README.md
Improve MPI example to avoid confusion of number of processes / total number of GPUs. https://github.com/NVIDIA/nccl-tests/issues/54#issuecomment-1212023369
Este commit está contenido en:
+2
-2
@@ -29,9 +29,9 @@ Run on 8 GPUs (`-g 8`), scanning from 8 Bytes to 128MBytes :
|
||||
$ ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 8
|
||||
```
|
||||
|
||||
Run with MPI on 40 processes (potentially on multiple nodes) with 4 GPUs each :
|
||||
Run with MPI on 10 processes (potentially on multiple nodes) with 4 GPUs each, for a total of 40 GPUs:
|
||||
```shell
|
||||
$ mpirun -np 40 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
|
||||
$ mpirun -np 10 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
|
||||
```
|
||||
|
||||
### Performance
|
||||
|
||||
Referencia en una nueva incidencia
Block a user