From 0f03b55914dccff9d090678090901d7b0eb30c72 Mon Sep 17 00:00:00 2001 From: Sylvain Jeaugey Date: Tue, 8 Aug 2017 16:28:46 -0700 Subject: [PATCH] Improve Readme [ROCm/rccl-tests commit: a15599f5cfc6043e3514800c92ac9e55b8dec835] --- projects/rccl-tests/README.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/projects/rccl-tests/README.md b/projects/rccl-tests/README.md index 1532a658fb..92b122c2f2 100644 --- a/projects/rccl-tests/README.md +++ b/projects/rccl-tests/README.md @@ -40,23 +40,24 @@ All tests support the same set of arguments : * Number of GPUs * `-t,--nthreads ` number of threads per process. Default : 1. - * `-g,--ngpus ` number of gpus per process. Default : 1. + * `-g,--ngpus ` number of gpus per thread. Default : 1. * Sizes to scan * `-b,--minbytes ` minimum size to start with. Default : 32M. * `-e,--maxbytes ` maximum size to end at. Default : 32M. * Increments can be either fixes of a multiplication factor. Only one of those should be used - * `-i,--stepbytes ` fixed increment between sizes. Default : (max-min)/10. - * `-f,--stepfactor ` multiplication factor between sizes. Default : disabled. + * `-i,--stepbytes ` fixed increment between sizes. Default : (max-min)/10. + * `-f,--stepfactor ` multiplication factor between sizes. Default : disabled. +* NCCL operations arguments + * `-o,--op ` Specify which reduction operation to perform. Only relevant for reduction operations like Allreduce, Reduce or ReduceScatter. Default : Sum. + * `-d,--datatype ` Specify which datatype to use. Default : Float. + * `-r,--root ` Specify which root to use. Only for operations with a root like broadcast or reduce. Default : 0. * Performance * `-n,--iters ` number of iterations. Default : 20. * `-w,--warmup_iters ` number of warmup iterations (not timed). Default : 5. -* `-s,--swap_args <0/1>` when used with multiple threads, have threads manage different GPUs for each iteration. Default : 0. -* `-p,--parallel_init <0/1>` use threads to initialize NCCL in parallel. -* `-c,--check <0/1>` check correctness of results. This can be quite slow on large numbers of GPUs. Default : 1. -* NCCL operations arguments - * `-o,--op ` Specify which reduction operation to perform. Only relevant for reduction operations. Default : Sum. - * `-d,--datatype ` Specify which datatype to use. Default : Float. - * `-r,--root ` Specify which root to use. Only for operations with a root like broadcast or reduce. +* Test operation + * `-s,--swap_args <0/1>` when used with multiple threads, have threads manage different GPUs for each iteration. Default : 0. + * `-p,--parallel_init <0/1>` use threads to initialize NCCL in parallel. Default : 0. + * `-c,--check <0/1>` check correctness of results. This can be quite slow on large numbers of GPUs. Default : 1. * `-z,--blocking <0/1>` Make NCCL collective blocking, i.e. have CPUs wait and sync after each collective. Default : 0. ## Copyright