a46ea10583
Add support for CUDA graphs. Fuse BCM Gen4 switches to avoid suboptimal performance on some platforms. Issue #439. Fix bootstrap issue caused by connection reordering. Fix CPU locking block. Improve CollNet algorithm. Improve performance on DGX A100 for communicators with only one GPU per node.
7 строки
102 B
Makefile
7 строки
102 B
Makefile
##### version
|
|
NCCL_MAJOR := 2
|
|
NCCL_MINOR := 9
|
|
NCCL_PATCH := 6
|
|
NCCL_SUFFIX :=
|
|
PKG_REVISION := 1
|