Fix crash with CollnetChain on some node topologies
Fix hang when interleaving the capture of different graphs
Fix hang during init in multi-threaded mode
Fix potential data corruption with LL128 protocol on unaligned buffers.
Fix CPU usage during preconnect
Fixes double-free in the error path for ncclCommInitAll
Workaround hang on H100 with Ring/LL128 on 2 GPUs.
This commit is contained in:
Sylvain Jeaugey
2022-10-25 00:55:55 -07:00
parent da8152e57a
commit cb111f764a
11 changed files with 281 additions and 147 deletions
+1 -1
View File
@@ -1,6 +1,6 @@
##### version
NCCL_MAJOR := 2
NCCL_MINOR := 15
NCCL_PATCH := 1
NCCL_PATCH := 5
NCCL_SUFFIX :=
PKG_REVISION := 1