Граф коммитов

6 Коммитов

Автор SHA1 Сообщение Дата
Wenkai Du 6648c81dc6 Merge remote-tracking branch 'remotes/nccl/master' into rccl_2.5.6 2019-12-03 15:42:04 -08:00
Sylvain Jeaugey 299c554dcc 2.5.6-1 (#255)
Add LL128 Protocol.

Rewrite the topology detection and tree/ring creation (#179). Improve
tree performance by sending/receiving from different GPUs. Add
model-based tuning to switch between the different algorithms and
protocols.

Rework P2P/SHM detection in containers (#155, #248).

Detect duplicated devices and return an error (#231).

Add tuning for GCP
2019-11-19 14:57:39 -08:00
Wenkai Du 5e109ed400 Add bfloat16 support in RCCL
Preprocessor symbol RCCL_BFLOAT16 is used as feature indicator
2019-11-18 13:45:53 -08:00
Michael LIAO 9369f8d75d Fix build with hip-clang.
- Add necessary function attribute for HIP programming model.
- Explicitly include hsa headers.
2019-08-15 14:56:04 -04:00
Wenkai Du f11c8f60cd RCCL 2.4 update 2019-08-14 10:42:35 -07:00
David Addison f40ce73e89 NCCL 2.4.6-1
Added detection of IBM/Power NVLink bridge device.
    Add NUMA support to PCI distance calculations.
    Added NCCL_IGNORE_CPU_AFFINITY env var.
    Fix memory leaks; GithubIssue#180
    Compiler warning fix; GithubIssue#178
    Replace non-standard variable length arrays. GithubIssue#171
    Fix Tree+Shared Memory crash. GithubPR#185
    Fix LL cleanup hang during long running DL jobs.
    Fix NCCL_RINGS environment variable handling.
    Added extra checks to catch repeat calls to ncclCommDestroy() GithubIssue#191
    Improve bootstrap socket connection reliability at scale.
    Fix hostname hashing issue. GithubIssue#187
    Code cleanup to rename all non device files from *.cu to *.cc
2019-04-05 13:05:45 -07:00