* Added `RCCL_MSCCL_ENABLE_SINGLE_PROCESS` runtime flag to return to the original MSCCL enablement behaviour except when explicitly enabling for multi-thread.
* Added documentation for the new `RCCL_MSCCL_ENABLE_SINGLE_PROCESS` runtime env var.
* update documentation
add version number to documentation
rename .sphinx/.doxygen to sphinx/doxygen
enable htmlzip, pdf, epub formats when publishing on Read the Docs
* add noCI label for dependabot PRs
since RTD CI is separate from math lib CI
* update rocm-docs-core to v0.13.4
* update README with link to rocm.docs.amd.com
* Fix typo in copyright
* Minor README improvements
- Prevent underscores from being interpreted as italics in test name format.
- Switch URL to HTTPS.
* Update docs scripts config
- Allow run_doc.sh and run_doxygen.sh to be called from any directory.
* Add docs build to Jenkins
Fix hang in corner cases of alltoallv using point to point send/recv.
Harmonize error messages.
Fix missing NVTX section in the license.
Update README.
* Adding the ability to force install dependencies (namely gtest); gtest library installation fix for centos
* Removing potentially unneccessary dependencies from install script
* Adding static library building option.
* Disabling running tests for static build
* Removing static packaging in CI
Co-authored-by: Saad Rahim <saad.rahim@amd.com>
* Making hip-clang the default compiler; documentation update
* Adding back --hip-clang to install.sh as a silent option for CI
* Documentation updates for NCCL 2.7
* Restoring deleted line in install script
* Fixing install script to actually install library when requested. Cleaning up unused code.
Removing unused arguments from install script.
Fixing weird whitespacing
* Fixing install script to install to correct location /opt/rocm, now creates symlink in /opt/rocm/lib
* Updates and corrections to README and install script
Added detection of IBM/Power NVLink bridge device.
Add NUMA support to PCI distance calculations.
Added NCCL_IGNORE_CPU_AFFINITY env var.
Fix memory leaks; GithubIssue#180
Compiler warning fix; GithubIssue#178
Replace non-standard variable length arrays. GithubIssue#171
Fix Tree+Shared Memory crash. GithubPR#185
Fix LL cleanup hang during long running DL jobs.
Fix NCCL_RINGS environment variable handling.
Added extra checks to catch repeat calls to ncclCommDestroy() GithubIssue#191
Improve bootstrap socket connection reliability at scale.
Fix hostname hashing issue. GithubIssue#187
Code cleanup to rename all non device files from *.cu to *.cc
Add support for inter-node communication using sockets and InfiniBand/RoCE.
Improve latency.
Add support for aggregation.
Improve LL/regular tuning.
Remove tests as those are now at github.com/nvidia/nccl-tests .