ROCm 4.3 changelog update (#379)
* Update CHANGELOG.md (#378)
* Updating CHANGELOG.md for ROCm 4.3
[ROCm/rccl commit: 903c84050d]
此提交包含在:
+21
-2
@@ -1,15 +1,34 @@
|
||||
# Change Log for RCCL
|
||||
|
||||
Full documentation for RCCL is available at [https://rccl.readthedocs.io](https://rccl.readthedocs.io)
|
||||
## [UNRELEASED]
|
||||
### Added
|
||||
- Compatibility with NCCL 2.9.9
|
||||
|
||||
## [Unreleased]
|
||||
## [RCCL-2.8.4 for ROCm 4.3.0]
|
||||
### Added
|
||||
- Ability to select the number of channels to use for clique-based all reduce (RCCL_CLIQUE_ALLREDUCE_NCHANNELS). This can be adjusted to tune for performance when computation kernels are being executed in parallel.
|
||||
### Optimizations
|
||||
- Additional tuning for clique-based kernel AllReduce performance (still requires opt in with RCCL_ENABLE_CLIQUE=1)
|
||||
|
||||
- Modification of default values for number of channels / byte limits for clique-based all reduce based on device architecture
|
||||
### Changed
|
||||
- Replaced RCCL_FORCE_ENABLE_CLIQUE to RCCL_CLIQUE_IGNORE_TOPO
|
||||
- Clique-based kernels can now be enabled on topologies where all active GPUs are XGMI-connected
|
||||
- Topologies not normally supported by clique-based kernels require RCCL_CLIQUE_IGNORE_TOPO=1
|
||||
### Known issues
|
||||
- Managed memory is not currently supported for clique-based kernels
|
||||
|
||||
## [RCCL-2.8.4 for ROCm 4.2.0]
|
||||
### Added
|
||||
- Compatibility with NCCL 2.8.4
|
||||
|
||||
### Optimizations
|
||||
- Additional tuning for clique-based kernels
|
||||
- Enabling GPU direct RDMA read from GPU
|
||||
- Fixing potential memory leak issue when re-creating multiple communicators within same process
|
||||
- Improved topology detection
|
||||
### Known issues
|
||||
- None
|
||||
|
||||
## [RCCL-2.7.8 for ROCm 4.1.0]
|
||||
### Added
|
||||
|
||||
新增問題並參考
封鎖使用者