Update CHANGELOG to match release branches 6.2 and 6.3 (#1391)
* [CHANGELOG] Add Known issues for ROCm 6.2.1 Signed-off-by: nileshnegi <Nilesh.Negi@amd.com> * Updated 6.2.1 known issues to match the content in develop. * Updated CHANGELOG for ROCm 6.3 release. (#1380) * Updated CHANGELOG for ROCm 6.3 release. * Update CHANGELOG to new format. Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> --------- Co-authored-by: Jeffrey Novotny <jnovotny@amd.com> --------- Signed-off-by: nileshnegi <Nilesh.Negi@amd.com> Co-authored-by: nileshnegi <Nilesh.Negi@amd.com> Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
Bu işleme şunda yer alıyor:
işlemeyi yapan:
GitHub
ebeveyn
29f87c7191
işleme
6ed513e1b9
+33
-10
@@ -2,19 +2,42 @@
|
||||
|
||||
Full documentation for RCCL is available at [https://rccl.readthedocs.io](https://rccl.readthedocs.io)
|
||||
|
||||
## RCCL 2.21.5 for ROCm 6.3.0
|
||||
|
||||
### Added
|
||||
|
||||
* MSCCL++ integration for specific contexts
|
||||
* Performance collection to rccl_replayer
|
||||
* Tuner Plugin example for MI300
|
||||
* Tuning table for large number of nodes
|
||||
* Support for amdclang++
|
||||
* New Rome model
|
||||
|
||||
### Changed
|
||||
|
||||
* Compatibility with NCCL 2.21.5
|
||||
* Increased channel count for MI300X multi-node
|
||||
* Enabled MSCCL for single-process multi-threaded contexts
|
||||
* Enabled gfx12
|
||||
* Enabled CPX mode for MI300X
|
||||
* Enabled tracing with rocprof
|
||||
* Improved version reporting
|
||||
* Enabled GDRDMA for Linux kernel 6.4.0+
|
||||
|
||||
### Resolved issues
|
||||
|
||||
* Fixed model matching with PXN enable
|
||||
|
||||
## RCCL 2.20.5 for ROCm 6.2.1
|
||||
|
||||
### Fixed
|
||||
- GDR support flag now set with DMABUF
|
||||
### Known issues
|
||||
- On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
|
||||
- This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
- Older RCCL versions are also impacted.
|
||||
- This issue will be addressed in a future ROCm release.
|
||||
|
||||
On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
|
||||
|
||||
This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
|
||||
Older RCCL versions are also impacted.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
## Unreleased - RCCL 2.20.5 for ROCm 6.2.0
|
||||
## RCCL 2.20.5 for ROCm 6.2.0
|
||||
### Changed
|
||||
- Compatibility with NCCL 2.20.5
|
||||
- Compatibility with NCCL 2.19.4
|
||||
|
||||
Yeni konuda referans
Bir kullanıcı engelle