Update CHANGELOG to match release branches 6.2 and 6.3 (#1391)
* [CHANGELOG] Add Known issues for ROCm 6.2.1
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com>
* Updated 6.2.1 known issues to match the content in develop.
* Updated CHANGELOG for ROCm 6.3 release. (#1380)
* Updated CHANGELOG for ROCm 6.3 release.
* Update CHANGELOG to new format.
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
---------
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
---------
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com>
Co-authored-by: nileshnegi <Nilesh.Negi@amd.com>
Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>
[ROCm/rccl commit: 6ed513e1b9]
This commit is contained in:
zatwierdzone przez
GitHub
rodzic
928414ac06
commit
1c700083b2
@@ -2,19 +2,42 @@
|
||||
|
||||
Full documentation for RCCL is available at [https://rccl.readthedocs.io](https://rccl.readthedocs.io)
|
||||
|
||||
## RCCL 2.21.5 for ROCm 6.3.0
|
||||
|
||||
### Added
|
||||
|
||||
* MSCCL++ integration for specific contexts
|
||||
* Performance collection to rccl_replayer
|
||||
* Tuner Plugin example for MI300
|
||||
* Tuning table for large number of nodes
|
||||
* Support for amdclang++
|
||||
* New Rome model
|
||||
|
||||
### Changed
|
||||
|
||||
* Compatibility with NCCL 2.21.5
|
||||
* Increased channel count for MI300X multi-node
|
||||
* Enabled MSCCL for single-process multi-threaded contexts
|
||||
* Enabled gfx12
|
||||
* Enabled CPX mode for MI300X
|
||||
* Enabled tracing with rocprof
|
||||
* Improved version reporting
|
||||
* Enabled GDRDMA for Linux kernel 6.4.0+
|
||||
|
||||
### Resolved issues
|
||||
|
||||
* Fixed model matching with PXN enable
|
||||
|
||||
## RCCL 2.20.5 for ROCm 6.2.1
|
||||
|
||||
### Fixed
|
||||
- GDR support flag now set with DMABUF
|
||||
### Known issues
|
||||
- On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
|
||||
- This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
- Older RCCL versions are also impacted.
|
||||
- This issue will be addressed in a future ROCm release.
|
||||
|
||||
On systems running Linux kernel 6.8.0, such as Ubuntu 24.04, Direct Memory Access (DMA) transfers between the GPU and NIC are disabled and impacts multi-node RCCL performance.
|
||||
|
||||
This issue was reproduced with RCCL 2.20.5 (ROCm 6.2.0 and 6.2.1) on systems with Broadcom Thor-2 NICs and affects other systems with RoCE networks using Linux 6.8.0 or newer.
|
||||
|
||||
Older RCCL versions are also impacted.
|
||||
|
||||
This issue will be addressed in a future ROCm release.
|
||||
|
||||
## Unreleased - RCCL 2.20.5 for ROCm 6.2.0
|
||||
## RCCL 2.20.5 for ROCm 6.2.0
|
||||
### Changed
|
||||
- Compatibility with NCCL 2.20.5
|
||||
- Compatibility with NCCL 2.19.4
|
||||
|
||||
Reference in New Issue
Block a user