From 22120c6303511b68ffb767bd9dcc8b19abcc86d7 Mon Sep 17 00:00:00 2001 From: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com> Date: Fri, 23 May 2025 14:47:59 -0600 Subject: [PATCH] Fixed errors in the CHANGELOG for ROCm 7.0 (#1702) * Updated 6.5 release to be 7.0 * Corrected the RCCL version for 6.4.1 * Moved items to the correct releases * Added NCCL 2.25.1 compatibility item * Fixed wording * Added entry for `ManagedMem` and `ManagedMemGraph` test fix [ROCm/rccl commit: 7b633d58441aec55b6cca9460e1fa77a3b300639] --- projects/rccl/CHANGELOG.md | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/projects/rccl/CHANGELOG.md b/projects/rccl/CHANGELOG.md index ef3e738ab8..91c5a3b857 100644 --- a/projects/rccl/CHANGELOG.md +++ b/projects/rccl/CHANGELOG.md @@ -2,41 +2,37 @@ Full documentation for RCCL is available at [https://rccl.readthedocs.io](https://rccl.readthedocs.io) -## Unreleased - RCCL 2.24.3 for ROCm 6.5.0 +## Unreleased - RCCL 2.25.1 for ROCm 7.0.0 ### Resolved issues * Resolved an issue when using more than 64 channels when multiple collectives are used in the same `ncclGroup()` call. +* Fixed unit test failures in tests ending with `ManagedMem` and `ManagedMemGraph` suffixes. ### Added * Added new GPU target `gfx950`. - -### Changed - -* Compatibility with NCCL 2.24.3 - -## Unreleased - RCCL 2.23.4 for ROCm 6.4.1 - -### Added - * Added MSCCL support for multinode gfx942/gfx950 (i.e., 16 and 32 GPUs). To enable, set the environment variable `RCCL_MSCCL_FORCE_ENABLE=1`. Max message size for MSCCL AllGather usage is `12292 * sizeof(datatype) * nGPUs`. -* Added synchronization before destroying proxy thread to fix a rare hang caused by early termination. ### Changed * Compatibility with NCCL 2.23.4 +* Compatibility with NCCL 2.24.3 +* Compatibility with NCCL 2.25.1 + +## RCCL 2.22.3 for ROCm 6.4.1 ### Resolved issues * Fixed the accuracy issue for MSCCLPP `allreduce7` kernel in graph mode. * Fixed IntraNet performance. +* Fixed an issue where, in rare circumstances, the application could stop responding due to a proxy thread synchronization issue. ### Known issues * When splitting a communicator using `ncclCommSplit` in some GPU configurations, MSCCL initialization can cause a segmentation fault. The recommended workaround is to disable MSCCL with `export RCCL_MSCCL_ENABLE=0`. -* Within the RCCL-UnitTests test suite, failures occur in tests ending with the `.ManagedMem` and `.ManagedMemGraph` suffixes. These failures only affect the test results and do not affect the RCCL component itself. This issue will be resolved in the next major release. +* Within the RCCL-UnitTests test suite, failures occur in tests ending with the `ManagedMem` and `ManagedMemGraph` suffixes. These failures only affect the test results and do not affect the RCCL component itself. This issue will be resolved in the next major release. ## RCCL 2.22.3 for ROCm 6.4.0