From ed152c5b89a145edc1669834b91506125a63574f Mon Sep 17 00:00:00 2001 From: Bertan Dogancay <111835151+BertanDogancay@users.noreply.github.com> Date: Wed, 24 Apr 2024 09:07:49 -0600 Subject: [PATCH] Update CHANGELOG.md for RCCL 2.20.5 (#1150) [ROCm/rccl commit: dcc75797a173f489dc0c1f5192a302ab8e9494c1] --- projects/rccl/CHANGELOG.md | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/projects/rccl/CHANGELOG.md b/projects/rccl/CHANGELOG.md index c2844158df..3ba664a2f7 100644 --- a/projects/rccl/CHANGELOG.md +++ b/projects/rccl/CHANGELOG.md @@ -2,14 +2,33 @@ Full documentation for RCCL is available at [https://rccl.readthedocs.io](https://rccl.readthedocs.io) -## Unreleased +## Unreleased - RCCL 2.20.5 for ROCm 6.2.0 ### Changed -- Modifying rings to be rail-optimized topology friendly +- Compatibility with NCCL 2.20.5 +- Compatibility with NCCL 2.19.4 +- Performance tuning for some collective operations on MI300 +- Enabled NVTX code in RCCL +- Replaced rccl_bfloat16 with hip_bfloat16 +- NPKit updates: + - Removed warm-up iteration removal by default, need to opt in now + - Doubled the size of buffers to accommodate for more channels +- Modified rings to be rail-optimized topology friendly +- Replaced ROCmSoftwarePlatform links with ROCm links ### Added +- Support for fp8 and rccl_bfloat8 +- Support for using HIP contiguous memory +- Implemented ROC-TX for host-side profiling +- Enabled static build +- Added new rome model +- Added fp16 and fp8 cases to unit tests +- New unit test for main kernel stack size +- New -n option for topo_expl to override # of nodes +- Improved debug messages of memory allocations ### Fixed -### Removed +- Bug when configuring RCCL for only LL128 protocol +- Scratch memory allocation after API change for MSCCL -## Unreleased - RCCL 2.18.6 for ROCm 6.1.0 +## RCCL 2.18.6 for ROCm 6.1.0 ### Changed - Compatibility with NCCL 2.18.6