From ef0bb78310c4a2075ddc26815ed7c273da461d82 Mon Sep 17 00:00:00 2001 From: "Hernandez, Gerardo" Date: Fri, 15 Aug 2025 20:40:05 +0100 Subject: [PATCH] SWDEV-525231 - clarify that reduce sync operations are new intrinsics in the 7.0 CHANGELOG (#876) SWDEV-525231 - clarify that reduce sync operations are new intrinsics in 7.0, not an existing one [ROCm/clr commit: a5be0f5346e8b6ee88be0497d2242f733e68efaa] --- projects/clr/CHANGELOG.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/projects/clr/CHANGELOG.md b/projects/clr/CHANGELOG.md index 6a4fe5852f..cb0e7b19da 100644 --- a/projects/clr/CHANGELOG.md +++ b/projects/clr/CHANGELOG.md @@ -31,9 +31,9 @@ Full documentation for HIP is available at [rocm.docs.amd.com](https://rocm.docs - HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs. * New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods. * New debug mask, to print precise code object information for logging. -* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. +* The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`. * Added `constexpr` operators for `fp16`/`bf16`. -* Added `__syncwarp` operation. +* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`) * Extended fine grained system memory pool. * `num_threads` total number of threads in the group. The legacy API size is alias. * Added PCI CHIP ID information as the device attribute.