SWDEV-525231 - clarify that reduce sync operations are new intrinsics in the 7.0 CHANGELOG (#876)
SWDEV-525231 - clarify that reduce sync operations are new intrinsics in 7.0, not an existing one
[ROCm/clr commit: a5be0f5346]
Этот коммит содержится в:
коммит произвёл
GitHub
родитель
daa2814cf3
Коммит
ef0bb78310
@@ -31,9 +31,9 @@ Full documentation for HIP is available at [rocm.docs.amd.com](https://rocm.docs
|
||||
- HIP Extensions APIs for microscaling formats, which are supported on AMD GPUs.
|
||||
* New `wptr` and `rptr` values in `ClPrint`, for better logging in dispatch barrier methods.
|
||||
* New debug mask, to print precise code object information for logging.
|
||||
* The `_sync()` version of crosslane builtins such as `shfl_sync()` and `__reduce_add_sync` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`.
|
||||
* The `_sync()` version of crosslane builtins such as `shfl_sync()` are enabled by default. These can be disabled by setting the preprocessor macro `HIP_DISABLE_WARP_SYNC_BUILTINS`.
|
||||
* Added `constexpr` operators for `fp16`/`bf16`.
|
||||
* Added `__syncwarp` operation.
|
||||
* Added warp level primitives: `__syncwarp` and reduce intrinsics (e.g. `__reduce_add_sync()`)
|
||||
* Extended fine grained system memory pool.
|
||||
* `num_threads` total number of threads in the group. The legacy API size is alias.
|
||||
* Added PCI CHIP ID information as the device attribute.
|
||||
|
||||
Ссылка в новой задаче
Block a user