Files
rocm-systems/tools
Wenkai Du 9a4213356d Support fused all reduce and elementwise operations (#1729)
* Support fused all reduce and elementwise operations

Add additional "acc" parameter to RCCL Replayer logs

Add flag which indicates availability of new API

* Fix Recorder json parsing

* Remove unreachable code

* Remove extra acc pointer check

* .

* Revert "[DEVICE] Adding ability to choose unroll factor at runtime (#1734)"

This reverts commit 9d72be7b2f.

* Use noinline to reduce kernels linking time

* Don't use noinline for gfx942 and gfx950 to avoid perf regression

---------

Co-authored-by: AtlantaPepsi <timhu102@amd.com>
Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>
2025-07-23 09:04:17 -07:00
..
2022-12-01 10:28:42 -07:00