Граф коммитов

5 Коммитов

Автор SHA1 Сообщение Дата
Wenkai Du 9a4213356d Support fused all reduce and elementwise operations (#1729)
* Support fused all reduce and elementwise operations

Add additional "acc" parameter to RCCL Replayer logs

Add flag which indicates availability of new API

* Fix Recorder json parsing

* Remove unreachable code

* Remove extra acc pointer check

* .

* Revert "[DEVICE] Adding ability to choose unroll factor at runtime (#1734)"

This reverts commit 9d72be7b2f.

* Use noinline to reduce kernels linking time

* Don't use noinline for gfx942 and gfx950 to avoid perf regression

---------

Co-authored-by: AtlantaPepsi <timhu102@amd.com>
Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>
2025-07-23 09:04:17 -07:00
Tim ba97c9c18b replayer update v0 (#1733)
* First version of new replayer, with comments on future TODOs

* plus minor fixes for UT

* Updated format of recorder, especially in binary department, according to replayer's need
2025-06-13 15:05:34 -04:00
Arm Patinyasakdikul 6c37ae9470 Added missing copyright message. (#1742)
* Added missing copyright message.

* addressed comments.
2025-06-12 09:58:01 -05:00
Tim 45e1c3f3e2 reverting change to RcclReplayer (#1657) 2025-04-23 15:36:46 -04:00
Tim 9a55ff60a9 RCCL Replayer update (#1603)
RCCL recorder w/ suggested change and UT
2025-04-19 00:21:27 -04:00