Wenkai Du 170cc1afde Print KL/CL/KE events for all warps (#1544)
* Print KL/CL/KE events for all warps

* Fix count off-by-one issue

* Fix opCount in KE and restore CPU thread option

* Simplify count calculation

[ROCm/rccl commit: ebf7e2305e]
2025-02-12 13:36:31 -08:00
S
Popis
Nebyl uveden žádný popis
282 MiB
Jazyky
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Jiný 1.1%