Files
rocm-systems/src
alex-breslow-amd 2f6b20c00a Use One Slice per Basic Primitive for AllReduce, ReduceScatter, AllGather (#1681) for Single Node on Some GFX9 Systems
Using a single slice rather than the typical two provides about 5% speedup (sometimes more or less) on some GFX9 systems for single node.
2025-05-29 16:17:35 -07:00
..
2025-01-27 03:33:57 -08:00
2024-06-19 01:57:16 -07:00
2022-03-02 20:48:56 +01:00
2025-01-27 03:33:57 -08:00
2019-04-08 09:16:54 -07:00