提交图

23 次代码提交

作者 SHA1 备注 提交日期
Mustafa Abduljabbar d665547eef Remove MSCCL single node AllGather XMLs (#1693)
* Remove MSCCL single node XMLs

* Remove comment on MSCCL AG single node support
2025-05-13 17:07:03 -05:00
Mustafa Abduljabbar aa7991dfc8 [AllGather MSCCL] Multinode and single node support up to certain send count (#1650)
* Add multinode and singlenode allgather XML
2025-04-24 09:02:03 -04:00
Pedram Alizadeh 5b36b68d06 single-node AR msccl algorithm tuning for MI300 (#1629) 2025-04-10 10:42:28 -04:00
Wenkai Du 62d10fdc25 msccl: disable 1-shot xmls (#1375)
MSCCL 1-shot xmls may cause different output values on different ranks.
Disabling them for now to avoid undefined behavior in applications.
2024-10-14 15:10:53 -07:00
Wenkai Du a680e329e6 Temporarily disable MSCCL all gather XMLs due to UT failure (#1373) 2024-10-12 08:43:16 -07:00
ClementLinCF cab25f919e Optimize NCHANNELS and MSCCL config for gfx942 80CUs (#1195)
* Optimize NCHANNELS and MSCCL config for gfx942 80CUs

Set appropriately for different NCCL_MIN_NCHANNELS and MSCCL config,
potentially improving communication perf on the MI300x 80CUs

* Delete tools/msccl-algorithms/allreduce_1step_mccl_8_2_16777216_LL.xml

* Change the factor of gfx94 and update msccl config
2024-06-01 07:07:46 -07:00
Wenkai Du 4e1b8c1cbb MSCCL: add support for out-of-place all reduce (#1156) 2024-04-28 19:49:09 -07:00
Pedram Alizadeh c2fc1d6809 msccl algorithms tuning for alltoall on MI300 (#1120)
Co-authored-by: PedramAlizadeh <amd@pmohamma.com>
2024-03-21 20:35:29 -04:00
Pedram Alizadeh 50f22e8317 msccl algorithms tuning for allgather on MI300 (#1110) 2024-03-14 12:18:26 -04:00
Pedram Alizadeh 5a0f9990a9 msccl algorithms tuning for allreduce on MI300 (#1088) 2024-02-21 11:31:56 -05:00
Ziyue Yang 0a53077c9c Improve MSCCL algorithms (#1023) 2024-01-03 14:51:34 -08:00
Ziyue Yang bb144dcd50 Tune MSCCL all-reduce algorithm (#1009) 2023-12-08 17:47:02 -06:00
Wen-Heng (Jack) Chung 8e8323252a Let 320KB message size uses LL protocol. (#1006) 2023-12-06 18:14:31 -06:00
Ziyue Yang e44e112a17 Fix mscclAlgoHandle not initialized issue (#995) 2023-12-01 07:58:01 -08:00
Ziyue Yang 4bb0b4a380 Move MSCCL algorithm loading to initialization to workaround HIP graph conflict (#982)
* MSCCL: pre-specify channels and pre-load algorithms

* add mutex

* fix bug

* clean include

* disable all-gathers temporarily
2023-11-30 09:47:20 -08:00
Ziyue Yang 7ae95db5b8 Optimize MSCCL all-gather algorithms for gfx942 (#964) 2023-11-15 08:18:59 -08:00
akolliasAMD 9f02ee8dea Revert "Introduce allgather for MSCCL on 8 sockets up to 320KB. (#931)" (#939)
This reverts commit bfb8642450.
2023-10-30 23:52:58 -06:00
Wen-Heng (Jack) Chung bfb8642450 Introduce allgather for MSCCL on 8 sockets up to 320KB. (#931) 2023-10-24 18:41:12 -05:00
Wen-Heng (Jack) Chung 3f9ffe4788 Introduce allgather MSCCL XML specification for MI250X up to 320KB. (#930) 2023-10-24 18:35:55 -05:00
Wen-Heng (Jack) Chung 72d5fbddfd Introduce 1-shot allreduce for MI250X Hayabusa. (#929) 2023-10-24 16:31:18 -05:00
Wen-Heng (Jack) Chung 341926c60a Introduce 1pass allreduce. Tailor it for very small message sizes <= 20KB. (#919) 2023-10-16 16:31:08 -05:00
Wenkai Du aeca1af374 Add MSCCL xml files (#861) 2023-08-23 14:12:34 -07:00
Ziyue Yang e3b2342f39 MSCCL: Improve executor and integrate scheduler (#694)
* MSCCL: improve executor and add scheduler for testing

* Use external scheduler

* Fix cmake error

* Address comments

* Fix thread safe issue

* Make MSCCL lifecycle APIs thread safe

* Make MSCCL internal scheduler aware of topology hint

* Revise error message
2023-03-14 14:34:25 -07:00