Arm Patinyasakdikul
4d71cae249
[topo-expl] update header file location. ( #1769 )
...
[ROCm/rccl commit: 35024ca1cb ]
2025-06-27 15:29:37 -05:00
Mustafa Abduljabbar
ab4a3eb0c1
Fix topo explorer's compatibility with NCCL 2.24 ( #1671 )
...
* Fix build issues
* Fix failure to find path remote rank
[ROCm/rccl commit: f3f3336468 ]
2025-05-05 15:26:29 -04:00
Mustafa Abduljabbar
0a81478bd9
Fix topo explorer's nccl 2.23 compatibility ( #1623 )
...
* Fix compiler issues due to broken compatibility
* Fix segfault and pass rank instead of busid and add a pointer to cover a new algorithm
[ROCm/rccl commit: aace4e27f8 ]
2025-04-02 09:47:29 -04:00
Benjamin Kitor
fe806d5427
Add Topologies for 16-GPU gfx942 SuperNode ( #1417 )
...
* Add Topologies for 16-GPU gfx942 SuperNode
- Add GigaIO topologies to tools/topo_expl for dev and testing
- Add GigaIO Columba 16 GPU romeModel and adjust topology
matching algorithm in rome_models for 16 GPU system
- Fix bug which failed to match Rome Model when using subsets
of system resources (i.e. ROCR_VISIBLE_DEVICES is set)
- Fixes for topo_expl
* Fix bug w/ 1H16P
[ROCm/rccl commit: a05329bd0d ]
2024-12-03 13:12:03 -08:00
BertanDogancay
9059445acb
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: 84081064a0 ]
2024-10-02 09:31:25 -05:00
Wenkai Du
f98715baea
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: abd0615351 ]
2023-06-26 22:51:56 +00:00
Wenkai Du
36e5e02e46
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: 4f0e223db4 ]
2022-10-20 15:41:29 +00:00
Wenkai Du
7874a99c75
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: a79d9e3586 ]
2022-09-09 16:05:38 +00:00
Wenkai Du
67e7e6507e
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: d28e1cb44f ]
2022-04-18 11:15:25 -07:00
Wenkai Du
5bebcb0015
Setup collectives threshold for enabling intranet ( #387 )
...
* Setup collectives threshold for enabling intranet
* Use separate operation counters for coll and p2p
[ROCm/rccl commit: b815a2800f ]
2021-06-09 13:24:26 -07:00
Wenkai Du
c8a432dc25
Allow intranode use of network connection ( #383 )
...
* Allow intranode use of network connection
* Checking for graph for null pointer
[ROCm/rccl commit: a3a8c2d56b ]
2021-06-08 07:37:59 -07:00
Wenkai Du
a76bebf8b6
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: a4ea1fed5b ]
2021-05-05 16:01:01 -07:00
Wenkai Du
287ed0f18a
Enable collnet in RCCL ( #333 )
...
* Enable CollNet and use different number of channels
* topo_expl: enable collnet
[ROCm/rccl commit: 1d6244b18d ]
2021-03-19 12:58:13 -07:00
Wenkai Du
adff98765c
Merge remote-tracking branch 'nccl/master' into no-target-id
...
[ROCm/rccl commit: d469947641 ]
2021-01-14 19:27:53 -05:00
Wenkai Du
69eb70ce43
tpol_expl: update to 2.7
...
[ROCm/rccl commit: 71ec3e09df ]
2020-06-09 17:40:24 -07:00
Wenkai Du
779ee97ada
topo_expl: fix build error
...
[ROCm/rccl commit: 5743c6b7d2 ]
2020-04-27 17:17:05 +00:00
Wenkai Du
8852e54181
topo_expl: update to 2.6
...
[ROCm/rccl commit: 6f54b23503 ]
2020-04-01 13:37:08 -07:00
Wenkai Du
00f421ccbd
Add topology explorer
...
[ROCm/rccl commit: 55f8e2dec7 ]
2020-02-19 14:42:06 -08:00