Wenkai Du
e453f1ced9
Add another Rome model ( #1354 )
2024-10-01 17:41:27 -05:00
Wenkai Du
532b70afb6
Add new Rome model ( #1304 )
...
* Add another rome model and override
* Fix bug
* Fix typo
* Add ring
* Update ring
* Fix model matching
* Clean up
* Clean up
* Reverse rings for NCCL_RINGS input
* Only reverse NCCL_RINGS for ring graph
* Fix mapping issue when using NCCL_RINGS
* Add NCCL_RINGS_REMAP to handle inconsistant net names
2024-08-23 08:45:43 +08:00
mberenjk
519843d2cf
adding rocprof and pytorch parser scripts ( #1214 )
...
* adding rocprof parser script
* adding the support for multiple json files
* adding pytorch profiler script
* remove filtering from pytorch log
* adding the addressing the comments and add the feature to parse all kernels
* completing the report for torch profiler
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2024-07-19 14:51:28 -05:00
Jack Taylor
5f2b88bc28
Add pytorch rccl/intra node all-reduce benchmark ( #1221 )
...
* Add gpt-fast pytorch all reduce benchmark script
* Update readme instructions
* Minor changes
2024-06-25 08:04:38 -07:00
Tim
f078db5998
Upload npkit_trace_analysis.py ( #1152 )
...
script for parsing json trace, generating heatmap, throughput series, etc.
2024-05-09 16:27:49 -04:00
gilbertlee-amd
4cb62f999a
Rail optimization for rings ( #1140 )
...
- Modifies the ring creation algorithm to be friendlier to rail-optimized topologies (should not affect classic fabric topologies)
2024-04-15 12:03:57 -06:00
corey-derochie-amd
503a472a25
Replaced ROCmSoftwarePlatform and RadeonOpenCompute links with ROCm links. ( #1125 )
2024-03-25 16:29:13 -06:00
akolliasAMD
16d7f372b7
Npkit updates ( #1084 )
...
* removed warmup runs to be an opt in
2024-02-15 07:48:45 -07:00
akolliasAMD
c71bae1608
npkit trace script now syncs the on average difference per rank ( #981 )
2023-11-28 11:03:55 -07:00
Pedram Alizadeh
3f6c2b9b32
Adding a script that will download/compile/run TransferBench/RCCL/UCX/RCCL-tests/RCCL-Unittests/hip-mpi-testsuite ( #895 )
2023-09-27 12:44:36 -04:00
Wenkai Du
7044599575
Add new model support ( #847 )
...
* Add new model support
* Update new rings
2023-08-10 17:14:51 -07:00
Wenkai Du
a7fcd58a97
Enable gfx94x ( #808 ) ( #816 )
...
(cherry picked from commit 94da229a7788d74685d1591a4e75a8341de64f41)
2023-07-21 07:31:27 -07:00
akolliasAMD
9cdac774ea
Wall clock update and npkit trace script Update ( #771 )
...
* changed builtin clock to wall_clock64
* updated npkit_Trace_generator to the new version of npkit
2023-06-07 17:47:10 -06:00
gilbertlee-amd
20b567caac
Updating NOTICES.txt and LICENSE.txt ( #770 )
2023-06-07 09:45:03 -06:00
akolliasAMD
2b1efa9e9a
added time results on npkit generator ( #749 )
2023-05-30 12:57:25 -06:00
akolliasAMD
c88475462b
added modified npkit_trace_generator.py to scripts ( #738 )
...
* added modified npkit_trace_generator.py to scripts
2023-05-09 10:11:35 -06:00
Wenkai Du
ef499c4810
Add another Rome model ( #553 )
...
* Add another Rome model
* Add option to force enable intranet on single node
* Limit p2p channels to number of ranks
* Refine p2p channels handling
2022-05-31 11:31:30 -07:00
Wenkai Du
283dc86a73
Refine and add new Rome models ( #548 )
2022-05-17 08:23:59 -07:00
Wenkai Du
2151c79d14
Add new Rome model ( #536 )
2022-04-13 11:45:40 -07:00
Wenkai Du
ba4c165bf3
Add new Rome model ( #535 )
2022-04-12 13:27:32 -07:00
Wenkai Du
cd17cf6dce
Update Rome model matching and add new models ( #516 )
...
* Update Rome model matching and add new models
* Add missing file
* Models update
2022-03-21 10:54:40 -07:00
Wenkai Du
f8d0775a6f
Add another Rome model ( #483 )
2022-01-05 09:26:31 -08:00
Wenkai Du
0331e39f81
Update Rome model matching ( #461 )
...
* Update Rome model matching
* Add another Rome model
* Automatically setup NET GDR level from model
2021-11-05 08:53:47 -07:00
Wenkai Du
2249a1d9d3
Add more Rome models ( #434 )
...
* Add more Rome models
* Update models and tuning
* Update tuning
2021-10-12 08:23:20 -07:00
Wenkai Du
e0053311c0
Add another Rome model ( #431 )
2021-10-06 08:17:12 -07:00
Wenkai Du
5c8380ff5b
Implement NIC identification and remapping ( #420 )
...
* Add 1H16P GPU model
* Implement NIC identification and remapping
* Revert "Sort IB devices based on device name (#413 )"
This reverts commit 2d0ed8dff6 .
* Fix permute and check order
* Correction on IB speed reporting
* Revert "Allow user to link layer with RCCL_IB_HCA_SKIP_LINK_LAYER (#361 )"
This reverts commit caf5c9992a .
2021-08-24 09:42:04 -07:00
Wenkai Du
5f15ed6e3e
Add gfx908 VM model ( #418 )
2021-08-10 08:55:11 -07:00
Wenkai Du
961922ea02
Add option to enable multiple SAT in SHARP ( #380 )
...
* Add option to enable multiple SAT in SHARP
* Extend number of NICs to 16
2021-06-03 19:45:18 -07:00
Wenkai Du
4c83adb75c
Update Rome models matching ( #376 )
2021-05-25 10:12:40 -07:00
Wenkai Du
1fe031402a
Add gfx90a target ( #344 )
...
* Add gfx90a target
* Support gfx90a topology
Co-authored-by: Eiden Yoshida <eiden.yoshida@amd.com >
2021-04-14 09:29:00 -06:00
Wenkai Du
d87dc7c2e8
collnet: support multiple NICs ( #335 )
2021-03-25 20:59:32 -07:00
Wenkai Du
1d6244b18d
Enable collnet in RCCL ( #333 )
...
* Enable CollNet and use different number of channels
* topo_expl: enable collnet
2021-03-19 12:58:13 -07:00
Wenkai Du
6dfdfef98f
Add gfx908 Rome 4 NICs model
2021-02-06 00:19:47 +00:00
Wenkai Du
373a108516
Fix Rome PCIe 2 node topology generation ( #310 )
2020-12-15 17:16:17 -08:00
Wenkai Du
975b14dffa
Add Rome model and improve search ( #305 )
2020-11-17 14:55:06 -08:00
Wenkai Du
dfa3c41ede
Add more Rome models ( #292 )
2020-10-30 21:26:04 -07:00
Wenkai Du
33babcb5e2
Update Rome single node models ( #277 )
2020-10-13 13:33:09 -07:00
Wenkai Du
ae008fd2db
Rework Rome detection and add multiple network ports models ( #274 )
...
* Rework Rome detection and add multiple network ports models
* Remove unused opCount in p2p transport
2020-10-07 13:37:36 -07:00
lijietang
bbe233f8c1
Add rccl bw test script in tools ( #255 )
2020-09-11 16:59:03 +08:00
Wenkai Du
c5cbece6d0
Increase minimal channels for gfx908 ( #259 )
2020-08-26 11:40:11 -07:00
Wenkai Du
391bbf3f1e
Add NPS4 support on some models ( #256 )
...
* Add NPS4 support on some models
* Add XML models
2020-08-19 11:03:20 -07:00
Wenkai Du
a51e4071e3
Add another Rome model ( #249 )
...
* Add another Rome model
* Add gfx908 4P3L models and support
* Revert "Use cached value for detecting GDR support only once"
This reverts commit 67c8e72ce3 .
* Skip using ibverb for GPU direct RDMA detection
* Fine tune one Rome model
2020-08-17 10:51:02 -07:00
Wenkai Du
09ef75656a
Add more Rome 4P2H models
2020-08-06 18:20:02 +00:00
Wenkai Du
e7a10aa0e4
Topology tuning for 4P2H on Rome ( #242 )
...
* Topology tuning for 4P2H on Rome
* Use ncclTopoIdToIndex
2020-07-27 11:53:57 -07:00
Wenkai Du
d5f90e19b5
Add 8P6L multi-node models ( #239 )
2020-07-21 14:10:36 -07:00
Wenkai Du
b3c9852634
Give preference to path with more XGMI connections
2020-05-14 15:33:16 -07:00
Wenkai Du
32388d60a9
topo_expl: add a few more single node models
2020-03-02 11:43:03 -08:00
Wenkai Du
498d5029ad
Add topology visualizer tool
2020-02-26 15:23:34 -08:00