Wenkai Du
addbf4bd90
rccl-prim-test: minor update ( #718 )
2023-04-03 07:30:04 -07:00
Ziyue Yang
e3b2342f39
MSCCL: Improve executor and integrate scheduler ( #694 )
...
* MSCCL: improve executor and add scheduler for testing
* Use external scheduler
* Fix cmake error
* Address comments
* Fix thread safe issue
* Make MSCCL lifecycle APIs thread safe
* Make MSCCL internal scheduler aware of topology hint
* Revise error message
2023-03-14 14:34:25 -07:00
Wenkai Du
e1cb45ff22
Merge remote-tracking branch 'nccl/master' into HEAD
2023-02-04 01:44:43 +00:00
Wenkai Du
a0dd8e0b84
topo_expl: fix broken build by adding hipify steps ( #670 )
2023-01-06 07:29:40 -08:00
Ziyue Yang
adafc0f759
Add MSCCL Support ( #658 )
...
* Add MSCCL support
* Add alignment and message size checking
* Fix nRanks checking, in-place and out-of-place tests and group call handling
* Fix hipGraph unit test
* Change MSCCL init warning to INFO
* Revise license info
2022-12-12 15:51:04 -08:00
gilbertlee-amd
faed69f9fc
Graph unit tests ( #656 )
...
* Adding hipGraph unit tests
2022-12-01 10:28:42 -07:00
Wenkai Du
94ad7f6f51
Update tuning table and fix topo_expl
2022-11-07 18:24:24 +00:00
Wenkai Du
4f0e223db4
Merge remote-tracking branch 'nccl/master' into develop
2022-10-20 15:41:29 +00:00
Wenkai Du
fc554a2428
topo_expl: fix compilation error ( #639 )
2022-10-19 14:19:50 -07:00
gilbertlee-amd
10dbd2a452
Fixing formatting for copywrite ( #638 )
2022-10-19 13:43:21 -06:00
gilbertlee-amd
ebb8b5bf63
Updating files for missing licenses ( #637 )
2022-10-14 13:49:16 -06:00
gilbertlee-amd
bd7d589446
Removing TransferBench from tools ( #632 )
...
Point to new TransferBench repo
2022-09-30 11:53:32 -06:00
Wen-Heng (Jack) Chung
84054c3b30
Tweak unroll factors.
2022-09-22 13:03:04 -05:00
Gilbert Lee
009e79623f
Merge branch 'develop' into 2.13.4
2022-09-09 23:07:04 +00:00
gilbertlee-amd
dd56135a9a
Updating stream caching ( #614 )
...
- Adding non-captured hipStream for use in setup
2022-09-09 16:30:15 -06:00
gilbertlee-amd
65d78e9a1d
GraphBench ( #613 )
...
Adding simple GraphBench tool for comparing RCCL hipGraph performance
2022-09-09 12:12:25 -06:00
Wenkai Du
a79d9e3586
Merge remote-tracking branch 'nccl/master' into develop
2022-09-09 16:05:38 +00:00
akolliasAMD
06bce9d0c9
added stream synch after hipMemset ( #609 )
2022-08-30 16:18:37 -06:00
arvindcheru
2cb2f9493a
HIP Path default updated to ROCM_PATH (reorg path) ( #592 )
...
Updated default path for hip to ROCM_PATH (/opt/rocm instead of /opt/rocm/hip) as per new/current structure.
2022-08-04 13:38:41 -04:00
Edgar
0336ffdf70
Introduce multi-rank support per device.
...
This is a single commit of the source code changes required to
introduce support for multiple ranks per device.
A new interface (ncclCommRankInitMulti) has to be used to make use of
this new feature.
2022-06-10 14:23:12 +00:00
Wenkai Du
ef499c4810
Add another Rome model ( #553 )
...
* Add another Rome model
* Add option to force enable intranet on single node
* Limit p2p channels to number of ranks
* Refine p2p channels handling
2022-05-31 11:31:30 -07:00
Wenkai Du
c5b77121f0
Update Rome model ( #552 )
2022-05-26 09:59:23 -07:00
akolliasAMD
98f0809a39
Added creation of new tree and added switch for using treesplit for specific cases ( #551 )
2022-05-25 18:55:14 -04:00
Wenkai Du
283dc86a73
Refine and add new Rome models ( #548 )
2022-05-17 08:23:59 -07:00
gilbertlee-amd
685bcea127
[TransferBench] Syncing with TransferBench v1.02 ( #541 )
2022-04-27 20:43:24 -06:00
Wenkai Du
063da25563
topo_expl: fix build and add tuning support ( #539 )
2022-04-26 15:40:07 -07:00
Wenkai Du
d28e1cb44f
Merge remote-tracking branch 'nccl/master' into develop
2022-04-18 11:15:25 -07:00
Wenkai Du
2151c79d14
Add new Rome model ( #536 )
2022-04-13 11:45:40 -07:00
Wenkai Du
ba4c165bf3
Add new Rome model ( #535 )
2022-04-12 13:27:32 -07:00
gilbertlee-amd
def6832287
Transfer bench single stream mode ( #531 )
...
- Adding single stream mode
- Removing some unused env vars
- Adding output to CSV mode for p2p benchmark, topology listing modes
2022-04-08 15:20:55 -06:00
Wenkai Du
bbe780ca6c
Support multiple tuning tables ( #522 )
...
* Support multiple tuning tables
* [UnitTests] Skip managed memory testing
2022-03-31 17:09:21 -07:00
gilbertlee-amd
2d558c9abc
Adding explicit request for coarse-grained host memory due to changes in HipHostMalloc ( #517 )
2022-03-25 13:05:07 -06:00
Wenkai Du
cd17cf6dce
Update Rome model matching and add new models ( #516 )
...
* Update Rome model matching and add new models
* Add missing file
* Models update
2022-03-21 10:54:40 -07:00
Ziyue Yang
b569c0a1db
Add Pivot AllToAll algorithm for Rome model ( #503 )
...
* add a2a pivot interface
* remove debug info
* address comments
* fix bug
* remove custom script
* address comments
* fix bug
2022-02-20 21:09:47 -08:00
gilbertlee-amd
f3c2cafd9d
[TransferBench] Fix for cases with subsets of configured numa nodes ( #495 )
2022-02-07 12:16:19 -07:00
gilbertlee-amd
84d5fce7dd
TransferBench: Adding ability to reindex GPUs based on PCIe address ( #494 )
2022-02-02 08:51:41 -07:00
Wenkai Du
598c6fdded
Update Rome models ( #491 )
2022-01-14 10:03:30 -08:00
Wenkai Du
369c021992
topo_expl: update for 2.11.4 ( #490 )
...
* topo_expl: update for 2.11.4
* topo_expl: revert a few logging changes
2022-01-13 13:33:07 -08:00
gilbertlee-amd
2530a2f084
[TransferBench] Updating for 2.11.4. Decoupling from RCCL kernel ( #485 )
2022-01-05 16:33:25 -07:00
Wenkai Du
4234a638b5
Merge pull request #482 from ROCmSoftwarePlatform/2.11.4
...
Sync up with 2.11.4
2022-01-05 09:31:51 -08:00
Wenkai Du
f8d0775a6f
Add another Rome model ( #483 )
2022-01-05 09:26:31 -08:00
Wenkai Du
434ecb0e1f
Merge remote-tracking branch 'origin/develop' into 2.11.4
2022-01-03 09:54:16 -08:00
gilbertlee-amd
1157c2edfe
[TransferBench] Adding more preset benchmarks to filter read mode, cpu vs gpu pairs ( #477 )
2021-11-24 18:05:37 -07:00
Wenkai Du
3a919c1f49
Merge remote-tracking branch 'nccl/master' into develop
2021-11-11 14:22:12 -08:00
gilbertlee-amd
1c7ef1b790
[TransferBench] Adding #CUs / RRLW mode to p2p benchmark ( #464 )
2021-11-08 14:36:04 -07:00
Wenkai Du
0331e39f81
Update Rome model matching ( #461 )
...
* Update Rome model matching
* Add another Rome model
* Automatically setup NET GDR level from model
2021-11-05 08:53:47 -07:00
Wenkai Du
14a184eb67
Query XGMI link count through rocm_smi_lib API ( #442 )
2021-10-26 10:30:20 -07:00
gilbertlee-amd
18246fc191
[TransferBench] Changing default per block multiple to 256B, adding BLOCK_BYTES env var ( #446 )
2021-10-25 11:23:29 -06:00
gilbertlee-amd
550d732d6c
TransferBench p2p benchmark mode ( #444 )
...
* [TransferBench] Adding a p2p benchmark mode
* [TransferBench] Switching to using single sync mode by default (USE_SINGLE_SYNC=1)
2021-10-21 15:28:16 -06:00
gilbertlee-amd
f6b7ac693e
[TransferBench] Adding comment echoing to help distinguish tests ( #438 )
2021-10-13 14:56:57 -06:00