Граф коммитов

184 Коммитов

Автор SHA1 Сообщение Дата
Wenkai Du addbf4bd90 rccl-prim-test: minor update (#718) 2023-04-03 07:30:04 -07:00
Ziyue Yang e3b2342f39 MSCCL: Improve executor and integrate scheduler (#694)
* MSCCL: improve executor and add scheduler for testing

* Use external scheduler

* Fix cmake error

* Address comments

* Fix thread safe issue

* Make MSCCL lifecycle APIs thread safe

* Make MSCCL internal scheduler aware of topology hint

* Revise error message
2023-03-14 14:34:25 -07:00
Wenkai Du e1cb45ff22 Merge remote-tracking branch 'nccl/master' into HEAD 2023-02-04 01:44:43 +00:00
Wenkai Du a0dd8e0b84 topo_expl: fix broken build by adding hipify steps (#670) 2023-01-06 07:29:40 -08:00
Ziyue Yang adafc0f759 Add MSCCL Support (#658)
* Add MSCCL support

* Add alignment and message size checking

* Fix nRanks checking, in-place and out-of-place tests and group call handling

* Fix hipGraph unit test

* Change MSCCL init warning to INFO

* Revise license info
2022-12-12 15:51:04 -08:00
gilbertlee-amd faed69f9fc Graph unit tests (#656)
* Adding hipGraph unit tests
2022-12-01 10:28:42 -07:00
Wenkai Du 94ad7f6f51 Update tuning table and fix topo_expl 2022-11-07 18:24:24 +00:00
Wenkai Du 4f0e223db4 Merge remote-tracking branch 'nccl/master' into develop 2022-10-20 15:41:29 +00:00
Wenkai Du fc554a2428 topo_expl: fix compilation error (#639) 2022-10-19 14:19:50 -07:00
gilbertlee-amd 10dbd2a452 Fixing formatting for copywrite (#638) 2022-10-19 13:43:21 -06:00
gilbertlee-amd ebb8b5bf63 Updating files for missing licenses (#637) 2022-10-14 13:49:16 -06:00
gilbertlee-amd bd7d589446 Removing TransferBench from tools (#632)
Point to new TransferBench repo
2022-09-30 11:53:32 -06:00
Wen-Heng (Jack) Chung 84054c3b30 Tweak unroll factors. 2022-09-22 13:03:04 -05:00
Gilbert Lee 009e79623f Merge branch 'develop' into 2.13.4 2022-09-09 23:07:04 +00:00
gilbertlee-amd dd56135a9a Updating stream caching (#614)
- Adding non-captured hipStream for use in setup
2022-09-09 16:30:15 -06:00
gilbertlee-amd 65d78e9a1d GraphBench (#613)
Adding simple GraphBench tool for comparing RCCL hipGraph performance
2022-09-09 12:12:25 -06:00
Wenkai Du a79d9e3586 Merge remote-tracking branch 'nccl/master' into develop 2022-09-09 16:05:38 +00:00
akolliasAMD 06bce9d0c9 added stream synch after hipMemset (#609) 2022-08-30 16:18:37 -06:00
arvindcheru 2cb2f9493a HIP Path default updated to ROCM_PATH (reorg path) (#592)
Updated default path for hip to ROCM_PATH (/opt/rocm instead of /opt/rocm/hip) as per new/current structure.
2022-08-04 13:38:41 -04:00
Edgar 0336ffdf70 Introduce multi-rank support per device.
This is a single commit of the source code changes required to
introduce support for multiple ranks per device.
A new interface (ncclCommRankInitMulti) has to be used to make use of
this new feature.
2022-06-10 14:23:12 +00:00
Wenkai Du ef499c4810 Add another Rome model (#553)
* Add another Rome model

* Add option to force enable intranet on single node

* Limit p2p channels to number of ranks

* Refine p2p channels handling
2022-05-31 11:31:30 -07:00
Wenkai Du c5b77121f0 Update Rome model (#552) 2022-05-26 09:59:23 -07:00
akolliasAMD 98f0809a39 Added creation of new tree and added switch for using treesplit for specific cases (#551) 2022-05-25 18:55:14 -04:00
Wenkai Du 283dc86a73 Refine and add new Rome models (#548) 2022-05-17 08:23:59 -07:00
gilbertlee-amd 685bcea127 [TransferBench] Syncing with TransferBench v1.02 (#541) 2022-04-27 20:43:24 -06:00
Wenkai Du 063da25563 topo_expl: fix build and add tuning support (#539) 2022-04-26 15:40:07 -07:00
Wenkai Du d28e1cb44f Merge remote-tracking branch 'nccl/master' into develop 2022-04-18 11:15:25 -07:00
Wenkai Du 2151c79d14 Add new Rome model (#536) 2022-04-13 11:45:40 -07:00
Wenkai Du ba4c165bf3 Add new Rome model (#535) 2022-04-12 13:27:32 -07:00
gilbertlee-amd def6832287 Transfer bench single stream mode (#531)
- Adding single stream mode
- Removing some unused env vars
- Adding output to CSV mode for p2p benchmark, topology listing modes
2022-04-08 15:20:55 -06:00
Wenkai Du bbe780ca6c Support multiple tuning tables (#522)
* Support multiple tuning tables

* [UnitTests] Skip managed memory testing
2022-03-31 17:09:21 -07:00
gilbertlee-amd 2d558c9abc Adding explicit request for coarse-grained host memory due to changes in HipHostMalloc (#517) 2022-03-25 13:05:07 -06:00
Wenkai Du cd17cf6dce Update Rome model matching and add new models (#516)
* Update Rome model matching and add new models

* Add missing file

* Models update
2022-03-21 10:54:40 -07:00
Ziyue Yang b569c0a1db Add Pivot AllToAll algorithm for Rome model (#503)
* add a2a pivot interface

* remove debug info

* address comments

* fix bug

* remove custom script

* address comments

* fix bug
2022-02-20 21:09:47 -08:00
gilbertlee-amd f3c2cafd9d [TransferBench] Fix for cases with subsets of configured numa nodes (#495) 2022-02-07 12:16:19 -07:00
gilbertlee-amd 84d5fce7dd TransferBench: Adding ability to reindex GPUs based on PCIe address (#494) 2022-02-02 08:51:41 -07:00
Wenkai Du 598c6fdded Update Rome models (#491) 2022-01-14 10:03:30 -08:00
Wenkai Du 369c021992 topo_expl: update for 2.11.4 (#490)
* topo_expl: update for 2.11.4

* topo_expl: revert a few logging changes
2022-01-13 13:33:07 -08:00
gilbertlee-amd 2530a2f084 [TransferBench] Updating for 2.11.4. Decoupling from RCCL kernel (#485) 2022-01-05 16:33:25 -07:00
Wenkai Du 4234a638b5 Merge pull request #482 from ROCmSoftwarePlatform/2.11.4
Sync up with 2.11.4
2022-01-05 09:31:51 -08:00
Wenkai Du f8d0775a6f Add another Rome model (#483) 2022-01-05 09:26:31 -08:00
Wenkai Du 434ecb0e1f Merge remote-tracking branch 'origin/develop' into 2.11.4 2022-01-03 09:54:16 -08:00
gilbertlee-amd 1157c2edfe [TransferBench] Adding more preset benchmarks to filter read mode, cpu vs gpu pairs (#477) 2021-11-24 18:05:37 -07:00
Wenkai Du 3a919c1f49 Merge remote-tracking branch 'nccl/master' into develop 2021-11-11 14:22:12 -08:00
gilbertlee-amd 1c7ef1b790 [TransferBench] Adding #CUs / RRLW mode to p2p benchmark (#464) 2021-11-08 14:36:04 -07:00
Wenkai Du 0331e39f81 Update Rome model matching (#461)
* Update Rome model matching

* Add another Rome model

* Automatically setup NET GDR level from model
2021-11-05 08:53:47 -07:00
Wenkai Du 14a184eb67 Query XGMI link count through rocm_smi_lib API (#442) 2021-10-26 10:30:20 -07:00
gilbertlee-amd 18246fc191 [TransferBench] Changing default per block multiple to 256B, adding BLOCK_BYTES env var (#446) 2021-10-25 11:23:29 -06:00
gilbertlee-amd 550d732d6c TransferBench p2p benchmark mode (#444)
* [TransferBench] Adding a p2p benchmark mode
* [TransferBench] Switching to using single sync mode by default (USE_SINGLE_SYNC=1)
2021-10-21 15:28:16 -06:00
gilbertlee-amd f6b7ac693e [TransferBench] Adding comment echoing to help distinguish tests (#438) 2021-10-13 14:56:57 -06:00