İşleme Grafiği

854 İşleme

Yazar SHA1 Mesaj Tarih
Wenkai Du c2e9ada40b Repurpose profiling implementation to simple timestamps tracing (#600)
[ROCm/rccl commit: 14b8ff153f]
2022-08-18 15:34:46 -07:00
Wenkai Du 6c3f1366e8 Add XGMI sys type and clean up detection code (#597)
[ROCm/rccl commit: f5c0b243a8]
2022-08-12 09:52:29 -07:00
Ziyue Yang 478d8312b8 Improve alignment and tuning for Pivot A2A algorithm (#593)
* Improve alignment and tuning for Pivot A2A algorithm

* enable pivot a2a by default

[ROCm/rccl commit: f6b9686482]
2022-08-05 19:40:19 -07:00
gilbertlee-amd e3b832f4ce Disable clique AllReduce UnitTest (#595)
[ROCm/rccl commit: dae11c2aca]
2022-08-04 18:30:00 -06:00
gilbertlee-amd b350916a6e Fixing CMake to avoid unnecessary git_version relinking (#594)
[ROCm/rccl commit: 9ed9cd0e31]
2022-08-04 18:03:59 -06:00
arvindcheru a44be6655d HIP Path default updated to ROCM_PATH (reorg path) (#592)
Updated default path for hip to ROCM_PATH (/opt/rocm instead of /opt/rocm/hip) as per new/current structure.

[ROCm/rccl commit: 2cb2f9493a]
2022-08-04 13:38:41 -04:00
akolliasAMD 6fb5c5d5e3 minor latency tuning (#591)
* minor tuning for tree ll

[ROCm/rccl commit: 4cecdc9be5]
2022-08-03 15:07:44 -06:00
Wenkai Du f70830d629 Revert "Use nontemporal in slow path and add XGMI sys type (#575)" (#590)
This reverts commit e04bba619a.

[ROCm/rccl commit: 9089e68a99]
2022-08-02 09:31:53 -07:00
Wenkai Du 7e124d5b83 Add nccl_net.h to librccl-dev package (#589)
[ROCm/rccl commit: e2cb95a390]
2022-07-29 13:39:49 -07:00
akolliasAMD d5ca0be51f Fixed issue with attomicEXCH creating errors on multi node runs (#587)
[ROCm/rccl commit: 254208e7dd]
2022-07-22 11:32:49 -06:00
akolliasAMD fd99ca19f5 updated alltoallV test to reflect how send counts are done in perf tests (#586)
[ROCm/rccl commit: 686dbc8bc6]
2022-07-21 14:59:34 -06:00
akolliasAMD 18d9fd1b8f Removing redundant LOAD and STORE on primitives plus adding some atomics (#585)
[ROCm/rccl commit: 451c287aa6]
2022-07-21 13:04:57 -06:00
Hubert Lu 088d62ff58 Merge pull request #580 from hubertlu-tw/develop
Enhancement of RCCL logging information for topology-aware utilities

[ROCm/rccl commit: 6dd090917a]
2022-07-15 15:16:37 -07:00
Edgar Gabriel 24f2071206 Merge pull request #584 from edgargabriel/topic/signal-backtrace
intercept SIGUSR2 in RCCL

[ROCm/rccl commit: 58437544f8]
2022-07-15 11:31:19 -05:00
Edgar Gabriel a9e0333dba intercept SIGUSR2 in RCCL
add support for intercepting SIGUSR2 in RCCL. This signal will
not terminate the execution of the application, but print the stacktrace
of the process that the signal was sent to instead.


[ROCm/rccl commit: 2b1d5d3bc1]
2022-07-15 16:28:46 +00:00
akolliasAMD 24d9d1c37a Merge pull request #583 from yzygitzh/ziyyang/ll-fix
Remove redundant LOAD/STORE usage in LL initialization

[ROCm/rccl commit: da31537ec7]
2022-07-14 08:51:39 -06:00
Ziyue Yang e1aae026bf Remove redundant LOAD/STORE usage in LL initialization
[ROCm/rccl commit: 77c2bef952]
2022-07-14 00:40:36 +00:00
akolliasAMD d2df866925 Merge pull request #582 from akolliasAMD/readmeUpdate
updated readme to reflect the newer tests

[ROCm/rccl commit: 873c13b47a]
2022-07-13 12:28:30 -06:00
akolliasAMD 2a1d472a20 updated readme to reflect the newer tests
[ROCm/rccl commit: 5950942738]
2022-07-13 16:08:28 +00:00
Wenkai Du 9a9d9cb29b README.md: add CMAKE_PREFIX_PATH to build steps (#581)
[ROCm/rccl commit: 314da5a485]
2022-07-12 11:32:07 -07:00
hubertlu-tw e13eb2eab9 Enhancement of RCCL logging information for topology-aware utilities
[ROCm/rccl commit: a1842df858]
2022-07-11 19:01:10 +00:00
Wenkai Du c129677fe0 Skip HDP cache flush for gfx90a (#578)
* Skip HDP cache flush for gfx90a

* Remove extra debug print

[ROCm/rccl commit: 8c3c8b78c0]
2022-07-08 10:13:32 -07:00
Wenkai Du 659cd52d5c Add more constraints to enable GDR (#579)
* Add more constraints to enable GDR

* Revert deleted line

[ROCm/rccl commit: aa0d7ca882]
2022-07-08 09:52:27 -07:00
Yifan Xiong bf15ad1d72 Reduce AlltoAll port usage in send/recv proxy (#577)
* Reduce AlltoAll port usage when connecting proxy

Reuse socket ports when connecting proxies in AlltoAll.

Existing port usage in AlltoAll is O(n) for recv and O(n) for send,
reusing socket ports in server or client side will make one of them
O(1), reusing both will reduce the total port usage to O(1) and enables
AlltoAll in >64 MI200 nodes.

* Update changelog accordingly

Update changelog accordingly.

[ROCm/rccl commit: 80f53cc171]
2022-07-07 16:15:52 -07:00
Wenkai Du 4b99cef680 Revert "Adding the missing roc:: namespace (#570)" (#576)
This reverts commit fc340decf4.

[ROCm/rccl commit: 2e65881a79]
2022-07-06 10:07:35 -07:00
Wenkai Du e04bba619a Use nontemporal in slow path and add XGMI sys type (#575)
* Use nontemporal in slow path and add XGMI sys type

* Clean up XGMI detection

[ROCm/rccl commit: b250c01cbe]
2022-07-06 07:58:41 -07:00
Wenkai Du 2f4aea93e0 Fix GPU to NIC mapping in tree (#573)
* Fix GPU to NIC mapping in tree

* Update tuning table

[ROCm/rccl commit: 00af1f64e9]
2022-07-03 20:52:52 -07:00
gilbertlee-amd cb5ae7224e Adding git hash info to version output line (#572)
[ROCm/rccl commit: a89a9966aa]
2022-06-28 16:42:51 -06:00
Dmitry Mikushin fc340decf4 Adding the missing roc:: namespace (#570)
* Adding the missing roc:: namespace, effectively changing the value of RCCL_LIBRARY from rccl to roc::rccl.
The important difference is that rccl is treated as a symbolic "-lrccl" by linker (and fail the linking
due to a missing library search path), while roc::rccl is a target name, which can resolve into an absolute
library path.

Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

* Adding a changelog entry

* minor updates to wording

* missing period

Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>

[ROCm/rccl commit: d5bea2cfaa]
2022-06-27 11:44:43 -06:00
Wenkai Du 915a9d3934 Do not set NET GDR level automatically (#571)
[ROCm/rccl commit: 9a285b5e1d]
2022-06-23 16:28:28 -07:00
Wenkai Du 784b12bf75 Use different atomics to check flags in kernel (#568)
[ROCm/rccl commit: c3bb9e70d0]
2022-06-23 09:16:41 -07:00
akolliasAMD f5b5160820 Merge pull request #569 from akolliasAMD/disableMultiRankTest
moved default number of max ranks per gpu to 1

[ROCm/rccl commit: 06f05300fe]
2022-06-22 15:52:06 -04:00
akolliasAMD dcf46e84e0 moved default number of max ranks per gpu to 1
[ROCm/rccl commit: 8b9291eb47]
2022-06-22 17:37:49 +00:00
Ziyue Yang 2b418b5dee Add Feature - Add NPKit Support in RCCL (#564)
* apply npkit

* fix bug

* add npkit in readme

[ROCm/rccl commit: 6e93fafdc3]
2022-06-20 14:30:19 -07:00
Wenkai Du 0fb000932f Change default nchannels per peer (#563)
[ROCm/rccl commit: f274c865c1]
2022-06-13 06:39:05 -07:00
arvindcheru 9c0e790eb5 [CMake] GNU Install Dir Enhancements (#557)
* sd321110 (GNUInstall Dir) enhancements

[ROCm/rccl commit: a1fe1adf1c]
2022-06-10 18:51:51 -04:00
Edgar Gabriel 5099922936 Merge pull request #561 from edgargabriel/multi-rank-devel
Multi rank devel

[ROCm/rccl commit: 45e611dffd]
2022-06-10 11:19:20 -05:00
Edgar f7ef619ba7 extending the unit-tests for multi-rank support
[ROCm/rccl commit: a87d61db2b]
2022-06-10 14:23:19 +00:00
Edgar 8953f5b5ca Introduce multi-rank support per device.
This is a single commit of the source code changes required to
introduce support for multiple ranks per device.
A new interface (ncclCommRankInitMulti) has to be used to make use of
this new feature.


[ROCm/rccl commit: 0336ffdf70]
2022-06-10 14:23:12 +00:00
Wenkai Du 11a6cdd52f Fix P2P scheduling (#560)
[ROCm/rccl commit: 5cb2aca3d9]
2022-06-06 13:32:28 -07:00
Wenkai Du f2dbc77afe Enable timing profile option (#558)
[ROCm/rccl commit: 7a6c6927ae]
2022-06-03 07:05:13 -07:00
akolliasAMD 49adbcc5c7 Merge pull request #556 from akolliasAMD/ROCmSoftwarePlatform/2.12.12
Sync up with NCCL 2.12.12

[ROCm/rccl commit: 2f9663379d]
2022-06-02 12:09:49 -04:00
Aristotelis 0b55e01ef3 Merge remote-tracking branch 'ncclRepo/master' into develop
[ROCm/rccl commit: e0864e7093]
2022-06-02 15:27:24 +00:00
Wenkai Du 1e36b432f1 Revert chunksteps changes (#555)
[ROCm/rccl commit: eef812bed7]
2022-05-31 14:45:51 -07:00
Wenkai Du 5becf1669f Add another Rome model (#553)
* Add another Rome model

* Add option to force enable intranet on single node

* Limit p2p channels to number of ranks

* Refine p2p channels handling

[ROCm/rccl commit: ef499c4810]
2022-05-31 11:31:30 -07:00
akolliasAMD a03ab8e752 code cleanup (#554)
[ROCm/rccl commit: a0a686e74c]
2022-05-31 09:59:36 -04:00
Wenkai Du 2c125ce6ed Update Rome model (#552)
[ROCm/rccl commit: c5b77121f0]
2022-05-26 09:59:23 -07:00
akolliasAMD 22dc8bd246 Added creation of new tree and added switch for using treesplit for specific cases (#551)
[ROCm/rccl commit: 98f0809a39]
2022-05-25 18:55:14 -04:00
gilbertlee-amd a2a4888497 Moving opt-in custom signal handler from UnitTests into RCCL (#550)
* Enable via RCCL_ENABLE_SIGNALHANDLER=1

[ROCm/rccl commit: 700b473211]
2022-05-20 09:56:38 -06:00
Wenkai Du 86e8797602 Add switch for pivot alltoall kernel (#549)
[ROCm/rccl commit: 6707a270b1]
2022-05-17 18:14:04 -07:00