提交線圖

977 次程式碼提交

作者 SHA1 備註 日期
gilbertlee-amd 80ed608a9d Multi stream unit test (#693)
* Adding multi-stream support to unit tests
2023-02-23 13:28:50 -07:00
Wenkai Du d601c4909c Merge pull request #685 from ROCmSoftwarePlatform/2.16.5
Sync up to NCCL 2.16.5
2023-02-22 10:29:02 -08:00
gilbertlee-amd f63d3b1978 Adding UnitTest timing summary (UT_SHOW_TIMING) (#692) 2023-02-22 08:57:13 -07:00
akolliasAMD d119c0886e UnitTests: made reduceScatter run a smaller amount of tests (#691) 2023-02-21 16:21:24 -07:00
Wenkai Du 86e7b71234 Fix P2P scheduling (#690) 2023-02-21 07:49:54 -08:00
gilbertlee-amd a640c6983f Unit test fail check (#689)
* Adding fall-through on unit test failure

* Workaround for hipGraph validity check issue
2023-02-18 08:50:46 -08:00
Wenkai Du 1c166046a2 Add back __syncthreads() in barrier and adjust stack size (#688) 2023-02-18 08:50:31 -08:00
Ziyue Yang f4bf47f325 NPKit: improve clock calibration and fix GPU clock API (#683)
* Improve clock calibration in NPKit

* Improve gfx macro

* Fix macro
2023-02-17 12:26:57 -07:00
Wenkai Du aee7b42bb8 Merge remote-tracking branch 'nccl/master' into HEAD 2023-02-14 17:14:13 -08:00
Wenkai Du f7a456122c Remove workaround and use indirect function call (#684) 2023-02-14 13:59:48 -08:00
Wenkai Du 9461a43168 Merge pull request #681 from wenkaidu/gfx9
Add HIP event optimization and remove special code for gfx90a
2023-02-13 08:04:59 -08:00
Pedram Alizadeh f525b8e1e6 Adding -pthread flag into CMakeLists.txt (#682)
Adding -pthread flag for linking issues into CMakeLists.txt
2023-02-10 17:22:30 -05:00
Pedram Alizadeh e05560ea82 Updated RCCL introduction at docs/source/library.rst (#680)
* Updated RCCL introduction at docs/source/library.rst

* space after period

---------

Co-authored-by: Saad Rahim <44449863+saadrahim@users.noreply.github.com>
2023-02-10 17:20:54 -05:00
Wenkai Du 39534e8724 Add HIP event optimization and remove special code for gfx90a 2023-02-10 16:46:01 +00:00
gilbertlee-amd df46645ff8 Switching to relaxed capture for unit tests (#679) 2023-02-08 11:28:58 -07:00
Pedram Alizadeh 0df82bd8a3 Update library.rst (#678) 2023-02-06 16:28:59 -05:00
Wenkai Du 11567e5157 Merge pull request #671 from ROCmSoftwarePlatform/2.16.2
Sync up with NCCL 2.16.2
2023-02-04 06:54:30 -08:00
Wenkai Du e1cb45ff22 Merge remote-tracking branch 'nccl/master' into HEAD 2023-02-04 01:44:43 +00:00
Pedram Alizadeh fddb5e6be8 UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#674) 2023-02-03 17:36:30 -05:00
Sylvain Jeaugey f3d5166783 2.16.5-1
Add support for 400Gbit NDR network adapters (CX7)
Handle EINTR in socket poll() function
Add NCCL_PROGRESS_APPENDOP_FREQ to control op append overhead
Resource cleanup fixes
Fix double free in case of init failure
Fix crash in ncclCommAbort
Revert AMD speed commit
2023-02-02 12:52:47 -08:00
Pedram Alizadeh fbe52b6caa removed the wrapper script so that the old name is no longer referenced (#676) 2023-01-31 11:11:02 -05:00
Eiden Yoshida 513bc6912a CI: decrease precheckin and extended test timeouts (#675) 2023-01-21 19:32:21 -07:00
Rashika Kheria 93840e7476 Fix maximum handle size for NCCL Net v4 API
NCCL Net v4 supports a maximum handle size of 64 bytes whereas the
ext-net example header files set it for NCCL Net v3. Since,
`aws-ofi-nccl` plugin plans to follow the example header files, fix it
here.

Signed-off-by: Rashika Kheria <rashika@amazon.com>
2023-01-18 13:31:57 +01:00
Wenkai Du a0dd8e0b84 topo_expl: fix broken build by adding hipify steps (#670) 2023-01-06 07:29:40 -08:00
gilbertlee-amd 3e8ab4e46e Updating CHANGELOG (#669) 2022-12-19 09:36:57 -07:00
Wenkai Du 2288e9ae80 Switch to hipLaunchHostFunc for HIP graph (#667) 2022-12-15 10:16:46 -08:00
akolliasAMD 24aa8bd802 added a different way for getting device count, by running it in a child process (#665) 2022-12-14 16:10:14 -07:00
Pedram Alizadeh 54a3da04eb Revert "UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#662)" (#666)
This reverts commit 8250092367.
2022-12-14 11:28:40 -05:00
Pedram Alizadeh 1a8cd9791e Merge pull request #664 from PedramAlizadeh/rccl-UnitTests_name_change
Changed the name of UnitTests to rccl-UnitTests (wrapper executable i…
2022-12-13 17:16:22 -05:00
PedramAlizadeh 45872d170f Changed the name of UnitTests to rccl-UnitTests (wrapper executable included). 2022-12-13 21:45:57 +00:00
Pedram Alizadeh 8250092367 UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#662) 2022-12-13 16:05:09 -05:00
Ziyue Yang adafc0f759 Add MSCCL Support (#658)
* Add MSCCL support

* Add alignment and message size checking

* Fix nRanks checking, in-place and out-of-place tests and group call handling

* Fix hipGraph unit test

* Change MSCCL init warning to INFO

* Revise license info
2022-12-12 15:51:04 -08:00
Wenkai Du b953544a59 Fix typo in detecting Intel platforms (#661) 2022-12-07 13:36:11 -08:00
akolliasAMD eca623df07 decreased warp size for gfx110x (#655) 2022-12-01 12:19:21 -07:00
gilbertlee-amd faed69f9fc Graph unit tests (#656)
* Adding hipGraph unit tests
2022-12-01 10:28:42 -07:00
Wenkai Du aebed537a5 Reduce linking time through more parallel jobs (#657) 2022-11-30 16:06:03 -08:00
Wenkai Du fb9938cffa Query DMABuf support through HSA runtime API (#654) 2022-11-30 08:53:03 -08:00
Sylvain Jeaugey 28189e2df8 2.16.2-1
Add support for CUDA 12.0, drop Kepler (sm_35).
Support for H100 features.
Make socket code more robust and protected. Solves #555.
Improve performance on large CUDA graphs, reducing dependencies.
Reduce inter-socket bandwidth on AMD CPUs to favor better paths.
Various fixes to ncclCommAbort.
Make service thread polling resistant to EINTR.
Compile with profiling API by default.
Extend NVTX instrumentation with call arguments.
2022-11-30 02:31:59 -08:00
Wenkai Du 9594bbee3b Adjust P2P channels on Intel platform (#653) 2022-11-29 13:57:10 -08:00
akolliasAMD 11862f67de removed cmake HIP_CLANG_PATCH_LEVEL check (#652)
* removed HIP_CLANG_PATCH_LEVEL check
2022-11-29 09:48:59 -07:00
Wenkai Du 67d9327f52 Merge pull request #651 from wenkaidu/nccl_sync
Sync up with NCCL
2022-11-28 17:33:59 -08:00
Wenkai Du bf03a48289 Merge remote-tracking branch 'nccl/master' into HEAD 2022-11-28 09:46:16 -08:00
gilbertlee-amd 36ac8107bd Update CHANGELOG up to ROCm 5.4 (#649)
* Update CHANGELOG for ROCm 5.4.0
2022-11-23 09:40:19 -07:00
Sylvain Jeaugey 614b49f0de Fix google-fastsocket plugin build 2022-11-22 02:13:13 -08:00
Sylvain Jeaugey 55b1d8ab98 Add documentation for NCCL NET plugins
Also repurpose dummy plugin as example, including headers and
compat layers from v6 to v2.
2022-11-22 02:12:53 -08:00
Wenkai Du 57764f8152 Fix incorrect rocm-smi ID conversion (#648) 2022-11-21 19:44:39 -08:00
Wenkai Du 9cb72a3d0f Fix collective trace timestamp format (#647) 2022-11-21 08:11:12 -08:00
Wenkai Du cf3c32a626 Fix typo in previous hipify change (#645) 2022-11-15 11:51:47 -08:00
Wenkai Du b4f6eee9b4 Merge pull request #643 from ROCmSoftwarePlatform/2.15.5
Sync up with NCCL 2.15.5
2022-11-15 08:40:59 -08:00
Wenkai Du 562dd87036 Move hipify to cmake stage
Add minimal ROCm/HIP version requirements for Graph support
2022-11-14 18:10:45 +00:00