akolliasAMD
f4858e14b2
rearranged how the min and max functions are part of msccl ( #1025 )
...
* rearranged how the min and max functions are part of msccl
* added more coverage on in place graph tests
2023-12-21 08:58:33 -07:00
akolliasAMD
d8dc282eeb
AllReduceTests,fixed the number of roots ( #925 )
2023-10-20 10:25:11 -06:00
Bertan Dogancay
c1f57a7041
Modify All-To-All doc ( #896 )
...
* Modify All-To-All doc
* Update nccl.h.in
* update unit-tests
---------
Co-authored-by: gilbertlee-amd <44450918+gilbertlee-amd@users.noreply.github.com >
2023-09-27 12:45:21 -04:00
Bertan Dogancay
0a01dc2f19
Add 0-byte test for send/recv ( #865 )
2023-08-29 09:14:18 -06:00
Bertan Dogancay
9d11cd092f
Add ncclCommSplit test ( #852 )
...
Add ncclSplitCommTest
2023-08-25 16:26:45 -06:00
gilbertlee-amd
a5a25bdff7
Removing unnecessary chrpath check for unit tests ( #811 )
2023-07-20 10:28:04 -06:00
Wenkai Du
ce6a2ffac8
Merge pull request #782 from ROCmSoftwarePlatform/2.18.3
...
Sync up with NCCL 2.18.3
2023-06-29 15:04:16 -07:00
akolliasAMD
cf8cfa88a8
Re-enabled graph tests ( #736 )
...
* enabled graph tests
* joined multi and single process CI testing
2023-06-29 08:08:17 -06:00
gilbertlee-amd
f7c553edad
Report unit test environment variable values as part of output ( #789 )
2023-06-29 07:13:05 -06:00
Wenkai Du
abd0615351
Merge remote-tracking branch 'nccl/master' into develop
2023-06-26 22:51:56 +00:00
Pedram Alizadeh
520f15e61b
resolving the pthread-gtest linking issue for rccl-UnitTests ( #768 )
2023-06-06 14:21:40 -04:00
gilbertlee-amd
777d8747a5
Refactoring CMakeFiles ( #755 )
2023-05-25 16:08:54 -06:00
Pedram Alizadeh
53c1c38f0e
Disabled hipgraph tests! ( #725 )
2023-04-13 17:42:05 -04:00
akolliasAMD
2ce7d971e5
lessened the amount of child processes to active ones ( #720 )
2023-04-11 08:59:56 -06:00
gilbertlee-amd
27e0cb43c2
Unit test performance refactor ( #700 )
...
* Refactoring unit tests to improve performance
* Spawning child processes during InitComms instead of on TestBed construction
* Temporarily disabling graph unit tests
2023-04-06 12:28:53 -06:00
gilbertlee-amd
00c3d8d850
Adding interactive mode for unit tests (UT_INTERACTIVE) ( #715 )
2023-03-21 10:58:24 -06:00
akolliasAMD
9a0d4a07a6
Test Fixes ( #710 )
...
* splitting CI tests in running SP first and MP second
* set device before hipStreamSynchronize on tests
2023-03-21 08:48:39 -06:00
Ziyue Yang
e3b2342f39
MSCCL: Improve executor and integrate scheduler ( #694 )
...
* MSCCL: improve executor and add scheduler for testing
* Use external scheduler
* Fix cmake error
* Address comments
* Fix thread safe issue
* Make MSCCL lifecycle APIs thread safe
* Make MSCCL internal scheduler aware of topology hint
* Revise error message
2023-03-14 14:34:25 -07:00
gilbertlee-amd
80ed608a9d
Multi stream unit test ( #693 )
...
* Adding multi-stream support to unit tests
2023-02-23 13:28:50 -07:00
gilbertlee-amd
f63d3b1978
Adding UnitTest timing summary (UT_SHOW_TIMING) ( #692 )
2023-02-22 08:57:13 -07:00
akolliasAMD
d119c0886e
UnitTests: made reduceScatter run a smaller amount of tests ( #691 )
2023-02-21 16:21:24 -07:00
gilbertlee-amd
a640c6983f
Unit test fail check ( #689 )
...
* Adding fall-through on unit test failure
* Workaround for hipGraph validity check issue
2023-02-18 08:50:46 -08:00
gilbertlee-amd
df46645ff8
Switching to relaxed capture for unit tests ( #679 )
2023-02-08 11:28:58 -07:00
Pedram Alizadeh
fddb5e6be8
UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) ( #674 )
2023-02-03 17:36:30 -05:00
Pedram Alizadeh
fbe52b6caa
removed the wrapper script so that the old name is no longer referenced ( #676 )
2023-01-31 11:11:02 -05:00
akolliasAMD
24aa8bd802
added a different way for getting device count, by running it in a child process ( #665 )
2022-12-14 16:10:14 -07:00
Pedram Alizadeh
54a3da04eb
Revert "UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) ( #662 )" ( #666 )
...
This reverts commit 8250092367 .
2022-12-14 11:28:40 -05:00
PedramAlizadeh
45872d170f
Changed the name of UnitTests to rccl-UnitTests (wrapper executable included).
2022-12-13 21:45:57 +00:00
Pedram Alizadeh
8250092367
UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) ( #662 )
2022-12-13 16:05:09 -05:00
Ziyue Yang
adafc0f759
Add MSCCL Support ( #658 )
...
* Add MSCCL support
* Add alignment and message size checking
* Fix nRanks checking, in-place and out-of-place tests and group call handling
* Fix hipGraph unit test
* Change MSCCL init warning to INFO
* Revise license info
2022-12-12 15:51:04 -08:00
gilbertlee-amd
faed69f9fc
Graph unit tests ( #656 )
...
* Adding hipGraph unit tests
2022-12-01 10:28:42 -07:00
Ranjith Ramakrishnan
b397cb16ea
Correct hsa header path for new directory layout
2022-11-04 09:52:16 -07:00
raramakr
b32f38126d
Merge pull request #635 from raramakr/swdev
...
Correct include and library path for new directory layout
2022-10-14 15:48:44 -07:00
gilbertlee-amd
ebb8b5bf63
Updating files for missing licenses ( #637 )
2022-10-14 13:49:16 -06:00
Ranjith Ramakrishnan
cf4e963aaf
Correct include and library path for new directory layout
...
Use actual header files and libraries , rather than using wrapper header files and library softlinks
2022-10-14 01:32:04 -07:00
akolliasAMD
06bce9d0c9
added stream synch after hipMemset ( #609 )
2022-08-30 16:18:37 -06:00
Edgar Gabriel
f6e00dec13
introduce support for ncclFloat16/half in UT
2022-08-24 15:28:24 +00:00
gilbertlee-amd
dae11c2aca
Disable clique AllReduce UnitTest ( #595 )
2022-08-04 18:30:00 -06:00
akolliasAMD
686dbc8bc6
updated alltoallV test to reflect how send counts are done in perf tests ( #586 )
2022-07-21 14:59:34 -06:00
akolliasAMD
8b9291eb47
moved default number of max ranks per gpu to 1
2022-06-22 17:37:49 +00:00
Edgar
a87d61db2b
extending the unit-tests for multi-rank support
2022-06-10 14:23:19 +00:00
gilbertlee-amd
700b473211
Moving opt-in custom signal handler from UnitTests into RCCL ( #550 )
...
* Enable via RCCL_ENABLE_SIGNALHANDLER=1
2022-05-20 09:56:38 -06:00
Edgar
2bf6d254b6
add a signal handler and backtrace
...
Tweak the signal handler and force non-release build
Increase ulimit locked memory value
Update the singal handler to use bfd symbol resolution.
Include configure logic to find bfd functions.
Add optionally c++ function name demangling
2022-04-25 10:48:17 -04:00
Liam Wrubleski
a8f1e61f48
Packages for test and benchmark executables on all supported OSes using CPack. ( #512 )
2022-03-21 15:04:14 -06:00
akolliasAMD
65ea3d80db
Added alltoallv test and optional args variable on collective args ( #514 )
...
* Added alltoallv test and optional args variable on collective args
2022-03-18 13:55:11 -04:00
Nirmal Unnikrishnan
676a4737c1
File reorganization as per the new defined standard
...
The header files will in /opt/rocm-xxx/include/rccl
Libraries and cmake will be in /opt/rocm-xxx/lib folder.
Added wrappers for header files using rocm-cmake functions for backward compatibility.
2022-03-08 17:32:02 +00:00
gilbertlee-amd
0687940b84
Changing initialization method for UnitTests ( #510 )
2022-03-07 09:22:55 -07:00
gilbertlee-amd
699dc30f05
[UnitTests] Check process mask for custom tests ( #507 )
2022-03-02 17:24:14 -07:00
akolliasAMD
ff54e79799
Added Unit test for nccl send recv ( #506 )
...
Added Send Receive test that tests through all pairs
2022-03-02 15:50:16 -05:00
gilbertlee-amd
29ad0f5fbe
Unit test refactor ( #500 )
...
Refactoring and consolidating single-process / multi-process unit testing
2022-02-25 08:59:07 -07:00