Wenkai Du
5ee7a08994
Warm up both out-of-place and in-place collectives ( #51 )
2023-10-16 12:13:50 -07:00
Wenkai Du
652a24d38d
Fix merge error
2023-06-14 20:26:33 +00:00
Wenkai Du
bb0f15d407
Merge remote-tracking branch 'nccl/master' into develop
2023-06-14 08:21:02 -07:00
Wenkai Du
469225bcaf
Merge remote-tracking branch 'origin/master' into develop
2023-06-14 08:01:50 -07:00
Pedram Alizadeh
d16d1fb16b
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #37 )
2023-04-27 14:06:17 -04:00
Pedram Alizadeh
e856fa720f
Revert "fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )" ( #36 )
...
This reverts commit e146460810 .
2023-04-25 13:44:43 -04:00
Pedram Alizadeh
e146460810
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )
2023-04-03 11:37:13 -04:00
David Addison
24fcf64ed1
Call cudaFreeHost() on wrongPerGpu not cudaFree()
2022-11-22 11:18:37 -08:00
David Addison
3bd2bd292b
Add fflush(stdout) before perf output
2022-11-22 11:16:47 -08:00
akolliasAMD
9d3a53dfa3
added std::max to avoid buffer overflow in printing ( #25 )
2022-11-01 11:34:55 -06:00
Edgar Gabriel
8a754f15ad
fix a messing endif statement
...
error introduced with the web merger-resolution tool :-(
2022-10-25 16:31:57 +00:00
Edgar Gabriel
4d7cd871c1
Merge branch 'develop' into topic/v2.13.4-sync
2022-10-21 17:12:45 -05:00
Wenkai Du
9a89c300b6
Allow more precise measurements of single operation ( #20 )
2022-10-21 22:07:41 +00:00
Edgar Gabriel
641e93e99c
make rccl-test compile again.
...
all files compile now.
mpi tests also pass
2022-10-21 22:07:33 +00:00
Edgar Gabriel
3ae371cce7
Merge remote-tracking branch 'nccl-tests/master' into topic/v2.13.4-sync
2022-10-14 16:02:54 -05:00
Wenkai Du
d22281cb3f
Allow more precise measurements of single operation ( #20 )
2022-10-12 17:28:04 -07:00
Sylvain Jeaugey
d313d20a26
Update NCCL tests
2022-09-23 01:13:29 -07:00
David Addison
afa4c56b6a
Fix an issue with the last commit when data checking is disabled
2022-09-07 11:23:49 -07:00
David Addison
a0a14911ee
Display N/A for error count in AlltoAll in-place test
...
AlltoAll does not support in-place buffers
2022-09-06 13:17:15 -07:00
John Bachan
51af5572bf
Resync with NCCL 2.13
...
* Added "verifiable", a suite of kernels for generating and verifying reduction
input and output arrays in a bit-precise way.
* Data corruption errors now reported in number of wrong elements instead of max
deviation.
* Use ncclGetLastError.
* Don't run hypercube on non-powers of 2 ranks.
* Fix to hypercube data verification.
* Use "thread local" as the defaut CUDA capture mode.
* Replaced pthread_yield -> sched_yield()
* Bugfix to the cpu-side barrier/allreduce implementations.
2022-08-22 17:51:06 -07:00
Wenkai Du
45ec598ac4
Fix typo from previous merge
2022-08-12 14:42:17 +00:00
gilbertlee-amd
f6f3c44a7a
Enabling hipGraph codepath for future support ( #18 )
2022-08-09 16:45:27 -06:00
Wenkai Du
9025051bbb
Fix missing error checking for AllocateBuffs due to merge ( #17 )
2022-08-09 11:04:38 -07:00
Edgar
0500f2f132
implementation of multi-rank support in rccl-tests.
2022-06-10 14:54:10 -04:00
Edgar
5cd2374edb
create branch up-to-date with rccl-test
2022-06-10 12:41:56 -04:00
Wenkai Du
6156759a40
Print GPU's full PCI bus ID
2022-04-06 16:46:17 +00:00
Wenkai Du
602b745ff4
Add missing hipStreamDestroy at test exit
2021-11-16 07:50:18 -08:00
Wenkai Du
8b35847d36
Use rccl_bfloat16 class
2021-09-23 16:39:11 -07:00
Wenkai Du
dc1ad4853d
Fix divide by zero error
2021-09-22 08:43:01 -07:00
Wenkai Du
213abee002
Merge remote-tracking branch 'nccl/master' into develop
2021-09-20 14:01:22 -07:00
David Addison
f773748b46
Resync with NCCL 2.11
...
New operator: mulsum
New test: gather
2021-09-17 09:02:45 -07:00
Wenkai Du
2d9be62621
Merge remote-tracking branch 'nccl/master'
2021-07-15 13:54:43 -07:00
David Addison
1f8f541686
Add CUDA graph support only for CUDA 11.3 and later builds
...
Fixes #90
2021-07-13 10:47:47 -07:00
Wenkai Du
9f8ddadcdf
Merge remote-tracking branch 'nccl/master' into develop
2021-07-13 08:11:44 -07:00
David Addison
b9f90d12a9
Removed MPI_SUPPORT conditional compilation of average flag
2021-07-12 11:43:57 -07:00
David Addison
547e119d35
Fix issues with MPI_Allreduce and multi-threaded tests
2021-07-08 16:42:40 -07:00
David Addison
f476f4a17a
Merge branch 'bfloat16'
2021-07-06 10:20:32 -07:00
David Addison
1dfc76eccc
Added new option to report average iteration time
2021-06-30 19:36:07 -07:00
David Addison
1ae8cdc315
Resync with changes in gitilab-master code
2021-06-30 13:16:04 -07:00
David Addison
e55ad3796d
Added support for CUDA graph capture/replay (-G)
2021-06-28 14:19:45 -07:00
David Addison
cde7e769c1
Add support for ncclAvg operation
2021-06-28 09:41:58 -07:00
Greg Inozemtsev
c4de829d91
Cleanup argument error handling and messages
...
Add error checking for minbytes and maxbytes arguments
Also accept lowercase literals when parsing size arguments and print errors and usage on stderr.
2021-06-04 21:47:40 +00:00
David Addison
e37545e491
Add support for new datatype: bfloat16
2021-03-15 17:13:35 -07:00
Wenkai Du
39086cdc0a
Revert "Allow call ncclCommAbort on Ctrl+C"
...
This reverts commit 23c374475f .
2021-02-03 21:16:18 -05:00
David Addison
7677f3f608
Do not allocate memory for expected buffer if checking disabled
...
This allows the tests to be run with larger buffers
2021-01-20 17:08:40 -08:00
Wenkai Du
e5f1482efb
Add tests code that can print info and reset input/output buffers
2021-01-04 16:51:16 -05:00
Wenkai Du
3117033150
Add support for testing memory allocated with hipMallocManaged
2020-12-15 22:05:50 -05:00
Wenkai Du
58dcd35af2
Add alltoallv test
2020-09-01 18:55:25 +00:00
Wenkai Du
3d63a84d97
Add cumask option
2020-08-21 21:34:55 +00:00
Wenkai Du
5361dd8177
Merge remote-tracking branch 'nccl/master' into HEAD
2020-07-06 17:54:31 +00:00