mberenjk
eb65dadfc5
replacing rccl_bfloat16 with hip_bfloat16 ( #70 )
...
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2024-04-23 17:00:20 -05:00
Nilesh M Negi
990f88cbaa
Ammend use of CUSTOM_RCCL_LIB to avoid build error ( #71 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
2024-04-12 12:01:32 -05:00
mberenjk
3f7f7859bf
adding git version to rccl-tests ( #69 )
...
Co-authored-by: mberenjk <mberenjk@amd.com >
2024-03-28 14:03:59 -05:00
akolliasAMD
91609be0ef
Revert "adding git version to rccl-test ( #66 )"
...
This reverts commit a31679775c .
2024-03-22 10:21:37 -06:00
mberenjk
a31679775c
adding git version to rccl-test ( #66 )
...
* adding git version to rccl-test
---------
Co-authored-by: mberenjk <mberenjk@banff-cyxtera-s74-2.ctr.dcgpu >
2024-03-20 10:04:12 -05:00
Andy li
e447c17382
update the fp8 header file name ( #65 )
...
* update the fp8 header name
2024-03-08 10:02:40 -08:00
Andy li
21e59fb283
Enable fp8 support ( #63 )
...
* initial checkin
* rename the fp8 datatype name
* update based on cr comments
* resolve the build issue
* resolve fp8 campability issue
* fix minior bug and catch up to reflex latest develop branch change
* add fp8 + operatior support
* update fp8 header file
* resolve merge issue from develop branch
2024-03-07 16:54:41 -08:00
Bertan Dogancay
88cf7dbf45
Add hipify steps prior to build ( #62 )
...
* Add hipify steps prior to build
2024-03-05 09:47:18 -07:00
Wenkai Du
621dde544d
Merge remote-tracking branch 'nccl-tests/master' into HEAD
2024-03-01 18:34:44 +00:00
Wenkai Du
7715a0cf1f
Fix typo in rank assignment ( #59 )
2024-02-15 12:04:38 -08:00
David Addison
c6afef0b6f
Added missing MPI_Comm_free() call before MPI_Finalize()
2024-02-05 08:53:54 -08:00
Nusrat Islam
a2bec5d2f6
Add option to disable out-of-place
2024-01-04 16:43:50 -06:00
Lauren Wrubleski
e1a816b869
Offload arch linking ( #54 )
...
* Update CMakeLists.txt
* Update CMakeLists.txt
* Link rccl_common object against hip::device
Previously the tests were compiled with `--amdgpu-target` to compile for multiple architectures, As rccl_common was not compiled against those architectures, this didn't work. Linking it against hip::device automatically links against all architectures in `AMDGPU_TARGETS`, and so are the test executables.
2023-12-05 19:20:46 -06:00
Wenkai Du
5ee7a08994
Warm up both out-of-place and in-place collectives ( #51 )
2023-10-16 12:13:50 -07:00
David Addison
1292b25553
Added an MPI_Barrier() call after MPI_Bcast() for HCOLL issue
2023-10-12 16:53:32 -07:00
David Addison
6c46206a47
Make the -c option be a datacheck iteration count parameter
...
Default is 1
2023-09-13 14:03:38 -07:00
arvindcheru
a6593375bc
Update Makefile - HIPCC Path Updated to latest ( #45 )
2023-08-04 19:33:39 -04:00
Wenkai Du
fcd0888d53
Remove hardcoded number of GPUs limit for alltoallv ( #41 )
2023-06-18 18:07:29 -07:00
Wenkai Du
652a24d38d
Fix merge error
2023-06-14 20:26:33 +00:00
Wenkai Du
bb0f15d407
Merge remote-tracking branch 'nccl/master' into develop
2023-06-14 08:21:02 -07:00
Wenkai Du
469225bcaf
Merge remote-tracking branch 'origin/master' into develop
2023-06-14 08:01:50 -07:00
Pedram Alizadeh
d16d1fb16b
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #37 )
2023-04-27 14:06:17 -04:00
Pedram Alizadeh
e856fa720f
Revert "fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )" ( #36 )
...
This reverts commit e146460810 .
2023-04-25 13:44:43 -04:00
Pedram Alizadeh
e146460810
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )
2023-04-03 11:37:13 -04:00
alan.souza
7ccda3c97b
fix handling of variable NVCC. Permit overriding the variable using environment variables
2023-03-25 16:56:16 -03:00
Pedram Alizadeh
255750b094
Adding -pthread flag for linking issues into CMakeLists.txt and src/Makefile ( #31 )
2023-03-02 11:05:25 -05:00
Pedram Alizadeh
5275aa5715
Adding -pthread flag for linking issues into src/Makefile ( #30 )
...
* Adding -pthread flag for linking issues into src/Makefile
* Adding -pthread flag for linking issues into CMakeLists.txt
2023-02-24 21:39:04 -05:00
David Addison
0b4c4cb99f
Add boot_id to the hostname hash due to collisions on Azure
...
Fixes #60
2022-12-12 01:16:46 -08:00
Jithin Jose
0aeba157db
Use DJB2a hash algorithm in getHostHash()
2022-12-12 01:16:38 -08:00
David Addison
24fcf64ed1
Call cudaFreeHost() on wrongPerGpu not cudaFree()
2022-11-22 11:18:37 -08:00
David Addison
3bd2bd292b
Add fflush(stdout) before perf output
2022-11-22 11:16:47 -08:00
akolliasAMD
9d3a53dfa3
added std::max to avoid buffer overflow in printing ( #25 )
2022-11-01 11:34:55 -06:00
Edgar Gabriel
377b28e5fb
make cmake stage also pass in CI
...
the subdir entry is not actually required for the compilation.
2022-10-31 22:07:15 +00:00
Edgar Gabriel
9c9746739a
add the rccl/lib directory to the link path
2022-10-31 19:01:22 +00:00
Edgar Gabriel
8a754f15ad
fix a messing endif statement
...
error introduced with the web merger-resolution tool :-(
2022-10-25 16:31:57 +00:00
Edgar Gabriel
4d7cd871c1
Merge branch 'develop' into topic/v2.13.4-sync
2022-10-21 17:12:45 -05:00
Wenkai Du
9a89c300b6
Allow more precise measurements of single operation ( #20 )
2022-10-21 22:07:41 +00:00
Edgar Gabriel
641e93e99c
make rccl-test compile again.
...
all files compile now.
mpi tests also pass
2022-10-21 22:07:33 +00:00
Edgar Gabriel
3ae371cce7
Merge remote-tracking branch 'nccl-tests/master' into topic/v2.13.4-sync
2022-10-14 16:02:54 -05:00
Wenkai Du
d22281cb3f
Allow more precise measurements of single operation ( #20 )
2022-10-12 17:28:04 -07:00
Sylvain Jeaugey
365b92a1ea
Fix build on RHEL7 with GCC 4.8
...
Add -std=c++11 to CXXFLAGS.
Fixes #116 .
2022-10-12 01:24:14 -07:00
akolliasAMD
3fbd3280ce
removed hypercube from Makefile ( #19 )
2022-09-29 15:36:39 -06:00
Sylvain Jeaugey
d313d20a26
Update NCCL tests
2022-09-23 01:13:29 -07:00
David Addison
749573f2d6
Fix preprocessor version check for ncclGetLastError()
...
ncclGetLastError() was added in NCCL 2.13.0
2022-09-07 16:10:41 -07:00
David Addison
afa4c56b6a
Fix an issue with the last commit when data checking is disabled
2022-09-07 11:23:49 -07:00
David Addison
a0a14911ee
Display N/A for error count in AlltoAll in-place test
...
AlltoAll does not support in-place buffers
2022-09-06 13:17:15 -07:00
John Bachan
51af5572bf
Resync with NCCL 2.13
...
* Added "verifiable", a suite of kernels for generating and verifying reduction
input and output arrays in a bit-precise way.
* Data corruption errors now reported in number of wrong elements instead of max
deviation.
* Use ncclGetLastError.
* Don't run hypercube on non-powers of 2 ranks.
* Fix to hypercube data verification.
* Use "thread local" as the defaut CUDA capture mode.
* Replaced pthread_yield -> sched_yield()
* Bugfix to the cpu-side barrier/allreduce implementations.
2022-08-22 17:51:06 -07:00
Wenkai Du
45ec598ac4
Fix typo from previous merge
2022-08-12 14:42:17 +00:00
gilbertlee-amd
f6f3c44a7a
Enabling hipGraph codepath for future support ( #18 )
2022-08-09 16:45:27 -06:00
Wenkai Du
9025051bbb
Fix missing error checking for AllocateBuffs due to merge ( #17 )
2022-08-09 11:04:38 -07:00