Wenkai Du
5ee7a08994
Warm up both out-of-place and in-place collectives ( #51 )
2023-10-16 12:13:50 -07:00
arvindcheru
a6593375bc
Update Makefile - HIPCC Path Updated to latest ( #45 )
2023-08-04 19:33:39 -04:00
Edgar Gabriel
591ec7777b
Merge pull request #43 from edgargabriel/topic/sles-mpi-autodetect
...
search SLES install paths for MPI
2023-07-25 10:37:40 -05:00
Edgar Gabriel
6048078be2
search SLES install paths for MPI
2023-07-24 12:02:44 -07:00
Wenkai Du
fcd0888d53
Remove hardcoded number of GPUs limit for alltoallv ( #41 )
2023-06-18 18:07:29 -07:00
Wenkai Du
6f6c7f8cdd
Merge pull request #40 from ROCmSoftwarePlatform/fix_merge
...
Fix merge error
2023-06-15 07:35:35 -07:00
Wenkai Du
652a24d38d
Fix merge error
2023-06-14 20:26:33 +00:00
Wenkai Du
8ca93a6ddd
Merge pull request #39 from ROCmSoftwarePlatform/develop_merge
...
Merge with latest nccl-tests
2023-06-14 11:29:09 -07:00
Wenkai Du
bb0f15d407
Merge remote-tracking branch 'nccl/master' into develop
2023-06-14 08:21:02 -07:00
Wenkai Du
d5201418a9
Merge pull request #38 from ROCmSoftwarePlatform/develop_merge
...
Merge master branch into develop
2023-06-14 08:12:30 -07:00
Wenkai Du
469225bcaf
Merge remote-tracking branch 'origin/master' into develop
2023-06-14 08:01:50 -07:00
Sylvain Jeaugey
1a5f551ffd
Merge pull request #146 from yangxingwu/master
...
makefile: remove extra space
2023-06-06 11:58:24 +02:00
yangxingwu
52ea1b2148
makefile: remove extra space
2023-06-06 09:47:50 +00:00
Pedram Alizadeh
d16d1fb16b
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #37 )
2023-04-27 14:06:17 -04:00
Pedram Alizadeh
e856fa720f
Revert "fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )" ( #36 )
...
This reverts commit e146460810 .
2023-04-25 13:44:43 -04:00
Pedram Alizadeh
e146460810
fixing the error message for mpirun when number of requested GPUs exceeds the limits ( #33 )
2023-04-03 11:37:13 -04:00
Sylvain Jeaugey
e98ef24bc0
Merge pull request #135 from aavbsouza/fix_nvcc_variable_handling
...
fix handling of variable NVCC.
2023-03-27 11:14:10 +02:00
alan.souza
7ccda3c97b
fix handling of variable NVCC. Permit overriding the variable using environment variables
2023-03-25 16:56:16 -03:00
David Addison
e76e36e9a9
Merge pull request #134 from flx42/patch-1
...
Update README.md to fix -i default increment value.
2023-03-23 09:53:15 -07:00
Felix Abecassis
17d0a42d5a
Update README.md
2023-03-23 09:05:41 -07:00
Edgar Gabriel
0fc25d5b61
Merge pull request #32 from edgargabriel/topic/mpi-auto-compile
...
revamp cmake MPI detection
2023-03-03 07:55:52 -06:00
Edgar Gabriel
bdf58b1656
revamp cmake MPI detection
...
we honor user requested MPI installations using MPI_PATH first,
and check afterwards for MPICH and Open MPI in the default
Ubuntu and RHEL installation directories.
2023-03-02 19:40:13 +00:00
Pedram Alizadeh
255750b094
Adding -pthread flag for linking issues into CMakeLists.txt and src/Makefile ( #31 )
2023-03-02 11:05:25 -05:00
Pedram Alizadeh
5275aa5715
Adding -pthread flag for linking issues into src/Makefile ( #30 )
...
* Adding -pthread flag for linking issues into src/Makefile
* Adding -pthread flag for linking issues into CMakeLists.txt
2023-02-24 21:39:04 -05:00
Edgar Gabriel
453e72972b
Merge pull request #28 from edgargabriel/topic/mpi-auto-compile
...
auto-detect and enable MPI
2023-02-23 12:35:59 -06:00
Edgar Gabriel
2b2f23f42d
auto-detect and enable MPI
2023-02-23 18:27:08 +00:00
Sylvain Jeaugey
2cbb968101
Update README.md
...
Improve MPI example to avoid confusion of number of processes / total number of GPUs.
https://github.com/NVIDIA/nccl-tests/issues/54#issuecomment-1212023369
2023-01-03 08:47:43 +01:00
David Addison
0b4c4cb99f
Add boot_id to the hostname hash due to collisions on Azure
...
Fixes #60
2022-12-12 01:16:46 -08:00
Jithin Jose
0aeba157db
Use DJB2a hash algorithm in getHostHash()
2022-12-12 01:16:38 -08:00
Edgar Gabriel
b3f0716190
Merge pull request #27 from edgargabriel/topic/half_prod_fix
...
fix algorithm assigning values in testsuite
2022-12-01 08:11:52 -06:00
Edgar Gabriel
e9f5be184c
fix algorithm assigning values in testsuite
...
avoid a division by zero which seems to only occur for op=prod and
datatype=half, since the maximum exponent is small (15) and can exceed
the number of ranks.
2022-11-30 23:01:46 +00:00
David Addison
24fcf64ed1
Call cudaFreeHost() on wrongPerGpu not cudaFree()
2022-11-22 11:18:37 -08:00
David Addison
3bd2bd292b
Add fflush(stdout) before perf output
2022-11-22 11:16:47 -08:00
akolliasAMD
9d3a53dfa3
added std::max to avoid buffer overflow in printing ( #25 )
2022-11-01 11:34:55 -06:00
Edgar Gabriel
a8c920ca7a
Merge pull request #24 from edgargabriel/pr/cmake-fix
...
make cmake stage also pass in CI
2022-11-01 09:39:22 -05:00
Edgar Gabriel
377b28e5fb
make cmake stage also pass in CI
...
the subdir entry is not actually required for the compilation.
2022-10-31 22:07:15 +00:00
Edgar Gabriel
a80fbba12b
Merge pull request #23 from edgargabriel/pr/link-fix
...
add the rccl/lib directory to the link path
2022-10-31 15:54:55 -05:00
Edgar Gabriel
9c9746739a
add the rccl/lib directory to the link path
2022-10-31 19:01:22 +00:00
Edgar Gabriel
fb0d339c1b
Merge pull request #22 from edgargabriel/pr/compile-fix
...
fix a messing endif statement
2022-10-25 12:19:25 -05:00
Edgar Gabriel
8a754f15ad
fix a messing endif statement
...
error introduced with the web merger-resolution tool :-(
2022-10-25 16:31:57 +00:00
Edgar Gabriel
84e8be8e65
Merge pull request #21 from ROCmSoftwarePlatform/topic/v2.13.4-sync
...
Topic/v2.13.4 sync
2022-10-21 17:17:27 -05:00
Edgar Gabriel
4d7cd871c1
Merge branch 'develop' into topic/v2.13.4-sync
2022-10-21 17:12:45 -05:00
Wenkai Du
9a89c300b6
Allow more precise measurements of single operation ( #20 )
2022-10-21 22:07:41 +00:00
Edgar Gabriel
641e93e99c
make rccl-test compile again.
...
all files compile now.
mpi tests also pass
2022-10-21 22:07:33 +00:00
Edgar Gabriel
3ae371cce7
Merge remote-tracking branch 'nccl-tests/master' into topic/v2.13.4-sync
2022-10-14 16:02:54 -05:00
Wenkai Du
d22281cb3f
Allow more precise measurements of single operation ( #20 )
2022-10-12 17:28:04 -07:00
Sylvain Jeaugey
365b92a1ea
Fix build on RHEL7 with GCC 4.8
...
Add -std=c++11 to CXXFLAGS.
Fixes #116 .
2022-10-12 01:24:14 -07:00
akolliasAMD
3fbd3280ce
removed hypercube from Makefile ( #19 )
2022-09-29 15:36:39 -06:00
Sylvain Jeaugey
d313d20a26
Update NCCL tests
2022-09-23 01:13:29 -07:00
David Addison
749573f2d6
Fix preprocessor version check for ncclGetLastError()
...
ncclGetLastError() was added in NCCL 2.13.0
2022-09-07 16:10:41 -07:00