Γράφημα Υποβολών

59 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
yangxingwu cba8bfd093 makefile: remove extra space
[ROCm/rccl-tests commit: 52ea1b2148]
2023-06-06 09:47:50 +00:00
alan.souza 4fd5ceeba8 fix handling of variable NVCC. Permit overriding the variable using environment variables
[ROCm/rccl-tests commit: 7ccda3c97b]
2023-03-25 16:56:16 -03:00
Felix Abecassis b3db782c3f Update README.md
[ROCm/rccl-tests commit: 17d0a42d5a]
2023-03-23 09:05:41 -07:00
Sylvain Jeaugey b70cac2b33 Update README.md
Improve MPI example to avoid confusion of number of processes / total number of GPUs.

https://github.com/NVIDIA/nccl-tests/issues/54#issuecomment-1212023369

[ROCm/rccl-tests commit: 2cbb968101]
2023-01-03 08:47:43 +01:00
David Addison 129a1b4b78 Add boot_id to the hostname hash due to collisions on Azure
Fixes #60


[ROCm/rccl-tests commit: 0b4c4cb99f]
2022-12-12 01:16:46 -08:00
Jithin Jose 5ba670d551 Use DJB2a hash algorithm in getHostHash()
[ROCm/rccl-tests commit: 0aeba157db]
2022-12-12 01:16:38 -08:00
David Addison 6313530fcc Call cudaFreeHost() on wrongPerGpu not cudaFree()
[ROCm/rccl-tests commit: 24fcf64ed1]
2022-11-22 11:18:37 -08:00
David Addison 04b5c40b1c Add fflush(stdout) before perf output
[ROCm/rccl-tests commit: 3bd2bd292b]
2022-11-22 11:16:47 -08:00
Sylvain Jeaugey c0e3f4d443 Fix build on RHEL7 with GCC 4.8
Add -std=c++11 to CXXFLAGS.
Fixes #116.


[ROCm/rccl-tests commit: 365b92a1ea]
2022-10-12 01:24:14 -07:00
Sylvain Jeaugey fdaa88710b Update NCCL tests
[ROCm/rccl-tests commit: d313d20a26]
2022-09-23 01:13:29 -07:00
David Addison 35ee4ec3eb Fix preprocessor version check for ncclGetLastError()
ncclGetLastError() was added in NCCL 2.13.0


[ROCm/rccl-tests commit: 749573f2d6]
2022-09-07 16:10:41 -07:00
David Addison a43863e1a7 Fix an issue with the last commit when data checking is disabled
[ROCm/rccl-tests commit: afa4c56b6a]
2022-09-07 11:23:49 -07:00
David Addison 59ed17798f Display N/A for error count in AlltoAll in-place test
AlltoAll does not support in-place buffers


[ROCm/rccl-tests commit: a0a14911ee]
2022-09-06 13:17:15 -07:00
John Bachan 70b6c0f5e5 Changed top-level Makefile behavior so that BUILDDIR is interpreted
as relative to top-level directory. This done is by abspath'ing it before
passing it to subdirectory Makefile's.

The old behavior had two cases: with and without BUILDDIR being set by
the user. With BUILDDIR not set, the build dir would be named "build"
in the top-level directory. If BUILDDIR was set, then the build dir
would be placed at "src/${BUILDDIR}".

The new behavior is simpler, if BUILDDIR is not set then it defaults
to "build", and the directory holding the final build is always at just
"${BUILDDIR}" in the top level.


[ROCm/rccl-tests commit: bc5f7cfb0a]
2022-08-23 10:08:49 -07:00
John Bachan b5d746b58e Resync with NCCL 2.13
* Added "verifiable", a suite of kernels for generating and verifying reduction
  input and output arrays in a bit-precise way.
* Data corruption errors now reported in number of wrong elements instead of max
  deviation.
* Use ncclGetLastError.
* Don't run hypercube on non-powers of 2 ranks.
* Fix to hypercube data verification.
* Use "thread local" as the defaut CUDA capture mode.
* Replaced pthread_yield -> sched_yield()
* Bugfix to the cpu-side barrier/allreduce implementations.


[ROCm/rccl-tests commit: 51af5572bf]
2022-08-22 17:51:06 -07:00
David Addison dd8563b279 Add option to statically link cudart
Build with CUDARTLIB=cudart_static to remove dynamic linkage

Also removed unused curand and nvToolsExt dependencies

BUG 95


[ROCm/rccl-tests commit: de3ddbe261]
2021-11-10 10:02:41 -08:00
David Addison ad9aac78df Add MPI_IBM build option
[ROCm/rccl-tests commit: 7130fa6096]
2021-10-25 16:30:57 -07:00
David Addison 56ff821802 Resync with NCCL 2.11
New operator: mulsum
New test: gather


[ROCm/rccl-tests commit: f773748b46]
2021-09-17 09:02:45 -07:00
David Addison f81f5baaed Add CUDA graph support only for CUDA 11.3 and later builds
Fixes #90


[ROCm/rccl-tests commit: 1f8f541686]
2021-07-13 10:47:47 -07:00
David Addison 6719794fc8 Removed MPI_SUPPORT conditional compilation of average flag
[ROCm/rccl-tests commit: b9f90d12a9]
2021-07-12 11:43:57 -07:00
David Addison d3061dc2a9 Fix issues with MPI_Allreduce and multi-threaded tests
[ROCm/rccl-tests commit: 547e119d35]
2021-07-08 16:42:40 -07:00
David Addison ea6eec9e80 Updated with new command line arguments
[ROCm/rccl-tests commit: 11cff17a04]
2021-07-06 16:27:45 -07:00
David Addison 230983c84e Merge branch 'bfloat16'
[ROCm/rccl-tests commit: f476f4a17a]
2021-07-06 10:20:32 -07:00
David Addison a23cffe28a Added new option to report average iteration time
[ROCm/rccl-tests commit: 1dfc76eccc]
2021-06-30 19:36:07 -07:00
David Addison 1044cd1f32 Resync with changes in gitilab-master code
[ROCm/rccl-tests commit: 1ae8cdc315]
2021-06-30 13:16:04 -07:00
David Addison d30e35f150 Added new tests: scatter, sendrecv, hypercube
[ROCm/rccl-tests commit: 9dae3d3a37]
2021-06-28 16:49:10 -07:00
David Addison e73e5a239b Added support for CUDA graph capture/replay (-G)
[ROCm/rccl-tests commit: e55ad3796d]
2021-06-28 14:19:45 -07:00
David Addison 20b63cf465 Fixed formatting for bfloat16 support
[ROCm/rccl-tests commit: 526eacadf7]
2021-06-28 10:12:34 -07:00
David Addison a41268e26e Add support for ncclAvg operation
[ROCm/rccl-tests commit: cde7e769c1]
2021-06-28 09:41:58 -07:00
Greg Inozemtsev 45c28c6c36 Cleanup argument error handling and messages
Add error checking for minbytes and maxbytes arguments

Also accept lowercase literals when parsing size arguments and print errors and usage on stderr.


[ROCm/rccl-tests commit: c4de829d91]
2021-06-04 21:47:40 +00:00
Sylvain Jeaugey 05f0ab10e6 Update PERFORMANCE.md
[ROCm/rccl-tests commit: e12c35d84b]
2021-05-27 09:12:52 -07:00
David Addison 882c60210b Add support for new datatype: bfloat16
[ROCm/rccl-tests commit: e37545e491]
2021-03-15 17:13:35 -07:00
David Addison c62bde3272 Do not allocate memory for expected buffer if checking disabled
This allows the tests to be run with larger buffers


[ROCm/rccl-tests commit: 7677f3f608]
2021-01-20 17:08:40 -08:00
David Addison 819d6ce228 Add boot_id to the hostname hash due to collisions on Azure
Fixes #60


[ROCm/rccl-tests commit: ae1ce98e69]
2021-01-04 11:38:45 -08:00
Jithin Jose f770d161f3 Use DJB2a hash algorithm in getHostHash()
[ROCm/rccl-tests commit: da67a81c8e]
2020-12-18 10:12:54 -08:00
Luke Yeager 8b83a414c5 Fix typo in src/Makefile
[ROCm/rccl-tests commit: afdaf59b3b]
2020-06-24 14:39:22 -07:00
Sylvain Jeaugey 0624d2cede Add gencode for CUDA11
[ROCm/rccl-tests commit: b2603a2e85]
2020-06-23 18:16:46 -07:00
Sylvain Jeaugey 12d86bd58f Change all_gather/reduce_scatter algbw to match the documentation.
Fix #45 : All_gather and reduce_scatter algorithm bandwidth was
computed as time/count*(nranks-1) which is not consistent with the
way we compute it for other collectives.

This change makes algbw higher; busbw is unchanged.


[ROCm/rccl-tests commit: ec1b5e22e6]
2020-06-19 10:42:19 -07:00
Sylvain Jeaugey fcaaf2c4a1 Fix #47 : compilation error on NCCL<2.7
Return an error when trying to run alltoall test when compiled
against NCCL<2.7.


[ROCm/rccl-tests commit: 07ac716c1a]
2020-06-18 15:02:51 -07:00
Sylvain Jeaugey cf70df2498 Merge pull request #46 from NVIDIA/p2p
Add alltoall perf test

[ROCm/rccl-tests commit: a7b304dde5]
2020-06-17 10:45:29 -07:00
Luke Yeager 3a6293b748 Fix some memory leaks
[ROCm/rccl-tests commit: af4fa0f4cf]
2020-06-17 10:44:32 -07:00
Sylvain Jeaugey 0dfae3da28 Remove sm_30
[ROCm/rccl-tests commit: 7a833631b2]
2020-06-15 08:54:21 -07:00
Sylvain Jeaugey e260c673fe Fix #43 : Add .gitignore for build dir
[ROCm/rccl-tests commit: ba924dac95]
2020-06-03 15:10:38 -07:00
Sylvain Jeaugey c633de20d6 Add alltoall perf test
[ROCm/rccl-tests commit: 119a0ecf60]
2020-03-17 12:00:19 -07:00
Wei Zhang c76094c704 Add -L$(MPI_HOME)/lib64 to NVLDFLAGS
In some cases, the MPI library is not in $(MPI_HOME)/lib but
in $(MPI_HOME)/lib64. For example, on RedHat like Linux system
(CentOS, Amazon Linux), and MPI is installed by yum or rpm.

Under such circumstance, the current make file will cause failure.
This patch address this issue by adding -L$(MPI_HOME)/lib64 to
NVLDFLAGS in src/Makefile.

Signed-off-by: Wei Zhang <wzam@amazon.com>


[ROCm/rccl-tests commit: 0f173234bb]
2019-12-16 16:18:22 -08:00
Sylvain Jeaugey 23326c8d34 Update README.md
Checks are now fully local, no need to disable them at scale.

[ROCm/rccl-tests commit: a2af1d959d]
2019-10-10 10:51:05 -07:00
Sylvain Jeaugey 6e12e2d665 Update README.md
[ROCm/rccl-tests commit: ca7a565236]
2019-08-16 09:06:28 -07:00
David Addison 18902f40a7 Resync all tests with test code from NCCL 2.4
Major rework to merge most of the changes from the NCCL internal
tests into the public ones

Added "-m <agg_iters>" operation aggregation option.
Data integrity checking is now much more performant at scale.
Startup times at scale are improved.
Test latency units are now displayed in usec.


[ROCm/rccl-tests commit: cbe7f65400]
2019-04-05 13:42:15 -07:00
Sylvain Jeaugey 2b951dc7dd Added a precision for AllGather and ReduceScatter sizes since NCCL uses the size per rank.
[ROCm/rccl-tests commit: dcf818955f]
2018-08-17 14:58:44 -07:00
Sylvain Jeaugey ee3da4e0b7 Clarification
[ROCm/rccl-tests commit: eb4c43ff3d]
2018-01-30 09:17:29 -08:00