Commit Graph

17 Commits

Author SHA1 Message Date
nileshnegi 8d887aad0d Merge remote-tracking branch 'nccl-tests/master' into develop
[ROCm/rccl-tests commit: 5625599dda]
2025-04-21 19:46:10 -05:00
mberenjk efa2d204b2 removing FP8 product from allReduce test cases (#97)
* removing FP8 product from allReduce test cases

---------

Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com>


[ROCm/rccl-tests commit: 77ae744c18]
2025-01-06 14:05:38 -06:00
John Bachan 69b9a05e71 Fixes to all tests that divide buffers by nranks so that they trim buffer sizes to be multiples of 16 bytes.
This ensures non-pow2 ranks have buffer addresses aligned suitably for performance.


[ROCm/rccl-tests commit: 29f4114f02]
2024-12-18 11:20:28 -08:00
Bertan Dogancay 882a96f5cb Add hipify steps prior to build (#62)
* Add hipify steps prior to build

[ROCm/rccl-tests commit: 88cf7dbf45]
2024-03-05 09:47:18 -07:00
Wenkai Du b49f6da1ec Merge remote-tracking branch 'nccl-tests/master' into HEAD
[ROCm/rccl-tests commit: 621dde544d]
2024-03-01 18:34:44 +00:00
Edgar Gabriel 08f9435e5a Merge remote-tracking branch 'nccl-tests/master' into topic/v2.13.4-sync
[ROCm/rccl-tests commit: 3ae371cce7]
2022-10-14 16:02:54 -05:00
Sylvain Jeaugey fdaa88710b Update NCCL tests
[ROCm/rccl-tests commit: d313d20a26]
2022-09-23 01:13:29 -07:00
John Bachan b5d746b58e Resync with NCCL 2.13
* Added "verifiable", a suite of kernels for generating and verifying reduction
  input and output arrays in a bit-precise way.
* Data corruption errors now reported in number of wrong elements instead of max
  deviation.
* Use ncclGetLastError.
* Don't run hypercube on non-powers of 2 ranks.
* Fix to hypercube data verification.
* Use "thread local" as the defaut CUDA capture mode.
* Replaced pthread_yield -> sched_yield()
* Bugfix to the cpu-side barrier/allreduce implementations.


[ROCm/rccl-tests commit: 51af5572bf]
2022-08-22 17:51:06 -07:00
Edgar dad6d819d0 implementation of multi-rank support in rccl-tests.
[ROCm/rccl-tests commit: 0500f2f132]
2022-06-10 14:54:10 -04:00
Wenkai Du 06f4ccd9d2 Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl-tests commit: 9f8ddadcdf]
2021-07-13 08:11:44 -07:00
David Addison 20b63cf465 Fixed formatting for bfloat16 support
[ROCm/rccl-tests commit: 526eacadf7]
2021-06-28 10:12:34 -07:00
David Addison a41268e26e Add support for ncclAvg operation
[ROCm/rccl-tests commit: cde7e769c1]
2021-06-28 09:41:58 -07:00
Wenkai Du 8ff34620fb workaround weak symbol issue
hcc prints "error: alias must point to a defined variable or function"


[ROCm/rccl-tests commit: 4474fe168d]
2019-04-18 10:34:55 -07:00
Stanley Tsang aac7cfb64f Adding AMD copyright notices
[ROCm/rccl-tests commit: 71e663e62d]
2019-04-10 15:28:40 -07:00
Wenkai Du 3c8cfb2d6e hipify nccl-tests to become rccl-tests
[ROCm/rccl-tests commit: a15f771cb2]
2019-04-10 13:43:58 -07:00
David Addison 18902f40a7 Resync all tests with test code from NCCL 2.4
Major rework to merge most of the changes from the NCCL internal
tests into the public ones

Added "-m <agg_iters>" operation aggregation option.
Data integrity checking is now much more performant at scale.
Startup times at scale are improved.
Test latency units are now displayed in usec.


[ROCm/rccl-tests commit: cbe7f65400]
2019-04-05 13:42:15 -07:00
Sylvain Jeaugey 4cb47ccb21 Initial commit
[ROCm/rccl-tests commit: b188a15299]
2017-08-08 16:18:34 -07:00