26 Коммитов

Автор SHA1 Сообщение Дата
Marzieh Berenjkoub d7293281f3 Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl commit: 858b4e76eb]
2026-01-20 13:04:02 -06:00
Atul Kulkarni e4aef19511 Added new unit tests for AllReduce with Bias API (#2036)
* Added new unit tests for AllReduce with Bias API

* Address review comments

[ROCm/rccl commit: 7c12b0b76b]
2025-12-03 17:37:34 -06:00
corey-derochie-amd af1c448ed1 Changed TestBedChild to avoid hang if the call fails (#1875)
Changed `TestBedChild` protocol to send the result code before the return value to avoid hanging if the call fails. Switched `TestBedChild::GetUniqueId` to use this.

[ROCm/rccl commit: b88c134874]
2025-08-23 00:17:34 -05:00
Tim e346e19065 Adjustment for UT Sendrecv (#1400)
Enabled UT sendrecv to same rank and refactor UBR call

[ROCm/rccl commit: fd9924cfe7]
2024-10-30 15:13:53 -04:00
Tim 3261e2a5fd Adding User Buffer Registration support for Unit test (#1199)
* Adding UBR support for UT SendRecv

Signed-off-by: Tim Hu <timhu102@amd.com>

* Update test/common/TestBedChild.cpp

Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com>

---------

Signed-off-by: Tim Hu <timhu102@amd.com>
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com>

[ROCm/rccl commit: a4793286c7]
2024-07-30 13:39:25 -04:00
corey-derochie-amd 37bf54b8f8 Enable multi-threading for MSCCL (#1203)
MSCCL can now run in a multi-threaded configuration. To test in the unit tests, added the ENABLE_OPENMP compile definition flag and the --openmp-test-enable flag to the unit test build script. To activate, set the environment variables UT_MULTITHREADED=1 and UT_PROCESS_MASK=1. Set Jenkins to use this mode.

[ROCm/rccl commit: 0c36d571ea]
2024-07-04 09:34:38 -06:00
Bertan Dogancay dea5e83940 [UT] Start supporting multiple group calls and graphs (#1151)
* Start supporting multiple group calls UT

[ROCm/rccl commit: 0ec41f1386]
2024-04-25 11:11:16 -06:00
Tim 0343d9ccac Relaxing default timeout limit, add error log (#1052)
Signed-off-by: Tim Hu <timhu102@amd.com>

[ROCm/rccl commit: 05850e89f2]
2024-01-18 15:09:08 -05:00
Tim 245e757b26 Adding timeout functionality/EnvVar to TestBed (#1044)
* Adding timeout functionality/EnvVar to TestBed
* updating timeout unit to microseconds

Signed-off-by: Tim Hu <timhu102@amd.com>

[ROCm/rccl commit: 9c0ef11ac7]
2024-01-17 11:33:01 -05:00
Wenkai Du f98715baea Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl commit: abd0615351]
2023-06-26 22:51:56 +00:00
gilbertlee-amd ff2c1c5d0f Unit test performance refactor (#700)
* Refactoring unit tests to improve performance
* Spawning child processes during InitComms instead of on TestBed construction
* Temporarily disabling graph unit tests

[ROCm/rccl commit: 27e0cb43c2]
2023-04-06 12:28:53 -06:00
gilbertlee-amd b859549866 Adding interactive mode for unit tests (UT_INTERACTIVE) (#715)
[ROCm/rccl commit: 00c3d8d850]
2023-03-21 10:58:24 -06:00
akolliasAMD 9daf0bc3d1 Test Fixes (#710)
* splitting CI tests in running SP first and MP second
* set device before hipStreamSynchronize on tests

[ROCm/rccl commit: 9a0d4a07a6]
2023-03-21 08:48:39 -06:00
gilbertlee-amd 0da1d6a6cd Multi stream unit test (#693)
* Adding multi-stream support to unit tests

[ROCm/rccl commit: 80ed608a9d]
2023-02-23 13:28:50 -07:00
gilbertlee-amd e5795cf101 Switching to relaxed capture for unit tests (#679)
[ROCm/rccl commit: df46645ff8]
2023-02-08 11:28:58 -07:00
Pedram Alizadeh f7982e9bed UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#674)
[ROCm/rccl commit: fddb5e6be8]
2023-02-03 17:36:30 -05:00
Pedram Alizadeh a85f71a421 Revert "UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#662)" (#666)
This reverts commit f29aa66d4f.

[ROCm/rccl commit: 54a3da04eb]
2022-12-14 11:28:40 -05:00
Pedram Alizadeh f29aa66d4f UnitTest: add test cases for 2.14 API (ncclCommInitRankConfig and ncclCommFinalize for non-blocking communicator) (#662)
[ROCm/rccl commit: 8250092367]
2022-12-13 16:05:09 -05:00
gilbertlee-amd 5871811d34 Graph unit tests (#656)
* Adding hipGraph unit tests

[ROCm/rccl commit: faed69f9fc]
2022-12-01 10:28:42 -07:00
Edgar f7ef619ba7 extending the unit-tests for multi-rank support
[ROCm/rccl commit: a87d61db2b]
2022-06-10 14:23:19 +00:00
gilbertlee-amd a2a4888497 Moving opt-in custom signal handler from UnitTests into RCCL (#550)
* Enable via RCCL_ENABLE_SIGNALHANDLER=1

[ROCm/rccl commit: 700b473211]
2022-05-20 09:56:38 -06:00
Edgar 1bfc5d06f8 add a signal handler and backtrace
Tweak the signal handler and force non-release build
Increase ulimit locked memory value
Update the singal handler to use bfd symbol resolution.
Include configure logic to find bfd functions.
Add optionally c++ function name demangling


[ROCm/rccl commit: 2bf6d254b6]
2022-04-25 10:48:17 -04:00
akolliasAMD 3493750b6b Added alltoallv test and optional args variable on collective args (#514)
* Added alltoallv test and optional args variable on collective args

[ROCm/rccl commit: 65ea3d80db]
2022-03-18 13:55:11 -04:00
gilbertlee-amd 8f7ec04f37 Changing initialization method for UnitTests (#510)
[ROCm/rccl commit: 0687940b84]
2022-03-07 09:22:55 -07:00
akolliasAMD 2419a950fe Added Unit test for nccl send recv (#506)
Added Send Receive test that tests through all pairs

[ROCm/rccl commit: ff54e79799]
2022-03-02 15:50:16 -05:00
gilbertlee-amd a182076a0e Unit test refactor (#500)
Refactoring and consolidating single-process / multi-process unit testing

[ROCm/rccl commit: 29ad0f5fbe]
2022-02-25 08:59:07 -07:00