Commit Graph

3 Commits

Author SHA1 Message Date
gilbertlee-amd f4a12be69b Clique tuning upgrade (#352)
* Enabling clique for any XGMI-connected topology, adding tuning
* Updating CHANGELOG for clique tuning
* Re-working clique barrier system to work on multi-process / multi-gpu

[ROCm/rccl commit: 9d7232c091]
2021-05-06 09:50:07 -06:00
Stanley Tsang f152c8d160 Update MP UT to support arbitrary # of GPUs; multiple bugfixes (#16)
* Fixing temp file creation/deletion for Clique kernel mode.

* Refactoring of MP unit tests; include bugfixes and general support for any number of GPUs

* GroupCall MP UT properly quits when too many devices specified

* MP UT will programmatically set NCCL_COMM_ID if not specified; updated install script

[ROCm/rccl commit: d00b7d17bd]
2021-02-05 16:49:25 -08:00
Stanley Tsang d7ed44eb9a Adding multiprocess unit tests (#312)
Adding multiprocess unit tests for collectives.  

To run, NCCL_COMM_ID=$HOSTNAME:12345 build/release/test/UnitTestsMultiProcess

[ROCm/rccl commit: d3fa257682]
2021-01-15 16:34:36 -07:00