Gráfico de Commits

764 Commits

Autor SHA1 Mensagem Data
Wenkai Du cd17cf6dce Update Rome model matching and add new models (#516)
* Update Rome model matching and add new models

* Add missing file

* Models update
2022-03-21 10:54:40 -07:00
akolliasAMD 65ea3d80db Added alltoallv test and optional args variable on collective args (#514)
* Added alltoallv test and optional args variable on collective args
2022-03-18 13:55:11 -04:00
nunnikri a04da71647 Merge pull request #511 from nunnikri/develop
File reorganization as per the new defined standard
2022-03-10 08:39:29 -08:00
Nirmal Unnikrishnan 115461cc04 File reorganization with backward compatibility
Updated the header file location and export path
2022-03-10 01:28:41 +00:00
Nirmal Unnikrishnan 676a4737c1 File reorganization as per the new defined standard
The header files will in /opt/rocm-xxx/include/rccl
Libraries and cmake will be in /opt/rocm-xxx/lib folder.
Added wrappers for header files using rocm-cmake functions for backward compatibility.
2022-03-08 17:32:02 +00:00
gilbertlee-amd 0687940b84 Changing initialization method for UnitTests (#510) 2022-03-07 09:22:55 -07:00
Wenkai Du d6d6af710e Force ring algorithm on single node (#509) 2022-03-04 10:29:02 -08:00
gilbertlee-amd b634b2f1c2 Adding NCCL_DEBUG=INFO for CI runs (#508) 2022-03-03 18:04:28 -07:00
gilbertlee-amd 699dc30f05 [UnitTests] Check process mask for custom tests (#507) 2022-03-02 17:24:14 -07:00
akolliasAMD ff54e79799 Added Unit test for nccl send recv (#506)
Added Send Receive test that tests through all pairs
2022-03-02 15:50:16 -05:00
gilbertlee-amd 29ad0f5fbe Unit test refactor (#500)
Refactoring and consolidating single-process / multi-process unit testing
2022-02-25 08:59:07 -07:00
Ziyue Yang b569c0a1db Add Pivot AllToAll algorithm for Rome model (#503)
* add a2a pivot interface

* remove debug info

* address comments

* fix bug

* remove custom script

* address comments

* fix bug
2022-02-20 21:09:47 -08:00
Wenkai Du 94e0dc8bfd Allow additional options to be passed in through model's definition (#501) 2022-02-17 08:28:58 -08:00
Wenkai Du 02096c9936 Add another Rome model (#497) 2022-02-12 10:30:16 -08:00
gilbertlee-amd f3c2cafd9d [TransferBench] Fix for cases with subsets of configured numa nodes (#495) 2022-02-07 12:16:19 -07:00
gilbertlee-amd 84d5fce7dd TransferBench: Adding ability to reindex GPUs based on PCIe address (#494) 2022-02-02 08:51:41 -07:00
Wenkai Du 400df49dbe Generate proper b-tree with non-repeating channels (#493) 2022-01-19 15:09:17 -08:00
Wenkai Du e94720fe3b Merge pull request #492 from wenkaidu/develop
Sync up with NCCL
2022-01-18 12:55:05 -08:00
Stanley Tsang b002ef1aaf Search for GTest 1.11; fix detection of GTest after install (#476) 2022-01-17 15:18:01 -07:00
Wenkai Du 973d0111db Merge remote-tracking branch 'nccl/master' into develop 2022-01-17 13:34:36 -08:00
Roopa Malavally 336ff4fe5f Add files via upload 2022-01-16 22:19:54 -08:00
Roopa Malavally f467fb57e8 Delete classification-maptest.xml 2022-01-16 22:19:33 -08:00
Roopa Malavally 62ef66b656 Add files via upload 2022-01-16 08:12:03 -08:00
Roopa Malavally 39cf27222c Add files via upload 2022-01-16 08:11:40 -08:00
Wenkai Du 598c6fdded Update Rome models (#491) 2022-01-14 10:03:30 -08:00
Wenkai Du 369c021992 topo_expl: update for 2.11.4 (#490)
* topo_expl: update for 2.11.4

* topo_expl: revert a few logging changes
2022-01-13 13:33:07 -08:00
Wenkai Du 6ff7690cb5 Use noinline and a few other fixes (#489)
* Use noinline and a few other fixes

* Tune collectives
2022-01-11 16:51:06 -08:00
gilbertlee-amd 7add135529 Updating CHANGELOG.md (#488) 2022-01-10 11:34:31 -07:00
Wenkai Du 3669e12432 Use hipGraph instead of cudaGraph (#487) 2022-01-10 08:26:01 -08:00
Wenkai Du 565fbeb5e9 Tune collectives for 2.11.4 (#486) 2022-01-10 08:25:47 -08:00
gilbertlee-amd 2530a2f084 [TransferBench] Updating for 2.11.4. Decoupling from RCCL kernel (#485) 2022-01-05 16:33:25 -07:00
Wenkai Du 6268b87c16 Unit tests: fix number of GPU detection (#484) 2022-01-05 15:06:12 -08:00
Wenkai Du 4234a638b5 Merge pull request #482 from ROCmSoftwarePlatform/2.11.4
Sync up with 2.11.4
2022-01-05 09:31:51 -08:00
Wenkai Du f8d0775a6f Add another Rome model (#483) 2022-01-05 09:26:31 -08:00
Wenkai Du 434ecb0e1f Merge remote-tracking branch 'origin/develop' into 2.11.4 2022-01-03 09:54:16 -08:00
Stanley Tsang bbbb35ceec Fixing setting of GPUs to 2 when 1 or less GPUs on system for unit tests (#481) 2021-12-09 11:04:31 -07:00
Chang Lan c5790b3672 Build fastsocket plugin from ext-net 2021-12-09 08:41:05 +01:00
Ke Wen c88c9f873f Add env NCCL_NET_DISABLE_INTRA
Disable NET transport for intra-node communication by setting the env to 1
It provides an option to error out instead of falling back to NET when superior intra-node transports (P2P and SHM) are unavailable
2021-12-08 16:28:19 +01:00
Wenkai Du a94b953bcc Update Rome model (#479) 2021-12-03 08:24:51 -08:00
Wenkai Du 8a08a2f579 Update tuning parameters (#478) 2021-11-30 08:51:11 -08:00
gilbertlee-amd 1157c2edfe [TransferBench] Adding more preset benchmarks to filter read mode, cpu vs gpu pairs (#477) 2021-11-24 18:05:37 -07:00
gilbertlee-amd 539de1216f Minor cppcheck fixes, adding suppression file (#475)
* Minor cppcheck fixes, adding suppression file
2021-11-24 10:23:59 -07:00
Wenkai Du e9bf01fb7e Determine fine grained memory availability at RCCL bootstrapping (#471) 2021-11-19 08:12:53 -08:00
Chris Jones 8cf7325d69 Perform busIdToInt64 on the stack.
I noticed when I enabled `NCCL_DEBUG_SUBSYS=ALLOC` that this function is
called thousands of times, making the log output unintelligible.
Fortunately, this function can be implemented without heap allocations.
2021-11-19 09:35:55 +01:00
Stanley Tsang 7b8b54955b Set ROCM_PATH CMake variable in install script (#470)
* Fixing cmake_install_prefix search to include /opt/rocm-xxxx

* Removing all hard references to /opt/rocm with ROCM_PATH

* Setting ROCM_PATH CMake variable in install script
2021-11-18 14:44:19 -07:00
Wenkai Du 03a830293c gtest: dynamically generate tests based on test machine's GPU count (#467)
* gtest: dynamically generate tests based on test machine's GPU count

* Adjust test element size and bfloat16 threshold for up to 16 GPUs
2021-11-16 10:28:26 -08:00
Stanley Tsang a6dba6b9dd Remove hardcoded references to /opt/rocm when using chrpath (#469)
* Fixing cmake_install_prefix search to include /opt/rocm-xxxx

* Removing all hard references to /opt/rocm with ROCM_PATH
2021-11-15 15:00:55 -07:00
Wenkai Du 3a919c1f49 Merge remote-tracking branch 'nccl/master' into develop 2021-11-11 14:22:12 -08:00
Wenkai Du e05de8fd26 Remove extra work element copy (#465) 2021-11-09 13:52:03 -08:00
gilbertlee-amd 1c7ef1b790 [TransferBench] Adding #CUs / RRLW mode to p2p benchmark (#464) 2021-11-08 14:36:04 -07:00