Wenkai Du
e2042ccf8a
Fix broken profiling build ( #263 )
2020-09-02 15:39:52 -07:00
Wenkai Du
4180e6409e
Fix incorrect threads split in sendrecv ( #261 )
2020-08-31 17:33:22 -07:00
Wenkai Du
c5cbece6d0
Increase minimal channels for gfx908 ( #259 )
2020-08-26 11:40:11 -07:00
Wenkai Du
b0919dc46c
Only use software barrier for synchronization ( #258 )
2020-08-25 13:16:34 -07:00
Wenkai Du
391bbf3f1e
Add NPS4 support on some models ( #256 )
...
* Add NPS4 support on some models
* Add XML models
2020-08-19 11:03:20 -07:00
gilbertlee-amd
ec9af40fcd
Upgrading various TransferBench features ( #257 )
2020-08-19 09:47:19 -06:00
Wenkai Du
a51e4071e3
Add another Rome model ( #249 )
...
* Add another Rome model
* Add gfx908 4P3L models and support
* Revert "Use cached value for detecting GDR support only once"
This reverts commit 67c8e72ce3 .
* Skip using ibverb for GPU direct RDMA detection
* Fine tune one Rome model
2020-08-17 10:51:02 -07:00
gilbertlee-amd
c985478133
Fixes to make TransferBench compile for hipclang ( #254 )
2020-08-13 12:25:28 -06:00
saadrahim
6d8e19929c
Adding gfx908 to CI ( #253 )
2020-08-13 11:07:33 -06:00
Wenkai Du
7e3d8a31cc
Collect gcnArch and hipDeviceArch_t in XML ( #252 )
2020-08-12 15:48:38 -07:00
saadrahim
50af2e9b66
Cleaning up CI code be removing overrides ( #251 )
2020-08-12 12:38:10 -06:00
Wenkai Du
066223333d
Merge pull request #248 from wenkaidu/2.7.8
...
2.7.8
2020-08-11 08:20:37 -07:00
Wenkai Du
7e3f841fab
Merge remote-tracking branch 'nccl/master' into 2.7.8
2020-08-10 16:11:00 +00:00
Wenkai Du
3c46cb8ad4
Merge pull request #247 from wenkaidu/rome
...
Additional Rome models support
2020-08-07 10:56:12 -07:00
MurtadhaAldallal
390c63cf0d
Update rccl_prim_test.cpp ( #246 )
...
Adding doublelocalcopy operation and freeing buffer memory at end.
DoubleLocalCopy Patch Added
2020-08-07 08:20:14 -07:00
Wenkai Du
09ef75656a
Add more Rome 4P2H models
2020-08-06 18:20:02 +00:00
Stanley Tsang
c5d4d9eb76
Adding static library building option. ( #244 )
...
* Adding static library building option.
* Disabling running tests for static build
* Removing static packaging in CI
Co-authored-by: Saad Rahim <saad.rahim@amd.com >
2020-08-06 11:19:43 -06:00
saadrahim
0dc019e35f
Download GTest if not found in system ( #237 )
...
Co-authored-by: Stanley Tsang <stanley.tsang@amd.com >
2020-08-06 09:36:58 -06:00
Jack Snyder
de49a77074
Setting type when gpu sub node is discovered
2020-08-05 13:39:23 -07:00
Sylvain Jeaugey
3d63f89068
Merge pull request #364 from badgerious/net-class
...
Add GPUs and NICs based on XML sub tags instead of PCI class.
2020-08-05 12:52:38 -07:00
Eric Badger
700c0e0f24
Don't require NIC devices to have specific PCI class
...
If a PCI node is the parent of a NIC, treat it as such, regardless of
the PCI class code for the device. This allows non-traditional devices
to act as NICs via the net plugin mechanism.
For consistency, treat GPUs similarly.
2020-08-05 12:46:29 -07:00
Wenkai Du
5b03132ace
Allow setup ring through NCCL_RINGS to facilitate testing
2020-08-04 21:07:00 +00:00
Wenkai Du
d1e20b4c5e
Improve 4P2H topology on Rome ( #243 )
...
1. Use bi-directional rings
2. GPU search is sorted by PCI device ID to get consistent results
2020-07-28 14:21:44 -07:00
David Addison
033d799524
2.7.8-1
...
Fix collective mismatch error when using ncclSend/ncclRecv
2020-07-27 16:34:09 -07:00
Wenkai Du
e7a10aa0e4
Topology tuning for 4P2H on Rome ( #242 )
...
* Topology tuning for 4P2H on Rome
* Use ncclTopoIdToIndex
2020-07-27 11:53:57 -07:00
Wenkai Du
8d5fb920b6
ib-test: support multiple channels ( #241 )
2020-07-27 11:03:12 -07:00
Sourav Chakraborty
fe3d520601
Merge pull request #240 from ROCmSoftwarePlatform/sourav/topo-expl-1
...
simplify model definitions in topo expl
2020-07-22 12:35:17 -05:00
Sourav Chakraborty
2475daafee
add 4 node 8P6L 1 NIC 2nd Hive model
2020-07-22 16:27:15 +00:00
Sourav Chakraborty
db55afb014
simplify model definitions in topo expl
2020-07-22 16:05:53 +00:00
Wenkai Du
d5f90e19b5
Add 8P6L multi-node models ( #239 )
2020-07-21 14:10:36 -07:00
Stanley Tsang
684f3e6af4
Adding better naming to unit tests for filtering; adding short and full unit test suites ( #235 )
2020-07-21 12:19:47 -06:00
Wenkai Du
35c5a7fe45
Fix RCCL build package name ( #236 )
2020-07-20 14:43:00 -07:00
saadrahim
99a491273f
Changing GTest inclusion in cmake to use find_package ( #234 )
...
* GTest is used via find_package. No longer downloaded in cmake.
* Adding error handling
2020-07-15 20:51:48 -06:00
saadrahim
7f93aa7e53
Changing dependency to hip-rocclr ( #228 )
2020-07-14 17:49:56 -06:00
Wenkai Du
ab787c767e
Change default channels duplication for chordal ring ( #233 )
2020-07-14 15:16:50 -07:00
gilbertlee-amd
f87ba17737
Removing UnitTest as install, removing unused env var ( #231 )
2020-07-10 09:30:28 -06:00
Wenkai Du
5215130168
Revert "Split primitive class to smaller structures" ( #230 )
...
This reverts commit 486fd436af .
2020-07-08 11:06:50 -07:00
Wenkai Du
1addf4f196
Match RCCL package name to API version ( #229 )
2020-07-07 13:30:39 -07:00
Riatre Foo
2d8601701d
Fix build action order
...
Add $(INCTARGETS) to build dependencies of %.o and $(DEVICELIB).
As there were no dep files during the first build, Make may kick off source
compilation before nccl.h got generated, which leads to occasional build
failures on systems with high core count. The build failure could be
reproduced reliably with a `sleep 5` in $(INCDIR)/nccl.h rule.
2020-07-07 10:20:51 -07:00
Stanley Tsang
9bd4c14603
Adding appropriate references in rccl-prim-test ( #227 )
...
Adding appropriate references to rccl-prim-test.
2020-07-06 10:15:03 -06:00
Wenkai Du
ecae1cd76a
Merge pull request #226 from wenkaidu/develop
...
Sync up to NCCL 2.7.6
2020-07-06 09:10:09 -07:00
Wenkai Du
da3b197d6c
Merge remote-tracking branch 'nccl/master' into develop
2020-07-01 16:51:25 -07:00
Wenkai Du
d3548cc474
topo_expl: each rank needs to have its own memory for graphs ( #225 )
2020-07-01 15:11:02 -07:00
Wenkai Du
a6be82f5ab
topo_expl: fix broken build ( #224 )
2020-06-30 11:11:23 -07:00
Wenkai Du
a144a85465
Merge pull request #223 from wenkaidu/sendrecv
...
Use separate threads for send and receive
2020-06-30 10:50:06 -07:00
Wenkai Du
8db0aa8f4c
gtest: extend testing up to 8 GPUs
2020-06-29 09:32:31 -07:00
Wenkai Du
964c4c2061
Merge sendrecv kernel from NCCL 2.7.3
...
This commit was cherry-picked and modified from
https://github.com/NVIDIA/nccl/commit/5949d96f36d050e59d05872f8bbffd2549318e95
2020-06-29 08:47:46 -07:00
Wenkai Du
b90735c935
Use separate threads for send and receive
2020-06-29 08:47:15 -07:00
Sylvain Jeaugey
1952325569
2.7.6-1
...
Fix crash when NVswitch is not visible inside a VM.
2020-06-26 16:35:54 -07:00
Sylvain Jeaugey
01afd20a77
2.7.5-1
...
Minor fixes for A100 platforms.
Add a WARN for invalid GroupEnd call.
2020-06-26 14:39:49 -07:00