gilbertlee-amd
a062c80298
[TransferBench] Displaying PCIe Bus ID ( #288 )
...
* Adding PCIe BusID per GPU in topology display
[ROCm/rccl commit: 61e1a71d14 ]
2020-10-21 16:13:36 -06:00
gilbertlee-amd
0282595de5
TransferBench Typo. Pinned host memory uses C not P ( #286 )
...
[ROCm/rccl commit: 769418c5c7 ]
2020-10-21 12:05:38 -06:00
saadrahim
5439649936
Adding sles15, centos7 and centos8 testing ( #283 )
...
[ROCm/rccl commit: e8177c9ee7 ]
2020-10-20 09:39:03 -06:00
Wenkai Du
1aae6b1344
Fix incorrect pointer checking for scatter and gather ( #285 )
...
[ROCm/rccl commit: dcad0ef7cb ]
2020-10-19 13:27:09 -07:00
gilbertlee-amd
f4b9a0d8e5
Removing unnecessary flags from CI ( #278 )
...
* Removing unnecessary flags from CI
* Re-adding HSA_FORCE_FINE_GRAIN_PCIE in CI
[ROCm/rccl commit: 9b3f762b68 ]
2020-10-19 13:08:24 -06:00
saadrahim
0465dffe6f
Updating copyright for documentation ( #282 )
...
[ROCm/rccl commit: 49aa6d7afe ]
2020-10-19 13:07:15 -06:00
Wenkai Du
d1781365d6
Merge pull request #279 from wenkaidu/nccl_sync
...
Sync up with latest NCCL master branch
[ROCm/rccl commit: a7deecb104 ]
2020-10-16 11:21:35 -07:00
Eiden Yoshida
f1dc3f1e86
Update sramecc and xnack to ANY ( #284 )
...
Co-authored-by: Tony <Tony.Tye@amd.com >
Co-authored-by: Wenkai Du<Wenkai.Du@amd.com >
[ROCm/rccl commit: 205b5507b4 ]
2020-10-16 00:25:18 -06:00
Wenkai Du
194135a40c
Merge remote-tracking branch 'nccl/master' into nccl_sync
...
[ROCm/rccl commit: c835d8263a ]
2020-10-15 18:42:38 -04:00
gilbertlee-amd
94437eef28
Revert "Initial support for clique-based kernels ( #276 )" ( #280 )
...
This reverts commit d68a532bc6 .
[ROCm/rccl commit: 84a2541e01 ]
2020-10-15 11:30:18 -07:00
Sylvain Jeaugey
591ffd32fe
Fix affinity move
...
[ROCm/rccl commit: 0e14394c5f ]
2020-10-13 16:58:05 -07:00
Sylvain Jeaugey
5de6b6681d
Make sure proxy threads inherit the CPU affinity.
...
[ROCm/rccl commit: c6dbdb0084 ]
2020-10-13 16:37:52 -07:00
Wenkai Du
8b120c0508
Update Rome single node models ( #277 )
...
[ROCm/rccl commit: 33babcb5e2 ]
2020-10-13 13:33:09 -07:00
gilbertlee-amd
d68a532bc6
Initial support for clique-based kernels ( #276 )
...
* Initial support for clique-based kernels
[ROCm/rccl commit: 2b8184808d ]
2020-10-13 11:22:04 -06:00
Wenkai Du
41260bb948
Rework Rome detection and add multiple network ports models ( #274 )
...
* Rework Rome detection and add multiple network ports models
* Remove unused opCount in p2p transport
[ROCm/rccl commit: ae008fd2db ]
2020-10-07 13:37:36 -07:00
Wenkai Du
e12db6f2ab
Don't download GTest unless building unit test ( #275 )
...
[ROCm/rccl commit: 88a062342b ]
2020-10-02 15:25:40 -07:00
Wenkai Du
dbde26e681
Add Alltoallv RCCL kernel implementation ( #269 )
...
* Add alltoallv API and implementation
* Extend Rome P2P channel limit to multinode and alltoall kernels
* topo_expl: fix compilation and sync up with main
* gtest: use RCCL alltoallv API
* Code review changes
[ROCm/rccl commit: b871ea3c0c ]
2020-09-30 16:25:36 -07:00
nunnikri
256de55920
SWDEV-253325 : Chaning amdgpu-target to cuda-gpu-arch ( #268 )
...
[ROCm/rccl commit: aa985bfb7e ]
2020-09-25 15:44:56 -06:00
Stanley Tsang
67a8d86d78
Updating inline asm to not require explicit L1 cache invalidation ( #270 )
...
[ROCm/rccl commit: acca2ae20a ]
2020-09-25 13:46:26 -06:00
gilbertlee-amd
5ca117d7cd
New TransferBench features ( #273 )
...
* Upgrading TransferBench to support pinned CPU memory, expanding functionality, cleaning up env vars
[ROCm/rccl commit: ee262819a7 ]
2020-09-25 12:20:48 -06:00
gilbertlee-amd
0a9adc16f4
Changes to topology based on XGMI ( #272 )
...
* Alterations to topology search to improve XGMI-enabled nodes
[ROCm/rccl commit: 01bd2573db ]
2020-09-25 12:20:09 -06:00
Wenkai Du
7ba087e069
Ensure all ranks on same send/receive or alltoall kernel path ( #271 )
...
[ROCm/rccl commit: 44fcde7835 ]
2020-09-24 08:25:04 -07:00
Wenkai Du
37f7eec6b7
Change network plugin name to librccl-net.so ( #266 )
...
[ROCm/rccl commit: d871fceb54 ]
2020-09-18 13:23:30 -07:00
Wenkai Du
2e5e4a6bde
Merge pull request #267 from wenkaidu/p2p
...
Limit P2P channels on Rome
[ROCm/rccl commit: 45a8f09e97 ]
2020-09-18 11:35:35 -07:00
Wenkai Du
f0a303664e
Limit P2P channels on Rome
...
[ROCm/rccl commit: 42955f5f4f ]
2020-09-17 17:20:32 -07:00
lijietang
f6b08ca547
Add rccl bw test script in tools ( #255 )
...
[ROCm/rccl commit: bbe233f8c1 ]
2020-09-11 16:59:03 +08:00
Stanley Tsang
209133fadf
Adding the ability to force install dependencies (namely gtest); gtest library installation fix for centos ( #265 )
...
* Adding the ability to force install dependencies (namely gtest); gtest library installation fix for centos
* Removing potentially unneccessary dependencies from install script
[ROCm/rccl commit: 8c90aefb6d ]
2020-09-10 17:27:22 -06:00
Wenkai Du
a3402d6aeb
Merge pull request #262 from wenkaidu/alignment
...
Make data alignment requirements matching ISA manual
[ROCm/rccl commit: 60819dcf8d ]
2020-09-08 10:40:42 -07:00
Stanley Tsang
818b44e27d
Adding XNACK flags. ( #264 )
...
* Adding XNACK flags.
[ROCm/rccl commit: f2e5db7bf7 ]
2020-09-08 11:36:30 -06:00
Aaron Enye Shi
0a3a397481
Add RCCL Static Lib Creation with -fgpu-rdc
...
RCCL uses -fgpu-rdc to compile its source objects. When linking
the RCCL static library, the link and archive step must do through
hipcc and uses the flag --emit-static-lib. When compiling
UnitTests, the librccl.a must be consumed through -l and -L.
[ROCm/rccl commit: 958b213428 ]
2020-09-03 11:25:41 -04:00
Wenkai Du
09639a5d54
Fix broken profiling build ( #263 )
...
[ROCm/rccl commit: e2042ccf8a ]
2020-09-02 15:39:52 -07:00
Wenkai Du
81bf52ddee
gtest: add alltoallv test
...
[ROCm/rccl commit: b163a8898f ]
2020-09-02 21:28:32 +00:00
Wenkai Du
cfa1228504
Make data alignment requirements matching ISA manual
...
From https://developer.amd.com/wp-content/resources/Vega_Shader_ISA.pdf
8.1.7. Alignment
For Dword or larger reads or writes, the two LSBs of the byte-address
are ignored, thus forcing Dword alignment.
[ROCm/rccl commit: 4751992231 ]
2020-09-01 21:21:58 +00:00
Wenkai Du
778ab61097
Fix incorrect threads split in sendrecv ( #261 )
...
[ROCm/rccl commit: 4180e6409e ]
2020-08-31 17:33:22 -07:00
Wenkai Du
03bb6bcb54
Increase minimal channels for gfx908 ( #259 )
...
[ROCm/rccl commit: c5cbece6d0 ]
2020-08-26 11:40:11 -07:00
Wenkai Du
0898fea746
Only use software barrier for synchronization ( #258 )
...
[ROCm/rccl commit: b0919dc46c ]
2020-08-25 13:16:34 -07:00
Wenkai Du
5f49a0e088
Add NPS4 support on some models ( #256 )
...
* Add NPS4 support on some models
* Add XML models
[ROCm/rccl commit: 391bbf3f1e ]
2020-08-19 11:03:20 -07:00
gilbertlee-amd
3e4ddd065b
Upgrading various TransferBench features ( #257 )
...
[ROCm/rccl commit: ec9af40fcd ]
2020-08-19 09:47:19 -06:00
Wenkai Du
3d5fb8142e
Add another Rome model ( #249 )
...
* Add another Rome model
* Add gfx908 4P3L models and support
* Revert "Use cached value for detecting GDR support only once"
This reverts commit 0108a1219d .
* Skip using ibverb for GPU direct RDMA detection
* Fine tune one Rome model
[ROCm/rccl commit: a51e4071e3 ]
2020-08-17 10:51:02 -07:00
gilbertlee-amd
1a9b00a7fd
Fixes to make TransferBench compile for hipclang ( #254 )
...
[ROCm/rccl commit: c985478133 ]
2020-08-13 12:25:28 -06:00
saadrahim
67bb880b8b
Adding gfx908 to CI ( #253 )
...
[ROCm/rccl commit: 6d8e19929c ]
2020-08-13 11:07:33 -06:00
Wenkai Du
f242a2f0b0
Collect gcnArch and hipDeviceArch_t in XML ( #252 )
...
[ROCm/rccl commit: 7e3d8a31cc ]
2020-08-12 15:48:38 -07:00
saadrahim
f309fb5b29
Cleaning up CI code be removing overrides ( #251 )
...
[ROCm/rccl commit: 50af2e9b66 ]
2020-08-12 12:38:10 -06:00
Wenkai Du
e5ec2d94d5
Merge pull request #248 from wenkaidu/2.7.8
...
2.7.8
[ROCm/rccl commit: 066223333d ]
2020-08-11 08:20:37 -07:00
Wenkai Du
14ad6ff3b4
Merge remote-tracking branch 'nccl/master' into 2.7.8
...
[ROCm/rccl commit: 7e3f841fab ]
2020-08-10 16:11:00 +00:00
Wenkai Du
26c540abb8
Merge pull request #247 from wenkaidu/rome
...
Additional Rome models support
[ROCm/rccl commit: 3c46cb8ad4 ]
2020-08-07 10:56:12 -07:00
MurtadhaAldallal
f1373612b0
Update rccl_prim_test.cpp ( #246 )
...
Adding doublelocalcopy operation and freeing buffer memory at end.
DoubleLocalCopy Patch Added
[ROCm/rccl commit: 390c63cf0d ]
2020-08-07 08:20:14 -07:00
Wenkai Du
c9815aaa36
Add more Rome 4P2H models
...
[ROCm/rccl commit: 09ef75656a ]
2020-08-06 18:20:02 +00:00
Stanley Tsang
bbc4b72ebe
Adding static library building option. ( #244 )
...
* Adding static library building option.
* Disabling running tests for static build
* Removing static packaging in CI
Co-authored-by: Saad Rahim <saad.rahim@amd.com >
[ROCm/rccl commit: c5d4d9eb76 ]
2020-08-06 11:19:43 -06:00
saadrahim
e5432857db
Download GTest if not found in system ( #237 )
...
Co-authored-by: Stanley Tsang <stanley.tsang@amd.com >
[ROCm/rccl commit: 0dc019e35f ]
2020-08-06 09:36:58 -06:00