rohit pathania
bc51b5bc28
display each workgroup ,links and directions with throughputs
...
[ROCm/rccl commit: e5b13d69e5 ]
2019-08-30 13:28:23 +05:30
Wenkai Du
04004816ba
Merge pull request #130 from wenkaidu/p2p_fix
...
Allocate opCount in pinned host memory for P2P transport
[ROCm/rccl commit: 9c501fb8fb ]
2019-08-29 14:12:03 -07:00
Wenkai Du
daf2c4b200
Allocate opCount in pinned host memory for P2P transport
...
To avoid remote P2P read access when checking remote GPU's opCount
[ROCm/rccl commit: 8c975353ed ]
2019-08-29 10:22:09 -07:00
amdkila
ea0ce5c064
Merge pull request #128 from amdkila/hip-clang
...
Added hip-clang options to install script, and openmp/pthread flags
[ROCm/rccl commit: 259583cde6 ]
2019-08-27 16:23:40 -06:00
Wenkai Du
96cab1f5f5
Merge pull request #127 from wenkaidu/rdma
...
Set RDMA default to off state
[ROCm/rccl commit: a4ef5a3dd4 ]
2019-08-26 11:46:10 -07:00
Wenkai Du
4afd6818ba
Set RDMA default to off state
...
[ROCm/rccl commit: 0f16ad966a ]
2019-08-26 10:59:33 -07:00
saadrahim
e433b21b23
Updating versioning to follow rocm-cmake standard ( #126 )
...
[ROCm/rccl commit: 544d4fb704 ]
2019-08-23 16:33:38 -06:00
Akila Premachandra
94b33a7550
Added hip-clang options to install script, and openmp/pthread options to CMakeLists.txt
...
[ROCm/rccl commit: f48ae5c98d ]
2019-08-23 22:02:42 +00:00
Wenkai Du
4df1defc3b
Merge pull request #125 from wenkaidu/fix_nvml_id
...
Assign unused nmvlDev to avoid random number
[ROCm/rccl commit: 6759660529 ]
2019-08-19 09:08:13 -07:00
Wenkai Du
54608abf5c
Merge pull request #117 from rpathani/xgmi_bench
...
Modified the code to use RTC clock frequency based on gpu gcn id
[ROCm/rccl commit: ee5dec4467 ]
2019-08-19 08:59:34 -07:00
rpathani
c441f2ff9b
Update rccl_prim_test.cpp
...
[ROCm/rccl commit: 40e30b5168 ]
2019-08-19 12:44:11 +05:30
Wenkai Du
175bf8e29e
Merge pull request #124 from wenkaidu/upstream_sync
...
Upstream sync
[ROCm/rccl commit: a67ae11ce4 ]
2019-08-16 16:41:55 -07:00
Wenkai Du
04cd446d89
Assign unused nmvlDev to avoid random number
...
[ROCm/rccl commit: 86efdfc3b5 ]
2019-08-16 16:34:14 -07:00
Wenkai Du
60989a3fc9
Merge remote-tracking branch 'remotes/nccl/master' into HEAD
...
[ROCm/rccl commit: 7c38da0939 ]
2019-08-16 16:13:34 -07:00
Wenkai Du
1658acbd78
Merge pull request #123 from wenkaidu/tune_unroll
...
Tune AUTOUNROLL for better performance
[ROCm/rccl commit: 72a64e27f3 ]
2019-08-16 11:15:49 -07:00
Wenkai Du
7396d5c3ba
Tune AUTOUNROLL for better performance
...
Also remove all unused UNROLL defines
[ROCm/rccl commit: 1faededc03 ]
2019-08-16 10:34:53 -07:00
rpathani
eaa1cdb48c
Merge branch 'master' into xgmi_bench
...
[ROCm/rccl commit: deea20d49c ]
2019-08-16 10:56:56 +05:30
Wenkai Du
761a2d2274
Merge pull request #121 from mhbliao/hliao/master/swdev-200061
...
Fix build with hip-clang.
[ROCm/rccl commit: 50c2202fe9 ]
2019-08-15 12:40:46 -07:00
Michael LIAO
f4a240065f
Fix build with hip-clang.
...
- Add necessary function attribute for HIP programming model.
- Explicitly include hsa headers.
[ROCm/rccl commit: 9369f8d75d ]
2019-08-15 14:56:04 -04:00
Wenkai Du
c920272a9e
Merge pull request #122 from wenkaidu/tune_ll
...
Tune LL threshold for VEGA
[ROCm/rccl commit: 3f6662f837 ]
2019-08-15 10:33:17 -07:00
Wenkai Du
d4862fa605
Tune LL threshold for VEGA
...
Also move abort check after SPINS_BEFORE_CHECK_ABORT as NCCL
[ROCm/rccl commit: 2223cccf15 ]
2019-08-15 09:16:11 -07:00
Wenkai Du
9f79f079f7
Merge pull request #120 from wenkaidu/rccl_2.4_update
...
RCCL 2.4 update
[ROCm/rccl commit: 9af66195db ]
2019-08-14 15:21:30 -07:00
Wenkai Du
93c44e96cb
Default to minimal 2 rings and improve LL loop
...
[ROCm/rccl commit: 4b77a16f3f ]
2019-08-14 14:12:56 -07:00
Wenkai Du
1feef99e7d
Remove duplicate line
...
[ROCm/rccl commit: 5782a8d857 ]
2019-08-14 13:22:43 -07:00
Wenkai Du
5971141b57
Merge remote-tracking branch 'remotes/rccl/master' into rccl_2.4_update
...
[ROCm/rccl commit: 6827b174c0 ]
2019-08-14 10:44:18 -07:00
Wenkai Du
6047487815
RCCL 2.4 update
...
[ROCm/rccl commit: f11c8f60cd ]
2019-08-14 10:42:35 -07:00
David Addison
221b65bee1
Merge branch 'lowintelligence-shm'
...
PR#196
[ROCm/rccl commit: ccb1298148 ]
2019-08-14 10:09:53 -07:00
David Addison
d57c0b0f92
Updated PR#196 to use a common hash function
...
[ROCm/rccl commit: fad079a8ae ]
2019-08-14 10:08:39 -07:00
David Addison
bb5b11fa23
Merge branch 'shm' of git://github.com/lowintelligence/nccl into lowintelligence-shm
...
[ROCm/rccl commit: 01d1836668 ]
2019-08-14 09:45:45 -07:00
rohit pathania
2dbcb62caf
Modified the code to use RTC clock frequency based on gpu gcn id
...
[ROCm/rccl commit: 65e2f5d87b ]
2019-08-14 12:55:12 +05:30
David Addison
c7957daee3
Make use of SO_REUSEPORT conditional
...
Fixes : #244
SO_RESUEPORT was introduced in Linux 3.9 and later.
This change allows NCCL to compile against older releases.
The functionality is only required if the user is specifying
a NCCL bootstrap address via an environment variable.
[ROCm/rccl commit: 7f2b337e70 ]
2019-08-13 16:32:07 -07:00
rpathani
2185206508
Adding linkinfo and srcGPU to destGPU info ( #114 )
...
* Adding linkinfo and srcGPU to destGPU info
[ROCm/rccl commit: 40445c17d8 ]
2019-08-13 09:25:03 -07:00
rohit pathania
042261445d
Merge branch 'xgmi_bench' of https://github.com/rpathani/rccl into xgmi_bench
...
# Conflicts:
# tools/rccl-prim-test/rccl_prim_test.cpp
[ROCm/rccl commit: 0f74929dab ]
2019-08-13 11:36:56 +05:30
rohit pathania
86f6d95b06
Adding linkinfo and srcGPU to destGPU info
...
[ROCm/rccl commit: 3bbf924ff8 ]
2019-08-13 11:28:50 +05:30
Stanley Tsang
d01dc8cf70
Merge pull request #116 from stanleytsang-amd/master
...
Removing unnecessary device collective source files.
[ROCm/rccl commit: b3a57dbb33 ]
2019-08-12 18:26:02 -04:00
Stanley Tsang
de09bece99
Removing unnecessary device collective source files.
...
[ROCm/rccl commit: 3a61907182 ]
2019-08-12 18:23:23 +00:00
rohit pathania
95162665c7
Adding linkinfo and srcGPU to destGPU info
...
[ROCm/rccl commit: 5a2f74b8d0 ]
2019-08-09 12:44:06 +05:30
gilbertlee-amd
8645391260
Adding TransferBench tool ( #113 )
...
* Adding standalone TransferBench tool
[ROCm/rccl commit: b8cf48fc16 ]
2019-08-07 17:21:41 -06:00
Wenkai Du
abab7569f9
Merge pull request #112 from wenkaidu/hdp
...
Get HDP register address from hipDeviceGetAttribute API
[ROCm/rccl commit: f1c727d4ce ]
2019-08-05 14:27:19 -07:00
Wenkai Du
909e014b51
Get HDP register address from hipDeviceGetAttribute API
...
[ROCm/rccl commit: 84d3344796 ]
2019-08-05 14:14:09 -07:00
Wenkai Du
b540c55c9b
Merge pull request #108 from wenkaidu/xgmi_finegrain
...
Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link
[ROCm/rccl commit: 4a9bdd8539 ]
2019-08-02 10:00:48 -07:00
Wenkai Du
fe2cb9f4cb
Merge pull request #110 from mhbliao/hliao/master/swdev-198268
...
Revise the previous fix to use the canonical path to HSA.
[ROCm/rccl commit: 315f792f83 ]
2019-08-01 12:46:25 -07:00
Michael LIAO
c14ef9f408
Revise the previous fix to use the canonical path to HSA.
...
- This fix the build failures under certain environments.
[ROCm/rccl commit: 4f2aa06688 ]
2019-08-01 14:50:44 -04:00
Wenkai Du
4d9eb5bd76
Merge pull request #107 from mhbliao/hliao/master/swdev-198268
...
Fix build with hip-clang
[ROCm/rccl commit: 9189279220 ]
2019-08-01 08:58:37 -07:00
Wenkai Du
2dcb42effd
Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link
...
[ROCm/rccl commit: e7022e9196 ]
2019-08-01 04:26:37 +00:00
Michael LIAO
4b5bf9f227
Fix build with hip-clang
...
Two minor issues are solved:
+ Enclose the kernel function with parenthesis as hip-clang defines
`hipLaunchKernelGGL` as macro.
+ Need to explicitly include <hsa.h> for hip-clang.
[ROCm/rccl commit: 41310144f6 ]
2019-07-31 15:07:36 -04:00
Cao Zongyan
d45a1180f7
Refine RPM package building spec file.
...
Add /sbin/ldconfig into RPM package install operations.
[ROCm/rccl commit: bfb3921519 ]
2019-07-31 10:36:22 -07:00
Wenkai Du
6688279075
Add gfx908 target ( #106 )
...
[ROCm/rccl commit: 1969e89003 ]
2019-07-30 13:56:45 -07:00
Wenkai Du
62e6e67e31
Remove extra "." from version string ( #104 )
...
[ROCm/rccl commit: 1fee6f9d50 ]
2019-07-25 15:25:02 -07:00
saadrahim
596e200499
Changing to rocm-cmake new style versioning ( #103 )
...
[ROCm/rccl commit: fdee095dd3 ]
2019-07-22 23:40:13 +00:00