Graphe des révisions

243 Révisions

Auteur SHA1 Message Date
Wenkai Du daf2c4b200 Allocate opCount in pinned host memory for P2P transport
To avoid remote P2P read access when checking remote GPU's opCount


[ROCm/rccl commit: 8c975353ed]
2019-08-29 10:22:09 -07:00
amdkila ea0ce5c064 Merge pull request #128 from amdkila/hip-clang
Added hip-clang options to install script, and openmp/pthread flags

[ROCm/rccl commit: 259583cde6]
2019-08-27 16:23:40 -06:00
Wenkai Du 96cab1f5f5 Merge pull request #127 from wenkaidu/rdma
Set RDMA default to off state

[ROCm/rccl commit: a4ef5a3dd4]
2019-08-26 11:46:10 -07:00
Wenkai Du 4afd6818ba Set RDMA default to off state
[ROCm/rccl commit: 0f16ad966a]
2019-08-26 10:59:33 -07:00
saadrahim e433b21b23 Updating versioning to follow rocm-cmake standard (#126)
[ROCm/rccl commit: 544d4fb704]
2019-08-23 16:33:38 -06:00
Akila Premachandra 94b33a7550 Added hip-clang options to install script, and openmp/pthread options to CMakeLists.txt
[ROCm/rccl commit: f48ae5c98d]
2019-08-23 22:02:42 +00:00
Wenkai Du 4df1defc3b Merge pull request #125 from wenkaidu/fix_nvml_id
Assign unused nmvlDev to avoid random number

[ROCm/rccl commit: 6759660529]
2019-08-19 09:08:13 -07:00
Wenkai Du 54608abf5c Merge pull request #117 from rpathani/xgmi_bench
Modified the code to use RTC clock frequency based on gpu gcn id

[ROCm/rccl commit: ee5dec4467]
2019-08-19 08:59:34 -07:00
rpathani c441f2ff9b Update rccl_prim_test.cpp
[ROCm/rccl commit: 40e30b5168]
2019-08-19 12:44:11 +05:30
Wenkai Du 175bf8e29e Merge pull request #124 from wenkaidu/upstream_sync
Upstream sync

[ROCm/rccl commit: a67ae11ce4]
2019-08-16 16:41:55 -07:00
Wenkai Du 04cd446d89 Assign unused nmvlDev to avoid random number
[ROCm/rccl commit: 86efdfc3b5]
2019-08-16 16:34:14 -07:00
Wenkai Du 60989a3fc9 Merge remote-tracking branch 'remotes/nccl/master' into HEAD
[ROCm/rccl commit: 7c38da0939]
2019-08-16 16:13:34 -07:00
Wenkai Du 1658acbd78 Merge pull request #123 from wenkaidu/tune_unroll
Tune AUTOUNROLL for better performance

[ROCm/rccl commit: 72a64e27f3]
2019-08-16 11:15:49 -07:00
Wenkai Du 7396d5c3ba Tune AUTOUNROLL for better performance
Also remove all unused UNROLL defines


[ROCm/rccl commit: 1faededc03]
2019-08-16 10:34:53 -07:00
rpathani eaa1cdb48c Merge branch 'master' into xgmi_bench
[ROCm/rccl commit: deea20d49c]
2019-08-16 10:56:56 +05:30
Wenkai Du 761a2d2274 Merge pull request #121 from mhbliao/hliao/master/swdev-200061
Fix build with hip-clang.

[ROCm/rccl commit: 50c2202fe9]
2019-08-15 12:40:46 -07:00
Michael LIAO f4a240065f Fix build with hip-clang.
- Add necessary function attribute for HIP programming model.
- Explicitly include hsa headers.


[ROCm/rccl commit: 9369f8d75d]
2019-08-15 14:56:04 -04:00
Wenkai Du c920272a9e Merge pull request #122 from wenkaidu/tune_ll
Tune LL threshold for VEGA

[ROCm/rccl commit: 3f6662f837]
2019-08-15 10:33:17 -07:00
Wenkai Du d4862fa605 Tune LL threshold for VEGA
Also move abort check after SPINS_BEFORE_CHECK_ABORT as NCCL


[ROCm/rccl commit: 2223cccf15]
2019-08-15 09:16:11 -07:00
Wenkai Du 9f79f079f7 Merge pull request #120 from wenkaidu/rccl_2.4_update
RCCL 2.4 update

[ROCm/rccl commit: 9af66195db]
2019-08-14 15:21:30 -07:00
Wenkai Du 93c44e96cb Default to minimal 2 rings and improve LL loop
[ROCm/rccl commit: 4b77a16f3f]
2019-08-14 14:12:56 -07:00
Wenkai Du 1feef99e7d Remove duplicate line
[ROCm/rccl commit: 5782a8d857]
2019-08-14 13:22:43 -07:00
Wenkai Du 5971141b57 Merge remote-tracking branch 'remotes/rccl/master' into rccl_2.4_update
[ROCm/rccl commit: 6827b174c0]
2019-08-14 10:44:18 -07:00
Wenkai Du 6047487815 RCCL 2.4 update
[ROCm/rccl commit: f11c8f60cd]
2019-08-14 10:42:35 -07:00
David Addison 221b65bee1 Merge branch 'lowintelligence-shm'
PR#196


[ROCm/rccl commit: ccb1298148]
2019-08-14 10:09:53 -07:00
David Addison d57c0b0f92 Updated PR#196 to use a common hash function
[ROCm/rccl commit: fad079a8ae]
2019-08-14 10:08:39 -07:00
David Addison bb5b11fa23 Merge branch 'shm' of git://github.com/lowintelligence/nccl into lowintelligence-shm
[ROCm/rccl commit: 01d1836668]
2019-08-14 09:45:45 -07:00
rohit pathania 2dbcb62caf Modified the code to use RTC clock frequency based on gpu gcn id
[ROCm/rccl commit: 65e2f5d87b]
2019-08-14 12:55:12 +05:30
David Addison c7957daee3 Make use of SO_REUSEPORT conditional
Fixes: #244

SO_RESUEPORT was introduced in Linux 3.9 and later.
This change allows NCCL to compile against older releases.

The functionality is only required if the user is specifying
a NCCL bootstrap address via an environment variable.


[ROCm/rccl commit: 7f2b337e70]
2019-08-13 16:32:07 -07:00
rpathani 2185206508 Adding linkinfo and srcGPU to destGPU info (#114)
* Adding linkinfo and srcGPU to destGPU info

[ROCm/rccl commit: 40445c17d8]
2019-08-13 09:25:03 -07:00
rohit pathania 042261445d Merge branch 'xgmi_bench' of https://github.com/rpathani/rccl into xgmi_bench
# Conflicts:
#	tools/rccl-prim-test/rccl_prim_test.cpp


[ROCm/rccl commit: 0f74929dab]
2019-08-13 11:36:56 +05:30
rohit pathania 86f6d95b06 Adding linkinfo and srcGPU to destGPU info
[ROCm/rccl commit: 3bbf924ff8]
2019-08-13 11:28:50 +05:30
Stanley Tsang d01dc8cf70 Merge pull request #116 from stanleytsang-amd/master
Removing unnecessary device collective source files.

[ROCm/rccl commit: b3a57dbb33]
2019-08-12 18:26:02 -04:00
Stanley Tsang de09bece99 Removing unnecessary device collective source files.
[ROCm/rccl commit: 3a61907182]
2019-08-12 18:23:23 +00:00
rohit pathania 95162665c7 Adding linkinfo and srcGPU to destGPU info
[ROCm/rccl commit: 5a2f74b8d0]
2019-08-09 12:44:06 +05:30
gilbertlee-amd 8645391260 Adding TransferBench tool (#113)
* Adding standalone TransferBench tool

[ROCm/rccl commit: b8cf48fc16]
2019-08-07 17:21:41 -06:00
Wenkai Du abab7569f9 Merge pull request #112 from wenkaidu/hdp
Get HDP register address from hipDeviceGetAttribute API

[ROCm/rccl commit: f1c727d4ce]
2019-08-05 14:27:19 -07:00
Wenkai Du 909e014b51 Get HDP register address from hipDeviceGetAttribute API
[ROCm/rccl commit: 84d3344796]
2019-08-05 14:14:09 -07:00
Wenkai Du b540c55c9b Merge pull request #108 from wenkaidu/xgmi_finegrain
Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link

[ROCm/rccl commit: 4a9bdd8539]
2019-08-02 10:00:48 -07:00
Wenkai Du fe2cb9f4cb Merge pull request #110 from mhbliao/hliao/master/swdev-198268
Revise the previous fix to use the canonical path to HSA.

[ROCm/rccl commit: 315f792f83]
2019-08-01 12:46:25 -07:00
Michael LIAO c14ef9f408 Revise the previous fix to use the canonical path to HSA.
- This fix the build failures under certain environments.


[ROCm/rccl commit: 4f2aa06688]
2019-08-01 14:50:44 -04:00
Wenkai Du 4d9eb5bd76 Merge pull request #107 from mhbliao/hliao/master/swdev-198268
Fix build with hip-clang

[ROCm/rccl commit: 9189279220]
2019-08-01 08:58:37 -07:00
Wenkai Du 2dcb42effd Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link
[ROCm/rccl commit: e7022e9196]
2019-08-01 04:26:37 +00:00
Michael LIAO 4b5bf9f227 Fix build with hip-clang
Two minor issues are solved:
+ Enclose the kernel function with parenthesis as hip-clang defines
  `hipLaunchKernelGGL` as macro.
+ Need to explicitly include <hsa.h> for hip-clang.


[ROCm/rccl commit: 41310144f6]
2019-07-31 15:07:36 -04:00
Cao Zongyan d45a1180f7 Refine RPM package building spec file.
Add /sbin/ldconfig into RPM package install operations.


[ROCm/rccl commit: bfb3921519]
2019-07-31 10:36:22 -07:00
Wenkai Du 6688279075 Add gfx908 target (#106)
[ROCm/rccl commit: 1969e89003]
2019-07-30 13:56:45 -07:00
Wenkai Du 62e6e67e31 Remove extra "." from version string (#104)
[ROCm/rccl commit: 1fee6f9d50]
2019-07-25 15:25:02 -07:00
saadrahim 596e200499 Changing to rocm-cmake new style versioning (#103)
[ROCm/rccl commit: fdee095dd3]
2019-07-22 23:40:13 +00:00
Wenkai Du d7f25d5be7 Use hipExtLaunchMultiKernelMultiDevice API (#100)
Depends on HIP version with this pull request:
https://github.com/ROCm-Developer-Tools/HIP/pull/1232

[ROCm/rccl commit: 0522041fac]
2019-07-18 09:02:37 -07:00
Ke Wen a66ab68630 Fix NIC distances for 11+ NICs
[ROCm/rccl commit: 4d579e51cc]
2019-07-17 06:32:33 -07:00