コミットグラフ

245 コミット

作成者 SHA1 メッセージ 日付
rohit pathania bc51b5bc28 display each workgroup ,links and directions with throughputs
[ROCm/rccl commit: e5b13d69e5]
2019-08-30 13:28:23 +05:30
Wenkai Du 04004816ba Merge pull request #130 from wenkaidu/p2p_fix
Allocate opCount in pinned host memory for P2P transport

[ROCm/rccl commit: 9c501fb8fb]
2019-08-29 14:12:03 -07:00
Wenkai Du daf2c4b200 Allocate opCount in pinned host memory for P2P transport
To avoid remote P2P read access when checking remote GPU's opCount


[ROCm/rccl commit: 8c975353ed]
2019-08-29 10:22:09 -07:00
amdkila ea0ce5c064 Merge pull request #128 from amdkila/hip-clang
Added hip-clang options to install script, and openmp/pthread flags

[ROCm/rccl commit: 259583cde6]
2019-08-27 16:23:40 -06:00
Wenkai Du 96cab1f5f5 Merge pull request #127 from wenkaidu/rdma
Set RDMA default to off state

[ROCm/rccl commit: a4ef5a3dd4]
2019-08-26 11:46:10 -07:00
Wenkai Du 4afd6818ba Set RDMA default to off state
[ROCm/rccl commit: 0f16ad966a]
2019-08-26 10:59:33 -07:00
saadrahim e433b21b23 Updating versioning to follow rocm-cmake standard (#126)
[ROCm/rccl commit: 544d4fb704]
2019-08-23 16:33:38 -06:00
Akila Premachandra 94b33a7550 Added hip-clang options to install script, and openmp/pthread options to CMakeLists.txt
[ROCm/rccl commit: f48ae5c98d]
2019-08-23 22:02:42 +00:00
Wenkai Du 4df1defc3b Merge pull request #125 from wenkaidu/fix_nvml_id
Assign unused nmvlDev to avoid random number

[ROCm/rccl commit: 6759660529]
2019-08-19 09:08:13 -07:00
Wenkai Du 54608abf5c Merge pull request #117 from rpathani/xgmi_bench
Modified the code to use RTC clock frequency based on gpu gcn id

[ROCm/rccl commit: ee5dec4467]
2019-08-19 08:59:34 -07:00
rpathani c441f2ff9b Update rccl_prim_test.cpp
[ROCm/rccl commit: 40e30b5168]
2019-08-19 12:44:11 +05:30
Wenkai Du 175bf8e29e Merge pull request #124 from wenkaidu/upstream_sync
Upstream sync

[ROCm/rccl commit: a67ae11ce4]
2019-08-16 16:41:55 -07:00
Wenkai Du 04cd446d89 Assign unused nmvlDev to avoid random number
[ROCm/rccl commit: 86efdfc3b5]
2019-08-16 16:34:14 -07:00
Wenkai Du 60989a3fc9 Merge remote-tracking branch 'remotes/nccl/master' into HEAD
[ROCm/rccl commit: 7c38da0939]
2019-08-16 16:13:34 -07:00
Wenkai Du 1658acbd78 Merge pull request #123 from wenkaidu/tune_unroll
Tune AUTOUNROLL for better performance

[ROCm/rccl commit: 72a64e27f3]
2019-08-16 11:15:49 -07:00
Wenkai Du 7396d5c3ba Tune AUTOUNROLL for better performance
Also remove all unused UNROLL defines


[ROCm/rccl commit: 1faededc03]
2019-08-16 10:34:53 -07:00
rpathani eaa1cdb48c Merge branch 'master' into xgmi_bench
[ROCm/rccl commit: deea20d49c]
2019-08-16 10:56:56 +05:30
Wenkai Du 761a2d2274 Merge pull request #121 from mhbliao/hliao/master/swdev-200061
Fix build with hip-clang.

[ROCm/rccl commit: 50c2202fe9]
2019-08-15 12:40:46 -07:00
Michael LIAO f4a240065f Fix build with hip-clang.
- Add necessary function attribute for HIP programming model.
- Explicitly include hsa headers.


[ROCm/rccl commit: 9369f8d75d]
2019-08-15 14:56:04 -04:00
Wenkai Du c920272a9e Merge pull request #122 from wenkaidu/tune_ll
Tune LL threshold for VEGA

[ROCm/rccl commit: 3f6662f837]
2019-08-15 10:33:17 -07:00
Wenkai Du d4862fa605 Tune LL threshold for VEGA
Also move abort check after SPINS_BEFORE_CHECK_ABORT as NCCL


[ROCm/rccl commit: 2223cccf15]
2019-08-15 09:16:11 -07:00
Wenkai Du 9f79f079f7 Merge pull request #120 from wenkaidu/rccl_2.4_update
RCCL 2.4 update

[ROCm/rccl commit: 9af66195db]
2019-08-14 15:21:30 -07:00
Wenkai Du 93c44e96cb Default to minimal 2 rings and improve LL loop
[ROCm/rccl commit: 4b77a16f3f]
2019-08-14 14:12:56 -07:00
Wenkai Du 1feef99e7d Remove duplicate line
[ROCm/rccl commit: 5782a8d857]
2019-08-14 13:22:43 -07:00
Wenkai Du 5971141b57 Merge remote-tracking branch 'remotes/rccl/master' into rccl_2.4_update
[ROCm/rccl commit: 6827b174c0]
2019-08-14 10:44:18 -07:00
Wenkai Du 6047487815 RCCL 2.4 update
[ROCm/rccl commit: f11c8f60cd]
2019-08-14 10:42:35 -07:00
David Addison 221b65bee1 Merge branch 'lowintelligence-shm'
PR#196


[ROCm/rccl commit: ccb1298148]
2019-08-14 10:09:53 -07:00
David Addison d57c0b0f92 Updated PR#196 to use a common hash function
[ROCm/rccl commit: fad079a8ae]
2019-08-14 10:08:39 -07:00
David Addison bb5b11fa23 Merge branch 'shm' of git://github.com/lowintelligence/nccl into lowintelligence-shm
[ROCm/rccl commit: 01d1836668]
2019-08-14 09:45:45 -07:00
rohit pathania 2dbcb62caf Modified the code to use RTC clock frequency based on gpu gcn id
[ROCm/rccl commit: 65e2f5d87b]
2019-08-14 12:55:12 +05:30
David Addison c7957daee3 Make use of SO_REUSEPORT conditional
Fixes: #244

SO_RESUEPORT was introduced in Linux 3.9 and later.
This change allows NCCL to compile against older releases.

The functionality is only required if the user is specifying
a NCCL bootstrap address via an environment variable.


[ROCm/rccl commit: 7f2b337e70]
2019-08-13 16:32:07 -07:00
rpathani 2185206508 Adding linkinfo and srcGPU to destGPU info (#114)
* Adding linkinfo and srcGPU to destGPU info

[ROCm/rccl commit: 40445c17d8]
2019-08-13 09:25:03 -07:00
rohit pathania 042261445d Merge branch 'xgmi_bench' of https://github.com/rpathani/rccl into xgmi_bench
# Conflicts:
#	tools/rccl-prim-test/rccl_prim_test.cpp


[ROCm/rccl commit: 0f74929dab]
2019-08-13 11:36:56 +05:30
rohit pathania 86f6d95b06 Adding linkinfo and srcGPU to destGPU info
[ROCm/rccl commit: 3bbf924ff8]
2019-08-13 11:28:50 +05:30
Stanley Tsang d01dc8cf70 Merge pull request #116 from stanleytsang-amd/master
Removing unnecessary device collective source files.

[ROCm/rccl commit: b3a57dbb33]
2019-08-12 18:26:02 -04:00
Stanley Tsang de09bece99 Removing unnecessary device collective source files.
[ROCm/rccl commit: 3a61907182]
2019-08-12 18:23:23 +00:00
rohit pathania 95162665c7 Adding linkinfo and srcGPU to destGPU info
[ROCm/rccl commit: 5a2f74b8d0]
2019-08-09 12:44:06 +05:30
gilbertlee-amd 8645391260 Adding TransferBench tool (#113)
* Adding standalone TransferBench tool

[ROCm/rccl commit: b8cf48fc16]
2019-08-07 17:21:41 -06:00
Wenkai Du abab7569f9 Merge pull request #112 from wenkaidu/hdp
Get HDP register address from hipDeviceGetAttribute API

[ROCm/rccl commit: f1c727d4ce]
2019-08-05 14:27:19 -07:00
Wenkai Du 909e014b51 Get HDP register address from hipDeviceGetAttribute API
[ROCm/rccl commit: 84d3344796]
2019-08-05 14:14:09 -07:00
Wenkai Du b540c55c9b Merge pull request #108 from wenkaidu/xgmi_finegrain
Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link

[ROCm/rccl commit: 4a9bdd8539]
2019-08-02 10:00:48 -07:00
Wenkai Du fe2cb9f4cb Merge pull request #110 from mhbliao/hliao/master/swdev-198268
Revise the previous fix to use the canonical path to HSA.

[ROCm/rccl commit: 315f792f83]
2019-08-01 12:46:25 -07:00
Michael LIAO c14ef9f408 Revise the previous fix to use the canonical path to HSA.
- This fix the build failures under certain environments.


[ROCm/rccl commit: 4f2aa06688]
2019-08-01 14:50:44 -04:00
Wenkai Du 4d9eb5bd76 Merge pull request #107 from mhbliao/hliao/master/swdev-198268
Fix build with hip-clang

[ROCm/rccl commit: 9189279220]
2019-08-01 08:58:37 -07:00
Wenkai Du 2dcb42effd Remove dependency to HSA_FORCE_FINE_GRAIN_PCIE flag for XGMI link
[ROCm/rccl commit: e7022e9196]
2019-08-01 04:26:37 +00:00
Michael LIAO 4b5bf9f227 Fix build with hip-clang
Two minor issues are solved:
+ Enclose the kernel function with parenthesis as hip-clang defines
  `hipLaunchKernelGGL` as macro.
+ Need to explicitly include <hsa.h> for hip-clang.


[ROCm/rccl commit: 41310144f6]
2019-07-31 15:07:36 -04:00
Cao Zongyan d45a1180f7 Refine RPM package building spec file.
Add /sbin/ldconfig into RPM package install operations.


[ROCm/rccl commit: bfb3921519]
2019-07-31 10:36:22 -07:00
Wenkai Du 6688279075 Add gfx908 target (#106)
[ROCm/rccl commit: 1969e89003]
2019-07-30 13:56:45 -07:00
Wenkai Du 62e6e67e31 Remove extra "." from version string (#104)
[ROCm/rccl commit: 1fee6f9d50]
2019-07-25 15:25:02 -07:00
saadrahim 596e200499 Changing to rocm-cmake new style versioning (#103)
[ROCm/rccl commit: fdee095dd3]
2019-07-22 23:40:13 +00:00