Stanley Tsang
dc403e0ca2
Making hip-clang the default compiler; documentation update ( #216 )
...
* Making hip-clang the default compiler; documentation update
* Adding back --hip-clang to install.sh as a silent option for CI
2020-06-04 11:58:27 -06:00
Wenkai Du
2a4514772c
Merge pull request #214 from wenkaidu/gdr
...
Use cached value for detecting GDR support only once
2020-05-22 13:36:23 -07:00
Wenkai Du
67c8e72ce3
Use cached value for detecting GDR support only once
2020-05-22 17:19:10 +00:00
Wenkai Du
957be85944
Merge pull request #212 from wenkaidu/version
...
Report HIP version in logs
2020-05-20 16:25:54 -07:00
Wenkai Du
e41ab173cf
Report HIP version in logs
2020-05-20 18:15:32 +00:00
Wenkai Du
af703877cf
Merge pull request #210 from wenkaidu/unroll
...
Revert "Tuning the inline and unroll to reduce the scratch usage"
2020-05-15 15:35:27 -07:00
Wenkai Du
ca493a6b51
Revert "Tuning the inline and unroll to reduce the scratch usage"
...
This reverts commit eec319038e .
2020-05-15 14:15:40 -07:00
Wenkai Du
c245f1507e
Merge pull request #209 from wenkaidu/hip-clang
...
Rename files which only diffs in extension
2020-05-15 13:51:12 -07:00
Wenkai Du
706de76046
Merge pull request #208 from wenkaidu/perf_xgmi
...
Give preference to path with more XGMI connections
2020-05-15 10:07:22 -07:00
Wenkai Du
e7b36304c8
Rename files which only diffs in extension
2020-05-15 09:16:32 -07:00
Wenkai Du
ca4987e5fb
Merge pull request #207 from wenkaidu/hip-clang
...
rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDe…
2020-05-14 18:31:56 -07:00
Wenkai Du
b3c9852634
Give preference to path with more XGMI connections
2020-05-14 15:33:16 -07:00
Wenkai Du
f1058b6353
rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-12 23:54:07 +00:00
Stanley Tsang
66a9f11910
Merge pull request #206 from stanleytsang-amd/develop
...
Updating RCCL documentation
2020-05-12 17:24:40 -06:00
Stanley Tsang
787ac13486
Restoring doxygen documentation to nccl.h.in.
2020-05-12 22:03:31 +00:00
Stanley Tsang
b59b9d328b
Updating README and readthedocs documentation.
2020-05-12 20:11:49 +00:00
Wenkai Du
52752aba6e
Merge pull request #205 from wenkaidu/bf16
...
Update rccl_bfloat16.h to match rocBLAS
2020-05-11 09:55:06 -07:00
Wenkai Du
d5a07a7b5c
Update rccl_bfloat16.h to match rocBLAS
2020-05-08 22:48:07 +00:00
Wenkai Du
94d16c0f0a
Merge pull request #204 from wenkaidu/launch_flags
...
Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-08 11:20:50 -07:00
Wenkai Du
24ea2ef6dd
Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-08 15:57:14 +00:00
Saad Rahim
33c23fdcda
Merge remote-tracking branch 'upstream/master' into develop
2020-04-29 16:12:37 -07:00
saadrahim
308e96877e
Refactoring packaging ( #193 )
2020-04-29 16:24:21 -06:00
Wenkai Du
914b6ca27c
Merge pull request #199 from wenkaidu/para_jobs
...
Enable parallel jobs for hip-clang build
2020-04-29 13:49:48 -07:00
saadrahim
65390f9872
Junit test storage call corrected ( #197 )
...
* Focus testing on Centos for now
* storing junit
* Reducing test suite to Ubuntu
2020-04-29 14:22:03 -06:00
Wenkai Du
3f471ab5b1
Enable parallel jobs for hip-clang build
2020-04-29 17:58:16 +00:00
saadrahim
6b1d70b03b
Adding NCCL_DEBUG=INFO Logging to CI ( #196 )
2020-04-27 15:12:15 -06:00
Wenkai Du
f7c27c6c9f
Merge pull request #195 from wenkaidu/sync_nccl
...
Sync up with NCCL
2020-04-27 11:45:05 -07:00
Wenkai Du
5743c6b7d2
topo_expl: fix build error
2020-04-27 17:17:05 +00:00
Wenkai Du
c4edc257b0
Merge remote-tracking branch 'nccl/master' into HEAD
2020-04-27 17:16:54 +00:00
Wenkai Du
cf5070f6c0
Merge pull request #194 from wenkaidu/search
...
Fix incorrect next device ID in PCI ordered search
2020-04-27 09:54:09 -07:00
Wenkai Du
edb49ed2d5
Fix incorrect next device ID in PCI ordered search
2020-04-25 01:01:13 +00:00
saadrahim
cc66dd46e9
Enabling CI Testing Again ( #192 )
...
Adding CI support based on AMD internal CI refactor.
2020-04-24 10:36:57 -06:00
Gilbert Lee
339bf9ff19
Adding option to re-use streams instead of re-creating per topology
2020-04-23 15:53:40 +00:00
Sylvain Jeaugey
f36540f55a
Fix crash when only a subset of GPUs are visible within a container.
...
Fixes #326 .
2020-04-17 10:03:14 -07:00
Sylvain Jeaugey
23a9fbb788
Improve robustness of PCI detection
...
Fallback to default values when class/speed is unknown.
2020-04-16 14:27:50 -07:00
Wenkai Du
c017f6e900
Merge pull request #191 from wenkaidu/gfx803
...
Revert "Temporary disable 0x803 target due to build error"
2020-04-16 09:27:24 -07:00
aokomoriuta
a783484ab5
Fix wrong variable name "slice" to "chunk"
...
https://github.com/NVIDIA/nccl/issues/287
2020-04-14 19:00:51 -07:00
Wenkai Du
5170bd1c02
Revert "Temporary disable 0x803 target due to build error"
...
This reverts commit cd7ab1425b .
2020-04-14 16:58:41 +00:00
Wenkai Du
3ac98e7d39
Merge pull request #188 from wenkaidu/prim_test
...
rccl-prim-test: auto-detect rings in 4P and 8P configurations
2020-04-14 09:52:49 -07:00
Wenkai Du
ef7064ba9b
rccl-prim-test: auto-detect rings in 4P and 8P configurations
2020-04-10 18:17:21 +00:00
Sylvain Jeaugey
b5b6c6acdd
Fix bug #307 : wrong NIC selection on the reduction tree.
...
The reduction tree (tree up) was inverting the NICs to use,
causing performance issue in cases where we are using different
NICs on a given channel.
2020-04-09 17:14:07 -07:00
Aaron Enye Shi
fa52d4f0aa
Merge pull request #187 from aaronenyeshi/fix-hip-vdi-hsa-ext
...
Fix HIP-Clang build with HSA headers
2020-04-03 19:06:38 -04:00
Aaron Enye Shi
a95090d981
Fix HIP-Clang build with HSA headers
...
HIP-Clang does not include these HSA headers, and they need to be explicitly added in RCCL.
2020-04-03 17:58:23 -04:00
Wenkai Du
3cbe5c8a40
Merge pull request #186 from wenkaidu/v2.6.4
...
Merge with NCCL 2.6.4
2020-04-02 10:42:01 -07:00
Wenkai Du
6f54b23503
topo_expl: update to 2.6
2020-04-01 13:37:08 -07:00
Wenkai Du
fa36fd9ef9
Merge remote-tracking branch 'nccl/master' into v2.6.4_merge
2020-04-01 13:35:12 -07:00
Sylvain Jeaugey
533e3702cf
Merge pull request #314 from NVIDIA/v2.6
...
2.6.4-1
2020-03-26 17:31:24 -07:00
Sylvain Jeaugey
b221128eca
2.6.4-1
...
Add support for network collectives.
Add support for XML topology dump/injection.
Add text values for GDR and P2P Levels, including "NVL".
Add speed detection for PCI, Infiniband and Ethernet cards.
Add CPU detection for ARM and AMD CPUs.
Add support for adaptive routing on Infiniband.
Change NET plugin API to v3 : merge PCI path and GPU pointer
capability into a single structure and add other properties.
2020-03-20 14:58:36 -07:00
Rashika Kheria
6c61492eba
Check return code for Flush operation
...
Current NCCL code does not abort for failed Flush operations by
underlying network. This may compromise data integrity.
Signed-off-by: Rashika Kheria <rashika@amazon.com >
2020-03-16 20:40:59 -07:00
Wenkai Du
ebc823e603
rccl-prim-test: add all-to-all benchmark ( #185 )
...
For gfx908, support simple detection of ring topology.
Call ReduceOrCopyMulti directly from kernel.
Also simplify code by removing kernel start synchronization option
which has no effect on throughput measurements.
2020-03-16 10:00:54 -07:00