Wenkai Du
cfa97eccd3
Add IB/RDMA unit test
2020-06-16 18:29:17 +00:00
Wenkai Du
95b8f70d15
Limit network profiling support to simple protocol and avoid overflow
2020-06-15 20:51:36 +00:00
Wenkai Du
7484e53ff7
Rework network proxy profiling
2020-06-13 03:13:58 +00:00
Wenkai Du
b257676f30
Reduce RCCL kernel count as we don't pass first coll in argument
2020-06-12 21:30:04 +00:00
Wenkai Du
a6d621176c
Sender rank's opCount maybe ahead by one if it finishes earlier
2020-06-12 03:39:45 +00:00
Wenkai Du
fee1a20b74
gtest: add scatter, gather and all to all unit tests
2020-06-09 17:44:15 -07:00
Wenkai Du
e98891d039
Log NUMA node of RDMA host buffer allocation
2020-06-09 17:44:15 -07:00
Wenkai Du
812543104d
Add network proxy profiling support
2020-06-09 17:44:15 -07:00
Wenkai Du
c9aa11928a
Calculate and use total wait cycles for RCCL profiling
2020-06-09 17:44:15 -07:00
Wenkai Du
e80e29573c
Add gather, scatter and alltoall collectives
...
Introducing 3 new APIs:
ncclResult_t ncclGather(const void* sendbuff, void* recvbuff, size_t sendcount,
ncclDataType_t datatype, int root, ncclComm_t comm, hipStream_t stream);
ncclResult_t ncclScatter(const void* sendbuff, void* recvbuff,
size_t recvcount, ncclDataType_t datatype, int root, ncclComm_t comm,
hipStream_t stream);
ncclResult_t ncclAllToAll(const void* sendbuff, void* recvbuff, size_t count,
ncclDataType_t datatype, ncclComm_t comm, hipStream_t stream);
Only out of place operation is supported.
Preprocessor symbol RCCL_GATHER_SCATTER=1 indicates API availibility.
By default the APIs launche RCCL kernel implementation, which can be disabled by
RCCL_ALLTOALL_KERNEL_DISABLE=1. Then the APIs use wrapper around ncclSend and ncclRecv.
2020-06-09 17:44:08 -07:00
Wenkai Du
71ec3e09df
tpol_expl: update to 2.7
2020-06-09 17:40:24 -07:00
Wenkai Du
26a0fd2517
Merge remote-tracking branch 'nccl/master' into develop
2020-06-09 17:40:11 -07:00
Sylvain Jeaugey
5949d96f36
2.7.3-1
...
Add support for A100 GPU and related platforms.
Add support for CUDA 11.
Add support for send/receive operations (beta).
2020-06-08 09:31:44 -07:00
saadrahim
87db65f22d
Fixing CI as install.sh script should not install dependencies without user request ( #217 )
2020-06-05 11:04:03 -06:00
Stanley Tsang
dc403e0ca2
Making hip-clang the default compiler; documentation update ( #216 )
...
* Making hip-clang the default compiler; documentation update
* Adding back --hip-clang to install.sh as a silent option for CI
2020-06-04 11:58:27 -06:00
Wenkai Du
2a4514772c
Merge pull request #214 from wenkaidu/gdr
...
Use cached value for detecting GDR support only once
2020-05-22 13:36:23 -07:00
Wenkai Du
67c8e72ce3
Use cached value for detecting GDR support only once
2020-05-22 17:19:10 +00:00
Wenkai Du
957be85944
Merge pull request #212 from wenkaidu/version
...
Report HIP version in logs
2020-05-20 16:25:54 -07:00
Wenkai Du
e41ab173cf
Report HIP version in logs
2020-05-20 18:15:32 +00:00
Wenkai Du
af703877cf
Merge pull request #210 from wenkaidu/unroll
...
Revert "Tuning the inline and unroll to reduce the scratch usage"
2020-05-15 15:35:27 -07:00
Wenkai Du
ca493a6b51
Revert "Tuning the inline and unroll to reduce the scratch usage"
...
This reverts commit eec319038e .
2020-05-15 14:15:40 -07:00
Wenkai Du
c245f1507e
Merge pull request #209 from wenkaidu/hip-clang
...
Rename files which only diffs in extension
2020-05-15 13:51:12 -07:00
Wenkai Du
706de76046
Merge pull request #208 from wenkaidu/perf_xgmi
...
Give preference to path with more XGMI connections
2020-05-15 10:07:22 -07:00
Wenkai Du
e7b36304c8
Rename files which only diffs in extension
2020-05-15 09:16:32 -07:00
Wenkai Du
ca4987e5fb
Merge pull request #207 from wenkaidu/hip-clang
...
rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDe…
2020-05-14 18:31:56 -07:00
Wenkai Du
b3c9852634
Give preference to path with more XGMI connections
2020-05-14 15:33:16 -07:00
Wenkai Du
f1058b6353
rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-12 23:54:07 +00:00
Stanley Tsang
66a9f11910
Merge pull request #206 from stanleytsang-amd/develop
...
Updating RCCL documentation
2020-05-12 17:24:40 -06:00
Stanley Tsang
787ac13486
Restoring doxygen documentation to nccl.h.in.
2020-05-12 22:03:31 +00:00
Stanley Tsang
b59b9d328b
Updating README and readthedocs documentation.
2020-05-12 20:11:49 +00:00
Wenkai Du
52752aba6e
Merge pull request #205 from wenkaidu/bf16
...
Update rccl_bfloat16.h to match rocBLAS
2020-05-11 09:55:06 -07:00
Wenkai Du
d5a07a7b5c
Update rccl_bfloat16.h to match rocBLAS
2020-05-08 22:48:07 +00:00
Wenkai Du
94d16c0f0a
Merge pull request #204 from wenkaidu/launch_flags
...
Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-08 11:20:50 -07:00
Wenkai Du
24ea2ef6dd
Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-08 15:57:14 +00:00
Saad Rahim
33c23fdcda
Merge remote-tracking branch 'upstream/master' into develop
2020-04-29 16:12:37 -07:00
saadrahim
308e96877e
Refactoring packaging ( #193 )
2020-04-29 16:24:21 -06:00
Wenkai Du
914b6ca27c
Merge pull request #199 from wenkaidu/para_jobs
...
Enable parallel jobs for hip-clang build
2020-04-29 13:49:48 -07:00
saadrahim
65390f9872
Junit test storage call corrected ( #197 )
...
* Focus testing on Centos for now
* storing junit
* Reducing test suite to Ubuntu
2020-04-29 14:22:03 -06:00
Wenkai Du
3f471ab5b1
Enable parallel jobs for hip-clang build
2020-04-29 17:58:16 +00:00
saadrahim
6b1d70b03b
Adding NCCL_DEBUG=INFO Logging to CI ( #196 )
2020-04-27 15:12:15 -06:00
Wenkai Du
f7c27c6c9f
Merge pull request #195 from wenkaidu/sync_nccl
...
Sync up with NCCL
2020-04-27 11:45:05 -07:00
Wenkai Du
5743c6b7d2
topo_expl: fix build error
2020-04-27 17:17:05 +00:00
Wenkai Du
c4edc257b0
Merge remote-tracking branch 'nccl/master' into HEAD
2020-04-27 17:16:54 +00:00
Wenkai Du
cf5070f6c0
Merge pull request #194 from wenkaidu/search
...
Fix incorrect next device ID in PCI ordered search
2020-04-27 09:54:09 -07:00
Wenkai Du
edb49ed2d5
Fix incorrect next device ID in PCI ordered search
2020-04-25 01:01:13 +00:00
saadrahim
cc66dd46e9
Enabling CI Testing Again ( #192 )
...
Adding CI support based on AMD internal CI refactor.
2020-04-24 10:36:57 -06:00
Gilbert Lee
339bf9ff19
Adding option to re-use streams instead of re-creating per topology
2020-04-23 15:53:40 +00:00
Sylvain Jeaugey
f36540f55a
Fix crash when only a subset of GPUs are visible within a container.
...
Fixes #326 .
2020-04-17 10:03:14 -07:00
Sylvain Jeaugey
23a9fbb788
Improve robustness of PCI detection
...
Fallback to default values when class/speed is unknown.
2020-04-16 14:27:50 -07:00
Wenkai Du
c017f6e900
Merge pull request #191 from wenkaidu/gfx803
...
Revert "Temporary disable 0x803 target due to build error"
2020-04-16 09:27:24 -07:00