نمودار کامیت

416 کامیت‌ها

مولف SHA1 پیام تاریخ
Wenkai Du cfa97eccd3 Add IB/RDMA unit test 2020-06-16 18:29:17 +00:00
Wenkai Du 95b8f70d15 Limit network profiling support to simple protocol and avoid overflow 2020-06-15 20:51:36 +00:00
Wenkai Du 7484e53ff7 Rework network proxy profiling 2020-06-13 03:13:58 +00:00
Wenkai Du b257676f30 Reduce RCCL kernel count as we don't pass first coll in argument 2020-06-12 21:30:04 +00:00
Wenkai Du a6d621176c Sender rank's opCount maybe ahead by one if it finishes earlier 2020-06-12 03:39:45 +00:00
Wenkai Du fee1a20b74 gtest: add scatter, gather and all to all unit tests 2020-06-09 17:44:15 -07:00
Wenkai Du e98891d039 Log NUMA node of RDMA host buffer allocation 2020-06-09 17:44:15 -07:00
Wenkai Du 812543104d Add network proxy profiling support 2020-06-09 17:44:15 -07:00
Wenkai Du c9aa11928a Calculate and use total wait cycles for RCCL profiling 2020-06-09 17:44:15 -07:00
Wenkai Du e80e29573c Add gather, scatter and alltoall collectives
Introducing 3 new APIs:
ncclResult_t  ncclGather(const void* sendbuff, void* recvbuff, size_t sendcount,
    ncclDataType_t datatype, int root, ncclComm_t comm, hipStream_t stream);
ncclResult_t  ncclScatter(const void* sendbuff, void* recvbuff,
    size_t recvcount, ncclDataType_t datatype, int root, ncclComm_t comm,
    hipStream_t stream);
ncclResult_t  ncclAllToAll(const void* sendbuff, void* recvbuff, size_t count,
    ncclDataType_t datatype, ncclComm_t comm, hipStream_t stream);

Only out of place operation is supported.
Preprocessor symbol RCCL_GATHER_SCATTER=1 indicates API availibility.
By default the APIs launche RCCL kernel implementation, which can be disabled by
RCCL_ALLTOALL_KERNEL_DISABLE=1. Then the APIs use wrapper around ncclSend and ncclRecv.
2020-06-09 17:44:08 -07:00
Wenkai Du 71ec3e09df tpol_expl: update to 2.7 2020-06-09 17:40:24 -07:00
Wenkai Du 26a0fd2517 Merge remote-tracking branch 'nccl/master' into develop 2020-06-09 17:40:11 -07:00
Sylvain Jeaugey 5949d96f36 2.7.3-1
Add support for A100 GPU and related platforms.
Add support for CUDA 11.
Add support for send/receive operations (beta).
2020-06-08 09:31:44 -07:00
saadrahim 87db65f22d Fixing CI as install.sh script should not install dependencies without user request (#217) 2020-06-05 11:04:03 -06:00
Stanley Tsang dc403e0ca2 Making hip-clang the default compiler; documentation update (#216)
* Making hip-clang the default compiler; documentation update

* Adding back --hip-clang to install.sh as a silent option for CI
2020-06-04 11:58:27 -06:00
Wenkai Du 2a4514772c Merge pull request #214 from wenkaidu/gdr
Use cached value for detecting GDR support only once
2020-05-22 13:36:23 -07:00
Wenkai Du 67c8e72ce3 Use cached value for detecting GDR support only once 2020-05-22 17:19:10 +00:00
Wenkai Du 957be85944 Merge pull request #212 from wenkaidu/version
Report HIP version in logs
2020-05-20 16:25:54 -07:00
Wenkai Du e41ab173cf Report HIP version in logs 2020-05-20 18:15:32 +00:00
Wenkai Du af703877cf Merge pull request #210 from wenkaidu/unroll
Revert "Tuning the inline and unroll to reduce the scratch usage"
2020-05-15 15:35:27 -07:00
Wenkai Du ca493a6b51 Revert "Tuning the inline and unroll to reduce the scratch usage"
This reverts commit eec319038e.
2020-05-15 14:15:40 -07:00
Wenkai Du c245f1507e Merge pull request #209 from wenkaidu/hip-clang
Rename files which only diffs in extension
2020-05-15 13:51:12 -07:00
Wenkai Du 706de76046 Merge pull request #208 from wenkaidu/perf_xgmi
Give preference to path with more XGMI connections
2020-05-15 10:07:22 -07:00
Wenkai Du e7b36304c8 Rename files which only diffs in extension 2020-05-15 09:16:32 -07:00
Wenkai Du ca4987e5fb Merge pull request #207 from wenkaidu/hip-clang
rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDe…
2020-05-14 18:31:56 -07:00
Wenkai Du b3c9852634 Give preference to path with more XGMI connections 2020-05-14 15:33:16 -07:00
Wenkai Du f1058b6353 rccl-prim-test: add flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang 2020-05-12 23:54:07 +00:00
Stanley Tsang 66a9f11910 Merge pull request #206 from stanleytsang-amd/develop
Updating RCCL documentation
2020-05-12 17:24:40 -06:00
Stanley Tsang 787ac13486 Restoring doxygen documentation to nccl.h.in. 2020-05-12 22:03:31 +00:00
Stanley Tsang b59b9d328b Updating README and readthedocs documentation. 2020-05-12 20:11:49 +00:00
Wenkai Du 52752aba6e Merge pull request #205 from wenkaidu/bf16
Update rccl_bfloat16.h to match rocBLAS
2020-05-11 09:55:06 -07:00
Wenkai Du d5a07a7b5c Update rccl_bfloat16.h to match rocBLAS 2020-05-08 22:48:07 +00:00
Wenkai Du 94d16c0f0a Merge pull request #204 from wenkaidu/launch_flags
Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang
2020-05-08 11:20:50 -07:00
Wenkai Du 24ea2ef6dd Set flags when calling hipExtLaunchMultiKernelMultiDevice in hip-clang 2020-05-08 15:57:14 +00:00
Saad Rahim 33c23fdcda Merge remote-tracking branch 'upstream/master' into develop 2020-04-29 16:12:37 -07:00
saadrahim 308e96877e Refactoring packaging (#193) 2020-04-29 16:24:21 -06:00
Wenkai Du 914b6ca27c Merge pull request #199 from wenkaidu/para_jobs
Enable parallel jobs for hip-clang build
2020-04-29 13:49:48 -07:00
saadrahim 65390f9872 Junit test storage call corrected (#197)
* Focus testing on Centos for now

* storing junit

* Reducing test suite to Ubuntu
2020-04-29 14:22:03 -06:00
Wenkai Du 3f471ab5b1 Enable parallel jobs for hip-clang build 2020-04-29 17:58:16 +00:00
saadrahim 6b1d70b03b Adding NCCL_DEBUG=INFO Logging to CI (#196) 2020-04-27 15:12:15 -06:00
Wenkai Du f7c27c6c9f Merge pull request #195 from wenkaidu/sync_nccl
Sync up with NCCL
2020-04-27 11:45:05 -07:00
Wenkai Du 5743c6b7d2 topo_expl: fix build error 2020-04-27 17:17:05 +00:00
Wenkai Du c4edc257b0 Merge remote-tracking branch 'nccl/master' into HEAD 2020-04-27 17:16:54 +00:00
Wenkai Du cf5070f6c0 Merge pull request #194 from wenkaidu/search
Fix incorrect next device ID in PCI ordered search
2020-04-27 09:54:09 -07:00
Wenkai Du edb49ed2d5 Fix incorrect next device ID in PCI ordered search 2020-04-25 01:01:13 +00:00
saadrahim cc66dd46e9 Enabling CI Testing Again (#192)
Adding CI support based on AMD internal CI refactor.
2020-04-24 10:36:57 -06:00
Gilbert Lee 339bf9ff19 Adding option to re-use streams instead of re-creating per topology 2020-04-23 15:53:40 +00:00
Sylvain Jeaugey f36540f55a Fix crash when only a subset of GPUs are visible within a container.
Fixes #326.
2020-04-17 10:03:14 -07:00
Sylvain Jeaugey 23a9fbb788 Improve robustness of PCI detection
Fallback to default values when class/speed is unknown.
2020-04-16 14:27:50 -07:00
Wenkai Du c017f6e900 Merge pull request #191 from wenkaidu/gfx803
Revert "Temporary disable 0x803 target due to build error"
2020-04-16 09:27:24 -07:00