2
0
Gráfico de cometimentos

33 Cometimentos

Autor(a) SHA1 Mensagem Data
Stanley Tsang c5d4d9eb76 Adding static library building option. (#244)
* Adding static library building option.

* Disabling running tests for static build

* Removing static packaging in CI

Co-authored-by: Saad Rahim <saad.rahim@amd.com>
2020-08-06 11:19:43 -06:00
saadrahim 0dc019e35f Download GTest if not found in system (#237)
Co-authored-by: Stanley Tsang <stanley.tsang@amd.com>
2020-08-06 09:36:58 -06:00
Wenkai Du 35c5a7fe45 Fix RCCL build package name (#236) 2020-07-20 14:43:00 -07:00
saadrahim 7f93aa7e53 Changing dependency to hip-rocclr (#228) 2020-07-14 17:49:56 -06:00
Wenkai Du 1addf4f196 Match RCCL package name to API version (#229) 2020-07-07 13:30:39 -07:00
Wenkai Du 84f8ba3bb0 Revert use posix_memalign for network buffer allocation on host memory (#222) 2020-06-24 11:25:55 -07:00
Wenkai Du 0eb19a563a Use posix_memalign for network buffer allocation on host memory (#221)
* Use posix_memalign for network buffer allocation on host memory

* ib-test: add ability to specify run iterations

* ib-test: define iterations as multiple of default cycles

* Add checking to posix_memalign return value
2020-06-22 13:06:25 -07:00
Wenkai Du e98891d039 Log NUMA node of RDMA host buffer allocation 2020-06-09 17:44:15 -07:00
Wenkai Du e80e29573c Add gather, scatter and alltoall collectives
Introducing 3 new APIs:
ncclResult_t  ncclGather(const void* sendbuff, void* recvbuff, size_t sendcount,
    ncclDataType_t datatype, int root, ncclComm_t comm, hipStream_t stream);
ncclResult_t  ncclScatter(const void* sendbuff, void* recvbuff,
    size_t recvcount, ncclDataType_t datatype, int root, ncclComm_t comm,
    hipStream_t stream);
ncclResult_t  ncclAllToAll(const void* sendbuff, void* recvbuff, size_t count,
    ncclDataType_t datatype, ncclComm_t comm, hipStream_t stream);

Only out of place operation is supported.
Preprocessor symbol RCCL_GATHER_SCATTER=1 indicates API availibility.
By default the APIs launche RCCL kernel implementation, which can be disabled by
RCCL_ALLTOALL_KERNEL_DISABLE=1. Then the APIs use wrapper around ncclSend and ncclRecv.
2020-06-09 17:44:08 -07:00
Wenkai Du 26a0fd2517 Merge remote-tracking branch 'nccl/master' into develop 2020-06-09 17:40:11 -07:00
Wenkai Du e7b36304c8 Rename files which only diffs in extension 2020-05-15 09:16:32 -07:00
Wenkai Du 3f471ab5b1 Enable parallel jobs for hip-clang build 2020-04-29 17:58:16 +00:00
Wenkai Du 5170bd1c02 Revert "Temporary disable 0x803 target due to build error"
This reverts commit cd7ab1425b.
2020-04-14 16:58:41 +00:00
Wenkai Du fa36fd9ef9 Merge remote-tracking branch 'nccl/master' into v2.6.4_merge 2020-04-01 13:35:12 -07:00
amdkila b9fb0cd808 set hip::host and hip::device and remove some deprecated targets (#184) 2020-03-05 13:36:55 -07:00
Wenkai Du 3d092f32b8 Bump up HCC version for -hc-function-calls switch 2020-02-11 19:37:13 +00:00
Stanley Tsang 20fa04d9b6 Updating copyright notices for 2020. 2020-01-29 15:28:08 -08:00
Wenkai Du fe6d012eb0 Merge remote-tracking branch 'remotes/rccl/master' into rccl_2.5.6_cleanup 2020-01-29 15:28:03 -08:00
Wenkai Du 1e55645d97 Misc fixes and improvements for 2.5.6
1. Fix RCCL unit test
2. Add ROME detection and tuning
3. Change default P2P level
4. Fix search algorithm for XGMI
5. Remove explicit channel duplication with implicit by using half of link speed
6. Add collective trace support
7. Correct Intel Skylake CPU detection and bandwidth
8. Fix topo connect function
9. Disable GDR read and remove unreachable code
10. Disable LL128 kernels
11. Add tuning parameters
12. Use original clock64() implementation which returns RTC counter value
13. Print out timestamp of collective trace
14. Do not use struct ncclColl in kernel launch parameter
15. Fix abort handling and add tracing
17. Add __launch_bounds__ to kernel functions
18. Remove unused abortCount
19. Unset default MIN_NRINGS and MIN_NCHANNELS
20. Do not allocate shared memory when not using LL128 kernels
21. Correct time print out in tuning log
2020-01-29 15:27:05 -08:00
paulfreddy 15c917244d Changes for multiple ROCm installation (#164)
* Changes for multiple ROCm installation

   1. Set version to 2.10.1
   2. Add CMAKE_INSTALL_PREFIX to neccessary places
   3. Cleanup, fix rpath, use prefix in install.sh

* Changes for multiple ROCm installation

   1. Set soversion to match release version
   2. Add CMAKE_INSTALL_PREFIX to neccessary places
   3. Cleanup, fix rpath, use prefix in install.sh

* Changes for multiple ROCm installation

1. Set soversion to match release version
2. Add CMAKE_INSTALL_PREFIX to neccessary places
3. Cleanup, fix rpath, use prefix in install.sh
2020-01-08 21:28:16 -08:00
saadrahim 0092b35132 Package fix (#161)
* Fixing RHEL dependency on rocm-dev
2019-12-06 16:06:50 -07:00
saadrahim bd59b6f880 Changing package dependency to rocm-dev (#160) 2019-12-06 14:00:25 -07:00
Wenkai Du 6648c81dc6 Merge remote-tracking branch 'remotes/nccl/master' into rccl_2.5.6 2019-12-03 15:42:04 -08:00
Wenkai Du 5e109ed400 Add bfloat16 support in RCCL
Preprocessor symbol RCCL_BFLOAT16 is used as feature indicator
2019-11-18 13:45:53 -08:00
Wenkai Du cd7ab1425b Temporary disable 0x803 target due to build error 2019-11-14 11:17:41 -08:00
Siu Chi Chan 08ba92f1b0 Bump up HCC version for -hc-function-calls switch 2019-11-12 14:16:35 -05:00
Wenkai Du 9be7ae8f0d Merge pull request #140 from scchan/rocm210_hc_function_calls
add -hc-function-calls switch back for HCC ROCm 2.10
2019-10-28 09:56:47 -07:00
Michael LIAO ec10a5cf14 [cmake] Allow GPU targets to be parameterized with AMDGPU_TARGETS. 2019-10-25 13:55:27 -04:00
Siu Chi Chan d779eae1d0 add -hc-function-calls switch back for HCC ROCm 2.10 2019-10-21 18:00:02 -04:00
Siu Chi Chan b87ef4f152 detect the hcc version and conditionally add the -hc-function-calls switch 2019-10-03 13:25:25 -04:00
Gilbert Lee 6232985e34 Re-adding gfx908 target 2019-09-13 16:57:34 +00:00
saadrahim 544d4fb704 Updating versioning to follow rocm-cmake standard (#126) 2019-08-23 16:33:38 -06:00
Wenkai Du f11c8f60cd RCCL 2.4 update 2019-08-14 10:42:35 -07:00