Bertan Dogancay
fcb0b2da3f
[Replayer] Add validation ( #1387 )
...
* Add validation to rccl_replayer
[ROCm/rccl commit: cfecce790f ]
2024-10-22 10:41:08 -04:00
dependabot[bot]
64aead445c
Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx ( #1385 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.8.2 to 1.8.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.2...v1.8.3 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[ROCm/rccl commit: 4685d3c546 ]
2024-10-21 10:05:58 -06:00
Bertan Dogancay
57710c1183
Dynamically select unroll factor to build for when targeting local arch ( #1371 )
...
* Dynamically select unroll factor to build for when targeting local arch only
[ROCm/rccl commit: 373f113524 ]
2024-10-21 10:53:11 -04:00
Wenkai Du
5ee84e0353
Increase CQ size to 3*MAX_REQUESTS ( #1374 )
...
* Increase CQ size to 3*MAX_REQUESTS
Suggested by Rukhsana Ansari <rukhsana.ansari@broadcom.com >
* Reword comments based on feedback from Rukhsana
[ROCm/rccl commit: 7c077db307 ]
2024-10-18 11:01:03 -07:00
akolliasAMD
ad2c8c3eb8
added atomic acquire for gfx12 on prims_simple ( #1382 )
...
[ROCm/rccl commit: af5678641d ]
2024-10-18 11:26:38 -06:00
Jeffrey Novotny
a5cc8edd9b
Add missing metadata information ( #1381 )
...
[ROCm/rccl commit: 4822fd47ca ]
2024-10-16 13:26:12 -04:00
Sean Karlage
3eda60a031
static: Enable true rccl static library build ( #1379 )
...
* static: Enable true rccl static library build
Rccl uses `-fgpu-rdc` to compile, which requires a specialized link command in order to produce a true static library.
When "linking" with `amdclang++`, you need to use `--emit-static-lib` and `--hip-link` to get a static library with all gpu code generated. Subsequent links with binaries do not need any special flags to generate gpu code.`
Building a static library:
```
$ cmake -DROCM_PATH=$ROCM_PATH -DCMAKE_PREFIX_PATH=$ROCM_PATH -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=off -DCMAKE_POSITION_INDEPENDENT_CODE=on -DAMDGPU_TARGETS=gfx942 -DCMAKE_CXX_COMPILER=$ROCM_PATH/lib/llvm/bin/amdclang++ -DCMAKE_C_COMPILER=$ROCM_PATH/lib/llvm/bin/amdclang .. 2>&1 | tee -a /tmp/build.txt
-- Could NOT find GTest (missing: GTEST_LIBRARY GTEST_INCLUDE_DIR GTEST_MAIN_LIBRARY) (Required is at least version "1.11")
-- Checking for ROCm support for GPU targets: gfx942
-- Compiling for gfx942
-- Could NOT find GTest (missing: GTEST_LIBRARY GTEST_INCLUDE_DIR GTEST_MAIN_LIBRARY) (Required is at least version "1.11")
-- ROCM_PATH found: /opt/rocm
-- Compiling with amdclang++
-- HIP compiler: clang
-- HIP runtime: rocclr
-- amdclang++ executable: /opt/rocm/llvm/bin/amdclang++
-- amdclang++ version: 18.0.0git
-- hipconfig executable: /opt/rocm/bin/hipconfig
-- amdclang++ HIP version: 6.2.41133
-- ROCm version: 6.2.0
...
$ make -j 32
[ 0%] Updating git_version.cpp if necessary
-- Updating git_version.cpp
[ 0%] Built target git_version_check
[ 0%] Hipifying src/transport/shm.cc -> /home/skarlage/local/rccl/build/hipify/src/transport/shm.cc
[ 0%] Hipifying src/bootstrap.cc -> /home/skarlage/local/rccl/build/hipify/src/bootstrap.cc
[ 0%] Hipifying src/channel.cc -> /home/skarlage/local/rccl/build/hipify/src/channel.cc
[ 1%] Hipifying src/device/all_reduce.h -> /home/skarlage/local/rccl/build/hipify/src/device/all_reduce.h
[ 1%] Hipifying src/device/broadcast.h -> /home/skarlage/local/rccl/build/hipify/src/device/broadcast.h
[ 1%] Hipifying src/device/all_gather.h -> /home/skarlage/local/rccl/build/hipify/src/device/all_gather.h
[ 1%] Hipifying src/device/common.cu -> /home/skarlage/local/rccl/build/hipify/src/device/common.cu.cpp
[ 1%] Hipifying src/debug.cc -> /home/skarlage/local/rccl/build/hipify/src/debug.cc
[ 1%] Hipifying src/device/alltoall_pivot.h -> /home/skarlage/local/rccl/build/hipify/src/device/alltoall_pivot.h
[ 1%] Hipifying src/device/network/unpack/unpack.h -> /home/skarlage/local/rccl/build/hipify/src/device/network/unpack/unpack.h
[ 4%] Hipifying src/collectives.cc -> /home/skarlage/local/rccl/build/hipify/src/collectives.cc
[ 4%] Hipifying src/device/msccl_kernel_impl.h -> /home/skarlage/local/rccl/build/hipify/src/device/msccl_kernel_impl.h
[ 4%] Hipifying src/device/network/unpack/unpack_defs.h -> /home/skarlage/local/rccl/build/hipify/src/device/network/unpack/unpack_defs.h
[ 4%] Hipifying src/device/op128.h -> /home/skarlage/local/rccl/build/hipify/src/device/op128.h
[ 4%] Hipifying src/device/onerank.cu -> /home/skarlage/local/rccl/build/hipify/src/device/onerank.cu.cpp
[ 4%] Hipifying src/device/common.h -> /home/skarlage/local/rccl/build/hipify/src/device/common.h
[ 6%] Hipifying src/device/prims_ll.h -> /home/skarlage/local/rccl/build/hipify/src/device/prims_ll.h
[ 6%] Hipifying src/device/primitives.h -> /home/skarlage/local/rccl/build/hipify/src/device/primitives.h
[ 6%] Hipifying src/device/prims_ll128.h -> /home/skarlage/local/rccl/build/hipify/src/device/prims_ll128.h
[ 6%] Hipifying src/device/reduce.h -> /home/skarlage/local/rccl/build/hipify/src/device/reduce.h
[ 7%] Hipifying src/device/common_kernel.h -> /home/skarlage/local/rccl/build/hipify/src/device/common_kernel.h
[ 7%] Hipifying src/device/reduce_scatter.h -> /home/skarlage/local/rccl/build/hipify/src/device/reduce_scatter.h
[ 7%] Hipifying src/device/sendrecv.h -> /home/skarlage/local/rccl/build/hipify/src/device/sendrecv.h
[ 7%] Hipifying src/device/prims_simple.h -> /home/skarlage/local/rccl/build/hipify/src/device/prims_simple.h
[ 7%] Hipifying src/enqueue.cc -> /home/skarlage/local/rccl/build/hipify/src/enqueue.cc
[ 7%] Hipifying src/device/reduce_kernel.h -> /home/skarlage/local/rccl/build/hipify/src/device/reduce_kernel.h
[ 7%] Hipifying src/graph/connect.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/connect.cc
[ 7%] Hipifying src/graph/rings.h -> /home/skarlage/local/rccl/build/hipify/src/graph/rings.h
[ 8%] Hipifying src/graph/rings.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/rings.cc
[ 8%] Hipifying src/graph/rome_models.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/rome_models.cc
[ 8%] Hipifying src/graph/rome_models.h -> /home/skarlage/local/rccl/build/hipify/src/graph/rome_models.h
[ 8%] Hipifying src/graph/paths.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/paths.cc
[ 9%] Hipifying src/graph/search.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/search.cc
[ 9%] Hipifying src/graph/topo.cc -> /home/skarlage/local/rccl/build/hipify/src/graph/topo.cc
...
[100%] Linking CXX static library librccl.a
Elapsed time: 270 s. (time), 0.00046 s. (clock)
Elapsed time: 0 s. (time), 0.000342 s. (clock)
[100%] Built target rccl
```
Static rccl exists:
```
$ file librccl.a
librccl.a: current ar archive
```
* Fix up tests Cmake for static builds
We also need to fix up the tests CMakeLists.txt to:
* Remove the unused `BUILD_STATIC` option
* Use `SHARED_LIBS` as a definition of whether we're building static or
not.
[ROCm/rccl commit: bdf9544c81 ]
2024-10-16 06:58:50 -07:00
Wenkai Du
bd0cdf5a50
Add back missing net flush ( #1376 )
...
[ROCm/rccl commit: c8d3543d3f ]
2024-10-15 08:12:26 -07:00
Wenkai Du
5f8571dcbc
msccl: disable 1-shot xmls ( #1375 )
...
MSCCL 1-shot xmls may cause different output values on different ranks.
Disabling them for now to avoid undefined behavior in applications.
[ROCm/rccl commit: 62d10fdc25 ]
2024-10-14 15:10:53 -07:00
Wenkai Du
9ad1fe571b
Temporarily disable MSCCL all gather XMLs due to UT failure ( #1373 )
...
[ROCm/rccl commit: a680e329e6 ]
2024-10-12 08:43:16 -07:00
Wenkai Du
09acdb6b49
Allow zero byte sendrecv in alltoallv ( #1349 )
...
* Allow zero byte sendrecv in alltoallv
* Fix previous merge error
[ROCm/rccl commit: 821d2e1f30 ]
2024-10-11 10:40:32 -07:00
Wenkai Du
4cd1b3a9f5
Improve model matching for GPUs with alltoall XGMI connection ( #1372 )
...
[ROCm/rccl commit: 5c367a21d0 ]
2024-10-11 09:53:14 -07:00
Arm Patinyasakdikul
ef54dd7cbc
Increase default number of channels for MI300A in multi-node scenario. ( #1366 )
...
This commit changed the default of channels of MI300A from 8 upto 24.
This helps bring up multi-node performance to the expected level.
[ROCm/rccl commit: 133ea201cf ]
2024-10-11 11:37:48 -05:00
Wenkai Du
1b988c1b31
Fix crash when PXN is enabled on some platforms ( #1369 )
...
[ROCm/rccl commit: b55b6be0cb ]
2024-10-11 09:02:59 -07:00
Nusrat Islam
5545392913
ext-src: Fix compiler warnings for MSCCLPP integration ( #1368 )
...
[ROCm/rccl commit: 6160603d4c ]
2024-10-10 08:20:02 -05:00
Nilesh M Negi
912e9f4b61
[BUILD] Simplify CMake args for building MSCCLPP ( #1363 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 364a6c2130 ]
2024-10-09 23:52:04 -05:00
Nilesh M Negi
04d9a98c8e
[BUILD] Require use of Python3 interpreter ( #1367 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 41a2c02773 ]
2024-10-09 22:36:50 -05:00
Nusrat Islam
f61053dcba
Add a custom allreduce algorithm in MSCCLPP for cpx mode ( #1362 )
...
* cmake: remove mscclpp patch after build is complete
To enable mscclpp in cpx mode, a patch cpx.patch needs to be applied.
This patch can be removed after building is done. This helps with the
build process the following time.
* Use read-based mscclpp allreduce from rccl
MSCCLPP by default uses remote write in the allreduce kernel for
large (> 1MB) messages. This PR adds an allreduce kernel that uses
remote read. It needs the users to use an environment variable
MSCCLPP_READ_ALLRED=1.
[ROCm/rccl commit: 4d68751ce1 ]
2024-10-08 14:42:12 -05:00
corey-derochie-amd
35d98330f2
Only set minNchannels if we are actually using MSCCL, checked using comm->mscclCompatible. ( #1337 )
...
[ROCm/rccl commit: c11f6b1531 ]
2024-10-08 10:20:55 -06:00
akolliasAMD
949fdd027b
disabled wbinvl1 for gfx9x on ll128 ( #1365 )
...
[ROCm/rccl commit: bc519fd733 ]
2024-10-08 08:43:29 -06:00
Nilesh M Negi
cd29f1e22f
[TRANSPORT] Add RCCL_FORCE_ENABLE_GDRDMA for debugging ( #1356 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 8ad76f8d10 ]
2024-10-06 18:43:49 -05:00
akolliasAMD
9c4ac4cae5
Regression timing fix ( #1361 )
...
* Removed testbed initialization on standalone tests
* .jenkins renabled all tests
[ROCm/rccl commit: 7fb9189760 ]
2024-10-03 10:41:26 -06:00
Bertan Dogancay
974c13cd62
[BUILD] Move code generation to python from CMake ( #1360 )
...
* Use generate.py for func generation
* Convert AddUnroll.cmake to bash
[ROCm/rccl commit: 2dd10c8f17 ]
2024-10-03 10:21:19 -04:00
dependabot[bot]
152738dcc9
Bump rocm-docs-core from 1.7.2 to 1.8.2 in /docs/sphinx ( #1348 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.7.2 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.2...v1.8.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[ROCm/rccl commit: 038517b169 ]
2024-10-02 16:33:26 -06:00
Bertan Dogancay
b915c3c154
Merge pull request #1358 from BertanDogancay/nccl-2.21-sync
...
[ROCm/rccl commit: 833b185a2d ]
2024-10-02 18:21:06 -04:00
Nusrat Islam
1f7945286c
Enable MSCCLPP use in CPX mode ( #1355 )
...
This PR enables the use of MSCCLPP in CPX mode for 8 GPUs.
[ROCm/rccl commit: d13f9c44f5 ]
2024-10-02 11:52:04 -05:00
BertanDogancay
9059445acb
Merge remote-tracking branch 'nccl/master' into develop
...
[ROCm/rccl commit: 84081064a0 ]
2024-10-02 09:31:25 -05:00
Wenkai Du
74aa13afbe
Add another Rome model ( #1354 )
...
[ROCm/rccl commit: e453f1ced9 ]
2024-10-01 17:41:27 -05:00
Ziyue Yang
cf980e9b9c
Fix size matching in MSCCL ( #1318 )
...
[ROCm/rccl commit: 7830af5844 ]
2024-10-01 13:32:41 -07:00
Nilesh M Negi
efc500d2ff
[CI] Temporarily disable RCCL UT Standalone.RegressionTiming in CI ( #1350 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 8b3ed8f104 ]
2024-09-27 14:08:36 -05:00
corey-derochie-amd
d5a2245a40
Checkout submodules with shallow depth ( #1353 )
...
* Make submodules shallow
* Updated README for the shallow checkout changes.
[ROCm/rccl commit: 7231808c58 ]
2024-09-27 11:07:16 -06:00
spolifroni-amd
dd884f00c0
Merge pull request #1345 from ROCm/spolifroni-amd/update-changelog
...
Updated 6.2.1 changelog so that it reflects what's in the 6.2.1 RN
[ROCm/rccl commit: 06a0ddb3b4 ]
2024-09-27 10:15:30 -04:00
Mustafa Abduljabbar
ef6d75b3ee
MSCCL Multithreaded regression root cause fix ( #1347 )
...
* Make sure the target device is used for MSCCL
* Enable single process mode by default to use MSCCL in MT
* Create a per-rank state when GPUs share a thread
[ROCm/rccl commit: 03a3ef3c34 ]
2024-09-25 15:24:25 -04:00
Nilesh M Negi
21a3b242bf
[TRANSPORT] GDRDMA enablement for linux kernel 6.4.0 or newer ( #1328 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 105ff1611f ]
2024-09-25 11:29:52 -05:00
Tim
94ac752578
Remove 0 size UBR ( #1346 )
...
ncclCommRegister, required for UBR, will call IB dmabuf regMr directly which forbids 0 size message
[ROCm/rccl commit: 40e93ebc29 ]
2024-09-24 18:16:51 -04:00
Nilesh M Negi
56bc01cb83
[BUILD] Enable MSCCL++ for gfx942 variants ( #1344 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 3c61e934f2 ]
2024-09-23 19:05:49 -05:00
Sandra Polifroni
53478f138e
Updated the information for 6.2.1 in the changelog so that it reflects what's in the 6.2.1 release notes
...
[ROCm/rccl commit: 7f87b0cd85 ]
2024-09-23 14:27:58 -04:00
Nilesh M Negi
60ee54839c
Add Dockerfile to build rccl and rccl-tests ( #1011 )
...
* [BUILD] Add Dockerfile for RCCL and RCCL-Tests
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
* Update docker/Dockerfile.ubuntu
Typo for LD_LIBRARY_PATH
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
* Update docker/Dockerfile.ubuntu
use `-b` for `git clone` instead of additional `git checkout`
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
* Update docker/Dockerfile.ubuntu
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
---------
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
[ROCm/rccl commit: 707377b3cd ]
2024-09-22 03:53:16 -05:00
Mustafa Abduljabbar
13f6bbde57
Fix MSCCLPP seg-fault when RCCL_MSCCL_ENABLE_SINGLE_PROCESS is enabled ( #1338 )
...
Removing unnecessary changes.
rename unique hosts function
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
use updated function name
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
Missed one instance of `mscclIsMultithreadedComm`.
rename unique hosts function
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
use updated function name
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
Missed one instance of `mscclIsMultithreadedComm`.
[ROCm/rccl commit: 2fe1e9f7db ]
2024-09-20 11:22:05 -05:00
gilbertlee-amd
d4094525c8
Fixing install.sh to properly accept spaces in ONLY_FUNCS ( #1339 )
...
[ROCm/rccl commit: 575afee5de ]
2024-09-18 17:25:36 -06:00
corey-derochie-amd
cf48e57bd9
Moved mscclpp_ncclGetUniqueId call into ncclCommInitRankFunc ( #1332 )
...
* Moved call to `mscclpp_ncclGetUniqueId` into `ncclCommInitRankFunc` to avoid setting up transport early in environments where MSCCL++ isn't valid.
* Checking `mscclEnabled` for the process and the topology to gate MSCCL++.
* Allowed `mscclForceEnable` to enable MSCCL++.
[ROCm/rccl commit: 853a0586b4 ]
2024-09-16 16:41:40 -06:00
dependabot[bot]
31b576b969
Bump rocm-docs-core from 1.7.1 to 1.7.2 in /docs/sphinx ( #1306 )
...
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core ) from 1.7.1 to 1.7.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.1...v1.7.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[ROCm/rccl commit: ad94c651ad ]
2024-09-13 16:57:45 -06:00
Mustafa Abduljabbar
bfa7940541
RCCL Tuner Plugin Docs
...
[ROCm/rccl commit: 05c7b7e69b ]
2024-09-12 13:43:45 -05:00
corey-derochie-amd
c8f4dedfd1
Added nlohmann/json:v3.11.3 as a submodule in ext-src and passed its path into the mscclpp build to avoid downloading the package at build time. ( #1330 )
...
[ROCm/rccl commit: b3b0ffdbf3 ]
2024-09-11 16:54:26 -06:00
corey-derochie-amd
9ffd893c5a
Re-enabled MSCCL++ ( #1325 )
...
* Added restrictions around calling MSCCL++ collectives (#1281 )
* Added restriction to non-zero 32-byte multiple message sizes to MSCCL++ AllGather.
* Renamed and refactored some mscclpp types.
* Only transmit the MSCCL++ unique id for non-split comm init. For splitting comm, it has already been transmitted. Instead, save the MSCCL++ communicator in child communicators when calling `ncclCommSplit`. Only destroy MSCCL++ communicators when no RCCL communicators remain that use it. Also improved trace logging.
* Disable MSCCL++ when using managed memory buffers as it isn't supported.
* Added datatype and op constraints for MSCCL++ AllReduce.
* Added documentation on MSCCL++ restrictions to the README.
* [BUILD] Support custom CMake flags in MSCCLPP (#1275 )
* [BUILD] Support custom CMAKE_PREFIX_PATH in MSCCLPP
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
* [BUILD] CMake flags to support build-id in MSCCLPP
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
* [BUILD] Fix CMake warnings in MSCCLPP build
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
* Wrapped all cmake arguments passed to mscclpp to remove empty arguments and properly format them.
---------
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
Co-authored-by: Corey Derochie <corey.derochie@amd.com >
* Link to libmscclpp_nccl statically (#1282 )
* Switched mscclpp_nccl to static linking. Added a build step to rename the NCCL API functions.
* Undid separation of building libmscclpp_nccl from building librccl with MSCCL++ integration. With a static build, it's either fully enabled or fully disabled.
* `nm` isn't always available in docker containers due to being stripped down. Removed use of `nm` in `cmake` and hard-coded the output into mscclpp_nccl_syms.txt.
* Removed IBVerbs dependency for integrating with MSCCL++ (#1313 )
* Renamed `RCCL_ENABLE_MSCCLPP` to `RCCL_MSCCLPP_ENABLE` to conform to MSCCL. Set `RCCL_MSCCLPP_ENABLE` to 1 by default if `ENABLE_MSCCLPP` is defined, or 0 otherwise. Added a log warning if `RCCL_MSCCLPP_ENABLE` is set to 1 but `ENABLE_MSCCLPP` is not defined. (#1294 )
* Include mscclpp as a git submodule (#1314 )
* Added the desired mscclpp commit as a git submodule.
* Added step to automatically checkout the mscclpp submodule if it isn't already present, in case the user forgot to clone recursively.
* Added instruction to README to clone using --recurse-submodules to get the mscclpp submodule.
* Enabled MSCCL++ feature build.
---------
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com >
[ROCm/rccl commit: 736a705875 ]
2024-09-11 09:55:16 -06:00
saurabhAMD
e3b39ab309
Making variable names consistent in EnvVars.cpp ( #1327 )
...
* Making variable names consistent in EnvVars.cpp
[ROCm/rccl commit: 4856309413 ]
2024-09-11 09:23:31 -05:00
mberenjk
78e0b3fe9e
replacing nccl/cuda related part of the api_trace.h with rccl/hip ( #1326 )
...
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
[ROCm/rccl commit: 4ceb672179 ]
2024-09-10 11:05:14 -05:00
saurabhAMD
fdaef9dd82
Enabling Unit Tests for CPX mode ( #1324 )
...
* Unit Tests for RCCL in CPX mode
* override pow2gpus set by cpx mode by user argument
* Adding comment for UT_POW2_GPUS
* Additional comment on why using pow2gpus for cpx mode.
[ROCm/rccl commit: 289a80c4e9 ]
2024-09-09 10:12:33 -05:00
dependabot[bot]
7873e551b1
Bump cryptography from 42.0.7 to 43.0.1 in /docs/sphinx ( #1317 )
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 42.0.7 to 43.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/42.0.7...43.0.1 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[ROCm/rccl commit: c85ac2bd1c ]
2024-09-06 14:28:54 -06:00
Tim
144a54f178
Merge pull request #1320 from AtlantaPepsi/UT_cpx_hotfix
...
Temporary patch for unit tests in cpx mode
[ROCm/rccl commit: 8169cf1dfd ]
2024-09-06 12:07:03 -04:00