Bertan Dogancay
f35777e9b0
improve compilation time and create timetrace plot ( #773 )
...
* improve compilation time and create time-trace plot
* set default value for nproc
2023-06-14 09:17:51 -06:00
Bertan Dogancay
b89c5e0632
Enable --fast ( #774 )
2023-06-13 13:40:40 -06:00
akolliasAMD
9cdac774ea
Wall clock update and npkit trace script Update ( #771 )
...
* changed builtin clock to wall_clock64
* updated npkit_Trace_generator to the new version of npkit
2023-06-07 17:47:10 -06:00
Sam Wu
c3f47853bd
Update Read the Docs, documentation, and dependabot ( #772 )
...
* update documentation
add version number to documentation
rename .sphinx/.doxygen to sphinx/doxygen
enable htmlzip, pdf, epub formats when publishing on Read the Docs
* add noCI label for dependabot PRs
since RTD CI is separate from math lib CI
* update rocm-docs-core to v0.13.4
* update README with link to rocm.docs.amd.com
2023-06-07 15:31:58 -06:00
dependabot[bot]
5f1f0142ac
Bump cryptography from 40.0.2 to 41.0.0 in /docs/.sphinx ( #766 )
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 40.0.2 to 41.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/40.0.2...41.0.0 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-07 10:27:56 -06:00
dependabot[bot]
5e8654a0b0
Bump requests from 2.28.2 to 2.31.0 in /docs/.sphinx ( #747 )
...
Bumps [requests](https://github.com/psf/requests ) from 2.28.2 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.28.2...v2.31.0 )
---
updated-dependencies:
- dependency-name: requests
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-07 10:27:36 -06:00
gilbertlee-amd
20b567caac
Updating NOTICES.txt and LICENSE.txt ( #770 )
2023-06-07 09:45:03 -06:00
Cory Bloor
b1a65afd58
Fix build on additional architectures ( #740 )
...
* Fix build on additional architectures
Instead of directly wrapping a platform-specific operation with a
preprocessor check against a gfx macro, it can be more flexible to
check a macro that can be overriden by the user. The gfx macro can then
just provide the default value for the macro, resulting in the same
default behaviour as if the gfx macro was checked directly but with
more control at build-time.
For example, to build rccl without using buffer_wbinvl1_vol on
gfx902, but still use the default on other archs, a user could
export CXXFLAGS='-Xarch_gfx902 -DRCCL_USE_WBINVL1_VOL=1' before
configuring the build. This flexibility isn't always necessary, but
it's nicer to have it and not need it than to need it and not have it.
* Define WARP_SIZE using warpSize builtin
2023-06-06 16:45:50 -06:00
dependabot[bot]
a029d34628
Bump rocm-docs-core from 0.11.0 to 0.13.3 in /docs/.sphinx ( #765 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.11.0 to 0.13.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.11.0...v0.13.3 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-06 16:45:15 -06:00
Pedram Alizadeh
520f15e61b
resolving the pthread-gtest linking issue for rccl-UnitTests ( #768 )
2023-06-06 14:21:40 -04:00
Wenkai Du
3af90902c8
Add NCCL_NCHANNELS_PER_PEER override ( #767 )
...
Also fix topol_expl build issue
2023-06-06 08:41:38 -07:00
Bertan Dogancay
d52b6c0d24
add DMA_BUF support ( #763 )
...
* add DMA_BUF support
* remove unused libraries in src/init.cc
* change NCCL_ALL to NCCL_INIT
* remove extra pointer functions in transport/net.cc
2023-06-01 12:46:42 -06:00
gilbertlee-amd
c62aebe882
Removing init_nvtx.cc from source list ( #762 )
2023-05-31 14:44:55 -06:00
Wenkai Du
5a38ff192b
Rework barrier and event code ( #761 )
...
* Rework barrier and event code
* Switch to inline asm
2023-05-31 13:36:51 -07:00
Nusrat Islam
7519ecb476
Merge pull request #757 from nusislam/unroll
...
device: change unroll factor
2023-05-30 14:19:32 -05:00
akolliasAMD
2b1efa9e9a
added time results on npkit generator ( #749 )
2023-05-30 12:57:25 -06:00
gilbertlee-amd
777d8747a5
Refactoring CMakeFiles ( #755 )
2023-05-25 16:08:54 -06:00
Nusrat Islam
4d1cfb17c8
device: change unroll factor
...
The default value of unroll factor is 2. Changing the unroll
factor to 4 provides better performance for most of the collectives.
2023-05-25 15:42:35 -05:00
akolliasAMD
58db1cb96d
updated install script to enable all of npkit ( #754 )
2023-05-24 14:44:01 -06:00
Ziyue Yang
7d6e7bcd7d
revert npkit ( #748 )
2023-05-24 07:41:05 -07:00
Ziyue Yang
ed252c30f4
Limit MSCCL reduce unrolling to pow-2 cases to shrink kernel size ( #746 )
2023-05-19 11:46:36 -07:00
Ziyue Yang
11676267b5
fix min, max and avg ( #745 )
2023-05-18 11:02:59 -07:00
Sam Wu
edb2ee1a44
Update documentation requirements ( #743 )
...
Co-authored-by: samjwu <samjwu@users.noreply.github.com >
2023-05-18 10:52:35 -06:00
Wen-Heng (Jack) Chung
eba4e9e100
Merge pull request #742 from whchung/skip_done_event_msccl
...
Allow skipping doneEvent inside MSCCL.
2023-05-18 10:17:20 -05:00
Wenkai Du
403cda6322
Fix merge error ( #744 )
2023-05-18 08:09:27 -07:00
Wen-Heng (Jack) Chung
ca4a1dfd67
Address review feedbacks and make the flag be disabled by default.
2023-05-17 17:50:25 +00:00
Wen-Heng (Jack) Chung
12dba425de
Skip doneEvent inside MSCCL by default.
...
Added a RCCL_MSCCL_ENABLE_DONE_EVENT env var, set it be 0 by default.
The env var is to control whether to use doneEvent when invoking MSCCL
kernels.
Skipping doneEvent would cause the firmware to skip L2 cache flush,
resulting in overall performance improvement.
2023-05-17 16:49:42 +00:00
Wenkai Du
4ca7742c61
Revert "Ensure memory copy integrity during transport setup ( #731 )" ( #741 )
...
* Revert "Ensure memory copy integrity during transport setup (#731 )"
This reverts commit 36e453c61e .
Add stream synchronization in ncclStrongStreamRelease.
* Use event record and wait
2023-05-16 10:34:47 -07:00
akolliasAMD
c88475462b
added modified npkit_trace_generator.py to scripts ( #738 )
...
* added modified npkit_trace_generator.py to scripts
2023-05-09 10:11:35 -06:00
Wenkai Du
8bb3340fcb
Skip checking of some settings in Cray OS ( #739 )
2023-05-09 07:59:56 -07:00
Wenkai Du
b6542c9b82
Merge pull request #735 from wenkaidu/nvls
...
Remove references to NVLS functions
2023-05-05 15:09:58 -07:00
Wenkai Du
897745a266
Remove references to NVLS functions
2023-05-05 07:55:20 -07:00
Wenkai Du
e21be4acaf
Merge pull request #732 from ROCmSoftwarePlatform/2.17.1
...
Sync up to NCCL 2.17.1
2023-05-02 08:30:36 -07:00
Wenkai Du
53a1f91857
Merge remote-tracking branch 'nccl/master' into develop
2023-04-25 15:38:32 -07:00
Wenkai Du
36e453c61e
Ensure memory copy integrity during transport setup ( #731 )
2023-04-25 14:41:43 -07:00
dependabot[bot]
79df139f0b
Bump rocm-docs-core from 0.2.0 to 0.5.0 in /docs/.sphinx ( #728 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.2.0 to 0.5.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/commits/v0.5.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-16 18:20:17 -06:00
Saad Rahim
a78ff46861
Standardizing documentation homepage message ( #726 )
2023-04-16 18:14:56 -06:00
David Addison
9b7d5edbfc
Merge pull request #822 from KaimingOuyang/github/pytorch-hang-fix
...
Shutdown socket before close in ncclSocketClose()
2023-04-14 19:52:45 -07:00
Wenkai Du
4b09ffba43
msccl: print stack and memory usage ( #723 )
...
* msccl: print stack and memory usage
* Update number of kernels calculation
2023-04-14 14:59:03 -07:00
Pedram Alizadeh
53c1c38f0e
Disabled hipgraph tests! ( #725 )
2023-04-13 17:42:05 -04:00
Kaiming Ouyang
006b6bc7dc
Add a comment to shutdown() in ncclSocketClose
2023-04-13 09:13:44 -07:00
Kaiming Ouyang
367e9b61c3
Shutdown socket before close in ncclSocketClose()
2023-04-13 09:11:52 -07:00
Ziyue Yang
7289c05146
MSCCL: Fix memcpy bug ( #721 )
2023-04-11 14:46:53 -07:00
Sam Wu
dc149a9fbd
pin rocm-docs-core and add dependabot config ( #722 )
2023-04-11 10:01:24 -06:00
akolliasAMD
2ce7d971e5
lessened the amount of child processes to active ones ( #720 )
2023-04-11 08:59:56 -06:00
gilbertlee-amd
27e0cb43c2
Unit test performance refactor ( #700 )
...
* Refactoring unit tests to improve performance
* Spawning child processes during InitComms instead of on TestBed construction
* Temporarily disabling graph unit tests
2023-04-06 12:28:53 -06:00
akolliasAMD
9fe5a349f1
added npkit_enable on CI tests ( #698 )
2023-04-05 08:05:23 -06:00
Wenkai Du
addbf4bd90
rccl-prim-test: minor update ( #718 )
2023-04-03 07:30:04 -07:00
Ziyue Yang
c8e33b1232
fix msccl stream usage ( #717 )
2023-03-24 10:59:36 -07:00
akolliasAMD
caf7a3d47d
fixed jenkins LD_LIBRARY_PATH ( #714 )
2023-03-21 11:05:43 -06:00