提交图

1036 次代码提交

作者 SHA1 备注 提交日期
Bertan Dogancay f35777e9b0 improve compilation time and create timetrace plot (#773)
* improve compilation time and create time-trace plot

* set default value for nproc
2023-06-14 09:17:51 -06:00
Bertan Dogancay b89c5e0632 Enable --fast (#774) 2023-06-13 13:40:40 -06:00
akolliasAMD 9cdac774ea Wall clock update and npkit trace script Update (#771)
* changed builtin clock to wall_clock64
* updated npkit_Trace_generator to the new version of npkit
2023-06-07 17:47:10 -06:00
Sam Wu c3f47853bd Update Read the Docs, documentation, and dependabot (#772)
* update documentation

add version number to documentation

rename .sphinx/.doxygen to sphinx/doxygen

enable htmlzip, pdf, epub formats when publishing on Read the Docs

* add noCI label for dependabot PRs

since RTD CI is separate from math lib CI

* update rocm-docs-core to v0.13.4

* update README with link to rocm.docs.amd.com
2023-06-07 15:31:58 -06:00
dependabot[bot] 5f1f0142ac Bump cryptography from 40.0.2 to 41.0.0 in /docs/.sphinx (#766)
Bumps [cryptography](https://github.com/pyca/cryptography) from 40.0.2 to 41.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/40.0.2...41.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-07 10:27:56 -06:00
dependabot[bot] 5e8654a0b0 Bump requests from 2.28.2 to 2.31.0 in /docs/.sphinx (#747)
Bumps [requests](https://github.com/psf/requests) from 2.28.2 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.28.2...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-07 10:27:36 -06:00
gilbertlee-amd 20b567caac Updating NOTICES.txt and LICENSE.txt (#770) 2023-06-07 09:45:03 -06:00
Cory Bloor b1a65afd58 Fix build on additional architectures (#740)
* Fix build on additional architectures

Instead of directly wrapping a platform-specific operation with a
preprocessor check against a gfx macro, it can be more flexible to
check a macro that can be overriden by the user. The gfx macro can then
just provide the default value for the macro, resulting in the same
default behaviour as if the gfx macro was checked directly but with
more control at build-time.

For example, to build rccl without using buffer_wbinvl1_vol on
gfx902, but still use the default on other archs, a user could
export CXXFLAGS='-Xarch_gfx902 -DRCCL_USE_WBINVL1_VOL=1' before
configuring the build. This flexibility isn't always necessary, but
it's nicer to have it and not need it than to need it and not have it.

* Define WARP_SIZE using warpSize builtin
2023-06-06 16:45:50 -06:00
dependabot[bot] a029d34628 Bump rocm-docs-core from 0.11.0 to 0.13.3 in /docs/.sphinx (#765)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.11.0 to 0.13.3.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.11.0...v0.13.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-06 16:45:15 -06:00
Pedram Alizadeh 520f15e61b resolving the pthread-gtest linking issue for rccl-UnitTests (#768) 2023-06-06 14:21:40 -04:00
Wenkai Du 3af90902c8 Add NCCL_NCHANNELS_PER_PEER override (#767)
Also fix topol_expl build issue
2023-06-06 08:41:38 -07:00
Bertan Dogancay d52b6c0d24 add DMA_BUF support (#763)
* add DMA_BUF support

* remove unused libraries in src/init.cc

* change NCCL_ALL to NCCL_INIT

* remove extra pointer functions in transport/net.cc
2023-06-01 12:46:42 -06:00
gilbertlee-amd c62aebe882 Removing init_nvtx.cc from source list (#762) 2023-05-31 14:44:55 -06:00
Wenkai Du 5a38ff192b Rework barrier and event code (#761)
* Rework barrier and event code

* Switch to inline asm
2023-05-31 13:36:51 -07:00
Nusrat Islam 7519ecb476 Merge pull request #757 from nusislam/unroll
device: change unroll factor
2023-05-30 14:19:32 -05:00
akolliasAMD 2b1efa9e9a added time results on npkit generator (#749) 2023-05-30 12:57:25 -06:00
gilbertlee-amd 777d8747a5 Refactoring CMakeFiles (#755) 2023-05-25 16:08:54 -06:00
Nusrat Islam 4d1cfb17c8 device: change unroll factor
The default value of unroll factor is 2. Changing the unroll
factor to 4 provides better performance for most of the collectives.
2023-05-25 15:42:35 -05:00
akolliasAMD 58db1cb96d updated install script to enable all of npkit (#754) 2023-05-24 14:44:01 -06:00
Ziyue Yang 7d6e7bcd7d revert npkit (#748) 2023-05-24 07:41:05 -07:00
Ziyue Yang ed252c30f4 Limit MSCCL reduce unrolling to pow-2 cases to shrink kernel size (#746) 2023-05-19 11:46:36 -07:00
Ziyue Yang 11676267b5 fix min, max and avg (#745) 2023-05-18 11:02:59 -07:00
Sam Wu edb2ee1a44 Update documentation requirements (#743)
Co-authored-by: samjwu <samjwu@users.noreply.github.com>
2023-05-18 10:52:35 -06:00
Wen-Heng (Jack) Chung eba4e9e100 Merge pull request #742 from whchung/skip_done_event_msccl
Allow skipping doneEvent inside MSCCL.
2023-05-18 10:17:20 -05:00
Wenkai Du 403cda6322 Fix merge error (#744) 2023-05-18 08:09:27 -07:00
Wen-Heng (Jack) Chung ca4a1dfd67 Address review feedbacks and make the flag be disabled by default. 2023-05-17 17:50:25 +00:00
Wen-Heng (Jack) Chung 12dba425de Skip doneEvent inside MSCCL by default.
Added a RCCL_MSCCL_ENABLE_DONE_EVENT env var, set it be 0 by default.

The env var is to control whether to use doneEvent when invoking MSCCL
kernels.

Skipping doneEvent would cause the firmware to skip L2 cache flush,
resulting in overall performance improvement.
2023-05-17 16:49:42 +00:00
Wenkai Du 4ca7742c61 Revert "Ensure memory copy integrity during transport setup (#731)" (#741)
* Revert "Ensure memory copy integrity during transport setup (#731)"

This reverts commit 36e453c61e.

Add stream synchronization in ncclStrongStreamRelease.

* Use event record and wait
2023-05-16 10:34:47 -07:00
akolliasAMD c88475462b added modified npkit_trace_generator.py to scripts (#738)
* added modified npkit_trace_generator.py to scripts
2023-05-09 10:11:35 -06:00
Wenkai Du 8bb3340fcb Skip checking of some settings in Cray OS (#739) 2023-05-09 07:59:56 -07:00
Wenkai Du b6542c9b82 Merge pull request #735 from wenkaidu/nvls
Remove references to NVLS functions
2023-05-05 15:09:58 -07:00
Wenkai Du 897745a266 Remove references to NVLS functions 2023-05-05 07:55:20 -07:00
Wenkai Du e21be4acaf Merge pull request #732 from ROCmSoftwarePlatform/2.17.1
Sync up to NCCL 2.17.1
2023-05-02 08:30:36 -07:00
Wenkai Du 53a1f91857 Merge remote-tracking branch 'nccl/master' into develop 2023-04-25 15:38:32 -07:00
Wenkai Du 36e453c61e Ensure memory copy integrity during transport setup (#731) 2023-04-25 14:41:43 -07:00
dependabot[bot] 79df139f0b Bump rocm-docs-core from 0.2.0 to 0.5.0 in /docs/.sphinx (#728)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.2.0 to 0.5.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-16 18:20:17 -06:00
Saad Rahim a78ff46861 Standardizing documentation homepage message (#726) 2023-04-16 18:14:56 -06:00
David Addison 9b7d5edbfc Merge pull request #822 from KaimingOuyang/github/pytorch-hang-fix
Shutdown socket before close in ncclSocketClose()
2023-04-14 19:52:45 -07:00
Wenkai Du 4b09ffba43 msccl: print stack and memory usage (#723)
* msccl: print stack and memory usage

* Update number of kernels calculation
2023-04-14 14:59:03 -07:00
Pedram Alizadeh 53c1c38f0e Disabled hipgraph tests! (#725) 2023-04-13 17:42:05 -04:00
Kaiming Ouyang 006b6bc7dc Add a comment to shutdown() in ncclSocketClose 2023-04-13 09:13:44 -07:00
Kaiming Ouyang 367e9b61c3 Shutdown socket before close in ncclSocketClose() 2023-04-13 09:11:52 -07:00
Ziyue Yang 7289c05146 MSCCL: Fix memcpy bug (#721) 2023-04-11 14:46:53 -07:00
Sam Wu dc149a9fbd pin rocm-docs-core and add dependabot config (#722) 2023-04-11 10:01:24 -06:00
akolliasAMD 2ce7d971e5 lessened the amount of child processes to active ones (#720) 2023-04-11 08:59:56 -06:00
gilbertlee-amd 27e0cb43c2 Unit test performance refactor (#700)
* Refactoring unit tests to improve performance
* Spawning child processes during InitComms instead of on TestBed construction
* Temporarily disabling graph unit tests
2023-04-06 12:28:53 -06:00
akolliasAMD 9fe5a349f1 added npkit_enable on CI tests (#698) 2023-04-05 08:05:23 -06:00
Wenkai Du addbf4bd90 rccl-prim-test: minor update (#718) 2023-04-03 07:30:04 -07:00
Ziyue Yang c8e33b1232 fix msccl stream usage (#717) 2023-03-24 10:59:36 -07:00
akolliasAMD caf7a3d47d fixed jenkins LD_LIBRARY_PATH (#714) 2023-03-21 11:05:43 -06:00