Grafik Komit

1379 Melakukan

Penulis SHA1 Pesan Tanggal
Wenkai Du 0ff5fc0bad npkit: add broadcast trace (#1166)
[ROCm/rccl commit: a0cef69110]
2024-05-07 14:00:16 -07:00
Pak Nin Lui df3d462dd9 Merge pull request #1167 from paklui/dmabuf
fix typo for DMABUF_ENABLE

[ROCm/rccl commit: 92a4fc6204]
2024-05-07 08:48:44 -07:00
dependabot[bot] 0d025525ad Bump jinja2 from 3.1.3 to 3.1.4 in /docs/sphinx (#1168)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.3...3.1.4)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: eb562e7b22]
2024-05-06 15:35:34 -06:00
paklui dd8e937948 fix typo for DMABUF_ENABLE
[ROCm/rccl commit: 140b7dd40f]
2024-05-06 13:27:50 -07:00
Wenkai Du c782aba364 Bypass NVIDIA Ampere related tuning (#1165)
[ROCm/rccl commit: b513c3970a]
2024-05-03 17:57:16 -07:00
Wenkai Du 7c811a7582 Fix ignore NUMA not being observed for NICs during model matching (#1164)
[ROCm/rccl commit: bb58b1c258]
2024-05-03 16:42:07 -07:00
Wenkai Du 9638535690 Fix build error when roctracer-dev package is not installed (#1161)
[ROCm/rccl commit: 6f5a8ce1fb]
2024-05-01 13:55:09 -07:00
Wenkai Du 3906e992f8 MSCCL: add support for out-of-place all reduce (#1156)
[ROCm/rccl commit: 4e1b8c1cbb]
2024-04-28 19:49:09 -07:00
Wenkai Du 703014e960 Add back tree simple chunk size tuning (#1157)
[ROCm/rccl commit: cd6e840e0b]
2024-04-28 19:48:53 -07:00
Nilesh M Negi b99b89e7a2 [GRAPH] Reduce NCCL_TOPO_MAX_NODES to 64 (#1153)
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com>

[ROCm/rccl commit: b90436d292]
2024-04-27 23:41:11 -05:00
Tim afeaa17475 Merge pull request #1158 from AtlantaPepsi/NPKit_fix
Prevent segfault from npkit-enabled rccl build

[ROCm/rccl commit: cc39e91c6f]
2024-04-26 12:44:04 -04:00
AtlantaPepsi 8cf28704ce prevent segfault from npkit-enabled rccl build
Signed-off-by: AtlantaPepsi <timhu102@amd.com>


[ROCm/rccl commit: 67246649ac]
2024-04-26 10:54:27 -05:00
Wenkai Du 3c94f98688 Revert "Use relaxed atomics for LL on GFX11 (#859)" (#1148)
This reverts commit 5983f0e371.

Use inline asm for 128b load on GFX11 for better peformance.

[ROCm/rccl commit: f330b82985]
2024-04-26 07:49:55 -07:00
Bertan Dogancay dea5e83940 [UT] Start supporting multiple group calls and graphs (#1151)
* Start supporting multiple group calls UT

[ROCm/rccl commit: 0ec41f1386]
2024-04-25 11:11:16 -06:00
Shilei Tian 9a203f439c SWDEV-455705: Fix an UB that could lead to miscompilation (#1155)
[ROCm/rccl commit: efe99057b0]
2024-04-25 10:10:01 -07:00
Wenkai Du e494f29235 Replace __HIP_PLATFORM_HCC__ with __HIP_PLATFORM_AMD__ (#1154)
[ROCm/rccl commit: 9e0c9b4ed8]
2024-04-25 07:19:18 -07:00
Bertan Dogancay ed152c5b89 Update CHANGELOG.md for RCCL 2.20.5 (#1150)
[ROCm/rccl commit: dcc75797a1]
2024-04-24 09:07:49 -06:00
Bertan Dogancay 2ad3fee222 Merge pull request #1111 from BertanDogancay/2.20
2.20.5 Sync

[ROCm/rccl commit: 8753bec3ea]
2024-04-24 09:05:41 -06:00
BertanDogancay 36f9492cda Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl commit: e1a835910e]
2024-04-23 13:34:00 -07:00
Wenkai Du 35f8d269f8 Use hipExtMallocWithFlags to allocate host memory on APU (#1149)
Also use SM60 as CUDA compatibility level.

[ROCm/rccl commit: 220066197a]
2024-04-17 16:56:38 -07:00
corey-derochie-amd 34fb1007a7 Updated CHANGELOG for next release (#1146)
* Updated CHANGELOG to release for ROCm 6.1.0 (#1142)

* Fixed missing CHANGELOG notes from ROCm 5.5 through unreleased 6.1 (#1141)

* Update CHANGELOG.md for ROCm release 5.5

(cherry picked from commit 83342e865445b233319466d4a620c1166ecaf181)

* Update CHANGELOG.md for ROCm 5.7.0

(cherry picked from commit a7c3b8dcb5cd0654f0a39cb3be4fdf7e8c820577)

* Added ROCm 6.0 and 6.1 CHANGELOG notes.

---------

Co-authored-by: gilbertlee-amd <44450918+gilbertlee-amd@users.noreply.github.com>
(cherry picked from commit 28a2b09304)

* Updated CHANGELOG to release for ROCm 6.1.0

* Removed empty sections from CHANGELOG in latest releases.

(cherry picked from commit 164c9553717f2c3bce86a372764ea73030dd5f72)

* Reverted ROCm 6.1.0 block to "Unreleased"

[ROCm/rccl commit: a14137c062]
2024-04-15 16:29:40 -06:00
corey-derochie-amd fa5d8d7a6b Created PR template for the rccl repo (#1118)
[ROCm/rccl commit: 8f471ba537]
2024-04-15 15:34:42 -06:00
gilbertlee-amd 422a7ffcbb Rail optimization for rings (#1140)
- Modifies the ring creation algorithm to be friendlier to rail-optimized topologies (should not affect classic fabric topologies)

[ROCm/rccl commit: 4cb62f999a]
2024-04-15 12:03:57 -06:00
Bertan Dogancay 8ddb74e3b1 Add unique files to source list (#1144)
[ROCm/rccl commit: 3caad91f32]
2024-04-15 09:46:53 -06:00
dependabot[bot] fb20f695ca Bump idna from 3.4 to 3.7 in /docs/sphinx (#1143)
Bumps [idna](https://github.com/kjd/idna) from 3.4 to 3.7.
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst)
- [Commits](https://github.com/kjd/idna/compare/v3.4...v3.7)

---
updated-dependencies:
- dependency-name: idna
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: c50eaddc28]
2024-04-12 09:28:39 -06:00
corey-derochie-amd 28a2b09304 Fixed missing CHANGELOG notes from ROCm 5.5 through unreleased 6.1 (#1141)
* Update CHANGELOG.md for ROCm release 5.5

(cherry picked from commit 83342e865445b233319466d4a620c1166ecaf181)

* Update CHANGELOG.md for ROCm 5.7.0

(cherry picked from commit a7c3b8dcb5cd0654f0a39cb3be4fdf7e8c820577)

* Added ROCm 6.0 and 6.1 CHANGELOG notes.

---------

Co-authored-by: gilbertlee-amd <44450918+gilbertlee-amd@users.noreply.github.com>

[ROCm/rccl commit: 3361abe786]
2024-04-11 15:04:40 -06:00
mberenjk da835cff9c replacing rccl_bfloat16 with hip_bfloat16 (#1126)
Co-authored-by: mberenjk <mberenjk@amd.com>

[ROCm/rccl commit: 428837ffe4]
2024-04-11 11:30:37 -05:00
dependabot[bot] 165d51b255 Bump rocm-docs-core from 0.38.0 to 0.38.1 in /docs/sphinx (#1139)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.0 to 0.38.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.0...v0.38.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: d3899c0581]
2024-04-11 09:32:54 -06:00
arvindcheru 2c0284885a Update Depends with correct HIP Runtime package name (#1130)
[ROCm/rccl commit: c1b8eab8e1]
2024-04-09 19:27:07 -04:00
Wenkai Du 99c7fc29ba NPKit: doubling size of event buffers following MAXCHANNELS change (#1135)
[ROCm/rccl commit: 0ce68f21d4]
2024-04-09 08:02:58 -07:00
Wenkai Du 0941d6bc6e Fix buffer overflow when parsing kernel cmdline (#1133)
[ROCm/rccl commit: 137571fa01]
2024-04-08 11:12:20 -07:00
gilbertlee-amd 62b9f0d3a7 [topo_expl] Adding -n option to override number of nodes (#1134)
[ROCm/rccl commit: 93982533d7]
2024-04-04 15:11:47 -06:00
Wenkai Du 890fafc2f7 rccl_prim_test: increase max number of workgroups and test iterations (#1132)
[ROCm/rccl commit: e8c76fd806]
2024-04-03 11:29:21 -07:00
dependabot[bot] d6aba883d4 Bump rocm-docs-core from 0.37.0 to 0.38.0 in /docs/sphinx (#1127)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.38.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: d0d1bfdeda]
2024-03-27 11:24:30 -06:00
arvindcheru a285fda3a1 Static Build update - Moved all cmake install() to rocm-cmake APIs, static build update (#1123)
[ROCm/rccl commit: c0a51dc84b]
2024-03-26 11:11:09 -04:00
corey-derochie-amd 62a6a07d49 Replaced ROCmSoftwarePlatform and RadeonOpenCompute links with ROCm links. (#1125)
[ROCm/rccl commit: 503a472a25]
2024-03-25 16:29:13 -06:00
corey-derochie-amd 19897f8d90 Fixes the copyright comment block on each of topo_expl/models/*.xml. The format was not valid XML. (#1124)
[ROCm/rccl commit: 9eefc68cb5]
2024-03-25 16:21:17 -06:00
Wenkai Du 43bbee4dcc Remove hipEventDisableSystemFence (#1122)
There is no indication that disabling system fence has any latency improvement.
Removing it per recommendation from HIP.

[ROCm/rccl commit: 5976f757dd]
2024-03-25 08:01:57 -07:00
Pedram Alizadeh 61f89d680d msccl algorithms tuning for alltoall on MI300 (#1120)
Co-authored-by: PedramAlizadeh <amd@pmohamma.com>

[ROCm/rccl commit: c2fc1d6809]
2024-03-21 20:35:29 -04:00
corey-derochie-amd 9c2a57259d Added @corey-derochie-amd as a code owner (to rocm-documentation) (#1119)
[ROCm/rccl commit: 606d3e6b6e]
2024-03-21 14:56:05 -06:00
dependabot[bot] d956fe9cbd Bump rocm-docs-core from 0.36.0 to 0.37.0 in /docs/sphinx (#1117)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: cb80586fb9]
2024-03-20 09:25:14 -06:00
jbachan b492ab6313 Merge pull request #1217 from crazy-JiangDongHua/bugfix_undo_plan
Bug in plan enqueue logic where plans could be silently not launched for some communicators. Triggered when both are true:
1. Multiple communicators per ncclGroup.
2. Communicators within a group have different plan counts.
2. Intra-process launch barrier disabled.

[ROCm/rccl commit: 6dd51f15bf]
2024-03-18 10:12:26 -07:00
Nilesh M Negi f93831cf6a BUILD: Enable RCCL static build (#1114)
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com>

[ROCm/rccl commit: 53fad75001]
2024-03-15 12:18:18 -05:00
srawat 7c8cf72d35 refactor RCCL (#1112)
* refactor RCCL

* rccl updates

* Update index.rst

* refactor

* Update what-is-rccl.rst

[ROCm/rccl commit: 45ee5734dd]
2024-03-15 14:14:47 +05:30
Pedram Alizadeh 17b9546da9 msccl algorithms tuning for allgather on MI300 (#1110)
[ROCm/rccl commit: 50f22e8317]
2024-03-14 12:18:26 -04:00
dependabot[bot] 7e22922051 Bump rocm-docs-core from 0.35.1 to 0.36.0 in /docs/sphinx (#1109)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rccl commit: 0867562b18]
2024-03-12 09:38:20 -06:00
FrankJ 894d3459a5 [bugfix]save undo plans in some case
[ROCm/rccl commit: 9ef920a77b]
2024-03-12 00:00:16 +08:00
Andy li e373bd44bf Enable fp8 support (#1101)
* initial checkin

* resolve cr comments

* resolve the build issue

* fix the data correctless issue

* update fp8 header file and update the unit test for fp8 support

* remove fp16 from fp8 headers

* fix ut issue and catch up the latest code from develop

* udate according to cr comments

* update ut according to cr comments

* update num floats for each SumPostDiv from 4 to 6

* update fp8 header file name

* fix the typo

[ROCm/rccl commit: 6777e65c1d]
2024-03-08 15:17:53 -08:00
Wenkai Du 2354601589 Improve debug messages of memory allocations (#1107)
[ROCm/rccl commit: ff951e607d]
2024-03-08 10:55:10 -08:00
Wenkai Du c2eff3ecd9 topo_expl: 2.19.4 update and fix build error (#1098)
[ROCm/rccl commit: d2224fd3e1]
2024-03-07 08:52:50 -08:00