dependabot[bot]
c50eaddc28
Bump idna from 3.4 to 3.7 in /docs/sphinx ( #1143 )
...
Bumps [idna](https://github.com/kjd/idna ) from 3.4 to 3.7.
- [Release notes](https://github.com/kjd/idna/releases )
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst )
- [Commits](https://github.com/kjd/idna/compare/v3.4...v3.7 )
---
updated-dependencies:
- dependency-name: idna
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-12 09:28:39 -06:00
corey-derochie-amd
3361abe786
Fixed missing CHANGELOG notes from ROCm 5.5 through unreleased 6.1 ( #1141 )
...
* Update CHANGELOG.md for ROCm release 5.5
(cherry picked from commit 975327be45f2313dc7249f9c54ad90870e833a4a)
* Update CHANGELOG.md for ROCm 5.7.0
(cherry picked from commit ac8db8d8e0853f1783c10e2858f6c3b86e4d27cb)
* Added ROCm 6.0 and 6.1 CHANGELOG notes.
---------
Co-authored-by: gilbertlee-amd <44450918+gilbertlee-amd@users.noreply.github.com >
2024-04-11 15:04:40 -06:00
mberenjk
428837ffe4
replacing rccl_bfloat16 with hip_bfloat16 ( #1126 )
...
Co-authored-by: mberenjk <mberenjk@amd.com >
2024-04-11 11:30:37 -05:00
dependabot[bot]
d3899c0581
Bump rocm-docs-core from 0.38.0 to 0.38.1 in /docs/sphinx ( #1139 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.38.0 to 0.38.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.0...v0.38.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-11 09:32:54 -06:00
arvindcheru
c1b8eab8e1
Update Depends with correct HIP Runtime package name ( #1130 )
2024-04-09 19:27:07 -04:00
Wenkai Du
0ce68f21d4
NPKit: doubling size of event buffers following MAXCHANNELS change ( #1135 )
2024-04-09 08:02:58 -07:00
Wenkai Du
137571fa01
Fix buffer overflow when parsing kernel cmdline ( #1133 )
2024-04-08 11:12:20 -07:00
gilbertlee-amd
93982533d7
[topo_expl] Adding -n option to override number of nodes ( #1134 )
2024-04-04 15:11:47 -06:00
Wenkai Du
e8c76fd806
rccl_prim_test: increase max number of workgroups and test iterations ( #1132 )
2024-04-03 11:29:21 -07:00
dependabot[bot]
d0d1bfdeda
Bump rocm-docs-core from 0.37.0 to 0.38.0 in /docs/sphinx ( #1127 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.37.0 to 0.38.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.38.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-27 11:24:30 -06:00
arvindcheru
c0a51dc84b
Static Build update - Moved all cmake install() to rocm-cmake APIs, static build update ( #1123 )
2024-03-26 11:11:09 -04:00
corey-derochie-amd
503a472a25
Replaced ROCmSoftwarePlatform and RadeonOpenCompute links with ROCm links. ( #1125 )
2024-03-25 16:29:13 -06:00
corey-derochie-amd
9eefc68cb5
Fixes the copyright comment block on each of topo_expl/models/*.xml. The format was not valid XML. ( #1124 )
2024-03-25 16:21:17 -06:00
Wenkai Du
5976f757dd
Remove hipEventDisableSystemFence ( #1122 )
...
There is no indication that disabling system fence has any latency improvement.
Removing it per recommendation from HIP.
2024-03-25 08:01:57 -07:00
Pedram Alizadeh
c2fc1d6809
msccl algorithms tuning for alltoall on MI300 ( #1120 )
...
Co-authored-by: PedramAlizadeh <amd@pmohamma.com >
2024-03-21 20:35:29 -04:00
corey-derochie-amd
606d3e6b6e
Added @corey-derochie-amd as a code owner (to rocm-documentation) ( #1119 )
2024-03-21 14:56:05 -06:00
dependabot[bot]
cb80586fb9
Bump rocm-docs-core from 0.36.0 to 0.37.0 in /docs/sphinx ( #1117 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.36.0 to 0.37.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-20 09:25:14 -06:00
Nilesh M Negi
53fad75001
BUILD: Enable RCCL static build ( #1114 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
2024-03-15 12:18:18 -05:00
srawat
45ee5734dd
refactor RCCL ( #1112 )
...
* refactor RCCL
* rccl updates
* Update index.rst
* refactor
* Update what-is-rccl.rst
2024-03-15 14:14:47 +05:30
Pedram Alizadeh
50f22e8317
msccl algorithms tuning for allgather on MI300 ( #1110 )
2024-03-14 12:18:26 -04:00
dependabot[bot]
0867562b18
Bump rocm-docs-core from 0.35.1 to 0.36.0 in /docs/sphinx ( #1109 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.35.1 to 0.36.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-12 09:38:20 -06:00
Andy li
6777e65c1d
Enable fp8 support ( #1101 )
...
* initial checkin
* resolve cr comments
* resolve the build issue
* fix the data correctless issue
* update fp8 header file and update the unit test for fp8 support
* remove fp16 from fp8 headers
* fix ut issue and catch up the latest code from develop
* udate according to cr comments
* update ut according to cr comments
* update num floats for each SumPostDiv from 4 to 6
* update fp8 header file name
* fix the typo
2024-03-08 15:17:53 -08:00
Wenkai Du
ff951e607d
Improve debug messages of memory allocations ( #1107 )
2024-03-08 10:55:10 -08:00
Wenkai Du
d2224fd3e1
topo_expl: 2.19.4 update and fix build error ( #1098 )
2024-03-07 08:52:50 -08:00
Wenkai Du
77615cce28
msccl: fix scratch memory allocation after API change ( #1103 )
2024-03-06 11:11:04 -08:00
dependabot[bot]
1f7b6e18d7
Bump rocm-docs-core from 0.35.0 to 0.35.1 in /docs/sphinx ( #1100 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.35.0 to 0.35.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-06 11:15:33 -07:00
yhuiYH
12441e8f6c
Merge pull request #1099 from ROCm/LisaDelaney-patch-1
...
link fix
2024-03-05 13:54:04 -05:00
Lisa
a032cb9eeb
link fix
2024-03-05 09:01:10 -07:00
Bertan Dogancay
a279e7f32d
Fix bug when configuring for only LL128 ( #1097 )
2024-03-01 18:09:39 -07:00
Wenkai Du
cbd955627e
Add support for using contiguous for GPU direct RDMA ( #1096 )
...
Enabled by env var RCCL_NET_CONTIGUOUS_MEM=1
2024-02-29 10:06:43 -08:00
Wenkai Du
df98a6957d
Add another Rome model ( #1095 )
2024-02-28 10:46:05 -08:00
Bertan Dogancay
b617aecc31
Implement ROCTX ( #1094 )
...
* Implement roctx
2024-02-27 15:46:15 -07:00
dependabot[bot]
dae6df6d16
Bump rocm-docs-core from 0.34.2 to 0.35.0 in /docs/sphinx ( #1092 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.34.2 to 0.35.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 16:57:14 -07:00
dependabot[bot]
beb1e487ad
Bump cryptography from 42.0.2 to 42.0.4 in /docs/sphinx ( #1090 )
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 42.0.2 to 42.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/42.0.2...42.0.4 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 16:47:14 -07:00
Tim
0d06b0f1de
Adding FP16 cases to unit tests( #1093 )
...
Signed-off-by: Tim Hu <timhu102@amd.com >
2024-02-26 12:08:04 -05:00
Wenkai Du
74f9e5db64
Add new GPU model ( #1080 )
2024-02-23 12:19:42 -08:00
Wenkai Du
c5ab37211b
Update RCCL/MSCCL work FIFO depth to 256K ( #1091 )
2024-02-21 17:15:11 -08:00
Bertan Dogancay
b275ed0b56
LL128 check if all XGMI ( #1089 )
2024-02-21 09:41:40 -07:00
Pedram Alizadeh
5a0f9990a9
msccl algorithms tuning for allreduce on MI300 ( #1088 )
2024-02-21 11:31:56 -05:00
dependabot[bot]
b7e3f1da14
Bump cryptography from 42.0.0 to 42.0.2 in /docs/sphinx ( #1087 )
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 42.0.0 to 42.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/42.0.0...42.0.2 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-20 15:03:10 -07:00
dependabot[bot]
7e47a77339
Bump rocm-docs-core from 0.34.0 to 0.34.2 in /docs/sphinx ( #1086 )
...
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core ) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2 )
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-16 11:21:27 -07:00
Bertan Dogancay
2fb12a9358
Merge pull request #1079 from BertanDogancay/2.19.4-sync
...
2.19.4 Sync
2024-02-16 09:50:11 -07:00
BertanDogancay
b098120c40
Increase max stack size when ll128 enabled
2024-02-15 15:56:59 -08:00
akolliasAMD
bac57421c7
Allow bus id to be null ( #1085 )
...
* Allow bus id to be null
2024-02-15 16:36:51 -07:00
BertanDogancay
6f3310605c
Disable unsupported ld/st instructions
2024-02-15 13:58:16 -08:00
BertanDogancay
76f83f95ab
Merge remote-tracking branch 'rccl/develop' into 2.19.4
2024-02-15 13:37:14 -08:00
akolliasAMD
16d7f372b7
Npkit updates ( #1084 )
...
* removed warmup runs to be an opt in
2024-02-15 07:48:45 -07:00
Wenkai Du
51003c9980
Use native half without conversion ( #1083 )
2024-02-13 16:57:34 -08:00
Wenkai Du
1f0af90206
Fix undefined symbol when nvtx is not enabled ( #1082 )
2024-02-13 14:03:43 -08:00
Bertan Dogancay
dc2d486ba0
Add stack size UT ( #1081 )
...
* Add stack size UT
2024-02-12 17:56:15 -07:00