Wenkai Du
|
df98a6957d
|
Add another Rome model (#1095)
|
2024-02-28 10:46:05 -08:00 |
|
Bertan Dogancay
|
b617aecc31
|
Implement ROCTX (#1094)
* Implement roctx
|
2024-02-27 15:46:15 -07:00 |
|
dependabot[bot]
|
dae6df6d16
|
Bump rocm-docs-core from 0.34.2 to 0.35.0 in /docs/sphinx (#1092)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-26 16:57:14 -07:00 |
|
dependabot[bot]
|
beb1e487ad
|
Bump cryptography from 42.0.2 to 42.0.4 in /docs/sphinx (#1090)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.2 to 42.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.2...42.0.4)
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-26 16:47:14 -07:00 |
|
Tim
|
0d06b0f1de
|
Adding FP16 cases to unit tests(#1093)
Signed-off-by: Tim Hu <timhu102@amd.com>
|
2024-02-26 12:08:04 -05:00 |
|
Wenkai Du
|
74f9e5db64
|
Add new GPU model (#1080)
|
2024-02-23 12:19:42 -08:00 |
|
Wenkai Du
|
c5ab37211b
|
Update RCCL/MSCCL work FIFO depth to 256K (#1091)
|
2024-02-21 17:15:11 -08:00 |
|
Bertan Dogancay
|
b275ed0b56
|
LL128 check if all XGMI (#1089)
|
2024-02-21 09:41:40 -07:00 |
|
Pedram Alizadeh
|
5a0f9990a9
|
msccl algorithms tuning for allreduce on MI300 (#1088)
|
2024-02-21 11:31:56 -05:00 |
|
dependabot[bot]
|
b7e3f1da14
|
Bump cryptography from 42.0.0 to 42.0.2 in /docs/sphinx (#1087)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0 to 42.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.0...42.0.2)
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-20 15:03:10 -07:00 |
|
dependabot[bot]
|
7e47a77339
|
Bump rocm-docs-core from 0.34.0 to 0.34.2 in /docs/sphinx (#1086)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-16 11:21:27 -07:00 |
|
Bertan Dogancay
|
2fb12a9358
|
Merge pull request #1079 from BertanDogancay/2.19.4-sync
2.19.4 Sync
|
2024-02-16 09:50:11 -07:00 |
|
BertanDogancay
|
b098120c40
|
Increase max stack size when ll128 enabled
|
2024-02-15 15:56:59 -08:00 |
|
akolliasAMD
|
bac57421c7
|
Allow bus id to be null (#1085)
* Allow bus id to be null
|
2024-02-15 16:36:51 -07:00 |
|
BertanDogancay
|
6f3310605c
|
Disable unsupported ld/st instructions
|
2024-02-15 13:58:16 -08:00 |
|
BertanDogancay
|
76f83f95ab
|
Merge remote-tracking branch 'rccl/develop' into 2.19.4
|
2024-02-15 13:37:14 -08:00 |
|
akolliasAMD
|
16d7f372b7
|
Npkit updates (#1084)
* removed warmup runs to be an opt in
|
2024-02-15 07:48:45 -07:00 |
|
Wenkai Du
|
51003c9980
|
Use native half without conversion (#1083)
|
2024-02-13 16:57:34 -08:00 |
|
Wenkai Du
|
1f0af90206
|
Fix undefined symbol when nvtx is not enabled (#1082)
|
2024-02-13 14:03:43 -08:00 |
|
Bertan Dogancay
|
dc2d486ba0
|
Add stack size UT (#1081)
* Add stack size UT
|
2024-02-12 17:56:15 -07:00 |
|
BertanDogancay
|
32cca51894
|
Fix docs
|
2024-02-11 22:32:55 -08:00 |
|
Wenkai Du
|
d999d9ad21
|
Merge remote-tracking branch 'rccl/develop' into 2.19.4
|
2024-02-09 11:31:03 -06:00 |
|
Wenkai Du
|
5669b0d7b6
|
2.18.5 fix (#1077)
* Revert "Revert "2.18.5-1""
This reverts commit 767fde8210.
* Fix initial net device value
|
2024-02-09 09:18:38 -08:00 |
|
dependabot[bot]
|
3e505a991c
|
Bump rocm-docs-core from 0.33.2 to 0.34.0 in /docs/sphinx (#1078)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-09 10:12:07 -07:00 |
|
Bertan Dogancay
|
8a442faa12
|
Nvtx support (#1076)
* NVTX support
|
2024-02-08 14:08:24 -07:00 |
|
Wenkai Du
|
5257c753c5
|
msccl: use relaxed atomics on scratch buffer (#1075)
|
2024-02-08 12:09:56 -08:00 |
|
dependabot[bot]
|
be45f0effd
|
Bump rocm-docs-core from 0.33.1 to 0.33.2 in /docs/sphinx (#1073)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-08 09:26:47 -07:00 |
|
Wenkai Du
|
704c9ef0d1
|
Doubling P2P channels per peer on single node gfx94x only (#1074)
|
2024-02-07 14:05:57 -08:00 |
|
dependabot[bot]
|
a9214032fc
|
Bump rocm-docs-core from 0.33.0 to 0.33.1 in /docs/sphinx (#1071)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-06 16:00:30 -07:00 |
|
dependabot[bot]
|
ca007ddad3
|
Bump cryptography from 41.0.6 to 42.0.0 in /docs/sphinx (#1070)
Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/41.0.6...42.0.0)
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-02-06 15:59:52 -07:00 |
|
Wenkai Du
|
1d989f6524
|
Doubling P2P channels per peer on single node only (#1069)
|
2024-02-02 12:41:00 -08:00 |
|
Wenkai Du
|
e64324a64a
|
Merge remote-tracking branch 'rccl/develop' into HEAD
|
2024-02-01 12:17:09 -06:00 |
|
Nilesh M Negi
|
2458f158b1
|
Enable kernarg preloading for ROCm 6.1 (#1068)
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com>
|
2024-02-01 12:14:04 -06:00 |
|
BertanDogancay
|
12ac20ade5
|
Revert re-usage of connect and listen ports
|
2024-02-01 10:03:13 -08:00 |
|
BertanDogancay
|
00fdb1ef51
|
Clean up
|
2024-01-31 17:27:15 -08:00 |
|
BertanDogancay
|
da85abab54
|
Fix stack size
|
2024-01-31 17:09:07 -08:00 |
|
Wenkai Du
|
95f87232c4
|
Fix transport merge
|
2024-01-31 17:35:12 -06:00 |
|
Wenkai Du
|
d1575a1622
|
topo_expl: 2.19 update
|
2024-01-31 16:11:14 -06:00 |
|
Wenkai Du
|
1a134b283b
|
Merge remote-tracking branch 'rccl/develop' into 2.19.4
|
2024-01-31 11:53:10 -06:00 |
|
BertanDogancay
|
9ff53eeeae
|
Merge remote-tracking branch 'nccl/master' into develop
|
2024-01-30 14:43:43 -08:00 |
|
Bertan Dogancay
|
01b359027b
|
Include common.h in enqueue.cc instead (#1067)
|
2024-01-30 08:24:22 -08:00 |
|
Wenkai Du
|
f7550d83b8
|
msccl: ensure memory coherence after data receive (#1062)
|
2024-01-30 08:22:50 -08:00 |
|
dependabot[bot]
|
8949a28502
|
Bump rocm-docs-core from 0.31.0 to 0.33.0 in /docs/sphinx (#1065)
Bumps [rocm-docs-core](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0.
- [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0)
---
updated-dependencies:
- dependency-name: rocm-docs-core
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-01-29 17:10:16 -07:00 |
|
Bertan Dogancay
|
d75c364864
|
Do not use LL128 when disabled (#1066)
|
2024-01-29 14:08:59 -07:00 |
|
BertanDogancay
|
31ec5d5cb0
|
correct data type
|
2024-01-28 19:55:19 -08:00 |
|
Shilei Tian
|
ba9f7917ba
|
Add a constructor for PtrUnion in case it is not initialized explicitly (#1064)
|
2024-01-26 08:00:27 -08:00 |
|
Pedram Alizadeh
|
ccfb35fa6d
|
modifying the tuning table to improve the performance of allreduce for 8MB and 16MB for single-node MI300X (#1063)
|
2024-01-26 09:05:53 -05:00 |
|
Wenkai Du
|
be8ef4367f
|
colltrace: fix dropped trace messages (#1059)
* colltrace: fix dropped trace messages
* Remove extra space
|
2024-01-25 13:31:53 -08:00 |
|
Wenkai Du
|
ffde530af5
|
Increase P2P channels per peer (#1060)
|
2024-01-25 11:21:58 -08:00 |
|
Sam Wu
|
7d6da4c66b
|
Add codeowners for documentation (#1061)
* Add codeowners for documentation
* Update CODEOWNERS
---------
Co-authored-by: samjwu <samjwu@users.noreply.github.com>
|
2024-01-25 09:33:28 -07:00 |
|