2
0

13 Cometimentos

Autor(a) SHA1 Mensagem Data
Corey Derochie f221a1ae08 Updated troubleshooting-rccl.rst to change rocm-smi to amd-smi (#2028)
* Updated troubleshooting-rccl.rst to change rocm-smi to amd-smi

* Added `amd-smi static --driver`

* Update docs/how-to/troubleshooting-rccl.rst

Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com>

---------

Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com>

[ROCm/rccl commit: f942810959]
2025-12-23 21:22:11 +05:30
Artem Kuzmitckii 0c7e116b31 Reverse logic of context tracking enablement from #1927 (#1971)
In this commit it disabled by default and can be enabled via
`RCCL_ENABLE_CONTEXT_TRACKING=1` for both (CDNA, RDNA)
Original PR https://github.com/ROCm/rccl/pull/1927

[ROCm/rccl commit: 00a42c80f3]
2025-10-09 10:24:09 +02:00
David DeBonis 32b3a82956 Adding usage tip for ignore cpu affinity (#1948)
* Adding usage tip for ignore cpu affinity

* Update docs/how-to/rccl-usage-tips.rst

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Update docs/how-to/rccl-usage-tips.rst

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

[ROCm/rccl commit: d23d18f423]
2025-09-29 10:11:21 -06:00
Artem Kuzmitckii 722b0cd579 Revert disabling of context tracking for Radeon (#1927)
* Revert disabling of context tracking for Radeon

Original commit df3b7e47
 `Disable context tracking for the current version. (#1839)`

* Add env variable for disabling of context tracking for Radeon

`export NCCL_DISABLE_CONTEXT_TRACKING=1` to force disable of context tracking

* Update docs/how-to/rccl-usage-tips.rst

Fix grammar, thanks @amd-jnovotny

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

* Rename NCCL_DISABLE_CONTEXT_TRACKING -> RCCL_DISABLE_CONTEXT_TRACKING

* Revert changes in includes and rename util function

---------

Co-authored-by: Jeffrey Novotny <jnovotny@amd.com>

[ROCm/rccl commit: 07925ec027]
2025-09-27 15:19:50 -04:00
Arm Patinyasakdikul 32e80aedc0 Update plugin to look for librccl-net.so. (#1768)
[ROCm/rccl commit: 71c788d4d7]
2025-06-26 16:59:38 -05:00
Jeffrey Novotny fb1fdef8e2 Fix broken link to RCCL Replayer GitHub info (#1655)
[ROCm/rccl commit: df778b4ea1]
2025-04-23 14:17:31 -04:00
Istvan Kiss 858fa4e65d Add documentation for NPS4 and CPX partition modes (#1555)
[ROCm/rccl commit: 28ab8603d2]
2025-03-31 09:25:25 -06:00
BertanDogancay 1b000665df Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl commit: 36343be84f]
2025-01-23 12:08:46 -06:00
Jeffrey Novotny 7c220660ba Change kernel reference to use new terminology (#1462)
[ROCm/rccl commit: 2934bf6fc6]
2024-12-16 13:34:18 -05:00
Jeffrey Novotny d7498b88a5 Refactor how to docs and formatting fixes (#1444)
[ROCm/rccl commit: 9aa5b9f02e]
2024-12-10 08:47:24 -05:00
Jeffrey Novotny 531476dacf Add RCCL debugging guide (#1420)
* Add RCCL debugging guide

* Changes from external review

* More edits from internal review

* Additional edits

* Minor correction

* More changes after external review

* Integrate index and ToC changes with incoming merge changes

* Integrate feedback from management review

* Minor edits from the internal review

[ROCm/rccl commit: 6d34fb7632]
2024-12-06 13:25:58 -05:00
Jeffrey Novotny 1d1e17b3c9 Refactor RCCL install guide into several pages (#1427)
* Refactor RCCL install guide into several pages

* Changes from code review and new docker guide

* Add missing entries to ToC

* Minor fixes

* Fix help strings

* Edits after review and remove extra white space

[ROCm/rccl commit: bf7c130631]
2024-11-27 15:34:26 -05:00
randyh62 0f98c58804 what-is-rccl (#1312)
* what-is-rccl

* create Installation instreuctions from README

* update README link

* Add using-nccl

* Add note about docs

* correct doc path

* sources to source

* correct docs link

[ROCm/rccl commit: 391c7ea070]
2024-09-05 06:54:48 -07:00