Add documentation describing ROCPROFSYS_USE_RCCP (#110)

* Add documentation describing ROCPROFSYS_USE_RCCP

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update wordlist

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CHANGELOGS.md

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Этот коммит содержится в:
systems-assistant[bot]
2025-08-13 18:01:18 -04:00
коммит произвёл GitHub
родитель 80b7e6baee
Коммит dd37d215fd
4 изменённых файлов: 19 добавлений и 0 удалений
+2
Просмотреть файл
@@ -30,6 +30,8 @@ ppc
proc
proto
Pthreads
RCCL
RCCLP
rocDecode
rocdecode
ROCprofiler
+1
Просмотреть файл
@@ -18,6 +18,7 @@ Full documentation for ROCm Systems Profiler is available at [https://rocm.docs.
- Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics.
- ROCprofiler-SDK is now used to trace RCCL API and collect communication counters.
- Use the setting `ROCPROFSYS_USE_RCCLP = ON` to enable profiling and tracing of RCCL application data.
- Updated the Dyninst submodule to v13.0.
- Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`.
Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 34 KiB

+16
Просмотреть файл
@@ -225,6 +225,22 @@ and memory copy operations submitted. With the
``ROCPROFSYS_ROCM_GROUP_BY_QUEUE=ON`` setting, the trace will display HSA queues
to which these kernel and memory operations were submitted.
ROCPROFSYS_USE_RCCLP
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Use the setting ``ROCPROFSYS_USE_RCCLP = ON`` to enable profiling and tracing of
ROCm Communication Collectives Library (RCCL, also pronounced as 'Rickle'). When this setting is enabled,
ROCm Systems Profiler will trace the RCCL API calls and collect performance metrics related to collective operations.
The image below shows an example of a Perfetto trace with RCCL communication data and API tracing enabled:
.. image:: ../data/rccl-comm-recv.png
:alt: Perfetto tracks with RCCL Communication Data and API tracing
.. note::
There is a known issue which causes the application to exit with an error. However, the trace data can still be found in the output directory.
This issue is being tracked internally.
Exploring GPU Metrics
---------------------