2
0

Documentation update for VCN, JPEG, rocDecode and rocJPEG feature (#109)

* Documentation update for VCN, JPEG, rocDecode and rocJPEG feature

Update documents to include the new tracks for tracing VCN and JPEG
activity.
Update the rocDecode and rocJPEG tracing enabled using ROCprofiler-SDK.
Update headings to the perfetto output images.

* Add few more lines about domain values.

* Add missing words to the dictionary
Este cometimento está contido em:
Sajina PK
2025-03-06 18:03:33 -05:00
cometido por GitHub
ascendente eb0a969a9c
cometimento 2222ce9b83
6 ficheiros modificados com 66 adições e 1 eliminações
+3
Ver ficheiro
@@ -38,6 +38,9 @@ pid
polymorphism
ppc
proc
rocDecode
rocJPEG
roctx
rpath
rvalues
sdk
+6
Ver ficheiro
@@ -2,6 +2,12 @@
Full documentation for ROCm Systems Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/).
## ROCm Systems Profiler 1.1.0 for ROCm 6.5
### Added
- Added profiling and metric collection capabilities for VCN engine activity, JPEG engine activity and API tracing for rocDecode, rocJPEG and VA-APIs.
## ROCm Systems Profiler 0.1.1 for ROCm 6.3.2
### Resolved issues
+4
Ver ficheiro
@@ -62,11 +62,15 @@ The documentation source files reside in the [`/docs`](/docs) folder of this rep
- HIP kernel tracing
- HSA API tracing
- HSA operation tracing
- rocDecode API tracing
- rocJPEG API tracing
- System-level sampling (via rocm-smi)
- Memory usage
- Power usage
- Temperature
- Utilization
- VCN Utilization
- JPEG Utilization
### CPU Metrics
+3
Ver ficheiro
@@ -52,6 +52,8 @@ GPU metrics
* HIP kernel tracing
* HSA API tracing
* HSA operation tracing
* rocDecode API tracing
* rocJPEG API tracing
* System-level sampling (via rocm-smi)
* Memory usage
@@ -59,6 +61,7 @@ GPU metrics
* Temperature
* Utilization
* VCN activity
* JPEG activity
CPU metrics
========================================
+40 -1
Ver ficheiro
@@ -217,6 +217,44 @@ The following example:
ROCPROFSYS_ROCM_EVENTS = GPUBusy SQ_WAVES:device=0 SQ_INSTS_VALU:device=1
Exploring GPU Metrics
---------------------
ROCm Systems Profiler supports GPU metrics collection, sampling, and API tracing via `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_ and `ROCm-SMI <https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/>`_.
ROCprofiler-SDK supports application tracing to provide a big picture of the GPU application execution and kernel profiling to provide low-level hardware details from the performance counters.
The ROCm-SMI library offers a unified tool for managing, monitoring, and retrieving information about the system's drivers and GPUs.
Sampling GPU metrics like utilization, temperature, power consumption, memory usage, etc., can be configured with ``ROCPROFSYS_ROCM_SMI_METRICS``.
The ``ROCPROFSYS_USE_ROCM_SMI`` setting should be enabled for GPU metric collection.
For example, the following is a valid configuration:
.. code-block:: shell
ROCPROFSYS_ROCM_SMI_METRICS=busy,temp,power,vcn_activity,mem_usage
Supported values for ``ROCPROFSYS_ROCM_SMI_METRICS`` are: ``busy``, ``temp``, ``power``, ``vcn_activity``, ``mem_usage``, ``jpeg_activity``.
API tracing is configured with the ``ROCPROFSYS_ROCM_DOMAINS`` setting. The domains are used to filter the events that are captured during profiling.
Supported values for this setting are those supported by ROCprofiler-SDK, which are returned by the API ``get_callback_tracing_names()`` and ``get_buffer_tracing_names()``. See the `ROCprofiler-SDK developer API documentation <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/_doxygen/html/namespacerocprofiler_1_1sdk.html>`_ to learn more about ROCprofiler-SDK APIs.
Use the following command to view the available domains:
.. code-block:: shell
rocprof-sys-avail -bd -r ROCM_DOMAINS
.. note::
Some settings can enable tracing for multiple domains, such as ``hip_api`` which will enable both ``hip_runtime_api`` and ``hip_compiler_api``.
And ``hsa_api`` which will enable all hsa domains, ``hsa_core_api``, ``hsa_amd_ext_api``, ``hsa_image_exit_api``, ``hsa_finalize_ext_api``.
The setting ``marker_api`` or ``roctx`` can be used to enable the roctx marker API tracing.
For example, the following is a valid configuration:
.. code-block:: shell
ROCPROFSYS_ROCM_DOMAINS=hip_runtime_api,kernel_dispatch,memory_copy,rocdecode_api,rocjpeg_api
rocprof-sys-avail examples
-----------------------------------
@@ -474,7 +512,8 @@ Viewing components
| sampling_gpu_power | GPU Power Usage via ROCm-SMI. Derived fro... |
| sampling_gpu_temp | GPU Temperature via ROCm-SMI. Derived fro... |
| sampling_gpu_busy | GPU Utilization (% busy) via ROCm-SMI. De... |
| sampling_vcn_busy | GPU VCN Utilization (% activity) via ROCm... |
| sampling_gpu_vcn | GPU VCN Utilization (% activity) via ROCm... |
| sampling_gpu_jpeg | GPU JPEG Utilization (% activity) via ROCm.. |
| sampling_gpu_memory_usage | GPU Memory Usage via ROCm-SMI. Derived fr... |
|-----------------------------------|----------------------------------------------|
+10
Ver ficheiro
@@ -366,22 +366,32 @@ absolute path, then all ``ROCPROFSYS_OUTPUT_PATH`` and similar
settings are ignored. Visit `ui.perfetto.dev <https://ui.perfetto.dev>`_ and open
this file.
**Figure 1:** Visualization of a performance graph in Perfetto
.. image:: ../data/rocprof-sys-perfetto.png
:alt: Visualization of a performance graph in Perfetto
:width: 800
**Figure 2:** Visualization of ROCm data in Perfetto
.. image:: ../data/rocprof-sys-rocm.png
:alt: Visualization of ROCm data in Perfetto
:width: 800
**Figure 3:** Visualization of ROCm flow data in Perfetto
.. image:: ../data/rocprof-sys-rocm-flow.png
:alt: Visualization of ROCm flow data in Perfetto
:width: 800
**Figure 4:** Visualization of ROCm API calls in Perfetto
.. image:: ../data/rocprof-sys-user-api.png
:alt: Visualization of ROCm API calls in Perfetto
:width: 800
**Figure 5:** Visualization of ROCm GPU metrics in Perfetto
.. image:: ../data/rocprof-sys-gpu-metrics.png
:alt: Visualization of ROCm GPU metrics in Perfetto
:width: 800