Documentation update for VCN, JPEG, rocDecode and rocJPEG feature (#109)
* Documentation update for VCN, JPEG, rocDecode and rocJPEG feature Update documents to include the new tracks for tracing VCN and JPEG activity. Update the rocDecode and rocJPEG tracing enabled using ROCprofiler-SDK. Update headings to the perfetto output images. * Add few more lines about domain values. * Add missing words to the dictionary
Este cometimento está contido em:
cometido por
GitHub
ascendente
eb0a969a9c
cometimento
2222ce9b83
@@ -38,6 +38,9 @@ pid
|
||||
polymorphism
|
||||
ppc
|
||||
proc
|
||||
rocDecode
|
||||
rocJPEG
|
||||
roctx
|
||||
rpath
|
||||
rvalues
|
||||
sdk
|
||||
|
||||
@@ -2,6 +2,12 @@
|
||||
|
||||
Full documentation for ROCm Systems Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/).
|
||||
|
||||
## ROCm Systems Profiler 1.1.0 for ROCm 6.5
|
||||
|
||||
### Added
|
||||
|
||||
- Added profiling and metric collection capabilities for VCN engine activity, JPEG engine activity and API tracing for rocDecode, rocJPEG and VA-APIs.
|
||||
|
||||
## ROCm Systems Profiler 0.1.1 for ROCm 6.3.2
|
||||
|
||||
### Resolved issues
|
||||
|
||||
@@ -62,11 +62,15 @@ The documentation source files reside in the [`/docs`](/docs) folder of this rep
|
||||
- HIP kernel tracing
|
||||
- HSA API tracing
|
||||
- HSA operation tracing
|
||||
- rocDecode API tracing
|
||||
- rocJPEG API tracing
|
||||
- System-level sampling (via rocm-smi)
|
||||
- Memory usage
|
||||
- Power usage
|
||||
- Temperature
|
||||
- Utilization
|
||||
- VCN Utilization
|
||||
- JPEG Utilization
|
||||
|
||||
### CPU Metrics
|
||||
|
||||
|
||||
@@ -52,6 +52,8 @@ GPU metrics
|
||||
* HIP kernel tracing
|
||||
* HSA API tracing
|
||||
* HSA operation tracing
|
||||
* rocDecode API tracing
|
||||
* rocJPEG API tracing
|
||||
* System-level sampling (via rocm-smi)
|
||||
|
||||
* Memory usage
|
||||
@@ -59,6 +61,7 @@ GPU metrics
|
||||
* Temperature
|
||||
* Utilization
|
||||
* VCN activity
|
||||
* JPEG activity
|
||||
|
||||
CPU metrics
|
||||
========================================
|
||||
|
||||
@@ -217,6 +217,44 @@ The following example:
|
||||
|
||||
ROCPROFSYS_ROCM_EVENTS = GPUBusy SQ_WAVES:device=0 SQ_INSTS_VALU:device=1
|
||||
|
||||
Exploring GPU Metrics
|
||||
---------------------
|
||||
|
||||
ROCm Systems Profiler supports GPU metrics collection, sampling, and API tracing via `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_ and `ROCm-SMI <https://rocm.docs.amd.com/projects/rocm_smi_lib/en/latest/>`_.
|
||||
ROCprofiler-SDK supports application tracing to provide a big picture of the GPU application execution and kernel profiling to provide low-level hardware details from the performance counters.
|
||||
The ROCm-SMI library offers a unified tool for managing, monitoring, and retrieving information about the system's drivers and GPUs.
|
||||
|
||||
Sampling GPU metrics like utilization, temperature, power consumption, memory usage, etc., can be configured with ``ROCPROFSYS_ROCM_SMI_METRICS``.
|
||||
The ``ROCPROFSYS_USE_ROCM_SMI`` setting should be enabled for GPU metric collection.
|
||||
|
||||
For example, the following is a valid configuration:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
ROCPROFSYS_ROCM_SMI_METRICS=busy,temp,power,vcn_activity,mem_usage
|
||||
|
||||
Supported values for ``ROCPROFSYS_ROCM_SMI_METRICS`` are: ``busy``, ``temp``, ``power``, ``vcn_activity``, ``mem_usage``, ``jpeg_activity``.
|
||||
|
||||
API tracing is configured with the ``ROCPROFSYS_ROCM_DOMAINS`` setting. The domains are used to filter the events that are captured during profiling.
|
||||
Supported values for this setting are those supported by ROCprofiler-SDK, which are returned by the API ``get_callback_tracing_names()`` and ``get_buffer_tracing_names()``. See the `ROCprofiler-SDK developer API documentation <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/_doxygen/html/namespacerocprofiler_1_1sdk.html>`_ to learn more about ROCprofiler-SDK APIs.
|
||||
Use the following command to view the available domains:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
rocprof-sys-avail -bd -r ROCM_DOMAINS
|
||||
|
||||
.. note::
|
||||
|
||||
Some settings can enable tracing for multiple domains, such as ``hip_api`` which will enable both ``hip_runtime_api`` and ``hip_compiler_api``.
|
||||
And ``hsa_api`` which will enable all hsa domains, ``hsa_core_api``, ``hsa_amd_ext_api``, ``hsa_image_exit_api``, ``hsa_finalize_ext_api``.
|
||||
The setting ``marker_api`` or ``roctx`` can be used to enable the roctx marker API tracing.
|
||||
|
||||
For example, the following is a valid configuration:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
ROCPROFSYS_ROCM_DOMAINS=hip_runtime_api,kernel_dispatch,memory_copy,rocdecode_api,rocjpeg_api
|
||||
|
||||
rocprof-sys-avail examples
|
||||
-----------------------------------
|
||||
|
||||
@@ -474,7 +512,8 @@ Viewing components
|
||||
| sampling_gpu_power | GPU Power Usage via ROCm-SMI. Derived fro... |
|
||||
| sampling_gpu_temp | GPU Temperature via ROCm-SMI. Derived fro... |
|
||||
| sampling_gpu_busy | GPU Utilization (% busy) via ROCm-SMI. De... |
|
||||
| sampling_vcn_busy | GPU VCN Utilization (% activity) via ROCm... |
|
||||
| sampling_gpu_vcn | GPU VCN Utilization (% activity) via ROCm... |
|
||||
| sampling_gpu_jpeg | GPU JPEG Utilization (% activity) via ROCm.. |
|
||||
| sampling_gpu_memory_usage | GPU Memory Usage via ROCm-SMI. Derived fr... |
|
||||
|-----------------------------------|----------------------------------------------|
|
||||
|
||||
|
||||
@@ -366,22 +366,32 @@ absolute path, then all ``ROCPROFSYS_OUTPUT_PATH`` and similar
|
||||
settings are ignored. Visit `ui.perfetto.dev <https://ui.perfetto.dev>`_ and open
|
||||
this file.
|
||||
|
||||
**Figure 1:** Visualization of a performance graph in Perfetto
|
||||
|
||||
.. image:: ../data/rocprof-sys-perfetto.png
|
||||
:alt: Visualization of a performance graph in Perfetto
|
||||
:width: 800
|
||||
|
||||
**Figure 2:** Visualization of ROCm data in Perfetto
|
||||
|
||||
.. image:: ../data/rocprof-sys-rocm.png
|
||||
:alt: Visualization of ROCm data in Perfetto
|
||||
:width: 800
|
||||
|
||||
**Figure 3:** Visualization of ROCm flow data in Perfetto
|
||||
|
||||
.. image:: ../data/rocprof-sys-rocm-flow.png
|
||||
:alt: Visualization of ROCm flow data in Perfetto
|
||||
:width: 800
|
||||
|
||||
**Figure 4:** Visualization of ROCm API calls in Perfetto
|
||||
|
||||
.. image:: ../data/rocprof-sys-user-api.png
|
||||
:alt: Visualization of ROCm API calls in Perfetto
|
||||
:width: 800
|
||||
|
||||
**Figure 5:** Visualization of ROCm GPU metrics in Perfetto
|
||||
|
||||
.. image:: ../data/rocprof-sys-gpu-metrics.png
|
||||
:alt: Visualization of ROCm GPU metrics in Perfetto
|
||||
:width: 800
|
||||
|
||||
Criar uma nova questão referindo esta
Bloquear um utilizador