Files
rocm-systems/projects/rocprofiler-systems/docs/how-to/vcn-jpeg-sampling.rst
T
David Galiffi b75423b173 Update installation and ROCPD documentation (#1300)
* Updating install doc page

* Removing the Quick Start page

* Add documentation for rocpd output

* Update links to reference rocm-systems repo

* Update README.md

Installation instructions references ROCm Docs link.

* Updated git clone instructions

Back to using https to clone the repository

* Fix formatting

* Update projects/rocprofiler-systems/docs/how-to/understanding-rocprof-sys-output.rst

* Add reference to "rocpd" section to the "Profiling Python" section

* Update CONTRIBUTING.md

* For ROCPD, document minimum version of SDK.

* Update CHANGELOGS

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CHANGELOG.md

Updated based on feedback from docs team

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md.

Simplify and remove setup information overlapping with the "rocm-systems" contributing documentation.

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update CHANGELOG.md

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-10-15 23:11:46 -04:00

161 righe
6.4 KiB
ReStructuredText

.. meta::
:description: ROCm Systems Profiler VCN and JPEG activity sampling and tracing
:keywords: rocprof-sys, rocprofiler-systems, ROCm, tips, how to, profiler, tracking, VCN, JPEG, rocDecode, rocjpeg, AMD
********************************************
VCN and JPEG activity sampling and tracing
********************************************
`ROCm Systems Profiler <https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler-systems>`_ supports
sampling of VCN and JPEG engines activities. It allows you to gather key performance metrics for
VCN utilization and understand engine usage through visualization. This information can be used
to optimize media and video workloads. Additionally, it supports tracing of `rocDecode
<https://rocm.docs.amd.com/projects/rocDecode/en/latest/>`_ APIs, `rocJPEG
<https://rocm.docs.amd.com/projects/rocJPEG/en/latest/>`_ APIs, and the
Video Acceleration APIs (VA-APIs). Tracing these APIs provides insights into how different
components of the video encoding and decoding workloads interact with the VCN engine.
Sampling support
=================
Sampling of VCN and JPEG engine activity is supported by leveraging `AMD SMI <https://rocm.docs.amd.com/projects/amdsmi/en/latest/>`_ which provides the interface for GPU metric collection.
1. Set the ``ROCPROFSYS_USE_AMD_SMI`` environment variable to enable GPU metric collection:
.. code-block:: shell
export ROCPROFSYS_USE_AMD_SMI=true
2. Update the ``ROCPROFSYS_AMD_SMI_METRICS`` variable to collect the VCN and JPEG activity metrics. The default value is:
.. code-block:: shell
ROCPROFSYS_AMD_SMI_METRICS=busy,temp,power,mem_usage
To include VCN and JPEG activity metrics, update it to:
.. code-block:: shell
ROCPROFSYS_AMD_SMI_METRICS=busy,temp,power,mem_usage,vcn_activity,jpeg_activity
Alternatively, you can use the following to collect all available GPU metrics:
.. code-block:: shell
ROCPROFSYS_AMD_SMI_METRICS=all
API tracing support
=====================
Tracing of rocDecode and rocJPEG APIs is supported by leveraging `ROCprofiler-SDK <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/index.html>`_
which provides runtime-independent APIs for tracing the runtime calls and asynchronous activities associated with decoder activities and workload in VCN and JPEG engines.
To enable tracing for the rocDecode and rocJPEG APIs, update the ``ROCPROFSYS_ROCM_DOMAINS`` variable. The default value is:
.. code-block:: shell
ROCPROFSYS_ROCM_DOMAINS=hip_runtime_api,marker_api,kernel_dispatch,memory_copy,scratch_memory,page_migration
Add ``rocdecode_api`` and ``rocjpeg_api`` to include tracing for rocDecode and rocJPEG APIs:
.. code-block:: shell
ROCPROFSYS_ROCM_DOMAINS=hip_runtime_api,marker_api,kernel_dispatch,memory_copy,scratch_memory,page_migration,rocdecode_api,rocjpeg_api
.. note::
By default, enabling ``rocdecode_api`` or ``rocjpeg_api`` also enables VA-API tracing.
To explore all supported tracing domains, use the command:
.. code-block:: shell
rocprof-sys-avail -bd -r ROCM_DOMAINS
For more details on the APIs, refer to `ROCprofiler-SDK Developer Docs <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/_doxygen/rocprofiler-sdk/html/>`_.
Using rocDecode and rocJPEG samples
================================================
For testing purposes, you can use the `rocDecode samples <https://github.com/ROCm/rocDecode?tab=readme-ov-file#using-sample-application>`_
and `rocJPEG samples <https://github.com/ROCm/rocJPEG?tab=readme-ov-file#using-sample-application>`_.
For generating sufficient load for VCN and JPEG engines, you can use the following samples:
For video decoding:
- `Video decode batch <https://github.com/ROCm/rocDecode/tree/develop/samples/videoDecodeBatch>`_
- `Video decode performance <https://github.com/ROCm/rocDecode/tree/develop/samples/videoDecodePerf>`_
For JPEG decoding:
- `JPEG decode batched <https://github.com/ROCm/rocJPEG/tree/develop/samples/jpegDecodeBatched>`_
- `JPEG decode perf <https://github.com/ROCm/rocJPEG/tree/develop/samples/jpegDecodePerf>`_
After completing the build steps mentioned in the sample documentation, proceed with the following steps:
1. Source the ROCm Systems Profiler Environment using:
.. code-block:: shell
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
Alternatively, if you are using modules, use:
.. code-block:: shell
module use /opt/rocprofiler-systems/share/modulefiles
2. Generate and configure the profiler config file.
.. code-block:: shell
rocprof-sys-avail -G $HOME/.rocprofsys.cfg -F txt
export ROCPROFSYS_CONFIG_FILE=$HOME/.rocprofsys.cfg
Edit ``.rocprofsys.cfg`` with the following settings:
.. code-block:: shell
ROCPROFSYS_USE_AMD_SMI = true
ROCPROFSYS_AMD_SMI_METRICS = busy,temp,power,mem_usage,vcn_activity,jpeg_activity
ROCPROFSYS_ROCM_DOMAINS = hip_runtime_api,marker_api,kernel_dispatch,memory_copy,scratch_memory,page_migration,rocdecode_api,rocjpeg_api
3. Profile the rocDecode sample.
.. code-block:: shell
rocprof-sys-sample -PTHD -- ./videodecodebatch -i /opt/rocm/share/rocdecode/video/
.. note::
If the ``rocdecode-dev`` package is installed, then the sample videos will be located in ``/opt/rocm/share/rocdecode/video``, by default.
At the end of the run, a similar message appears::
[rocprofiler-systems][964294][perfetto]> Outputting '/home/demo/rocprofsys-videodecodebatch-output/2025-04-25_15.52/perfetto-trace-964294.proto'
(2792.91 KB / 2.79 MB / 0.00 GB)... Done
To view the generated ``.proto`` file in the browser, open the
`Perfetto UI page <https://ui.perfetto.dev/>`_. Then, click on
``Open trace file`` and select the ``.proto`` file. In the browser, a similar visualization is generated.
.. image:: ../data/rocprof-sys-vcn-activity.png
:alt: Visualization of a performance graph in Perfetto with VCN Activity tracks
.. image:: ../data/rocprof-sys-rocdecode.png
:alt: Visualization of a performance graph in Perfetto with rocdecode and VA-API traces
4. To profile the rocJPEG sample, use:
.. code-block:: shell
rocprof-sys-sample -v 2 -PTHD -- ./jpegdecodeperf -i /opt/rocm/share/rocjpeg/image/
.. note::
If ``rocjpeg-dev`` package is installed, the sample images will be located in the
``/opt/rocm/share/rocjpeg/image/`` directory.
Duplicate the images to generate enough workload to see activity in the trace
.. image:: ../data/rocprof-sys-jpeg-activity.png
:alt: Visualization of a performance graph in Perfetto with JPEG Activity tracks