diff --git a/.wordlist.txt b/.wordlist.txt index 705ade97ef..ae9070776b 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -38,6 +38,9 @@ pid polymorphism ppc proc +rocDecode +rocJPEG +roctx rpath rvalues sdk diff --git a/CHANGELOG.md b/CHANGELOG.md index bb56075c76..df413bd1c8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,12 @@ Full documentation for ROCm Systems Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/). +## ROCm Systems Profiler 1.1.0 for ROCm 6.5 + +### Added + +- Added profiling and metric collection capabilities for VCN engine activity, JPEG engine activity and API tracing for rocDecode, rocJPEG and VA-APIs. + ## ROCm Systems Profiler 0.1.1 for ROCm 6.3.2 ### Resolved issues diff --git a/README.md b/README.md index fed53600e8..2d99d84527 100755 --- a/README.md +++ b/README.md @@ -62,11 +62,15 @@ The documentation source files reside in the [`/docs`](/docs) folder of this rep - HIP kernel tracing - HSA API tracing - HSA operation tracing +- rocDecode API tracing +- rocJPEG API tracing - System-level sampling (via rocm-smi) - Memory usage - Power usage - Temperature - Utilization + - VCN Utilization + - JPEG Utilization ### CPU Metrics diff --git a/docs/conceptual/rocprof-sys-feature-set.rst b/docs/conceptual/rocprof-sys-feature-set.rst index fed24b55b3..5d4c264b3c 100644 --- a/docs/conceptual/rocprof-sys-feature-set.rst +++ b/docs/conceptual/rocprof-sys-feature-set.rst @@ -52,6 +52,8 @@ GPU metrics * HIP kernel tracing * HSA API tracing * HSA operation tracing +* rocDecode API tracing +* rocJPEG API tracing * System-level sampling (via rocm-smi) * Memory usage @@ -59,6 +61,7 @@ GPU metrics * Temperature * Utilization * VCN activity + * JPEG activity CPU metrics ======================================== diff --git a/docs/how-to/configuring-runtime-options.rst b/docs/how-to/configuring-runtime-options.rst index 9694582148..a8352b3bef 100644 --- a/docs/how-to/configuring-runtime-options.rst +++ b/docs/how-to/configuring-runtime-options.rst @@ -217,6 +217,44 @@ The following example: ROCPROFSYS_ROCM_EVENTS = GPUBusy SQ_WAVES:device=0 SQ_INSTS_VALU:device=1 +Exploring GPU Metrics +--------------------- + +ROCm Systems Profiler supports GPU metrics collection, sampling, and API tracing via `ROCprofiler-SDK `_ and `ROCm-SMI `_. +ROCprofiler-SDK supports application tracing to provide a big picture of the GPU application execution and kernel profiling to provide low-level hardware details from the performance counters. +The ROCm-SMI library offers a unified tool for managing, monitoring, and retrieving information about the system's drivers and GPUs. + +Sampling GPU metrics like utilization, temperature, power consumption, memory usage, etc., can be configured with ``ROCPROFSYS_ROCM_SMI_METRICS``. +The ``ROCPROFSYS_USE_ROCM_SMI`` setting should be enabled for GPU metric collection. + +For example, the following is a valid configuration: + +.. code-block:: shell + + ROCPROFSYS_ROCM_SMI_METRICS=busy,temp,power,vcn_activity,mem_usage + +Supported values for ``ROCPROFSYS_ROCM_SMI_METRICS`` are: ``busy``, ``temp``, ``power``, ``vcn_activity``, ``mem_usage``, ``jpeg_activity``. + +API tracing is configured with the ``ROCPROFSYS_ROCM_DOMAINS`` setting. The domains are used to filter the events that are captured during profiling. +Supported values for this setting are those supported by ROCprofiler-SDK, which are returned by the API ``get_callback_tracing_names()`` and ``get_buffer_tracing_names()``. See the `ROCprofiler-SDK developer API documentation `_ to learn more about ROCprofiler-SDK APIs. +Use the following command to view the available domains: + +.. code-block:: shell + + rocprof-sys-avail -bd -r ROCM_DOMAINS + +.. note:: + +Some settings can enable tracing for multiple domains, such as ``hip_api`` which will enable both ``hip_runtime_api`` and ``hip_compiler_api``. +And ``hsa_api`` which will enable all hsa domains, ``hsa_core_api``, ``hsa_amd_ext_api``, ``hsa_image_exit_api``, ``hsa_finalize_ext_api``. +The setting ``marker_api`` or ``roctx`` can be used to enable the roctx marker API tracing. + +For example, the following is a valid configuration: + +.. code-block:: shell + + ROCPROFSYS_ROCM_DOMAINS=hip_runtime_api,kernel_dispatch,memory_copy,rocdecode_api,rocjpeg_api + rocprof-sys-avail examples ----------------------------------- @@ -474,7 +512,8 @@ Viewing components | sampling_gpu_power | GPU Power Usage via ROCm-SMI. Derived fro... | | sampling_gpu_temp | GPU Temperature via ROCm-SMI. Derived fro... | | sampling_gpu_busy | GPU Utilization (% busy) via ROCm-SMI. De... | - | sampling_vcn_busy | GPU VCN Utilization (% activity) via ROCm... | + | sampling_gpu_vcn | GPU VCN Utilization (% activity) via ROCm... | + | sampling_gpu_jpeg | GPU JPEG Utilization (% activity) via ROCm.. | | sampling_gpu_memory_usage | GPU Memory Usage via ROCm-SMI. Derived fr... | |-----------------------------------|----------------------------------------------| diff --git a/docs/how-to/understanding-rocprof-sys-output.rst b/docs/how-to/understanding-rocprof-sys-output.rst index 73e27fe354..878effefab 100644 --- a/docs/how-to/understanding-rocprof-sys-output.rst +++ b/docs/how-to/understanding-rocprof-sys-output.rst @@ -366,22 +366,32 @@ absolute path, then all ``ROCPROFSYS_OUTPUT_PATH`` and similar settings are ignored. Visit `ui.perfetto.dev `_ and open this file. +**Figure 1:** Visualization of a performance graph in Perfetto + .. image:: ../data/rocprof-sys-perfetto.png :alt: Visualization of a performance graph in Perfetto :width: 800 +**Figure 2:** Visualization of ROCm data in Perfetto + .. image:: ../data/rocprof-sys-rocm.png :alt: Visualization of ROCm data in Perfetto :width: 800 +**Figure 3:** Visualization of ROCm flow data in Perfetto + .. image:: ../data/rocprof-sys-rocm-flow.png :alt: Visualization of ROCm flow data in Perfetto :width: 800 +**Figure 4:** Visualization of ROCm API calls in Perfetto + .. image:: ../data/rocprof-sys-user-api.png :alt: Visualization of ROCm API calls in Perfetto :width: 800 +**Figure 5:** Visualization of ROCm GPU metrics in Perfetto + .. image:: ../data/rocprof-sys-gpu-metrics.png :alt: Visualization of ROCm GPU metrics in Perfetto :width: 800