Files
rocm-systems/projects/rocprofiler-sdk
Trowbridge, Ian 74efdad1aa rocJPEG API Tracing (#73)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 31fe8858d1]
2025-02-21 13:43:49 -08:00
..
2025-02-21 13:43:49 -08:00
2025-02-21 13:43:49 -08:00
2025-02-21 13:43:49 -08:00
2025-02-21 13:43:49 -08:00
2023-08-24 19:19:48 -05:00
2025-02-04 04:05:38 -06:00
2025-01-22 11:34:21 -06:00
2025-02-21 13:43:49 -08:00
2025-01-22 19:11:20 -06:00
2025-02-04 04:05:38 -06:00
2025-02-13 08:51:10 -06:00

ROCprofiler-SDK: Application Profiling, Tracing, and Performance Analysis

Important

We are phasing out development and support for roctracer/rocprofiler/rocprof/rocprofv2 in favor of rocprofiler-sdk/rocprofv3 in upcoming ROCm releases. Going forward, only critical defect fixes will be addressed for older versions of profiling tools and libraries. We encourage all users to upgrade to the latest version, rocprofiler-sdk library and rocprofv3 tool, to ensure continued support and access to new features.

Overview

ROCProfiler-SDK is AMDs new and improved tooling infrastructure, providing a hardware-specific low-level performance analysis interface for profiling and tracing GPU compute applications. To see what's changed Click Here

Note

The published documentation is available at ROCprofiler-SDK documentation in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the rocprofiler-sdk/source/docs folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see Contribute to ROCm documentation.

GPU Metrics

  • GPU hardware counters
  • HIP API tracing
  • HIP kernel tracing
  • HSA API tracing
  • HSA operation tracing
  • Marker(ROCTx) tracing
  • PC Sampling (Beta)
  • RCCL tracing
  • Kokkoks tracing

Tool Support

rocprofv3 is the command line tool built using the rocprofiler-sdk library and shipped with the ROCm stack. To see details on the command line options of rocprofv3, please see rocprofv3 user guide Click Here

Documentation

We make use of doxygen to generate API documentation automatically. The generated document can be found in the following path:

<ROCM_PATH>/share/html/rocprofiler-sdk

ROCM_PATH by default is /opt/rocm It can be set by the user in different locations if needed.

Build and Installation

git clone https://github.com/ROCm/rocprofiler-sdk.git rocprofiler-sdk-source
cmake                                         \
      -B rocprofiler-sdk-build                \
      -D ROCPROFILER_BUILD_TESTS=ON           \
      -D ROCPROFILER_BUILD_SAMPLES=ON         \
      -D CMAKE_INSTALL_PREFIX=/opt/rocm       \
       rocprofiler-sdk-source

cmake --build rocprofiler-sdk-build --target all --parallel 8

To install ROCprofiler, run:

cmake --build rocprofiler-sdk-build --target install

Please see the detailed section on build and installation here: Click Here

Support

Please report in the Github Issues.

Limitations

  • Individual XCC mode is not supported.

  • By default, PC sampling API is disabled. To use PC sampling. Setting the ROCPROFILER_PC_SAMPLING_BETA_ENABLED environment variable grants access to the PC Sampling experimental beta feature. This feature is still under development and may not be completely stable.

    • Risk Acknowledgment: By activating this environment variable, you acknowledge and accept the following potential risks:

      • Hardware Freeze: This beta feature could cause your hardware to freeze unexpectedly.
      • Need for Cold Restart: In the event of a hardware freeze, you may need to perform a cold restart (turning the hardware off and on) to restore normal operations. Please use this beta feature cautiously. It may affect your system's stability and performance. Proceed at your own risk.
    • At this point, We do not recommend stress-testing the beta implementation.

    • Correlation IDs provided by the PC sampling service are verified only for HIP API calls.

    • Timestamps in PC sampling records might not be 100% accurate.

    • Using PC sampling on multi-threaded applications might fail with HSA_STATUS_ERROR_EXCEPTION.Furthermore, if three or more threads launch operations to the same agent, and if PC sampling is enabled, the HSA_STATUS_ERROR_EXCEPTION might appear.

  • gfx11 and gfx12 requires a stable power state for counter collection. This includes Radeon 7000 GPUs.

    # For device <N>. Use 'rocm-smi' or 'amd-smi monitor' to see device number.
    sudo amd-smi set -g <N> -l stable_std
    # After profiling, set power state back to 'auto'
    sudo amd-smi set -g <N> -l auto
    

    The gfx version can be found via amd-smi static --asic -g <N> in the TARGET_GRAPHICS_VERSION field:

    $ amd-smi static -a -g 2
    GPU: 2
        ASIC:
            MARKET_NAME: Navi 33 [Radeon Pro W7500]
            VENDOR_ID: 0x1002
            VENDOR_NAME: Advanced Micro Devices Inc. [AMD/ATI]
            SUBVENDOR_ID: 0x1002
            DEVICE_ID: 0x7489
            SUBSYSTEM_ID: 0x0e0d
            REV_ID: 0x00
            ASIC_SERIAL: N/A
            OAM_ID: N/A
            NUM_COMPUTE_UNITS: 28
            TARGET_GRAPHICS_VERSION: gfx1102
    

Warning

The latest mainline version of AQLprofile can be found at https://repo.radeon.com/rocm/misc/aqlprofile/. However, it's important to note that updates to the public AQLProfile may not occur as frequently as updates to the rocprofiler-sdk. This discrepancy could lead to a potential mismatch between the AQLprofile binary and the rocprofiler-sdk source.