Madsen, Jonathan 7afedc63be [rocprofv3] SQLite3 database output (rocpd) support + rocprofiler-sdk-rocpd (#403)
* [rocprofv3] rocpd SQLite3 database output support

* Move counters xml and yaml to source/share/rocprofiler-sdk

- more representative of install hierarchy

* Add share/rocprofiler-sdk/rocpd SQL files

* Experimental rocprofiler-sdk SQL API

* rocprofv3 default output format is rocpd

* Fix rocpd event ids for counter collection w/o kernel dispatch

* Remove fktable entries from rocpd_tables.sql

* Fix rocpd schema path

* Fix install component for roctx python bindings

* rocprofiler-sdk-rocpd

- create include/rocprofiler-sdk-rocpd
- create rocprofiler-sdk-rocpd library, package, etc.
- default all "guid" fields to "{{guid}}" in tables
- remove "{{view_uuid}}" support (always unused)

* Migrate rocprofv3 to use rocprofiler-sdk-rocpd

* Fix missing foreign key reference

* Revert change

* Fix cmake comment

* Fix maybe-uninitialized compiler warning

* Fix maybe-uninitialized compiler warning

* Add logging to rocpd_sql_load_schema

* Improve string sanitization when inserting json strings

* Initialize rocpd logging on rocprofiler-sdk-rocpd library load

* Revert lib/output/generatePerfetto.cpp changes

* [temporary] Tweak rocprofv3-test-list-avail-trace-execute test log level

* Update get_install_path for lib/rocprofiler-sdk-rocpd/sql.cpp

- try to resolve issues on RHEL/SLES for dladdr

* Update lib/common/logging.cpp

- enable environ overrides

* dlsym for rocpd_sql_load_schema

* Make dl_info.dli_fname lexically normal

* Implement node_info alternatives if /etc/machine-id does not exist

* Misc include fixes

* SHA256 and UUIDv7 support

* Implement UUIDv7 in generateRocpd.cpp

* Support push/pop environment variables

* Minor tweak

* Fix glog segfaults when unsetting glog env

* Updated CHANGELOG

* Updates tests/pytest-packages

- rocpd_reader.py: RocpdReader

* Update tests / marker_views.sql

- add test_rocpd_data

* Update rocpd_tables.sql

- Use AUTOINCREMENT
- insert "uuid" and "guid" into rocpd_metadata

* Minor updates to generateRocpd.cpp

- don't quote GUID
- use sqlite3_open_v2
- use sqlite3_close_v2

* Update execute_raw_sql_statements_impl

- uses sqlite3_last_insert_rowid for autoincrement

* Update SQL deferred_transaction

- CI check for nullptr to connection

* Apply suggestions from code review

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>

* Code review updates

- formatting
- replace if with switch
- remove loop for {{uuid}}

* Fix pmc_groups handling in rocprofv3

* Address code review feedback

- Include rocm_version in rocprofv3 version info
- Note `--version` option for `rocprofv3` in CHANGELOG.md
- remove commented out code

* Fix packaging dependencies

* Fix install package step of CI workflow

* Fix install package step of CI workflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
2025-05-30 00:13:19 -05:00
2023-08-24 19:19:48 -05:00
2025-02-04 04:05:38 -06:00
2025-01-22 11:34:21 -06:00
2025-01-22 19:11:20 -06:00
2025-02-04 04:05:38 -06:00
2025-03-26 02:12:03 -05:00

ROCprofiler-SDK: Application Profiling, Tracing, and Performance Analysis

Important

We are phasing out development and support for ROCTracer, ROCprofiler, rocprof, and rocprofv2 in favour of ROCprofiler-SDK and rocprofv3 in upcoming ROCm releases. Starting with the ROCm 6.4 release, only critical defect fixes will be addressed for older versions of the profiling tools and libraries. We encourage all users to upgrade to the latest version of the ROCprofiler-SDK library and the rocprofv3 tool to ensure continued support and access to new features.

Please note that we anticipate the end of life for ROCprofiler V1/V2 and ROCTracer within nine months after the ROCm 7.0 release, aligning with the Q1 2026.

Overview

ROCProfiler-SDK is AMDs new and improved tooling infrastructure, providing a hardware-specific low-level performance analysis interface for profiling and tracing GPU compute applications. To see what's changed Click Here

Note

The published documentation is available at ROCprofiler-SDK documentation in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the rocprofiler-sdk/source/docs folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see Contribute to ROCm documentation.

GPU Metrics

  • GPU hardware counters
  • Dispatch Counter Collection
  • Device Counter Collection
  • PC Sampling (Host Trap)
  • Thread trace and ROCprof trace decoder (SQTT, ATT).

API Trace Support

  • HIP API tracing
  • HSA API tracing
  • Marker (ROCTx) tracing
  • Memory copy tracing
  • Memory allocation tracing
  • Page Migration Event tracing
  • Scratch Memory tracing
  • RCCL API tracing
  • rocDecode API tracing
  • rocJPEG API tracing

Parallelism API Support

  • HIP
  • HSA
  • MPI
  • Kokkos-Tools (KokkosP)
  • OpenMP-Tools (OMPT)

Tool Support

rocprofv3 is the command line tool built using the rocprofiler-sdk library and shipped with the ROCm stack. To see details on the command line options of rocprofv3, please see rocprofv3 user guide Click Here

Documentation

We make use of doxygen to generate API documentation automatically. The generated document can be found in the following path:

<ROCM_PATH>/share/html/rocprofiler-sdk

ROCM_PATH by default is /opt/rocm It can be set by the user in different locations if needed.

Build and Installation

git clone https://github.com/ROCm/rocprofiler-sdk.git rocprofiler-sdk-source
cmake                                         \
      -B rocprofiler-sdk-build                \
      -D ROCPROFILER_BUILD_TESTS=ON           \
      -D ROCPROFILER_BUILD_SAMPLES=ON         \
      -D CMAKE_INSTALL_PREFIX=/opt/rocm       \
       rocprofiler-sdk-source

cmake --build rocprofiler-sdk-build --target all --parallel 8

To install ROCprofiler, run:

cmake --build rocprofiler-sdk-build --target install

Please see the detailed section on build and installation here: Click Here

Support

Please report in the Github Issues OR send an email to dl.ROCm-Profiler.support@amd.com

Limitations

  • Individual XCC mode is not supported.

  • By default, PC sampling API is disabled. To use PC sampling. Setting the ROCPROFILER_PC_SAMPLING_BETA_ENABLED environment variable grants access to the PC Sampling experimental beta feature. This feature is still under development and may not be completely stable.

    • Risk Acknowledgment: By activating this environment variable, you acknowledge and accept the following potential risks:

      • Hardware Freeze: This beta feature could cause your hardware to freeze unexpectedly.
      • Need for Cold Restart: In the event of a hardware freeze, you may need to perform a cold restart (turning the hardware off and on) to restore normal operations. Please use this beta feature cautiously. It may affect your system's stability and performance. Proceed at your own risk.
    • At this point, We do not recommend stress-testing the beta implementation.

    • Correlation IDs provided by the PC sampling service are verified only for HIP API calls.

    • Timestamps in PC sampling records might not be 100% accurate.

    • Using PC sampling on multi-threaded applications might fail with HSA_STATUS_ERROR_EXCEPTION.Furthermore, if three or more threads launch operations to the same agent, and if PC sampling is enabled, the HSA_STATUS_ERROR_EXCEPTION might appear.

  • gfx10, gfx11 and gfx12 requires a stable power state for counter collection. This includes Radeon 7000 GPUs.

    # For device <N>. Use 'rocm-smi' or 'amd-smi monitor' to see device number.
    sudo amd-smi set -g <N> -l stable_std
    # After profiling, set power state back to 'auto'
    sudo amd-smi set -g <N> -l auto
    

    The gfx version can be found via amd-smi static --asic -g <N> in the TARGET_GRAPHICS_VERSION field:

    $ amd-smi static -a -g 2
    GPU: 2
        ASIC:
            MARKET_NAME: Navi 33 [Radeon Pro W7500]
            VENDOR_ID: 0x1002
            VENDOR_NAME: Advanced Micro Devices Inc. [AMD/ATI]
            SUBVENDOR_ID: 0x1002
            DEVICE_ID: 0x7489
            SUBSYSTEM_ID: 0x0e0d
            REV_ID: 0x00
            ASIC_SERIAL: N/A
            OAM_ID: N/A
            NUM_COMPUTE_UNITS: 28
            TARGET_GRAPHICS_VERSION: gfx1102
    

Warning

The latest mainline version of AQLprofile can be found at https://repo.radeon.com/rocm/misc/aqlprofile/. However, it's important to note that updates to the public AQLProfile may not occur as frequently as updates to the rocprofiler-sdk. This discrepancy could lead to a potential mismatch between the AQLprofile binary and the rocprofiler-sdk source.

S
توضیحات
No description provided
Readme 282 MiB
Languages
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
دیگر 1.1%