Jonathan R. Madsen 16d535ef48 rocprofv3 OTF2 Output Support (#995)
* CMake support for OTF2 library

* Preliminary OTF2 generation implementation

* Completed OTF2 Support

- HSA API
- HIP API
- Marker API
- Async Memory Copies
- Kernel Dispatch

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix location type for dispatches

* Testing for OTF2 output

* Add OTF2 to requirements.txt

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix getting kernel name

* OTF2 testing with rocprofv3/tracing-hip-in-libraries

* Format external/otf2/CMakeLists.txt

* Update external/otf2/CMakeLists.txt

- guard CMP0135 for cmake < 3.24

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix duplicate string ref issue

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix header includes

* Update CI workflow

- sudo install pypi requirements for core-rpm for $HOME/.local installs

* Update pytest_utils/otf2_reader.py

- modifications for reading trace

* Update pytest_utils/otf2_reader.py

- misc cleanup

* Update CI workflow

- fix installer artifact naming

* Update pytest_utils/otf2_reader.py

- handle slightly overlapping kernel timestamps for MI300

* OTF2 attributes for category

* Testing with OTF2Reader category attributes

* Fix memory leak in OTF2 generation

- leaking OTF2_AttributeList
2024-07-30 19:57:19 -05:00
2023-08-24 19:19:48 -05:00
2024-06-22 00:10:54 +05:30
2024-06-13 22:59:20 +05:30
2024-04-14 14:35:00 -05:00
2024-07-30 22:01:07 +05:30
2023-11-14 10:58:33 -06:00
2024-07-23 08:07:59 +05:30

ROCprofiler-SDK: Application Profiling, Tracing, and Performance Analysis

Note

Note: rocprofiler-sdk is currently considered a beta version and is subject to change in future releases

Overview

ROCProfiler-SDK is AMDs new and improved tooling infrastructure, providing a hardware-specific low-level performance analysis interface for profiling and tracing GPU compute applications. To see what's changed Click Here

GPU Metrics

  • GPU hardware counters
  • HIP API tracing
  • HIP kernel tracing
  • HSA API tracing
  • HSA operation tracing
  • Marker(ROCtx) tracing
  • PC Sampling (Beta)

Tool Support

rocprofv3 is the command line tool built using the rocprofiler-sdk library and shipped with the ROCm stack. To see details on the command line options of rocprofv3, please see rocprofv3 user guide Click Here

Documentation

We make use of doxygen to generate API documentation automatically. The generated document can be found in the following path:

<ROCM_PATH>/share/html/rocprofiler-sdk

ROCM_PATH by default is /opt/rocm It can be set by the user in different locations if needed.

Build and Installation

git clone https://git@github.com:ROCm/rocprofiler-sdk.git rocprofiler-sdk-source
cmake                                         \
      -B rocprofiler-sdk-build                \
      -D ROCPROFILER_BUILD_TESTS=ON           \
      -D ROCPROFILER_BUILD_SAMPLES=ON         \
      -D CMAKE_INSTALL_PREFIX=/opt/rocm       \
       rocprofiler-sdk-source

cmake --build rocprofiler-sdk-build --target all --parallel 8

To install ROCprofiler, run:

cmake --build rocprofiler-sdk-build --target install

Please see the detailed section on build and installation here: Click Here

Support

Please report in the Github Issues.

Limitations

  • Individual XCC mode is not supported.

  • By default, PC sampling API is disabled. To use PC sampling. Setting the ROCPROFILER_PC_SAMPLING_BETA_ENABLED environment variable grants access to the PC Sampling experimental beta feature. This feature is still under development and may not be completely stable.

    • Risk Acknowledgment: By activating this environment variable, you acknowledge and accept the following potential risks:
      • Hardware Freeze: This beta feature could cause your hardware to freeze unexpectedly.
      • Need for Cold Restart: In the event of a hardware freeze, you may need to perform a cold restart (turning the hardware off and on) to restore normal operations. Please use this beta feature cautiously. It may affect your system's stability and performance. Proceed at your own risk.
  • At this point, We do not recommend stress-testing the beta implementation.

  • Correlation IDs provided by the PC sampling service are verified only for HIP API calls.

  • Timestamps in PC sampling records might not be 100% accurate.

  • Using PC sampling on multi-threaded applications might fail with HSA_STATUS_ERROR_EXCEPTION.Furthermore, if three or more threads launch operations to the same agent, and if PC sampling is enabled, the HSA_STATUS_ERROR_EXCEPTION might appear.

Warning

The latest mainline version of AQLprofile can be found at https://repo.radeon.com/rocm/misc/aqlprofile/. However, it's important to note that updates to the public AQLProfile may not occur as frequently as updates to the rocprofiler-sdk. This discrepancy could lead to a potential mismatch between the AQLprofile binary and the rocprofiler-sdk source.

S
説明
説明が提供されていません
Readme 282 MiB
言語
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
その他 1.1%