Files
Saurabh Verma 946385d0ff Reverts #1379 and properly migrates the docs (#1381)
Reverts #1379 and properly migrates the docs

---------

Co-authored-by: Matt Williams <matt.williams@amd.com>
2025-10-15 10:48:27 -04:00

61 строка
3.1 KiB
ReStructuredText

.. meta::
:description: AQLprofile is an open source library that enables advanced GPU profiling and tracing on AMD platforms.
:keywords: AQLprofile, ROCm, tool, Instinct, accelerator, AMD
What is AQLprofile?
===================
The Architected Queuing Language profiling library (AQLprofile) is an
open source library that enables advanced GPU profiling and tracing on
AMD platforms. It works in conjunction with
`ROCprofiler-SDK <https://github.com/ROCm/rocprofiler-sdk>`__ to
support profiling methods such as `performance counters
(PMC) <https://rocm.docs.amd.com/projects/aqlprofile/en/latest/examples/pmc-workflow.html>`__ and `SQ thread trace
(SQTT) <https://rocm.docs.amd.com/projects/aqlprofile/en/latest/examples/sqtt-workflow.html>`__. AQLprofile provides the
foundational mechanisms for constructing AQL packets and managing
profiling operations across multiple AMD GPU architecture families. The
development of AQLprofile is aligned with ROCprofiler-SDK, ensuring
compatibility and feature support for new GPU architectures and
profiling requirements.
AQLprofile builds on concepts from the Heterogeneous System Architecture
(HSA) and the AQL, which define the foundations for GPU command
processing and profiling on AMD platforms. For more information, see:
- `HSA Platform System Architecture
Specification <http://hsafoundation.com/wp-content/uploads/2021/02/HSA-SysArch-1.2.pdf>`__
- `HSA Runtime Programmer's Reference
Specification <http://hsafoundation.com/wp-content/uploads/2021/02/HSA-Runtime-1.2.pdf>`__
Features
--------
- Profiling AQL packets for GPU workloads.
- Performance counters and SQ thread traces.
- Support for GFX9, GFX10XX, GFX11XX, and GFX12XX architecture families.
- Verbose tracing and error logging capabilities.
- Thread trace binary data generated by AQLprofile can be decoded using
`rocprof-trace-decoder <https://github.com/ROCm/rocprof-trace-decoder/releases>`__.
Who should use this library?
----------------------------
- **End users**: If you want to profile AMD GPUs, use
`ROCprofiler-SDK <https://github.com/ROCm/rocprofiler-sdk>`__ or
tools that depend on it. You do *not* need to use AQLprofile
directly.
- **Developers/integrators**: If you're building profiling tools,
custom workflows, or need to extend profiling capabilities, you may
use AQLprofile directly as a backend.
How does AQLprofile fit into the ROCm profiling stack?
------------------------------------------------------
Here's the typical workflow:
Application → ROCprofiler-SDK ⇄ **AQLprofile** ⇄ ROCprofiler-SDK → HSA/ROCR/KFD → AMD GPU hardware
- **AQLprofile** generates profiling command packets (AQL/PM4) tailored to the GPU architecture. It doesn't interact with hardware or drivers directly. It only produces the packets and buffer requirements requested by ``ROCprofiler-SDK``.
- **ROCprofiler-SDK** provides a higher-level API and user-facing tools, using AQLprofile internally. It manages profiling sessions, submits packets to the GPU via `ROCr <https://rocm.docs.amd.com/projects/rocr_debug_agent/en/latest/index.html>`_/HSA/KFD, and collects results.