Files

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

349 строки
17 KiB
ReStructuredText
Исходник Постоянная ссылка Обычный вид История

2024-07-29 17:23:36 -04:00
.. meta::
:description: ROCm Systems Profiler Python profiling documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, Python, profiling Python, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
2024-07-29 17:23:36 -04:00
****************************************************
Profiling Python scripts
****************************************************
`ROCm Systems Profiler <https://github.com/ROCm/rocm-systems/tree/develop/projects/rocprofiler-systems>`_ supports profiling Python code at the
2024-07-29 17:23:36 -04:00
source level and the script level.
Python support is enabled via the ``ROCPROFSYS_USE_PYTHON`` and the
``ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>`` CMake options.
Alternatively, to build multiple Python versions, use
``ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]"``,
and ``ROCPROFSYS_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]"`` instead of ``ROCPROFSYS_PYTHON_VERSION``.
When building multiple Python versions, the length of the ``ROCPROFSYS_PYTHON_VERSIONS``
and ``ROCPROFSYS_PYTHON_ROOT_DIRS`` lists must
2024-07-29 17:23:36 -04:00
be the same size.
.. note::
When using ROCm Systems Profiler with Python programs, the Python interpreter major and minor version (e.g. 3.7)
2024-07-29 17:23:36 -04:00
must match the interpreter major and minor version
used when compiling the Python bindings. When building ROCm Systems Profiler,
the shared object file ``libpyrocprofsys.<IMPL>-<VERSION>-<ARCH>-<OS>-<ABI>.so`` is generated
where ``IMPL`` is the Python implementation, ``VERSION`` is the major and minor
2024-07-29 17:23:36 -04:00
version, ``ARCH`` is the architecture,
``OS`` is the operating system, and ``ABI`` is the application binary interface,
for example, ``libpyrocprofsys.cpython-38-x86_64-linux-gnu.so``.
2024-07-29 17:23:36 -04:00
2025-02-19 18:38:42 -05:00
.. note::
Direct Perfetto output (using ``--trace`` or ``ROCPROFSYS_USE_TRACE=ON``) has limited support for Artificial Intelligence (AI) and Machine Learning (ML) workloads.
Data from child threads is not captured. Instead, use ROCPD (``ROCPROFSYS_USE_ROCPD=ON``) as the output type.
For more information, see the :ref:`rocprof_sys_rocpd_output` section.
2025-02-19 18:38:42 -05:00
2024-12-13 15:59:07 -05:00
Getting started
2024-07-29 17:23:36 -04:00
========================================
The ROCm Systems Profiler Python package is installed in ``lib/pythonX.Y/site-packages/rocprofsys``.
To ensure the Python interpreter can find the ROCm Systems Profiler package,
2024-07-29 17:23:36 -04:00
add this path to the ``PYTHONPATH`` environment variable, as in the following example:
.. code-block:: shell
export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}
2024-07-29 17:23:36 -04:00
Both the ``share/rocprofiler-systems/setup-env.sh`` script and the module file in
``share/modulefiles/rocprofiler-systems`` automatically handle the prefixing of the ``PYTHONPATH``
2024-07-29 17:23:36 -04:00
environment variable.
.. note::
Profiling PyTorch and other AI workloads might fail because it is unable to find the libraries in the default linker path. As a workaround, you need to explicitly add the library path to ``LD_LIBRARY_PATH``. For example, when using PyTorch with Python 3.10, add the following to the environment:
.. code-block:: shell
export LD_LIBRARY_PATH=:/opt/venv/lib/python3.10/site-packages/torch/lib:$LD_LIBRARY_PATH
Running ROCm Systems Profiler on a Python script
2024-12-13 15:59:07 -05:00
================================================
2024-07-29 17:23:36 -04:00
ROCm Systems Profiler provides an ``rocprof-sys-python`` helper bash script which
2024-07-29 17:23:36 -04:00
ensures ``PYTHONPATH`` is properly set and the correct Python interpreter is used.
This means the following commands are effectively equivalent:
.. code-block:: shell
rocprof-sys-python --help
2024-07-29 17:23:36 -04:00
and
.. code-block:: shell
export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}
python3.8 -m rocprofsys --help
2024-07-29 17:23:36 -04:00
.. note::
``rocprof-sys-python`` and ``python -m rocprofsys`` use the same command-line syntax
as the other ``rocprof-sys`` executables (``rocprof-sys-python <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>``)
2024-07-29 17:23:36 -04:00
and has similar options.
Command line options
-----------------------------------
Use ``rocprof-sys-python --help`` to view the available options:
2024-07-29 17:23:36 -04:00
.. code-block:: shell
usage: rocprof-sys [-h] [-v VERBOSITY] [-b] [-c FILE] [-s FILE] [-F [BOOL]] [--label [{args,file,line} [{args,file,line} ...]]] [-I FUNC [FUNC ...]] [-E FUNC [FUNC ...]] [-R FUNC [FUNC ...]] [-MI FILE [FILE ...]] [-ME FILE [FILE ...]] [-MR FILE [FILE ...]] [--trace-c [BOOL]]
2024-07-29 17:23:36 -04:00
optional arguments:
-h, --help show this help message and exit
-v VERBOSITY, --verbosity VERBOSITY
Logging verbosity
-b, --builtin Put 'profile' in the builtins. Use '@profile' to decorate a single function, or 'with profile:' to profile a single section of code.
-c FILE, --config FILE
ROCm Systems Profiler configuration file
2024-07-29 17:23:36 -04:00
-s FILE, --setup FILE
Code to execute before the code to profile
-F [BOOL], --full-filepath [BOOL]
Encode the full function filename (instead of basename)
--label [{args,file,line} [{args,file,line} ...]]
Encode the function arguments, filename, and/or line number into the profiling function label
-I FUNC [FUNC ...], --function-include FUNC [FUNC ...]
Include any entries with these function names
-E FUNC [FUNC ...], --function-exclude FUNC [FUNC ...]
Filter out any entries with these function names
-R FUNC [FUNC ...], --function-restrict FUNC [FUNC ...]
Select only entries with these function names
-MI FILE [FILE ...], --module-include FILE [FILE ...]
Include any entries from these files
-ME FILE [FILE ...], --module-exclude FILE [FILE ...]
Filter out any entries from these files
-MR FILE [FILE ...], --module-restrict FILE [FILE ...]
Select only entries from these files
--trace-c [BOOL] Enable profiling C functions
usage: python3 -m rocprofsys <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>
2024-07-29 17:23:36 -04:00
.. note::
The ``--trace-c`` option does not incorporate ROCm Systems Profiler's dynamic instrumentation support.
2024-07-29 17:23:36 -04:00
It only enables profiling the underlying C function call within the Python interpreter.
Selective instrumentation
-----------------------------------
Similar to the ``rocprof-sys-instrument`` executable, command-line options exist for restricting,
2024-07-29 17:23:36 -04:00
including, and excluding certain functions and modules, for example, ``--function-exclude "^__init__$"``.
Alternatively, add the ``@profile`` decorator to the primary function of interest
2024-07-29 17:23:36 -04:00
in your program and use the ``-b`` / ``--builtin`` command-line option to narrow the scope of the
instrumentation to this function and its children.
Consider the following Python code (``example.py``):
.. code-block:: python
import sys
def fib(n):
return n if n < 2 else (fib(n - 1) + fib(n - 2))
def inefficient(n):
a = 0
for i in range(n):
a += i
for j in range(n):
a += j
return a
def run(n):
return fib(n) + inefficient(n)
if __name__ == "__main__":
run(20)
2025-02-19 18:38:42 -05:00
Running ``rocprof-sys-python -- ./example.py`` with ``ROCPROFSYS_PROFILE=ON`` and
``ROCPROFSYS_TIMEMORY_COMPONENTS=trip_count`` produces the following:
2024-07-29 17:23:36 -04:00
.. code-block:: shell
|-------------------------------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-------------------------------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run | 1 | 0 | trip_count | 1 |
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|-------------------------------------------------------------------------------------------|
If the ``inefficient`` function is decorated with ``@profile`` as follows:
.. code-block:: python
@profile
def inefficient(n):
# ...
And then run using the command ``rocprof-sys-python -b -- ./example.py``, ROCm Systems Profiler produces this output:
2024-07-29 17:23:36 -04:00
.. code-block:: shell
|-----------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-----------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|-------------------|--------|--------|------------|--------|
| |0>>> inefficient | 1 | 0 | trip_count | 1 |
|-----------------------------------------------------------|
ROCm Systems Profiler Python source instrumentation
2024-12-13 15:59:07 -05:00
===================================================
2024-07-29 17:23:36 -04:00
Starting with the unmodified ``example.py`` script above, import the ``rocprofsys`` module:
2024-07-29 17:23:36 -04:00
.. code-block:: python
import sys
import rocprofsys # import rocprofsys
2024-07-29 17:23:36 -04:00
def fib(n):
# ... etc. ...
Next, add ``@rocprofsys.profile()`` to the ``run`` function:
2024-07-29 17:23:36 -04:00
.. code-block:: python
@rocprofsys.profile()
2024-07-29 17:23:36 -04:00
def run(n):
# ...
Alternatively, use ``rocprofsys.profile()`` as a context-manager around ``run(20)``:
2024-07-29 17:23:36 -04:00
.. code-block:: python
if __name__ == "__main__":
with rocprofsys.profile():
2024-07-29 17:23:36 -04:00
run(20)
The results for both of the source-level instrumentation modes are identical to the
2025-02-19 18:38:42 -05:00
original ``rocprof-sys-python -- ./example.py`` results:
2024-07-29 17:23:36 -04:00
.. code-block:: shell
|-------------------------------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-------------------------------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run | 1 | 0 | trip_count | 1 |
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|-------------------------------------------------------------------------------------------|
.. note::
When ``rocprof-sys-python`` is used without built-ins, the profiling results can be cluttered by the
2024-07-29 17:23:36 -04:00
numerous functions called when more complex modules are imported, such as ``import numpy``.
ROCm Systems Profiler Python source instrumentation configuration
2024-12-13 15:59:07 -05:00
-----------------------------------------------------------------
2024-07-29 17:23:36 -04:00
Within the Python source code, the profiler can be configured by directly
modifying the ``rocprof-sys.profiler.config`` data fields.
2024-07-29 17:23:36 -04:00
.. code-block:: python
import sys
def fib(n):
return n if n < 2 else (fib(n - 1) + fib(n - 2))
def inefficient(n):
a = 0
for i in range(n):
a += i
for j in range(n):
a += j
return a
def run(n):
return fib(n) + inefficient(n)
if __name__ == "__main__":
from rocprofsys.profiler import config
from rocprofsys import profile
2024-07-29 17:23:36 -04:00
config.include_args = True
config.include_filename = False
config.include_line = False
config.restrict_functions += ["fib", "run"]
with profile():
run(5)
Executing this script produces the following:
.. code-block:: shell
|------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|--------------------------|--------|--------|------------|--------|
| |0>>> run(n=5) | 1 | 0 | trip_count | 1 |
| |0>>> |_fib(n=5) | 1 | 1 | trip_count | 1 |
| |0>>> |_fib(n=4) | 1 | 2 | trip_count | 1 |
| |0>>> |_fib(n=3) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 5 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 5 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=3) | 1 | 2 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 3 | trip_count | 1 |
|------------------------------------------------------------------|