0689797736
* Add Sphinx and Read the Docs configs * Add documentation workflow configurations * Changed macros verbprintf and verbprintf_bare so they write to stdout… (#346) Flush stdout when listing keys + bump verbose level for GPU count * Removing static version asserts. (#347) It is causing failures on our internal builds Signed-off-by: David Galiffi <David.Galiffi@amd.com> * Check for an empty vector before popping (#350) Protect from possible seg. fault Signed-off-by: David Galiffi <David.Galiffi@amd.com> * Add release links to installation.md (#351) * Initial infrastructure rework for Omnitrace refactoring and a rewrite of the What is file * Add files in conceptual section, along with images and infrastructure changes. * Formatting and style fixes for files in conceptual directory * Add quick start install guide and fix spelling errors in other files * Add install document and fix code tags. Infrastructure changes * Add two how-to guides along with infra changes and spelling fixes * Add two new how to files and fix errors in the last commit * Fix spelling mistakes * Add new how to file on causal profiling and infra changes. * Add how to file on interpreting Omnitrace output, fixes, and images * Add remaining how-to guides and reference materials along with fixes and infrastructure * Add YouTube file and fix spelling and formatting * Fix a few loose ends and add link to license page * Add Sphinx and Doxygen infrastructure and some additional corrections * Update rocm-docs-core * Fix Doxyfile * Fix path to API header files * Run doxysphinx in conf.py * Add back custom css for doxygen * Remove doxygenlayout * Add api to toc * Update Doxyfile Generate from source .in * Proofreading edits and other changes * Add .gitignore for Doxygen and remove deprecated words and typos * Fix one additional typo * Turn off dot * Update doxyfile strip from path * Workflow, submodules, and thread info Updates (#352) * Update CI workflows - use node20 workflow packages * Update tests/source/CMakeLists.txt - Use OMNITRACE_TRACE and OMNTRACE_PROFILE instead of perfetto/timemory * Update timemory submodule - argparse: requires -> required - parse callbacks * Update thread_info.cpp - fix causal::delay::get_local usage * Update timemory submodule * Update kokkos submodule - release 3.7.02 * Revert opensuse.yml and ubuntu-bionic.yml to use node16 workflows * Update docs.yml * ROCm 6.1 Installers (#349) * Add ROCm 6.1 to packages * Bump version to 1.11.3 * Add 6.1 support to the docker build support. Simplified this by adding 6.* to case statements, now that repo links have been standardized. * Update timemory submodule (#354) - fix argparse::argument::required template deduction * Build omnitrace-rt library (#355) * Build omnitrace-rt library - Explicitly build dyninstAPI_RT as omnitrace-rt so that the SONAME in the ELF is omnitrace-rt instead of dyninstAPI_RT - Create symbolic link lib/omnitrace/libdyninstAPI_RT.so which points to lib/libomnitrace-rt.so - Simplify build tree location of libomnitrace-rt.so since it is ../lib from the bin directory even in the build tree - Update dyninst submodule with minor tweaks to dyninstAPI_RT/CMakeLists.txt * Update source/lib/omnitrace-rt/cmake/platform.cmake * Use ftpmirror.gnu.org instead of ftp.gnu.org - in timemory and dyninst submodules - minor .clang-tidy tweak * Executables append omnitrace library directory to LD_LIBRARY_PATH (#356) - omnitrace-run, omnitrace-sample, and omnitrace-causal now automatically append the LD_LIBRARY_PATH with the directory containing the omnitrace libraries - this helps ensure that binary rewritten exes can resolve omnitrace-rt library location * Fix a few typos and formatting issues * Additional fixes and minor formatting changes. * More fixes and minor formatting changes. * Complete second proofreading with fixes and minor formatting changes. * Make changes to table of contents and disable linting * Update links in the README doc to reflect the new structure. * Align intro on the Omnitrace index page with the first paragraph of the what-is page * Changes and edits based on review comments * Additional changes and edits based on external review * Additional updates and changes from the external review of Omnitrace * Additional changes based on the external review * New round of edits based on the external review * Additional edits based on the external review * Changes to address comments from the internal review * Correct to the RHEL SELinux note in the troubleshooting guide * One additional change to the development guide code example * Move troubleshooting to post-install of install.rst and other minor edits. * Remove troubleshooting page and modify new post-install troubleshooting section on install.rst * Refactor the how Omnitrace works page into seperate topics and redo infrastructure * API ToC changes * Additional API and ToC changes * Back out API and ToC changes and update requirements.txt * Additional API and ToC changes * Add commit for signing purposes * Add ElfUtils and BinUtils Download URL Overrides (#358) * Add CMake CACHE Variable ElfUtils_DOWNLOAD_URL Used to override the default URL to download ElfUtils from. Useful for internal builds Also, include a mirror to fallback to if the override URL fails. * Update timemory submodule Updating to include the BINUTIL_DOWNLOAD_URL override cmake variable. --------- Signed-off-by: David Galiffi <David.Galiffi@amd.com> * Remove Ubuntu 18.04 and SUSE 15.2 * Update checkout action to v4 * Add `docs/**` to `paths-ignore` Document location is being refactored. * Modified submodules dyninst and timemory. (#361) --------- Signed-off-by: David Galiffi <David.Galiffi@amd.com> Co-authored-by: Peter Jun Park <peter.park@amd.com> Co-authored-by: ajanicijamd <Aleksandar.Janicijevic@amd.com> Co-authored-by: David Galiffi <David.Galiffi@amd.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
335 baris
16 KiB
ReStructuredText
335 baris
16 KiB
ReStructuredText
.. meta::
|
|
:description: Omnitrace documentation and reference
|
|
:keywords: Omnitrace, ROCm, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
|
|
|
|
****************************************************
|
|
Profiling Python scripts
|
|
****************************************************
|
|
|
|
`Omnitrace <https://github.com/ROCm/omnitrace>`_ supports profiling Python code at the
|
|
source level and the script level.
|
|
Python support is enabled via the ``OMNITRACE_USE_PYTHON`` and the
|
|
``OMNITRACE_PYTHON_VERSIONS="<MAJOR>.<MINOR>`` CMake options.
|
|
Alternatively, to build multiple Python versions, use
|
|
``OMNITRACE_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]"``,
|
|
and ``OMNITRACE_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]"`` instead of ``OMNITRACE_PYTHON_VERSION``.
|
|
When building multiple Python versions, the length of the ``OMNITRACE_PYTHON_VERSIONS``
|
|
and ``OMNITRACE_PYTHON_ROOT_DIRS`` lists must
|
|
be the same size.
|
|
|
|
.. note::
|
|
|
|
When using Omnitrace with Python programs, the Python interpreter major and minor version (e.g. 3.7)
|
|
must match the interpreter major and minor version
|
|
used when compiling the Python bindings. When building Omnitrace,
|
|
the shared object file ``libpyomnitrace.<IMPL>-<VERSION>-<ARCH>-<OS>-<ABI>.so`` is generated
|
|
where ``IMPL`` is the Python implementation, ``VERSION`` is the major and minor
|
|
version, ``ARCH`` is the architecture,
|
|
``OS`` is the operating system, and ``ABI`` is the application binary interface,
|
|
for example, ``libpyomnitrace.cpython-38-x86_64-linux-gnu.so``.
|
|
|
|
Getting Started
|
|
========================================
|
|
|
|
The Omnitrace Python package is installed in ``lib/pythonX.Y/site-packages/omnitrace``.
|
|
To ensure the Python interpreter can find the Omnitrace package,
|
|
add this path to the ``PYTHONPATH`` environment variable, as in the following example:
|
|
|
|
.. code-block:: shell
|
|
|
|
export PYTHONPATH=/opt/omnitrace/lib/python3.8/site-packages:${PYTHONPATH}
|
|
|
|
Both the ``share/omnitrace/setup-env.sh`` script and the module file in
|
|
``share/modulefiles/omnitrace`` automatically handle the prefixing of the ``PYTHONPATH``
|
|
environment variable.
|
|
|
|
Running Omnitrace on a Python script
|
|
========================================
|
|
|
|
Omnitrace provides an ``omnitrace-python`` helper bash script which
|
|
ensures ``PYTHONPATH`` is properly set and the correct Python interpreter is used.
|
|
This means the following commands are effectively equivalent:
|
|
|
|
.. code-block:: shell
|
|
|
|
omnitrace-python --help
|
|
|
|
and
|
|
|
|
.. code-block:: shell
|
|
|
|
export PYTHONPATH=/opt/omnitrace/lib/python3.8/site-packages:${PYTHONPATH}
|
|
python3.8 -m omnitrace --help
|
|
|
|
.. note::
|
|
|
|
``omnitrace-python`` and ``python -m omnitrace`` use the same command-line syntax
|
|
as the other ``omnitrace`` executables (``omnitrace-python <OMNITRACE_ARGS> -- <SCRIPT> <SCRIPT_ARGS>``)
|
|
and has similar options.
|
|
|
|
Command line options
|
|
-----------------------------------
|
|
|
|
Use ``omnitrace-python --help`` to view the available options:
|
|
|
|
.. code-block:: shell
|
|
|
|
usage: omnitrace [-h] [-v VERBOSITY] [-b] [-c FILE] [-s FILE] [-F [BOOL]] [--label [{args,file,line} [{args,file,line} ...]]] [-I FUNC [FUNC ...]] [-E FUNC [FUNC ...]] [-R FUNC [FUNC ...]] [-MI FILE [FILE ...]] [-ME FILE [FILE ...]] [-MR FILE [FILE ...]] [--trace-c [BOOL]]
|
|
|
|
optional arguments:
|
|
-h, --help show this help message and exit
|
|
-v VERBOSITY, --verbosity VERBOSITY
|
|
Logging verbosity
|
|
-b, --builtin Put 'profile' in the builtins. Use '@profile' to decorate a single function, or 'with profile:' to profile a single section of code.
|
|
-c FILE, --config FILE
|
|
OmniTrace configuration file
|
|
-s FILE, --setup FILE
|
|
Code to execute before the code to profile
|
|
-F [BOOL], --full-filepath [BOOL]
|
|
Encode the full function filename (instead of basename)
|
|
--label [{args,file,line} [{args,file,line} ...]]
|
|
Encode the function arguments, filename, and/or line number into the profiling function label
|
|
-I FUNC [FUNC ...], --function-include FUNC [FUNC ...]
|
|
Include any entries with these function names
|
|
-E FUNC [FUNC ...], --function-exclude FUNC [FUNC ...]
|
|
Filter out any entries with these function names
|
|
-R FUNC [FUNC ...], --function-restrict FUNC [FUNC ...]
|
|
Select only entries with these function names
|
|
-MI FILE [FILE ...], --module-include FILE [FILE ...]
|
|
Include any entries from these files
|
|
-ME FILE [FILE ...], --module-exclude FILE [FILE ...]
|
|
Filter out any entries from these files
|
|
-MR FILE [FILE ...], --module-restrict FILE [FILE ...]
|
|
Select only entries from these files
|
|
--trace-c [BOOL] Enable profiling C functions
|
|
|
|
usage: python3 -m omnitrace <OMNITRACE_ARGS> -- <SCRIPT> <SCRIPT_ARGS>
|
|
|
|
.. note::
|
|
|
|
The ``--trace-c`` option does not incorporate Omnitrace's dynamic instrumentation support.
|
|
It only enables profiling the underlying C function call within the Python interpreter.
|
|
|
|
Selective instrumentation
|
|
-----------------------------------
|
|
|
|
Similar to the ``omnitrace-instrument`` executable, command-line options exist for restricting,
|
|
including, and excluding certain functions and modules, for example, ``--function-exclude "^__init__$"``.
|
|
Alternatively, add the ``@profile`` decorator to the primary function of interest
|
|
in your program and use the ``-b`` / ``--builtin`` command-line option to narrow the scope of the
|
|
instrumentation to this function and its children.
|
|
|
|
Consider the following Python code (``example.py``):
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
|
|
def fib(n):
|
|
return n if n < 2 else (fib(n - 1) + fib(n - 2))
|
|
|
|
|
|
def inefficient(n):
|
|
a = 0
|
|
for i in range(n):
|
|
a += i
|
|
for j in range(n):
|
|
a += j
|
|
return a
|
|
|
|
|
|
def run(n):
|
|
return fib(n) + inefficient(n)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
run(20)
|
|
|
|
Running ``omnitrace-python ./example.py`` with ``OMNITRACE_PROFILE=ON`` and
|
|
``OMNITRACE_TIMEMORY_COMPONENTS=trip_count`` produces the following:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-------------------------------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-------------------------------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|---------------------------------------------------|--------|--------|------------|--------|
|
|
| |0>>> run | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
|
|
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
|
|
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
|
|
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
|
|
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
|
|
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
|
|
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
|
|
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
|
|
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
|
|
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
|
|
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
|
|
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
|
|
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
|
|
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
|
|
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
|
|
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
|
|
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
|
|
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
|
|
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
|
|
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|
|
|-------------------------------------------------------------------------------------------|
|
|
|
|
If the ``inefficient`` function is decorated with ``@profile`` as follows:
|
|
|
|
.. code-block:: python
|
|
|
|
@profile
|
|
def inefficient(n):
|
|
# ...
|
|
|
|
And then run using the command ``omnitrace-python -b -- ./example.py``, Omnitrace produces this output:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-----------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-----------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|-------------------|--------|--------|------------|--------|
|
|
| |0>>> inefficient | 1 | 0 | trip_count | 1 |
|
|
|-----------------------------------------------------------|
|
|
|
|
Omnitrace Python source instrumentation
|
|
========================================
|
|
|
|
Starting with the unmodified ``example.py`` script above, import the ``omnitrace`` module:
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
import omnitrace # import omnitrace
|
|
|
|
def fib(n):
|
|
# ... etc. ...
|
|
|
|
Next, add ``@omnitrace.profile()`` to the ``run`` function:
|
|
|
|
.. code-block:: python
|
|
|
|
@omnitrace.profile()
|
|
def run(n):
|
|
# ...
|
|
|
|
Alternatively, use ``omnitrace.profile()`` as a context-manager around ``run(20)``:
|
|
|
|
.. code-block:: python
|
|
|
|
if __name__ == "__main__":
|
|
with omnitrace.profile():
|
|
run(20)
|
|
|
|
The results for both of the source-level instrumentation modes are identical to the
|
|
original ``omnitrace-python ./example.py`` results:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-------------------------------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-------------------------------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|---------------------------------------------------|--------|--------|------------|--------|
|
|
| |0>>> run | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
|
|
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
|
|
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
|
|
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
|
|
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
|
|
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
|
|
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
|
|
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
|
|
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
|
|
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
|
|
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
|
|
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
|
|
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
|
|
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
|
|
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
|
|
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
|
|
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
|
|
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
|
|
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
|
|
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|
|
|-------------------------------------------------------------------------------------------|
|
|
|
|
.. note::
|
|
|
|
When ``omnitrace-python`` is used without built-ins, the profiling results can be cluttered by the
|
|
numerous functions called when more complex modules are imported, such as ``import numpy``.
|
|
|
|
Omnitrace Python source instrumentation configuration
|
|
-------------------------------------------------------------
|
|
|
|
Within the Python source code, the profiler can be configured by directly
|
|
modifying the ``omnitrace.profiler.config`` data fields.
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
|
|
def fib(n):
|
|
return n if n < 2 else (fib(n - 1) + fib(n - 2))
|
|
|
|
|
|
def inefficient(n):
|
|
a = 0
|
|
for i in range(n):
|
|
a += i
|
|
for j in range(n):
|
|
a += j
|
|
return a
|
|
|
|
|
|
def run(n):
|
|
return fib(n) + inefficient(n)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
from omnitrace.profiler import config
|
|
from omnitrace import profile
|
|
|
|
config.include_args = True
|
|
config.include_filename = False
|
|
config.include_line = False
|
|
config.restrict_functions += ["fib", "run"]
|
|
|
|
with profile():
|
|
run(5)
|
|
|
|
Executing this script produces the following:
|
|
|
|
.. code-block:: shell
|
|
|
|
|------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|--------------------------|--------|--------|------------|--------|
|
|
| |0>>> run(n=5) | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib(n=5) | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib(n=4) | 1 | 2 | trip_count | 1 |
|
|
| |0>>> |_fib(n=3) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 5 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 5 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=3) | 1 | 2 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 3 | trip_count | 1 |
|
|
|------------------------------------------------------------------|
|