dfaa4dc9c5
* Add Sphinx and Read the Docs configs
* Add documentation workflow configurations
* Changed macros verbprintf and verbprintf_bare so they write to stdout… (#346)
Flush stdout when listing keys + bump verbose level for GPU count
* Removing static version asserts. (#347)
It is causing failures on our internal builds
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Check for an empty vector before popping (#350)
Protect from possible seg. fault
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Add release links to installation.md (#351)
* Initial infrastructure rework for Omnitrace refactoring and a rewrite of the What is file
* Add files in conceptual section, along with images and infrastructure changes.
* Formatting and style fixes for files in conceptual directory
* Add quick start install guide and fix spelling errors in other files
* Add install document and fix code tags. Infrastructure changes
* Add two how-to guides along with infra changes and spelling fixes
* Add two new how to files and fix errors in the last commit
* Fix spelling mistakes
* Add new how to file on causal profiling and infra changes.
* Add how to file on interpreting Omnitrace output, fixes, and images
* Add remaining how-to guides and reference materials along with fixes and infrastructure
* Add YouTube file and fix spelling and formatting
* Fix a few loose ends and add link to license page
* Add Sphinx and Doxygen infrastructure and some additional corrections
* Update rocm-docs-core
* Fix Doxyfile
* Fix path to API header files
* Run doxysphinx in conf.py
* Add back custom css for doxygen
* Remove doxygenlayout
* Add api to toc
* Update Doxyfile
Generate from source .in
* Proofreading edits and other changes
* Add .gitignore for Doxygen and remove deprecated words and typos
* Fix one additional typo
* Turn off dot
* Update doxyfile strip from path
* Workflow, submodules, and thread info Updates (#352)
* Update CI workflows
- use node20 workflow packages
* Update tests/source/CMakeLists.txt
- Use OMNITRACE_TRACE and OMNTRACE_PROFILE instead of perfetto/timemory
* Update timemory submodule
- argparse: requires -> required
- parse callbacks
* Update thread_info.cpp
- fix causal::delay::get_local usage
* Update timemory submodule
* Update kokkos submodule
- release 3.7.02
* Revert opensuse.yml and ubuntu-bionic.yml to use node16 workflows
* Update docs.yml
* ROCm 6.1 Installers (#349)
* Add ROCm 6.1 to packages
* Bump version to 1.11.3
* Add 6.1 support to the docker build support.
Simplified this by adding 6.* to case statements, now that repo links have been standardized.
* Update timemory submodule (#354)
- fix argparse::argument::required template deduction
* Build omnitrace-rt library (#355)
* Build omnitrace-rt library
- Explicitly build dyninstAPI_RT as omnitrace-rt so that the SONAME in the ELF is omnitrace-rt instead of dyninstAPI_RT
- Create symbolic link lib/omnitrace/libdyninstAPI_RT.so which points to lib/libomnitrace-rt.so
- Simplify build tree location of libomnitrace-rt.so since it is ../lib from the bin directory even in the build tree
- Update dyninst submodule with minor tweaks to dyninstAPI_RT/CMakeLists.txt
* Update source/lib/omnitrace-rt/cmake/platform.cmake
* Use ftpmirror.gnu.org instead of ftp.gnu.org
- in timemory and dyninst submodules
- minor .clang-tidy tweak
* Executables append omnitrace library directory to LD_LIBRARY_PATH (#356)
- omnitrace-run, omnitrace-sample, and omnitrace-causal now automatically append the LD_LIBRARY_PATH with the directory containing the omnitrace libraries
- this helps ensure that binary rewritten exes can resolve omnitrace-rt library location
* Fix a few typos and formatting issues
* Additional fixes and minor formatting changes.
* More fixes and minor formatting changes.
* Complete second proofreading with fixes and minor formatting changes.
* Make changes to table of contents and disable linting
* Update links in the README doc to reflect the new structure.
* Align intro on the Omnitrace index page with the first paragraph of the what-is page
* Changes and edits based on review comments
* Additional changes and edits based on external review
* Additional updates and changes from the external review of Omnitrace
* Additional changes based on the external review
* New round of edits based on the external review
* Additional edits based on the external review
* Changes to address comments from the internal review
* Correct to the RHEL SELinux note in the troubleshooting guide
* One additional change to the development guide code example
* Move troubleshooting to post-install of install.rst and other minor edits.
* Remove troubleshooting page and modify new post-install troubleshooting section on install.rst
* Refactor the how Omnitrace works page into seperate topics and redo infrastructure
* API ToC changes
* Additional API and ToC changes
* Back out API and ToC changes and update requirements.txt
* Additional API and ToC changes
* Add commit for signing purposes
* Add ElfUtils and BinUtils Download URL Overrides (#358)
* Add CMake CACHE Variable ElfUtils_DOWNLOAD_URL
Used to override the default URL to download ElfUtils from.
Useful for internal builds
Also, include a mirror to fallback to if the override URL fails.
* Update timemory submodule
Updating to include the BINUTIL_DOWNLOAD_URL override cmake
variable.
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Remove Ubuntu 18.04 and SUSE 15.2
* Update checkout action to v4
* Add `docs/**` to `paths-ignore`
Document location is being refactored.
* Modified submodules dyninst and timemory. (#361)
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Peter Jun Park <peter.park@amd.com>
Co-authored-by: ajanicijamd <Aleksandar.Janicijevic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
[ROCm/rocprofiler-systems commit: 0689797736]
335 рядки
16 KiB
ReStructuredText
335 рядки
16 KiB
ReStructuredText
.. meta::
|
|
:description: Omnitrace documentation and reference
|
|
:keywords: Omnitrace, ROCm, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
|
|
|
|
****************************************************
|
|
Profiling Python scripts
|
|
****************************************************
|
|
|
|
`Omnitrace <https://github.com/ROCm/omnitrace>`_ supports profiling Python code at the
|
|
source level and the script level.
|
|
Python support is enabled via the ``OMNITRACE_USE_PYTHON`` and the
|
|
``OMNITRACE_PYTHON_VERSIONS="<MAJOR>.<MINOR>`` CMake options.
|
|
Alternatively, to build multiple Python versions, use
|
|
``OMNITRACE_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]"``,
|
|
and ``OMNITRACE_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]"`` instead of ``OMNITRACE_PYTHON_VERSION``.
|
|
When building multiple Python versions, the length of the ``OMNITRACE_PYTHON_VERSIONS``
|
|
and ``OMNITRACE_PYTHON_ROOT_DIRS`` lists must
|
|
be the same size.
|
|
|
|
.. note::
|
|
|
|
When using Omnitrace with Python programs, the Python interpreter major and minor version (e.g. 3.7)
|
|
must match the interpreter major and minor version
|
|
used when compiling the Python bindings. When building Omnitrace,
|
|
the shared object file ``libpyomnitrace.<IMPL>-<VERSION>-<ARCH>-<OS>-<ABI>.so`` is generated
|
|
where ``IMPL`` is the Python implementation, ``VERSION`` is the major and minor
|
|
version, ``ARCH`` is the architecture,
|
|
``OS`` is the operating system, and ``ABI`` is the application binary interface,
|
|
for example, ``libpyomnitrace.cpython-38-x86_64-linux-gnu.so``.
|
|
|
|
Getting Started
|
|
========================================
|
|
|
|
The Omnitrace Python package is installed in ``lib/pythonX.Y/site-packages/omnitrace``.
|
|
To ensure the Python interpreter can find the Omnitrace package,
|
|
add this path to the ``PYTHONPATH`` environment variable, as in the following example:
|
|
|
|
.. code-block:: shell
|
|
|
|
export PYTHONPATH=/opt/omnitrace/lib/python3.8/site-packages:${PYTHONPATH}
|
|
|
|
Both the ``share/omnitrace/setup-env.sh`` script and the module file in
|
|
``share/modulefiles/omnitrace`` automatically handle the prefixing of the ``PYTHONPATH``
|
|
environment variable.
|
|
|
|
Running Omnitrace on a Python script
|
|
========================================
|
|
|
|
Omnitrace provides an ``omnitrace-python`` helper bash script which
|
|
ensures ``PYTHONPATH`` is properly set and the correct Python interpreter is used.
|
|
This means the following commands are effectively equivalent:
|
|
|
|
.. code-block:: shell
|
|
|
|
omnitrace-python --help
|
|
|
|
and
|
|
|
|
.. code-block:: shell
|
|
|
|
export PYTHONPATH=/opt/omnitrace/lib/python3.8/site-packages:${PYTHONPATH}
|
|
python3.8 -m omnitrace --help
|
|
|
|
.. note::
|
|
|
|
``omnitrace-python`` and ``python -m omnitrace`` use the same command-line syntax
|
|
as the other ``omnitrace`` executables (``omnitrace-python <OMNITRACE_ARGS> -- <SCRIPT> <SCRIPT_ARGS>``)
|
|
and has similar options.
|
|
|
|
Command line options
|
|
-----------------------------------
|
|
|
|
Use ``omnitrace-python --help`` to view the available options:
|
|
|
|
.. code-block:: shell
|
|
|
|
usage: omnitrace [-h] [-v VERBOSITY] [-b] [-c FILE] [-s FILE] [-F [BOOL]] [--label [{args,file,line} [{args,file,line} ...]]] [-I FUNC [FUNC ...]] [-E FUNC [FUNC ...]] [-R FUNC [FUNC ...]] [-MI FILE [FILE ...]] [-ME FILE [FILE ...]] [-MR FILE [FILE ...]] [--trace-c [BOOL]]
|
|
|
|
optional arguments:
|
|
-h, --help show this help message and exit
|
|
-v VERBOSITY, --verbosity VERBOSITY
|
|
Logging verbosity
|
|
-b, --builtin Put 'profile' in the builtins. Use '@profile' to decorate a single function, or 'with profile:' to profile a single section of code.
|
|
-c FILE, --config FILE
|
|
OmniTrace configuration file
|
|
-s FILE, --setup FILE
|
|
Code to execute before the code to profile
|
|
-F [BOOL], --full-filepath [BOOL]
|
|
Encode the full function filename (instead of basename)
|
|
--label [{args,file,line} [{args,file,line} ...]]
|
|
Encode the function arguments, filename, and/or line number into the profiling function label
|
|
-I FUNC [FUNC ...], --function-include FUNC [FUNC ...]
|
|
Include any entries with these function names
|
|
-E FUNC [FUNC ...], --function-exclude FUNC [FUNC ...]
|
|
Filter out any entries with these function names
|
|
-R FUNC [FUNC ...], --function-restrict FUNC [FUNC ...]
|
|
Select only entries with these function names
|
|
-MI FILE [FILE ...], --module-include FILE [FILE ...]
|
|
Include any entries from these files
|
|
-ME FILE [FILE ...], --module-exclude FILE [FILE ...]
|
|
Filter out any entries from these files
|
|
-MR FILE [FILE ...], --module-restrict FILE [FILE ...]
|
|
Select only entries from these files
|
|
--trace-c [BOOL] Enable profiling C functions
|
|
|
|
usage: python3 -m omnitrace <OMNITRACE_ARGS> -- <SCRIPT> <SCRIPT_ARGS>
|
|
|
|
.. note::
|
|
|
|
The ``--trace-c`` option does not incorporate Omnitrace's dynamic instrumentation support.
|
|
It only enables profiling the underlying C function call within the Python interpreter.
|
|
|
|
Selective instrumentation
|
|
-----------------------------------
|
|
|
|
Similar to the ``omnitrace-instrument`` executable, command-line options exist for restricting,
|
|
including, and excluding certain functions and modules, for example, ``--function-exclude "^__init__$"``.
|
|
Alternatively, add the ``@profile`` decorator to the primary function of interest
|
|
in your program and use the ``-b`` / ``--builtin`` command-line option to narrow the scope of the
|
|
instrumentation to this function and its children.
|
|
|
|
Consider the following Python code (``example.py``):
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
|
|
def fib(n):
|
|
return n if n < 2 else (fib(n - 1) + fib(n - 2))
|
|
|
|
|
|
def inefficient(n):
|
|
a = 0
|
|
for i in range(n):
|
|
a += i
|
|
for j in range(n):
|
|
a += j
|
|
return a
|
|
|
|
|
|
def run(n):
|
|
return fib(n) + inefficient(n)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
run(20)
|
|
|
|
Running ``omnitrace-python ./example.py`` with ``OMNITRACE_PROFILE=ON`` and
|
|
``OMNITRACE_TIMEMORY_COMPONENTS=trip_count`` produces the following:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-------------------------------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-------------------------------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|---------------------------------------------------|--------|--------|------------|--------|
|
|
| |0>>> run | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
|
|
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
|
|
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
|
|
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
|
|
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
|
|
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
|
|
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
|
|
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
|
|
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
|
|
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
|
|
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
|
|
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
|
|
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
|
|
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
|
|
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
|
|
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
|
|
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
|
|
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
|
|
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
|
|
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|
|
|-------------------------------------------------------------------------------------------|
|
|
|
|
If the ``inefficient`` function is decorated with ``@profile`` as follows:
|
|
|
|
.. code-block:: python
|
|
|
|
@profile
|
|
def inefficient(n):
|
|
# ...
|
|
|
|
And then run using the command ``omnitrace-python -b -- ./example.py``, Omnitrace produces this output:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-----------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-----------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|-------------------|--------|--------|------------|--------|
|
|
| |0>>> inefficient | 1 | 0 | trip_count | 1 |
|
|
|-----------------------------------------------------------|
|
|
|
|
Omnitrace Python source instrumentation
|
|
========================================
|
|
|
|
Starting with the unmodified ``example.py`` script above, import the ``omnitrace`` module:
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
import omnitrace # import omnitrace
|
|
|
|
def fib(n):
|
|
# ... etc. ...
|
|
|
|
Next, add ``@omnitrace.profile()`` to the ``run`` function:
|
|
|
|
.. code-block:: python
|
|
|
|
@omnitrace.profile()
|
|
def run(n):
|
|
# ...
|
|
|
|
Alternatively, use ``omnitrace.profile()`` as a context-manager around ``run(20)``:
|
|
|
|
.. code-block:: python
|
|
|
|
if __name__ == "__main__":
|
|
with omnitrace.profile():
|
|
run(20)
|
|
|
|
The results for both of the source-level instrumentation modes are identical to the
|
|
original ``omnitrace-python ./example.py`` results:
|
|
|
|
.. code-block:: shell
|
|
|
|
|-------------------------------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|-------------------------------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|---------------------------------------------------|--------|--------|------------|--------|
|
|
| |0>>> run | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
|
|
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
|
|
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
|
|
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
|
|
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
|
|
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
|
|
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
|
|
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
|
|
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
|
|
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
|
|
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
|
|
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
|
|
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
|
|
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
|
|
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
|
|
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
|
|
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
|
|
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
|
|
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
|
|
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|
|
|-------------------------------------------------------------------------------------------|
|
|
|
|
.. note::
|
|
|
|
When ``omnitrace-python`` is used without built-ins, the profiling results can be cluttered by the
|
|
numerous functions called when more complex modules are imported, such as ``import numpy``.
|
|
|
|
Omnitrace Python source instrumentation configuration
|
|
-------------------------------------------------------------
|
|
|
|
Within the Python source code, the profiler can be configured by directly
|
|
modifying the ``omnitrace.profiler.config`` data fields.
|
|
|
|
.. code-block:: python
|
|
|
|
import sys
|
|
|
|
def fib(n):
|
|
return n if n < 2 else (fib(n - 1) + fib(n - 2))
|
|
|
|
|
|
def inefficient(n):
|
|
a = 0
|
|
for i in range(n):
|
|
a += i
|
|
for j in range(n):
|
|
a += j
|
|
return a
|
|
|
|
|
|
def run(n):
|
|
return fib(n) + inefficient(n)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
from omnitrace.profiler import config
|
|
from omnitrace import profile
|
|
|
|
config.include_args = True
|
|
config.include_filename = False
|
|
config.include_line = False
|
|
config.restrict_functions += ["fib", "run"]
|
|
|
|
with profile():
|
|
run(5)
|
|
|
|
Executing this script produces the following:
|
|
|
|
.. code-block:: shell
|
|
|
|
|------------------------------------------------------------------|
|
|
| COUNTS NUMBER OF INVOCATIONS |
|
|
|------------------------------------------------------------------|
|
|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|
|
|--------------------------|--------|--------|------------|--------|
|
|
| |0>>> run(n=5) | 1 | 0 | trip_count | 1 |
|
|
| |0>>> |_fib(n=5) | 1 | 1 | trip_count | 1 |
|
|
| |0>>> |_fib(n=4) | 1 | 2 | trip_count | 1 |
|
|
| |0>>> |_fib(n=3) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 5 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 5 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=3) | 1 | 2 | trip_count | 1 |
|
|
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
|
|
| |0>>> |_fib(n=1) | 1 | 3 | trip_count | 1 |
|
|
|------------------------------------------------------------------|
|