* Config updates - See PR #69 for details - change type of OMNITRACE_DL_VERBOSE - add "deprecated" category to OMNITRACE_ROCM_SMI_DEVICES - reduce size of perfetto shared memory size hint - deprecate OMNITRACE_OUTPUT_FILE in favor of OMNITRACE_PERFETTO_FILE - set papi event choices - read config file after reading command line - fix update of OMNITRACE_DL_VERBOSE - mark several settings as hidden - timemory update support hidden attribute for settings - rework get_perfetto_output_filename() - Hide settings from not available backends * Rework omnitrace-avail to support dumping configurations * Overwrite query, tests, output flag - Support using -O flag when dumping config - Support checking before overwriting existing config - Support --force to overwrite existing config - Fix get_component_info not including omnitrace components - Testing for dumping config * Update documentation on omnitrace-avail * Fix issue with timemory format + "/__w/" * Update output prefix keys docs * Rename --dump-config to --generate-config * Hide MPI related options - OMNITRACE_PERFETTO_COMBINE_TRACES and OMNITRACE_COLLAPSE_PROCESSES are hidden w/o MPI support
Omnitrace: Application Profiling, Tracing, and Analysis
Omnitrace is an AMD open source research project and is not supported as part of the ROCm software stack.
Documentation
The full documentation for omnitrace is available at amdresearch.github.io/omnitrace.
Quick Start
Omnitrace Settings
Generate an omnitrace configuration file using omnitrace-avail -D omnitrace.cfg. Optionally, use omnitrace-avail -D omnitrace.cfg --all for
a verbose configuration file with descriptions, categories, etc. Modify the configuration file as desired, e.g. enable
perfetto, timemory, sampling, and process-level sampling by default
and tweak some sampling default values:
# ...
OMNITRACE_USE_PERFETTO = true
OMNITRACE_USE_TIMEMORY = true
OMNITRACE_USE_SAMPLING = true
OMNITRACE_USE_PROCESS_SAMPLING = true
# ...
OMNITRACE_SAMPLING_FREQ = 50
OMNITRACE_SAMPLING_CPUS = all
OMNITRACE_SAMPLING_GPUS = $env:HIP_VISIBLE_DEVICES
Once the configuration file is adjusted to your preferences, either export the path to this file via OMNITRACE_CONFIG_FILE=/path/to/omnitrace.cfg
or place this file in ${HOME}/.omnitrace.cfg to ensure these values are always read as the default. If you wish to change any of these settings,
you can override them via environment variables or by specifying an alternative OMNITRACE_CONFIG_FILE.
Omnitrace Executable
The omnitrace executable is used to instrument an existing binary.
omnitrace --help
omnitrace <omnitrace-options> -- <exe-or-library> <exe-options>
Binary Rewrite
Rewrite the text section of an executable or library with instrumentation:
omnitrace -o app.inst -- /path/to/app
In binary rewrite mode, if you also want instrumentation in the linked libraries, you must also rewrite those libraries.
Example of rewriting the functions starting with "hip" with instrumentation in the amdhip64 library:
mkdir -p ./lib
omnitrace -R '^hip' -o ./lib/libamdhip64.so.4 -- /opt/rocm/lib/libamdhip64.so.4
export LD_LIBRARY_PATH=${PWD}/lib:${LD_LIBRARY_PATH}
Verify via
lddthat your executable will load the instrumented library -- if you built your executable with an RPATH to the original library's directory, then prefixingLD_LIBRARY_PATHwill have no effect.
Once you have rewritten your executable and/or libraries with instrumentation, you can just run the (instrumented) executable or exectuable which loads the instrumented libraries normally, e.g.:
./app.inst
If you want to re-define certain settings to new default in a binary rewrite, use the --env option. This omnitrace option
will set the environment variable to the given value but will not override it. E.g. the default value of OMNITRACE_PERFETTO_BUFFER_SIZE_KB
is 1024000 KB (1 GiB):
# buffer size defaults to 1024000
omnitrace -o app.inst -- /path/to/app
./app.inst
Passing --env OMNITRACE_PERFETTO_BUFFER_SIZE_KB=5120000 will change the default value in app.inst to 5120000 KiB (5 GiB):
# defaults to 5 GiB buffer size
omnitrace -o app.inst --env OMNITRACE_PERFETTO_BUFFER_SIZE_KB=5120000 -- /path/to/app
./app.inst
# override default 5 GiB buffer size to 200 MB
export OMNITRACE_PERFETTO_BUFFER_SIZE_KB=200000
./app.inst
Runtime Instrumentation
Runtime instrumentation will not only instrument the text section of the executable but also the text sections of the
linked libraries. Thus, it may be useful to exclude those libraries via the -ME (module exclude) regex option
or exclude specific functions with the -E regex option.
omnitrace -- /path/to/app
omnitrace -ME '^(libhsa-runtime64|libz\\.so)' -- /path/to/app
omnitrace -E 'rocr::atomic|rocr::core|rocr::HSA' -- /path/to/app
Visualizing Perfetto Results
Visit ui.perfetto.dev in your browser and open up the .proto file(s) created by omnitrace.
Merging the traces from rocprof and omnitrace
This section requires installing Julia.
Installing Julia
Julia is available via Linux package managers or may be available via a module. Debian-based distributions such as Ubuntu can run (as a super-user):
apt-get install julia
Once Julia is installed, install the necessary packages (this operation only needs to be performed once):
julia -e 'using Pkg; for name in ["JSON", "DataFrames", "Dates", "CSV", "Chain", "PrettyTables"]; Pkg.add(name); end'
Using
rocprofexternally for tracing is deprecated. The current version has built-in support for recording the GPU activity and HIP API calls. If you want to use an external rocprof, either configure CMake with-DOMNITRACE_USE_ROCTRACER=OFFor explicitly setOMNITRACE_ROCTRACER_ENABLED=OFFin the environment.
Use the omnitrace-merge.jl Julia script to merge rocprof and perfetto traces.
export OMNITRACE_USE_ROCTRACER=OFF
rocprof --hip-trace --roctx-trace --stats ./app.inst
omnitrace-merge.jl results.json omnitrace-app.inst-output/2021-09-02_01.03_PM/*.proto
Use Perfetto tracing with System Backend
Enable traced and perfetto in the background:
pkill traced
traced --background
perfetto --out ./omnitrace-perfetto.proto --txt -c ${OMNITRACE_ROOT}/share/omnitrace.cfg --background
Configure omnitrace to use the perfetto system backend:
export OMNITRACE_PERFETTO_BACKEND=system
And finally, execute your instrumented application. Either the binary rewritten application:
omnitrace -o ./myapp.inst -- ./myapp
./myapp.inst
Or with runtime instrumentation:
omnitrace -- ./myapp



