Docs images [skip ci] (#55)
* Added images of perfetto in docs
* README images + updates
[ROCm/rocprofiler-systems commit: ae2ea090fb]
Этот коммит содержится в:
коммит произвёл
GitHub
родитель
14d8998ba0
Коммит
facd23b7bb
@@ -1,81 +1,32 @@
|
||||
# omnitrace: application tracing with static/dynamic binary instrumentation
|
||||
# Omnitrace: Application Profiling, Tracing, and Analysis
|
||||
|
||||
[](https://github.com/AMDResearch/omnitrace/actions/workflows/ubuntu-bionic.yml)
|
||||
[](https://github.com/AMDResearch/omnitrace/actions/workflows/ubuntu-focal-external.yml)
|
||||
[](https://github.com/AMDResearch/omnitrace/actions/workflows/ubuntu-focal.yml)
|
||||
[](https://github.com/AMDResearch/omnitrace/actions/workflows/ubuntu-focal-external-rocm.yml)
|
||||
|
||||
> ***[Omnitrace](https://github.com/AMDResearch/omnitrace) is an AMD research project and should***
|
||||
> ***not be treated as an offical part of the ROCm software stack.***
|
||||
> ***[Omnitrace](https://github.com/AMDResearch/omnitrace) is an AMD open source research project and is not supported as part of the ROCm software stack.***
|
||||
|
||||
The documentation for omnitrace is available at [amdresearch.github.io/omnitrace](https://amdresearch.github.io/omnitrace/).
|
||||
## Documentation
|
||||
|
||||
## Using Omnitrace Executable
|
||||
The full documentation for [omnitrace](https://github.com/AMDResearch/omnitrace) is available at [amdresearch.github.io/omnitrace](https://amdresearch.github.io/omnitrace/).
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Omnitrace Settings
|
||||
|
||||
`omnitrace-avail -Sd` will provide a list of all the possible omnitrace settings, their current value, and a description of the setting
|
||||
when running an instrumented binary.
|
||||
|
||||
### Omnitrace Executable
|
||||
|
||||
The `omnitrace` executable is used to instrument an existing binary.
|
||||
|
||||
```shell
|
||||
omnitrace --help
|
||||
omnitrace <omnitrace-options> -- <exe-or-library> <exe-options>
|
||||
```
|
||||
|
||||
## Omnitrace Settings
|
||||
|
||||
`omnitrace-avail -Sd` will provide a list of all the possible omnitrace settings, their current value, and a description of the setting.
|
||||
|
||||
> ***Some settings may only affect the timemory backend.***
|
||||
|
||||
These settings can be set via environment variables or placed in a config file and specified via `OMNITRACE_CONFIG_FILE=/path/to/config/file`. The config file
|
||||
can be a text, JSON, or XML file. Some of the most relevant settings are provided below:
|
||||
|
||||
| Environment Variable | Default Value | Description |
|
||||
|--------------------------------------------|--------------------------|------------------------------------------------------------------------------------------------------------------|
|
||||
| `OMNITRACE_USE_PERFETTO` | `false` | Enable perfetto backend |
|
||||
| `OMNITRACE_USE_PID` | `true` | Enable tagging filenames with process identifier (either MPI rank or pid) |
|
||||
| `OMNITRACE_USE_ROCTRACER` | `true` | Enable ROCM tracing |
|
||||
| `OMNITRACE_USE_SAMPLING` | `true` | Enable statistical sampling of call-stack |
|
||||
| `OMNITRACE_USE_TIMEMORY` | `false` | Enable timemory backend |
|
||||
| `OMNITRACE_BACKEND` | `inprocess` | Specify the perfetto backend to activate. Options are: 'inprocess', 'system', or 'all' |
|
||||
| `OMNITRACE_BUFFER_SIZE_KB` | `1024000` | Size of perfetto buffer (in KB) |
|
||||
| `OMNITRACE_COUT_OUTPUT` | `false` | Write output to stdout |
|
||||
| `OMNITRACE_CRITICAL_TRACE` | `false` | Enable generation of the critical trace |
|
||||
| `OMNITRACE_CRITICAL_TRACE_BUFFER_COUNT` | `2000` | Number of critical trace records to store in thread-local memory before submitting to shared buffer |
|
||||
| `OMNITRACE_CRITICAL_TRACE_COUNT` | `0` | Number of critical trace to export (0 == all) |
|
||||
| `OMNITRACE_CRITICAL_TRACE_DEBUG` | `false` | Enable debugging for critical trace |
|
||||
| `OMNITRACE_CRITICAL_TRACE_NUM_THREADS` | `8` | Number of threads to use when generating the critical trace |
|
||||
| `OMNITRACE_CRITICAL_TRACE_PER_ROW` | `0` | How many critical traces per row in perfetto (0 == all in one row) |
|
||||
| `OMNITRACE_CRITICAL_TRACE_SERIALIZE_NAMES` | `false` | Include names in serialization of critical trace (mainly for debugging) |
|
||||
| `OMNITRACE_DIFF_OUTPUT` | `false` | Generate a difference output vs. a pre-existing output (see also: TIMEMORY_INPUT_PATH and TIMEMORY_INPUT_PREFIX) |
|
||||
| `OMNITRACE_FLAT_SAMPLING` | `false` | Ignore hierarchy in all statistical sampling entries |
|
||||
| `OMNITRACE_INSTRUMENTATION_INTERVAL` | `1` | Instrumentation only takes measurements once every N function calls (not statistical) |
|
||||
| `OMNITRACE_JSON_OUTPUT` | `true` | Write json output files |
|
||||
| `OMNITRACE_MEMORY_PRECISION` | `-1` | Set the precision for components with 'is_memory_category' type-trait |
|
||||
| `OMNITRACE_MEMORY_SCIENTIFIC` | `false` | Set the numerical reporting format for components with 'is_memory_category' type-trait |
|
||||
| `OMNITRACE_MEMORY_UNITS` | `""` | Set the units for components with 'uses_memory_units' type-trait |
|
||||
| `OMNITRACE_OUTPUT_FILE` | `""` | Perfetto filename |
|
||||
| `OMNITRACE_OUTPUT_PATH` | `omnitrace-{EXE}-output` | Explicitly specify the output folder for results |
|
||||
| `OMNITRACE_OUTPUT_PREFIX` | `""` | Explicitly specify a prefix for all output files |
|
||||
| `OMNITRACE_PRECISION` | `-1` | Set the global output precision for components |
|
||||
| `OMNITRACE_ROCTRACER_FLAT_PROFILE` | `false` | Ignore hierarchy in all kernels entries with timemory backend |
|
||||
| `OMNITRACE_ROCTRACER_HSA_ACTIVITY` | `false` | Enable HSA activity tracing support |
|
||||
| `OMNITRACE_ROCTRACER_HSA_API` | `false` | Enable HSA API tracing support |
|
||||
| `OMNITRACE_ROCTRACER_HSA_API_TYPES` | `""` | HSA API type to collect |
|
||||
| `OMNITRACE_ROCTRACER_TIMELINE_PROFILE` | `false` | Create unique entries for every kernel with timemory backend |
|
||||
| `OMNITRACE_SAMPLING_DELAY` | `1e-06` | Number of seconds to delay activating the statistical sampling |
|
||||
| `OMNITRACE_SAMPLING_FREQ` | `10` | Number of software interrupts per second when OMNITTRACE_USE_SAMPLING=ON |
|
||||
| `OMNITRACE_SCIENTIFIC` | `false` | Set the global numerical reporting to scientific format |
|
||||
| `OMNITRACE_SETTINGS_DESC` | `false` | Provide descriptions when printing settings |
|
||||
| `OMNITRACE_SHMEM_SIZE_HINT_KB` | `40960` | Hint for shared-memory buffer size in perfetto (in KB) |
|
||||
| `OMNITRACE_TEXT_OUTPUT` | `true` | Write text output files |
|
||||
| `OMNITRACE_TIMELINE_SAMPLING` | `false` | Create unique entries for every sample when statistical sampling is enabled |
|
||||
| `OMNITRACE_TIMEMORY_COMPONENTS` | `wall_clock` | List of components to collect via timemory (see omnitrace-avail) |
|
||||
| `OMNITRACE_TIME_FORMAT` | `%F_%I.%M_%p` | Customize the folder generation when TIMEMORY_TIME_OUTPUT is enabled (see also: strftime) |
|
||||
| `OMNITRACE_TIME_OUTPUT` | `true` | Output data to subfolder w/ a timestamp (see also: TIMEMORY_TIME_FORMAT) |
|
||||
| `OMNITRACE_TIMING_PRECISION` | `6` | Set the precision for components with 'is_timing_category' type-trait |
|
||||
| `OMNITRACE_TIMING_SCIENTIFIC` | `false` | Set the numerical reporting format for components with 'is_timing_category' type-trait |
|
||||
| `OMNITRACE_TIMING_UNITS` | `""` | Set the units for components with 'uses_timing_units' type-trait |
|
||||
| `OMNITRACE_TREE_OUTPUT` | `true` | Write hierarchical json output files |
|
||||
|
||||
### Example Omnitrace Instrumentation
|
||||
|
||||
#### Binary Rewrite
|
||||
|
||||
Rewrite the text section of an executable or library with instrumentation:
|
||||
@@ -130,7 +81,8 @@ export OMNITRACE_BUFFER_SIZE_KB=200000
|
||||
#### Runtime Instrumentation
|
||||
|
||||
Runtime instrumentation will not only instrument the text section of the executable but also the text sections of the
|
||||
linked libraries. Thus, it may be useful to exclude those libraries via the `-ME` (module exclude) regex option.
|
||||
linked libraries. Thus, it may be useful to exclude those libraries via the `-ME` (module exclude) regex option
|
||||
or exclude specific functions with the `-E` regex option.
|
||||
|
||||
```shell
|
||||
omnitrace -- /path/to/app
|
||||
@@ -138,37 +90,17 @@ omnitrace -ME '^(libhsa-runtime64|libz\\.so)' -- /path/to/app
|
||||
omnitrace -E 'rocr::atomic|rocr::core|rocr::HSA' -- /path/to/app
|
||||
```
|
||||
|
||||
## Miscellaneous Features and Caveats
|
||||
### Visualizing Perfetto Results
|
||||
|
||||
- You may need to increase the default perfetto buffer size (1 GiB) to capture all the information
|
||||
- E.g. `export OMNITRACE_BUFFER_SIZE_KB=10240000` increases the buffer size to 10 GiB
|
||||
- The omnitrace library has various setting which can be configured via environment variables, you can
|
||||
configure these settings to custom defaults with the omnitrace command-line tool via the `--env` option
|
||||
- E.g. to default to a buffer size of 5 GB, use `--env OMNITRACE_BUFFER_SIZE_KB=5120000`
|
||||
- This is particularly useful in binary rewrite mode
|
||||
- Perfetto tooling is enabled by default
|
||||
- Timemory tooling is disabled by default
|
||||
- Enabling/disabling one of the aformentioned tools but not specifying enabling/disable the other will assume the inverse of the other's enabled state, e.g.
|
||||
- `OMNITRACE_USE_PERFETTO=OFF` yields the same result `OMNITRACE_USE_TIMEMORY=ON`
|
||||
- `OMNITRACE_USE_PERFETTO=ON` yields the same result as `OMNITRACE_USE_TIMEMORY=OFF`
|
||||
- In order to enable _both_ timemory and perfetto, set both `OMNITRACE_USE_TIMEMORY=ON` and `OMNITRACE_USE_PERFETTO=ON`
|
||||
- Setting `OMNITRACE_USE_TIMEMORY=OFF` and `OMNITRACE_USE_PERFETTO=OFF` will disable all instrumentation but call-stack sampling (`OMNITRACE_USE_SAMPLING=ON`) is still available.
|
||||
- Use `omnitrace-avail -S` to view the various settings for timemory
|
||||
- Set `OMNITRACE_COMPONENTS="<comma-delimited-list-of-component-name>"` to control which components timemory collects
|
||||
- The list of components and their descriptions can be viewed via `omnitrace-avail -Cd`
|
||||
- The list of components and their string identifiers can be view via `omnitrace-avail -Cbs`
|
||||
- You can filter any `omnitrace-avail` results via `-r <regex> -hl`
|
||||
Visit [ui.perfetto.dev](https://ui.perfetto.dev) in your browser and open up the `.proto` file(s) created by omnitrace.
|
||||
|
||||
## Omnitrace Output
|
||||

|
||||
|
||||
`omnitrace` will create an output directory named `omnitrace-<EXE_NAME>-output`, e.g. if your executable
|
||||
is named `app.inst`, the output directory will be `omnitrace-app.inst-output`. Depending on whether
|
||||
`OMNITRACE_TIME_OUTPUT=ON` (the default when perfetto is enabled), there will be a subdirectory with the date and time,
|
||||
e.g. `2021-09-02_01.03_PM`. Within this directory, all perfetto files will be named `perfetto-trace.<PID>.proto` or
|
||||
when `OMNITRACE_USE_MPI=ON`, `perfetto-trace.<RANK>.proto` (assuming omnitrace was built with MPI support).
|
||||

|
||||
|
||||
You can explicitly control the output path and naming scheme of the files via the `OMNITRACE_OUTPUT_FILE` environment
|
||||
variable. The special character sequences `%pid%` and `%rank%` will be replaced with the PID or MPI rank, respectively.
|
||||

|
||||
|
||||

|
||||
|
||||
## Merging the traces from rocprof and omnitrace
|
||||
|
||||
@@ -196,7 +128,7 @@ julia -e 'using Pkg; for name in ["JSON", "DataFrames", "Dates", "CSV", "Chain",
|
||||
Use the `omnitrace-merge.jl` Julia script to merge rocprof and perfetto traces.
|
||||
|
||||
```shell
|
||||
export OMNITRACE_ROCTRACER_ENABLED=OFF
|
||||
export OMNITRACE_USE_ROCTRACER=OFF
|
||||
rocprof --hip-trace --roctx-trace --stats ./app.inst
|
||||
omnitrace-merge.jl results.json omnitrace-app.inst-output/2021-09-02_01.03_PM/*.proto
|
||||
```
|
||||
@@ -214,7 +146,7 @@ perfetto --out ./htrace.out --txt -c ${OMNITRACE_ROOT}/share/roctrace.cfg
|
||||
then in the window running the application, configure the omnitrace instrumentation to use the system backend:
|
||||
|
||||
```shell
|
||||
export OMNITRACE_BACKEND_SYSTEM=1
|
||||
export OMNITRACE_BACKEND=system
|
||||
```
|
||||
|
||||
for the merge use the `htrace.out`:
|
||||
|
||||
@@ -6,8 +6,7 @@
|
||||
:maxdepth: 4
|
||||
```
|
||||
|
||||
> ***[Omnitrace](https://github.com/AMDResearch/omnitrace) is an AMD research project and should***
|
||||
> ***not be treated as an offical part of the ROCm software stack.***
|
||||
> ***[Omnitrace](https://github.com/AMDResearch/omnitrace) is an AMD open source research project and is not supported as part of the ROCm software stack.***
|
||||
|
||||
[Browse Omnitrace source code on Github](https://github.com/AMDResearch/omnitrace)
|
||||
|
||||
|
||||
Двоичные данные
Двоичный файл не отображается.
|
После Ширина: | Высота: | Размер: 313 KiB |
Двоичные данные
Двоичный файл не отображается.
|
После Ширина: | Высота: | Размер: 195 KiB |
Двоичный файл не отображается.
|
После Ширина: | Высота: | Размер: 230 KiB |
Двоичные данные
Двоичный файл не отображается.
|
После Ширина: | Высота: | Размер: 277 KiB |
@@ -220,7 +220,15 @@ set `OMNITRACE_OUTPUT_PREFIX="%argt%-"` and let omnitrace cleanly organize the o
|
||||
## Perfetto Output
|
||||
|
||||
Use the `OMNITRACE_OUTPUT_FILE` to specify a specific location. If this is an absolute path, then all `OMNITRACE_OUTPUT_PATH`, etc.
|
||||
settings will be ignored.
|
||||
settings will be ignored. Visit [ui.perfetto.dev](https://ui.perfetto.dev) and open this file.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## Timemory Output
|
||||
|
||||
|
||||
Ссылка в новой задаче
Block a user