945f541965
* Added documentation markdown source * Replaced AARInternal with AMDResearch in URLs * Renamed cpack artifact names * Fix to testing and lulesh submodule checkout * Docker updates * CMake and CPack - force CMAKE_INSTALL_LIBDIR to lib - CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME - CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME - Tweak LIBOMP_LIBRARY find in examples/openmp - Tweak setup-env.sh.in * Partial update of README - status badges - docs link - removed install info (covered by docs) * OMNITRACE_SAMPLING_CPUS setting - enables control over which CPUs are sampled for frequency * omnitrace exe updates - exclude transaction clone, virtual thunk, non-virtual thunk - module_function::start_address - module_function::instructions - verbosity > 0 encodes instructions into JSON * Miscellaneous fixes - relocate setup-env.sh.in - add modulefile.in - Updated README.md and source/docs/about.md - cmake fix for libomp - fix license in miscellaneous places - dl.hpp and dl.cpp * Update timemory and dyninst submodules - timemory signals updates - dyninst Movement-adhoc updates * cmake format
77 satır
1.6 KiB
Markdown
77 satır
1.6 KiB
Markdown
# Features
|
|
|
|
```eval_rst
|
|
.. toctree::
|
|
:glob:
|
|
:maxdepth: 4
|
|
```
|
|
|
|
## Overview
|
|
|
|
[Omnitrace](https://github.com/AMDResearch/omnitrace) is designed to be highly extensible. Internally, it leverages the
|
|
[timemory performance analysis toolkit](https://github.com/NERSC/timemory) to
|
|
manage extensions, resources, data, etc.
|
|
|
|
### Data Collection Modes
|
|
|
|
- Dynamic instrumentation
|
|
- Runtime instrumentation
|
|
- Instrument executable and shared libraries at runtime
|
|
- Binary rewriting
|
|
- Generate a new executable and/or library with instrumentation built-in
|
|
- Statistical sampling
|
|
- Periodic software interrupts per-thread
|
|
- Background thread sampling
|
|
- Record process and system-level values while an application executes
|
|
- Critical trace generation
|
|
|
|
### Data Analysis
|
|
|
|
- Critical trace generation (beta)
|
|
- Support for
|
|
|
|
### Parallelism API Support
|
|
|
|
- Built-in MPI support
|
|
- Kokkos-Tools support
|
|
|
|
### GPU Metrics
|
|
|
|
- HIP API tracing
|
|
- ROCM HSA API tracing
|
|
- Kernel runtime tracing
|
|
- System-level sampling (via rocm-smi)
|
|
- Memory usage
|
|
- Power usage
|
|
- Temperature
|
|
- Utilization
|
|
|
|
### CPU Metrics
|
|
|
|
- CPU hardware counters sampling and profiles
|
|
- CPU frequency sampling
|
|
- Various timing metrics
|
|
- Wall time
|
|
- CPU time (process and/or thread)
|
|
- CPU utilization (process and/or thread)
|
|
- User CPU time
|
|
- Kernel CPU time
|
|
- Various memory metrics
|
|
- High-water mark (sampling and profiles)
|
|
- Memory page allocation
|
|
- Virtual memory usage
|
|
- Network statistics
|
|
- I/O metrics
|
|
- ... many more
|
|
|
|
### Third-party API support
|
|
|
|
- OpenMP-Tools (OMPT)
|
|
- TAU
|
|
- LIKWID
|
|
- Caliper
|
|
- CrayPAT
|
|
- VTune
|
|
- NVTX
|
|
- ROCTX
|