ab8894082b107ff68147eafce70109411bcd0016
7 Commity
| Autor | SHA1 | Wiadomość | Data | |
|---|---|---|---|---|
|
|
4ed5f3e67b |
rocprofler_iterate_info workaround + omnitrace-avail update (#270)
* rocprofler_iterate_info workaround + omnitrace-avail update
- provides workaround for rocprofiler_iterate_info behavior change in ROCm 5.4.0-3
- update timemory submodule with argparse tweaks
- updates hsa_rsrc_factory.{hpp,cpp}
- colorized log in omnitrace-avail
- Bump version to 1.9.2
* Fix empty_base inheritance
- timemory's component::empty_base inherits from concepts::component so direct inheritance was removed
* Fix OMNITRACE_HIP_VERSION_COMPAT_STRING
- defined as "" when OMNITRACE_HIP_VERSION_MAJOR==0
* new defines + extra info
- define OMNITRACE_LIBRARY_ARCH (via CMAKE_LIBRARY_ARCHITECTURE)
- define OMNITRACE_SYSTEM_NAME (via CMAKE_SYSTEM_NAME)
- define OMNITRACE_SYSTEM_PROCESSOR (via CMAKE_SYSTEM_PROCESSOR)
- define OMNITRACE_SYSTEM_VERSION (via OMNITRACE_SYSTEM_VERSION)
- define OMNITRACE_COMPILER_ID (via CMAKE_CXX_COMPILER_ID)
- define OMNITRACE_COMPILER_VERSION (via CMAKE_CXX_COMPILER_VERSION)
- include this info in metadata
- include subset of this info in --version for bin tools
- tweak to perfetto verbose messages
|
||
|
|
279a8e0952 |
Roctracer perfetto flow fixes (#267)
* testing label updates
- automatically add "gpu", "roctracer", "rocm-smi", and "rocprofiler" test labels when appropriate
* Bump version to v1.9.1
* roctracer and config updates
- fix perfetto::Flow
- use roctracer correlation ID instead of critical trace correlation ID
- renamed ambiguous _cid, _parent_cid, _corr_id variables to _crit_cid, _parent_crit_cid, _roct_cid
- use atomic_{mutex,lock} instead of STL mutex/lock
- support for individual perfetto annotations for HIP API args
- OMNITRACE_PERFETTO_COMPACT_ROCTRACER_ANNOTATIONS option for controlling compact vs. individual perfetto annotations for HIP API args
* Update timemory submodule
- argparser updates
- help prints to std::cout by default now
- supports setting custom ostream
* cmake formatting
* config::get_setting_value updates
- config::get_setting_value returns std::optional instead of std::pair<bool, Tp>
|
||
|
|
abe35de43a |
omnitrace-run executable - required for running binary writes (#257)
* omnitrace-run exe - ensure LD_PRELOAD for libomnitrace-dl.so - convert config options into command-line options * Update timemory submodule - updates to tsettings - updates to argparser * common environment update - throw error if get_env<bool> has empty string * config updates - minor tweaks to categories of settings * core lib update - add argparse for common handling of argument parsers * omnitrace-sample update - fix handling of --trace-file (OMNITRACE_PERFETTO_FILE) * omnitrace-run update - updated to use omnitrace::argparse functions * Tests for omnitrace-run * argparse core update - remove choices for --cpu-events and --gpu-events * remove some debugging prints * fix timemory include in argparse.cpp * always provide --hsa-interrupt option * Update source/lib/core/argparse.cpp - fix pedantic warning * Update testing - remove testing args that may not be there in some builds * roctracer/pthread_create fix - disable roctracer_data when roctracer not enabled * omnitrace-causal tweak * omnitrace-instrument: module_function tweak - allow DEFAULT_MODULE and LIBRARY_MODULE * common environment update - support get_env for enums * core: config update - Add "mode" category to OMNITRACE_MODE * Update timemory submodule - remove debug print statement * omnitrace-sample tweak - change var init * omnitrace-run testing update - use --help instead of -? * core: common.hpp - tweak header include style * core: argparser update - add_ld_preload func - launcher and command member variables in parser_data - support launcher * omnitrace-run update - clean up and reworked * libomnitrace-dl updates - require LD_PRELOAD with binary rewrite - dl::InstrumentMode - dl::get_instrumented() - verify_instrumented_preloaded() - omnitrace_set_instrumented(int) - relocated omnitrace_main from main.c to dl.cpp - omnitrace_set_env does not dlopen libomnitrace - omnitrace_set_main(func_ptr) [internal API] - OMNITRACE_HIDDEN_API -> OMNITRACE_INTERNAL_API * Update testing to new LD_PRELOAD requirements * omnitrace-instrument updates - adhere to LD_PRELOAD requirementsa - invoke omnitrace_set_instrumented - binary rewrite does not instrument main - binary rewrite does not instrument call to omnitrace_init - runtime instr does not instrument main - runtime instr does not instrument call to omnitrace_init * Bump to v1.9.0 - LD_PRELOAD requirement necessitates minor version increment * common: environment - fix ambiguous get_env calls * omnitrace-instrument update - fix issue with temporaries * omnitrace-instrument and libomnitrace-dl updates - runtime instrumentation does not work if libomnitrace-dl is preloaded * libomnitrace-dl and libpyomnitrace updates - define dl::InstrumentMode in dl.hpp - handle instrumentation via setprofile libpyomnitrace - do not push trace in omnitrace_init * omnitrace-instrument and libomnitrace-dl updates - move header to dl subdirectory - omnitrace::omnitrace-headers include omnitrace-dl folder - use InstrumentMode in omnitrace-instrument * Update workflows and scripts - Use omnitrace-run on instrumented exes * Update docs - add omnitrace-run to examples of running binary rewritten exes |
||
|
|
846301bcaf |
Address and thread sanitizer fixes (#250)
* Address and thread sanitizer fixes - Fix compilation with clang - Tweak perfetto copy to build tree - Added suppression files to scripts - fix LD_PRELOAD support in omnitrace-causal and omnitrace-sample - use spin_mutex and spin_lock from timemory instead of atomic_mutex and atomic_lock - state uses atomic - fix some memory leaks - tweak testing - mpi tests do not use preload - increase timeout when using sanitizers - add env LD_PRELOAD when using sanitizers * Tweak perfetto build * Update timemory submodule * Update version to 1.8.1 * Update omnitrace-leak.supp * Update timemory submodule - fixed spin_mutex implementation * Remove previously added addr_space->allowTraps(instr_traps) - this appears to cause errors during binary rewrite * causal testing updates - relaxed causal validation on CI systems (to account for hyperthreading decreasing prediction) - improved impact calculation - other general improvements to validate-causal-json.py * Improve fork handling for perfetto - numerous updates changing perfetto:: to ::perfetto:: - added perfetto_fwd.hpp * Updated fork example - user API for validation that stopping/starting perfetto is valid * Misc fixes to perfetto + fork support - tweak regions in fork example - handle disabling tmp files - get rid of stop/start with perfetto before/after fork - fixed sampling support during fork - tweak env of fork test * Fix find_package in build-tree * Fix buildtree export * Fix buildtree export * Restructured ConfigInstall before adding examples * Guard against creating tmp file in sampling when disabled * Fix buildtree package * formatting * exit handlers on child processes - quick exit to avoid perfetto cleanup * Further tweaking of causal tests for reliability - enable PROCESSOR_AFFINITY - decrease to 5 iterations * Further tweaking of causal tests for reliability - disable PROCESSOR_AFFINITY for fast func e2e tests - enabling affinity results in (valid) speedup predictions greater than zero * Fixes to fork handling - use pthread_atfork for redundancy if fork_gotcha fails * cmake formatting * Fix fork init settings + install components - remove dl from PROJECT_BUILD_TARGETS * Testing tweaks - fix mpi-binary-rewrite-run regex when OMNITRACE_VERBOSE set > 1 in env - increase causal e2e iterations to 8 * Fix "Test User API" - test-find-package.sh included dl component * Further tweaks to causal validation - further considerations of variance |
||
|
|
32b15fe7b7 |
Handle fork in target application (#191)
* Always print PID in log messages * omnitrace-dl updates - omnitrace_preload does not call omnitrace_init or omnitrace_init_tooling - omnitrace_preload will call omnitrace_set_mpi if OMNITRACE_USE_MPI or OMNITRACE_USE_MPIP in the env is true but not call it otherwise because doing so either overrides OMNITRACE_USE_PID (when true) or disable mpip from initialization (when false) and the MPI init can be caught later and override OMNITRACE_USE_PID * config updates - set_setting_value sets user update type - remove volatile from get_settings_configured - don't override settings::default_process_suffix - don't kill process in omnitrace_exit_action - set_state ignores updating state if >= State::Finalized * Handle state > State::Finalized * fork gotcha updates - unsets LD_PRELOAD - sets OMNITRACE_ROOT_PROCESS - sets OMNITRACE_CHILD_PROCESS * libomnitrace library.cpp updates - basic_bundle for fini metrics - handle finalization from child process * sampling updates - sampling::shutdown handles when child process * Add example and test using fork * Update run-ci script to support not submitting * Tweak test envs * Update build flags when codecov enabled * remove unnecessary includes of sampling header * Replace mpi copy/fini static lambda with free-funcs * Update codecov job * Fix OMPT segfaults after finalization * Miscellaneous updates after rebase * fixes for causal profiling * revert some run-ci.sh changes * Disable storing env in sampling::shutdown * formatting fix * Update timemory submodule - fixed occasional synchronization issues with allocator offloading - exclude protozero:: from internal samples * improve root/child process detection - avoid omnitrace_finalize in MPI when child process - revert some testing tweaks |
||
|
|
0da62c980e |
Binary instrumentation: more robust exclusion of functions used internally (#238)
## Overview
This PR attempts to increase the stability of binary rewrite and runtime instrumentation.
### Improved protection against self-instrumentation
Using ~~the binary analysis capabilities added from #229~~ the Dyninst SymtabAPI, OmniTrace now does a much better job of avoiding instrumentation of functions which are internally called by OmniTrace:
- The `omnitrace` executable searches for and parses the symbols of various libraries which are known to cause problems when instrumented
- GNU libraries which are common to nearly every library, e.g., `"libc.so.6"`, `"libdl.so.2"`, etc., and thus are outside the scope of the users optimizations efforts
- Libraries which OmniTrace depends on for functionality, e.g. `"libunwind.so"`, `"libgotcha.so"`, `"libroctracer64.so"`, etc.
- OmniTrace skips instrumenting any `module_function` instance when it's member `module_name` or `function_name` variable matches the library name, source file, or function name found for that symbol (unless the user explicitly requests that it be eligible for instrumentation)
- Note: the parsing of the "internal" libraries may result in longer instrumentation time and higher memory usage. Please file an issue if either of these is found to be excessive.
### Function filters based on linkage and visibility
Added options to restrict instrumentation to certain linkage types (e.g. avoid instrumenting weak symbols) and visibility types (e.g. avoid instrumenting hidden symbols).
### Function filters based on instructions
In the past, after instrumentation, some applications instrumented by Dyninst would fail with a trap signal (e.g. #147). In several cases, it was found that this occurred whenever certain instructions were present in the function so an option was added to exclude functions based on one or more regex patterns was added.
## Details
- generates list of "internal" libraries and attempts to find the first match via:
- the library is already open, e.g. `dlopen(<libname>, RTLD_LAZY | RTLD_NOLOAD)`
- searching for the library in `LD_LIBRARY_PATH`
- searching for the library in `OMNITRACE_ROCM_PATH`, `ROCM_PATH`
- searching the folders from `/sbin/ldconfig -p`
- searching for the library in common places such as `/usr/local/lib`
- provides new `--linkage` command line option to restrict instrumentation to functions with particular type(s) of linkage
- Linkage types: `unknown`, `global`, `local`, `weak`, `unique`
- provides new `--visibility` command line option to restrict instrumentation to functions with particular type(s) of visibility
- Visibility types: `unknown`, `default`, `hidden`, `protected`, `internal`
- provides new `--internal-module-include` and `--internal-function-include` command line regex options to bypass automatic exclusion from instrumentation
- provides new `--internal-library-append` command line option to specify a library should be considered internal
- provides new `--internal-library-remove` command line option to specify a library should not be considered internal
- provides new `--instruction-exclude` command line regex option to exclude functions which contain matching instructions
- provides new `--internal-library-deps` command line option to treat libraries linked to internal libraries as internal libraries
- generally, this will only be helpful during runtime instrumentation when OmniTrace is built with an external dyninst library which is dynamically linked to boost libraries and the application is using the same boost libraries
- relaxed restrictions in `module_function::is_module_constrained()`
- relaxed restrictions in `module_function::is_routine_constrained()`
- added a few miscellaneous nullptr checks
## Miscellaneous
- Fix `LD_PRELOAD` + `OMNITRACE_DL_VERBOSE=3` issue
- Adds a sampling offload verbose message
- Improves MPI send-recv.cpp example error message
- Minor tweaks to binary library
- `binary::get_linked_path` returns `std::optional<string>`
- renamed `binary::symbol::read_bfd` to `binary::symbol::read_bfd_line_info`
- `binary::get_binary_info` has param options for reading line info and included undefined symbols
- fixed another edge case instance of resource deadlock during first call to configure_settings
- improved the error log printing in `omnitrace` (does not print repeated messages)
* fix OMNITRACE_DL_VERBOSE=3 + preload issue
- join needs to handle nullptr
* sampling offload verbose message
* mpi-send-recv error message
* binary updates
- get_linked_path returns std::optional<string>
- get_binary_info accepts include_undef flag
- renamed symbol::read_bfd to symbol::read_bfd_line_info
- get_binary_info has param options for reading line info and included undefined symbols
* config updates (initialization)
- fixed another instance of resource deadlock during first call to configure_settings
* Testing fix for HIP w/o rocprofiler support
- disable rocprofiler tests when HIP enabled but OMNITRACE_USE_ROCPROFILER=OFF
* omnitrace exe: insert_instr nullptr check
* omnitrace exe: new method for determining internal constraints
- added internal-libs.cpp
- using binary::get_binary_info on various known libs used by omnitrace
- any matching func/file from symbols found in known internal libs are excluded
- relaxed restrictions in is_module_constrained
- relaxed restrictions in is_routine_constrained
- added a few safety checks
* internal libs append/remove
- options to change which libs are considered internal libraries
* omnitrace exe instruction exclude
- regex option for excluding functions containing specific instructions
* fix is_internal_constrained
* binary link map verbose message
* support constraints on linkage and visibility of symbols
* misc fixes
- fix compiler error for Ubuntu Jammy + GCC 12
- dlopen + libtbbmalloc_proxy appears to be causing issues on OpenSUSE
* Performance details + MT
- multithread processing internal info
- report timing info
* Defer parsing internal data
- wait until after address space is created
* Performance improvement finding for get_symtab_function
* fix data race in get_binary_info
* remove set_default for linkage and visibility argparse
* Parse internal libs with Dyninst::Symtab instead of binary reader
- conflicting versions of libraries for binary analysis causes problems
- expanded whole function restrictions
- expanded module_function::is_routine_constrained regex
* internal lib updates
- include memory usage info
- option to read libraries linked against internal libs: --internal-library-deps
- defer parsing internal libs data to when processing modules
|
||
|
|
e7d3125459 |
restructure libomnitrace + tasking and omnitrace-causal updates (#237)
* restructured libomnitrace - this is necessary to incorporate some of the binary analysis capabilities into omnitrace exe - created libomnitrace-core (static) - created libomnitrace-binary (static) - created libomnitrace (static) - omnitrace-avail links to libomnitrace.a - omnitrace-critical-trace links to libomnitrace.a - tweaked the testing - reduced verbosity on some of MPI tests - excluded trace-time-window from tests on Ubuntu 18.04 - reduced causal e2e iterations - minor tweak to tasking - manually create `PTL::UserTaskQueue` instance instead of relying on `PTL::ThreadPool` to create it * Update formatting workflow - source formatting uses ubuntu-22.04 - check-includes doesn't generate false positive for 'include "timemory.hpp"' * omnitrace-causal --generate-configs - fix config generation in omnitrace causal - add test for omnitrace-causal + generating configs * Fix omnitrace-object-library build - accidentally included rocm sources in non-rocm builds * Fix rocm compilation w/o rocprofiler * update timemory submodule with mpi_get warning messages * sampling offload file updates - more verbose messages - disable offload before stopping * testing updates - increase causal e2e iterations to 12 - increase lock_environment verbose to 2 (for sampling offload messages) - fix return for omnitrace_add_validation_test |