Graphe des révisions

15 Révisions

Auteur SHA1 Message Date
Jonathan R. Madsen 0da62c980e Binary instrumentation: more robust exclusion of functions used internally (#238)
## Overview

This PR attempts to increase the stability of binary rewrite and runtime instrumentation.

### Improved protection against self-instrumentation

Using ~~the binary analysis capabilities added from #229~~ the Dyninst SymtabAPI, OmniTrace now does a much better job of avoiding instrumentation of functions which are internally called by OmniTrace:

- The `omnitrace` executable searches for and parses the symbols of various libraries which are known to cause problems when instrumented
  - GNU libraries which are common to nearly every library, e.g., `"libc.so.6"`, `"libdl.so.2"`, etc., and thus are outside the scope of the users optimizations efforts
  - Libraries which OmniTrace depends on for functionality, e.g. `"libunwind.so"`, `"libgotcha.so"`, `"libroctracer64.so"`, etc.
    - OmniTrace skips instrumenting any `module_function` instance when it's member `module_name` or `function_name` variable matches the library name, source file, or function name found for that symbol (unless the user explicitly requests that it be eligible for instrumentation)
- Note: the parsing of the "internal" libraries may result in longer instrumentation time and higher memory usage. Please file an issue if either of these is found to be excessive.

### Function filters based on linkage and visibility

Added options to restrict instrumentation to certain linkage types (e.g. avoid instrumenting weak symbols) and visibility types (e.g. avoid instrumenting hidden symbols).

### Function filters based on instructions

In the past, after instrumentation, some applications instrumented by Dyninst would fail with a trap signal (e.g. #147). In several cases, it was found that this occurred whenever certain instructions were present in the function so an option was added to exclude functions based on one or more regex patterns was added. 

## Details

- generates list of "internal" libraries and attempts to find the first match via:
  - the library is already open, e.g. `dlopen(<libname>, RTLD_LAZY | RTLD_NOLOAD)`
  - searching for the library in `LD_LIBRARY_PATH`
  - searching for the library in `OMNITRACE_ROCM_PATH`, `ROCM_PATH`
  - searching the folders from `/sbin/ldconfig -p`
  - searching for the library in common places such as `/usr/local/lib`
- provides new `--linkage` command line option to restrict instrumentation to functions with particular type(s) of linkage
  -  Linkage types: `unknown`, `global`, `local`, `weak`, `unique`
- provides new `--visibility` command line option to restrict instrumentation to functions with particular type(s) of visibility 
  - Visibility types: `unknown`, `default`, `hidden`, `protected`, `internal` 
- provides new `--internal-module-include` and `--internal-function-include` command line regex options to bypass automatic exclusion from instrumentation
- provides new `--internal-library-append` command line option to specify a library should be considered internal
- provides new `--internal-library-remove` command line option to specify a library should not be considered internal
- provides new `--instruction-exclude` command line regex option to exclude functions which contain matching instructions
- provides new `--internal-library-deps` command line option to treat libraries linked to internal libraries as internal libraries
  - generally, this will only be helpful during runtime instrumentation when OmniTrace is built with an external dyninst library which is dynamically linked to boost libraries and the application is using the same boost libraries
- relaxed restrictions in `module_function::is_module_constrained()`
- relaxed restrictions in `module_function::is_routine_constrained()`
- added a few miscellaneous nullptr checks

## Miscellaneous

- Fix `LD_PRELOAD` + `OMNITRACE_DL_VERBOSE=3` issue
- Adds a sampling offload verbose message
- Improves MPI send-recv.cpp example error message
- Minor tweaks to binary library
  - `binary::get_linked_path` returns `std::optional<string>`
  - renamed `binary::symbol::read_bfd` to `binary::symbol::read_bfd_line_info`
  - `binary::get_binary_info` has param options for reading line info and included undefined symbols
- fixed another edge case instance of resource deadlock during first call to configure_settings
- improved the error log printing in `omnitrace` (does not print repeated messages)

* fix OMNITRACE_DL_VERBOSE=3 + preload issue

- join needs to handle nullptr

* sampling offload verbose message

* mpi-send-recv error message

* binary updates

- get_linked_path returns std::optional<string>
- get_binary_info accepts include_undef flag
- renamed symbol::read_bfd to symbol::read_bfd_line_info
- get_binary_info has param options for reading line info and included undefined symbols

* config updates (initialization)

- fixed another instance of resource deadlock during first call to configure_settings

* Testing fix for HIP w/o rocprofiler support

- disable rocprofiler tests when HIP enabled but OMNITRACE_USE_ROCPROFILER=OFF

* omnitrace exe: insert_instr nullptr check

* omnitrace exe: new method for determining internal constraints

- added internal-libs.cpp
- using binary::get_binary_info on various known libs used by omnitrace
- any matching func/file from symbols found in known internal libs are excluded
- relaxed restrictions in is_module_constrained
- relaxed restrictions in is_routine_constrained
- added a few safety checks

* internal libs append/remove

- options to change which libs are considered internal libraries

* omnitrace exe instruction exclude

- regex option for excluding functions containing specific instructions

* fix is_internal_constrained

* binary link map verbose message

* support constraints on linkage and visibility of symbols

* misc fixes

- fix compiler error for Ubuntu Jammy + GCC 12
- dlopen + libtbbmalloc_proxy appears to be causing issues on OpenSUSE

* Performance details + MT

- multithread processing internal info
- report timing info

* Defer parsing internal data

- wait until after address space is created

* Performance improvement finding for get_symtab_function

* fix data race in get_binary_info

* remove set_default for linkage and visibility argparse

* Parse internal libs with Dyninst::Symtab instead of binary reader

- conflicting versions of libraries for binary analysis causes problems
- expanded whole function restrictions
- expanded module_function::is_routine_constrained regex

* internal lib updates

- include memory usage info
- option to read libraries linked against internal libs: --internal-library-deps
- defer parsing internal libs data to when processing modules
2023-02-07 03:39:10 -06:00
Mészáros Gergely 1b8f09aa2d instrumentation: include functions with specific calls (#202)
* instrumentation: include functions with specific calls

Add the option `--caller-include <regex>` or environment variable
`OMNITRACE_REGEX_CALLER_INCLUDE` to instrument functions which contain
call to a set of functions, E.g. `--caller-include foo` instruments any
function which calls `foo`.

* Serialize caller include information

* Add test for caller include

* Tweak to the caller include test

- tweak environment
- tweak pass regexes

* Set rewrite caller example to debug

, to avoid optimizing out the call expressions that it relies on.

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2022-11-11 02:32:57 -06:00
Jonathan R. Madsen 4e3527f0ed Resolve warnings/errors with extra warnings (#171) 2022-09-28 14:28:32 -05:00
Jonathan R. Madsen 90ff7188f8 Crusher hackathon updates (#164)
- improved error handling in dyninst
- improved error handling in omnitrace exe
- new logging facility for omnitrace exe
- improved backtraces
- disable concurrent kernels in rocprofiler
- updates `setup-env.sh` and modulefile
  - set `omnitrace_ROOT`
  - set `HSA_TOOLS_LIB` if roctracer or rocprofiler enabled
  - set `ROCP_TOOL_LIB` if rocprofiler enabled
  - closes #163 
- No longer make setting `HSA_ENABLE_INTERRUPT=0` the default 
  - this has performance implications
- this was set to workaround a bug in ROCR which caused an ioctl call in
ROCm to hang when interrupted. But it was only interrupted when realtime
sampling was enabled since the CPU-clock doesn't increment when waiting
  - This bug should be fixed in ROCm 5.3
- omnitrace no longer activates a realtime sampler by default when
sampling, thus this bug is no longer encountered unless the user
explicitly triggers realtime sampling
2022-09-21 13:58:14 -05:00
Jonathan R. Madsen bb400d5d61 Remove unused funcs + messages for excluding system lib (#133)
- instead of filtering system library opaquely, generate info messages
- remove unused are_file_include_exclude_lists_empty()
- remove unused module_constraint free func
- remove unused routine_counstraint free func
2022-08-08 08:37:51 -05:00
Jonathan R. Madsen 45be03906a RCCL support (#93)
* Initial support for RCCL

* OMNITRACE_USE_RCCLP + sampling tweaks

- also OMNITRACE_SAMPLING_KEEP_INTERNAL option
- minor modifications to sampling to use keep internal option + discard funlockfile

* Update docker and workflows to download RCCL

* Update CPack DEB with rocprofiler dependency

* Rework rccl into library and library/components folder

- add tpls/rccl/rccl/rccl.h

* Fix timemory includes

* rcclp inline definitions when disabled

* Tweaks to ubuntu-focal-external-rocm

- disable ompt
- enable building testing

* Tweaks to ubuntu-focal-external-rocm

- ctest exclude

* Tweak ubuntu-focal.yml

- remove source /.../setup-env.sh, replace with $GITHUB_ENV

* Fix ubuntu-focal-rocm + OMPI + root

* Improved rocm-smi error handling

- Recover from rocm-smi errors
- Disabling rocm-smi after recovering from errors
- Werror in developer mode
- Remove State::DelayedInit
- Add State::Disabled

* formatting

* Fix merge of OMNITRACE_SAMPLING_KEEP_INTERNAL

* Update RCCL include directory

- based on ROCm version we need with <rccl/rccl.h> or <rccl.h>

* RCCL Testing

- updated tests to use configuration files
- many tests generate a configuration file
- tests how have GPU option
- enable ncclCommCount, disable ncclGetVersion
- add testing for RCCLP via rccl-tests
- working directory of tests is PROJECT_BINARY_DIR
- add nccl/rccl functions to get_whole_function_names
- some clang compiler fixes

* Handle RCCL include w/o HIP

* RCCL requires HIP

* Update OMNITRACE_SAMPLING_CPUS for testing

* Update tests/CMakeLists.txt

* Debug settings

* Install MPI even when USE_MPI=OFF

* exclude printf

* skip mpi tests w/o USE_MPI or USE_MPI_HEADERS

* update ubuntu rocm workflow

* Fix configure env step for ubuntu rocm
2022-07-25 12:16:11 -05:00
Jonathan R. Madsen 99da25ea80 exit gotcha + remove DelayedInit state + rocm-smi + cleanup (#110)
* exit gotcha + remove DelayedInit state + cleanup

- exit gotcha which wraps exit, quick_exit, abort
- minor refactor of mpi gotchas
- removed some redundant code in omnitrace_finalized_hidden
- exclude instrumenting functions starting with dlopen and dlsym
- exclude instrumenting exit, quick_exit, and abort functions
- update timemory submodule with support for new gotcha_invoker with (gotcha_data, <function pointer>, args...)

* Improved rocm_smi error handling
2022-07-24 22:09:32 -05:00
Jonathan R. Madsen d04cbe862e fix omnitrace print-* with libraries (#94)
* fix omnitrace print-* with libraries

* timemory submodule update

* Update workflows to use ./bin/omnitrace instead of ./omnitrace

* cmake format

* update timemory submodule

- fix ODR violations in utility/procfs

* cmake updates

- uniform find_package for all ROCm-based libraries

* tweak transpose example

- throw exception instead of std::exit

* Inspect cmdv name before assuming not exe

- some ELF execs "think" they are libraries so only assume rewrite + simulate + all-functions if filename looks like library
- adds some test for --print-available -- <library>

* Fix _has_lib_prefix when command is < 3

* Updates and reverts to omnitrace exe

- update module_function operator< and operator==
- add function_signature operator<
- refactor module_function ctor
- revert some previous changes w.r.t. simulate and include_unninstr

* Fix source/bin/tests to use same output dir as tests

* cmake format

* Segfault mitigation + refactor + modify function iteration

- refactor module_function ctor to avoid segfaults
- string_t -> std::string
- replace std::string with std::string_view in some places
- get_name(module_t*)
- get_name(procedure_t*)
- disable using both app_modules and app_functions
- new option: --parse-all-modules to iterate over app_modules
- removed some unused code w.r.t. debug info

* Disable module_function address range for uninstrumentable functions

* Disable module_function address range for uninstrumentable functions

* Refactored getting file/line info and init/fini

- use dyninst insertInitCallback and insertFiniCallback if main not found
- fixed all issues with segmentation faults in --simulate --all-functions

* revert changes to Findrocprofiler.cmake
2022-07-21 01:15:41 -05:00
Jonathan R. Madsen a640fbdb29 Fix loop-level instrumentation + more (#32)
- fix loop-level instrumentation
- support loop instrumentation w/o debug symbols via loop number
- improve module_function messages
- serialize num_basic_blocks
- serialize num_outer_loops
- serialize is_num_instructions_constrained
- serialize is_loop_num_instructions_constrained
- updated transpose example to use uniform_int_distribution
- added transpose loop test
- added fail regexes for tests which enable loop instrumentation
- use module->getFullName in get_loop_file_line_info
- use module->getFullName in get_func_file_line_info
- use module->getFullName in get_basic_block_file_line_info
2022-06-10 06:57:50 -05:00
Jonathan R. Madsen 6491ce7808 omnitrace function exclude updates (#5)
- These functions cause weird call-stack behavior when instrumented
    - rocr::image::ImageRuntime::CreateImageManager
    - rocr::AMD::GpuAgent::GetInfo
    - rocr::HSA::hsa_agent_get_info
- These functions cause out-of-order call-stacks when KokkosP is enabled
    - Kokkos::Profiling::*
2022-05-24 19:26:12 -05:00
Jonathan R. Madsen 134b33320d Code coverage updates (#50)
* code coverage updates

- python support
- refactored source

* remove code_coverage::operator+ and operator+=

* impl/coverage.hpp
2022-05-08 01:40:56 -05:00
Jonathan R. Madsen 791375bb24 Code Coverage Support (#46)
* Code-coverage support

* Examples update

- code-coverage example
- tweak transpose and parallel-overhead

* Coverage output + testing

- config::get_setting value(...)
- REGULAR_EXPRESSION -> REGEX in cmake func args
- coverage.hpp header
- coverage JSON
- coverage tests

* cmake formatting

* Library instrumentation w/o main + more

- fixed library instrumentation w/o main
- use TIMEMORY_PROJECT_NAME in output messages
- removed '--driver' option from omnitrace exe
- support coverage in trace mode
- OMNITRACE_KOKKOS_KERNEL_LOGGER
- support multiple calls to omnitrace_set_env after init if already called
- support multiple calls to omnitrace_set_mpi after init if same args
- support multiple calls to omnitrace_init if same mode
- unique_ptr_t for thread_data which calls finalize when thread_data is destroyed
- tweaked openmp tests
- improved finalization

* Replace CI --output-on-failure with -V

* Fix to OMNITRACE_DL_INVOKE

* omnitrace-exe and testing updates

- omnitrace::omnitrace-timemory interface library
- support for configs in omnitrace exe
- print-{available,instrumented,...} opts no longer exit w/o --simulate
- all tests apply --print-instrumented functions
- tweaked coverage tests
- print-* options print instructions not address range

* Remove OMNITRACE_DEBUG_FINALIZE=ON from CI

* Python cmake tweaks

* Tweak test ordering

* Upload CI artifacts if fail or success

* CI Python tweaks

- Use OMNITRACE_PYTHON_PREFIX and OMNITRACE_PYTHON_ENVS

* CI ELFULTILS_DOWNLOAD_VERSION

* test tweaks

- labels and more coverage tests

* tweak to omnitrace --config handling

* Update module/function constraint handling + PP

- tweak pre-processor definition handling
- removed free-standing module_constraint
- remove free-standing routine_constraint
- remove module_name.find("omnitrace") module constraint
- fully handle the output path of omnitrace *-instr files
- get_use_code_coverage config option
- print-coverage option
- coverage_module_functions

* use github.job not github.name

* Re-enable HSA_ENABLE_INTERRUPT

- remove coverage address report
2022-04-25 17:00:52 -05:00
Jonathan R. Madsen afa3edebab Python support (#37)
* Initial python support

* Add python testing

* Increase timeout for bin tests

* cmake-format

* Valid build types + testing + formatting + more

- Enforce valid build types
- Fix to numpy install
- Increase testing timeout
- Fix to cmake format glob
- Fix to backtrace verbose

* Disable stripping libraries by default

* omnitrace exe updates

- new '--print-instructions' option
- changed format of instructions in JSON
- remove no-save-fpr tests

* Default to strip libraries when release build
2022-04-05 00:24:34 -05:00
Jonathan R. Madsen 945f541965 Documentation + Miscellaneous Fixes (#36)
* Added documentation markdown source

* Replaced AARInternal with AMDResearch in URLs

* Renamed cpack artifact names

* Fix to testing and lulesh submodule checkout

* Docker updates

* CMake and CPack

- force CMAKE_INSTALL_LIBDIR to lib
- CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- Tweak LIBOMP_LIBRARY find in examples/openmp
- Tweak setup-env.sh.in

* Partial update of README

- status badges
- docs link
- removed install info (covered by docs)

* OMNITRACE_SAMPLING_CPUS setting

- enables control over which CPUs are sampled for frequency

* omnitrace exe updates

- exclude transaction clone, virtual thunk, non-virtual thunk
- module_function::start_address
- module_function::instructions
- verbosity > 0 encodes instructions into JSON

* Miscellaneous fixes

- relocate setup-env.sh.in
- add modulefile.in
- Updated README.md and source/docs/about.md
- cmake fix for libomp
- fix license in miscellaneous places
- dl.hpp and dl.cpp

* Update timemory and dyninst submodules

- timemory signals updates
- dyninst Movement-adhoc updates

* cmake format
2022-04-04 15:27:38 -05:00
Jonathan R. Madsen d80752bc69 User API + reorganized lib folders (#30)
* User API + reorganized lib folders

- omnitrace_user_start_trace
- omnitrace_user_stop_trace
- omnitrace_user_start_thread_trace
- omnitrace_user_stop_thread_trace
- omnitrace_user_push_region
- omnitrace_user_pop_region

* New OpenMP examples/tests

* Fix to KokkosP

* OMPT support

- fixed omnitrace instrumenting reporting
- common invoke improvements
- component::user_region

* exclude kmp_threadprivate_

* Separate omnitrace into multiple files

* PTL and timemory submodule updates

* Active guards + USE_OMPT guards in omnitrace-dl

* Tweak transpose default iterations

* omnitrace-precommit build target

* Omnitrace exe restructuring pt 2

- Never instrument functions with less than 4 instructions
- Never instrument ompt_start_tool or nanosleep
- module_function serializes heuristics
- removed hash stuff from omnitrace
- removed instr_procedures lambda
- WAITPID_DEBUG_MESSAGE

* set_state, "_hidden" fix, CI exceptions, backtrace fix

- set_state function
- fixed "_hidden" from appearing in print macros using __FUNCTION__
- OMNITRACE_CI_THROW
- more CI checks in library
- fixed backtrace init value sample issue being ignored

* Tweaks to OMPT tests

* cmake-formatting

* Removed debug output from backtrace processing

* Fix warnings and verbosity

* omnitrace-dl fix for libomp

* omnitrace-avail fixes

- remove second omnitrace_init_library call
- fix -r option not working

* Additional testing

- source/bin/tests
- tests for omnitrace-exe
- tests for omnitrace-avail

* cmake-format

* Reduce runtime of openmp-lu

* Update openmp-lu and tests timeout

* openmp-lu and CI tweaks

- decrease iterations
- OMP_NUM_THREADS=2
- install clang and libomp-dev in linux-ci
- fix data-files in linux-ci
2022-03-07 20:40:48 -06:00