* instrumentation: include functions with specific calls
Add the option `--caller-include <regex>` or environment variable
`OMNITRACE_REGEX_CALLER_INCLUDE` to instrument functions which contain
call to a set of functions, E.g. `--caller-include foo` instruments any
function which calls `foo`.
* Serialize caller include information
* Add test for caller include
* Tweak to the caller include test
- tweak environment
- tweak pass regexes
* Set rewrite caller example to debug
, to avoid optimizing out the call expressions that it relies on.
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
- improved error handling in dyninst
- improved error handling in omnitrace exe
- new logging facility for omnitrace exe
- improved backtraces
- disable concurrent kernels in rocprofiler
- updates `setup-env.sh` and modulefile
- set `omnitrace_ROOT`
- set `HSA_TOOLS_LIB` if roctracer or rocprofiler enabled
- set `ROCP_TOOL_LIB` if rocprofiler enabled
- closes#163
- No longer make setting `HSA_ENABLE_INTERRUPT=0` the default
- this has performance implications
- this was set to workaround a bug in ROCR which caused an ioctl call in
ROCm to hang when interrupted. But it was only interrupted when realtime
sampling was enabled since the CPU-clock doesn't increment when waiting
- This bug should be fixed in ROCm 5.3
- omnitrace no longer activates a realtime sampler by default when
sampling, thus this bug is no longer encountered unless the user
explicitly triggers realtime sampling
* Initial support for RCCL
* OMNITRACE_USE_RCCLP + sampling tweaks
- also OMNITRACE_SAMPLING_KEEP_INTERNAL option
- minor modifications to sampling to use keep internal option + discard funlockfile
* Update docker and workflows to download RCCL
* Update CPack DEB with rocprofiler dependency
* Rework rccl into library and library/components folder
- add tpls/rccl/rccl/rccl.h
* Fix timemory includes
* rcclp inline definitions when disabled
* Tweaks to ubuntu-focal-external-rocm
- disable ompt
- enable building testing
* Tweaks to ubuntu-focal-external-rocm
- ctest exclude
* Tweak ubuntu-focal.yml
- remove source /.../setup-env.sh, replace with $GITHUB_ENV
* Fix ubuntu-focal-rocm + OMPI + root
* Improved rocm-smi error handling
- Recover from rocm-smi errors
- Disabling rocm-smi after recovering from errors
- Werror in developer mode
- Remove State::DelayedInit
- Add State::Disabled
* formatting
* Fix merge of OMNITRACE_SAMPLING_KEEP_INTERNAL
* Update RCCL include directory
- based on ROCm version we need with <rccl/rccl.h> or <rccl.h>
* RCCL Testing
- updated tests to use configuration files
- many tests generate a configuration file
- tests how have GPU option
- enable ncclCommCount, disable ncclGetVersion
- add testing for RCCLP via rccl-tests
- working directory of tests is PROJECT_BINARY_DIR
- add nccl/rccl functions to get_whole_function_names
- some clang compiler fixes
* Handle RCCL include w/o HIP
* RCCL requires HIP
* Update OMNITRACE_SAMPLING_CPUS for testing
* Update tests/CMakeLists.txt
* Debug settings
* Install MPI even when USE_MPI=OFF
* exclude printf
* skip mpi tests w/o USE_MPI or USE_MPI_HEADERS
* update ubuntu rocm workflow
* Fix configure env step for ubuntu rocm
* exit gotcha + remove DelayedInit state + cleanup
- exit gotcha which wraps exit, quick_exit, abort
- minor refactor of mpi gotchas
- removed some redundant code in omnitrace_finalized_hidden
- exclude instrumenting functions starting with dlopen and dlsym
- exclude instrumenting exit, quick_exit, and abort functions
- update timemory submodule with support for new gotcha_invoker with (gotcha_data, <function pointer>, args...)
* Improved rocm_smi error handling
* fix omnitrace print-* with libraries
* timemory submodule update
* Update workflows to use ./bin/omnitrace instead of ./omnitrace
* cmake format
* update timemory submodule
- fix ODR violations in utility/procfs
* cmake updates
- uniform find_package for all ROCm-based libraries
* tweak transpose example
- throw exception instead of std::exit
* Inspect cmdv name before assuming not exe
- some ELF execs "think" they are libraries so only assume rewrite + simulate + all-functions if filename looks like library
- adds some test for --print-available -- <library>
* Fix _has_lib_prefix when command is < 3
* Updates and reverts to omnitrace exe
- update module_function operator< and operator==
- add function_signature operator<
- refactor module_function ctor
- revert some previous changes w.r.t. simulate and include_unninstr
* Fix source/bin/tests to use same output dir as tests
* cmake format
* Segfault mitigation + refactor + modify function iteration
- refactor module_function ctor to avoid segfaults
- string_t -> std::string
- replace std::string with std::string_view in some places
- get_name(module_t*)
- get_name(procedure_t*)
- disable using both app_modules and app_functions
- new option: --parse-all-modules to iterate over app_modules
- removed some unused code w.r.t. debug info
* Disable module_function address range for uninstrumentable functions
* Disable module_function address range for uninstrumentable functions
* Refactored getting file/line info and init/fini
- use dyninst insertInitCallback and insertFiniCallback if main not found
- fixed all issues with segmentation faults in --simulate --all-functions
* revert changes to Findrocprofiler.cmake
- fix loop-level instrumentation
- support loop instrumentation w/o debug symbols via loop number
- improve module_function messages
- serialize num_basic_blocks
- serialize num_outer_loops
- serialize is_num_instructions_constrained
- serialize is_loop_num_instructions_constrained
- updated transpose example to use uniform_int_distribution
- added transpose loop test
- added fail regexes for tests which enable loop instrumentation
- use module->getFullName in get_loop_file_line_info
- use module->getFullName in get_func_file_line_info
- use module->getFullName in get_basic_block_file_line_info
- These functions cause weird call-stack behavior when instrumented
- rocr::image::ImageRuntime::CreateImageManager
- rocr::AMD::GpuAgent::GetInfo
- rocr::HSA::hsa_agent_get_info
- These functions cause out-of-order call-stacks when KokkosP is enabled
- Kokkos::Profiling::*
* Code-coverage support
* Examples update
- code-coverage example
- tweak transpose and parallel-overhead
* Coverage output + testing
- config::get_setting value(...)
- REGULAR_EXPRESSION -> REGEX in cmake func args
- coverage.hpp header
- coverage JSON
- coverage tests
* cmake formatting
* Library instrumentation w/o main + more
- fixed library instrumentation w/o main
- use TIMEMORY_PROJECT_NAME in output messages
- removed '--driver' option from omnitrace exe
- support coverage in trace mode
- OMNITRACE_KOKKOS_KERNEL_LOGGER
- support multiple calls to omnitrace_set_env after init if already called
- support multiple calls to omnitrace_set_mpi after init if same args
- support multiple calls to omnitrace_init if same mode
- unique_ptr_t for thread_data which calls finalize when thread_data is destroyed
- tweaked openmp tests
- improved finalization
* Replace CI --output-on-failure with -V
* Fix to OMNITRACE_DL_INVOKE
* omnitrace-exe and testing updates
- omnitrace::omnitrace-timemory interface library
- support for configs in omnitrace exe
- print-{available,instrumented,...} opts no longer exit w/o --simulate
- all tests apply --print-instrumented functions
- tweaked coverage tests
- print-* options print instructions not address range
* Remove OMNITRACE_DEBUG_FINALIZE=ON from CI
* Python cmake tweaks
* Tweak test ordering
* Upload CI artifacts if fail or success
* CI Python tweaks
- Use OMNITRACE_PYTHON_PREFIX and OMNITRACE_PYTHON_ENVS
* CI ELFULTILS_DOWNLOAD_VERSION
* test tweaks
- labels and more coverage tests
* tweak to omnitrace --config handling
* Update module/function constraint handling + PP
- tweak pre-processor definition handling
- removed free-standing module_constraint
- remove free-standing routine_constraint
- remove module_name.find("omnitrace") module constraint
- fully handle the output path of omnitrace *-instr files
- get_use_code_coverage config option
- print-coverage option
- coverage_module_functions
* use github.job not github.name
* Re-enable HSA_ENABLE_INTERRUPT
- remove coverage address report