* Improved sampling performance
* Sampling tweaks
- backtrace::get() returns vector of string_view
- further performance improvements
- tweaked _use_label
- wrapped samples in perfetto "samples [omnitrace]" block
- samples in TID=0 in sampling mode are in separate thread row
* Fix empty HW counter desc + category for sampling
- fallback to to metric name if papi event info description not found
- add perfetto sampling category
* Limit the SIGALRM frequency
* HIP API perfetto args + updated perfetto categories
- Support for HIP API args field in perfetto
- PERFETTO_CATEGORIES -> OMNITRACE_PERFETTO_CATEGORIES
- Changed perfetto categories for several trace events and trace counters
- migrated several TRACE_EVENT_* to use omnitrace::tracing::{push,pop}_perfetto_ts(...)
* Tweaked category_region to encode the type of args as well as value
- Affects MPI args field in perfetto
* Improved testing in ubuntu-focal.yml
- "Test Install" step sources setup-env.sh
- "Test Install" step tests python support
- "Test Install" step tests reading ~/.omnitrace.cfg
- Avoid installing boost and tbb libs when building from submodule
* validate-perfetto-proto.py accepts -m / --categories
* Remove reference from category_region typeids
* Tweak opensuse action name
* Tweak the "Test Install" Step of ubuntu-focal
* Fix sampling counter time scales
- All perfetto trace events have "begin_ns" and "end_ns" debug fields
- data for thread start and end timestamp in pthread_create_gotcha
- discard samples outside of thread start and end timestamps
- rename "CPU User CPU Time" perfetto counter to "CPU User Time"
- rename "CPU Kernel CPU Time" perfetto counter to "CPU Kernel Time"
- ensure CPU system samples in perfetto are set to zero at end
- backtrace uses comp::wall_clock record() for timestamps (consistency)
- "Peak Memory Usage [Thread X] (S)" renamed to "Thread Peak Memory Usage [X] (S)"
- "Context Switches [Thread X] (S)" renamed to "Thread Context Switches Usage [X] (S)"
- "Page Faults [Thread X] (S)" renamed to "Thread Page Faults Usage [X] (S)"
- "<PAPI_DESC> [Thread X] (S)" renamed to "Thread <PAPI_DESC> [X] (S)"
- samples
* Fix includes
* Rework sampling trace counter names + new trace counters
- reformulate trace counter names for easier comparison (thread sampling)
- new process-level trace counters for context switches (thread sampling)
- new process-level trace counters for page faults (thread sampling)
- new process-level trace counters for CPU time (thread sampling)
- new thread-level trace counters for context switches (sampling)
- new thread-level trace counters for page faults (sampling)
* tweak header include in backtrace.cpp
* Parallel overhead example with locks
* Support tracing mutex locking + more
- support wrapping pthread_mutex_lock
- support wrapping pthread_mutex_unlock
- support wrapping pthread_mutex_trylock
- get_perfetto_combined_traces setting
- OMNITRACE_TRACE_THREAD_LOCKS option
- ThreadState
- critical trace includes queue id
- enabled/disabled settings in timemory
- fix OMNITRACE_TIMEMORY_COMPONENTS
- fix reading config
- fix setting categories
- applied ThreadState::Internal in various places
- utility::get_filled_array
- utility::get_reserved_vector
- utility::get_thread_index
- fork_gotcha messages about forks
- split out some pthread_gotcha functionality into pthread_create_gotcha
- handle queue id in roctracer callbacks
* Update timemory and PTL submodules
* Misc CMake updates
- Includes fix to omnitrace-static-lib{gcc,stdcxx}
* Misc cleanup to pthread_mutex_gotcha and backtrace
* Fix to duplicate field in module_function json
* Improvement to debug messages
* omnitrace-dl and common improvements
- tweak to delimit
- common::ignore message
- common::join quoting of strings
- omnitrace_set_env ignores if inited and active
- omnitrace_set_mpi ignores if inited and active
* nsync for transpose example
* Fix to thread_deleter<void> functor invoke
* Fix thread state and HIP stream enums
- Got rid of `source/lib/omnitrace/include` and `source/lib/omnitrace/src` and merged into `source/lib/omnitrace`
- Updated perfetto submodule to v25.0
- Updated papi submodule