219b2e988ebd119ebb844932f018e2ee6f00a672
26 Commity
| Autor | SHA1 | Wiadomość | Data | |
|---|---|---|---|---|
|
|
9499e2f521 |
Remove Critical Trace Support (#327)
* Delete core critical-trace files * Update docs and README * Update workflows * Update testing * Update cmake * Remove critical trace usage in source code * Update source/docs/critical_trace.md - fix spelling * Formatting * Update bin/omnitrace-avail/avail.cpp - statically allocate shared pointers for timemory manager and hash id/aliases to prevent use-after-free errors |
||
|
|
77d52814e9 |
Fix omnitrace-avail component list (#328)
* Fix omnitrace-avail component list - remove omnitrace components from `omnitrace-avail -C` since these are no-ops in OMNITRACE_TIMEMORY_COMPONENTS * Fix omnitrace-avail-filter-wall-clock-available test |
||
|
|
5de4163d66 |
Deprecate OMNITRACE_USE_PERFETTO, OMNITRACE_USE_TIMEMORY (#306)
* Rename OMNITRACE_USE_PERFETTO to OMNITRACE_TRACE * Rename OMNITRACE_USE_TIMEMORY to OMNITRACE_PROFILE * Revert change to Perfetto.cmake * Fix formatting clang-format-11 was complaining about formatting |
||
|
|
518c83e0f9 |
Dynamic expansion of thread data (#294)
* Tests for exceeding OMNITRACE_MAX_THREADS
- tests which exceeds OMNITRACE_MAX_THREADS value for thread creation
* CMake Formatting.cmake update
- include source files in /tests/source directory
* Add unknown-hash= to OMNITRACE_ABORT_FAIL_REGEX
- fail if a timemory hash is not resolved to a name
* Tests for exceeding OMNITRACE_MAX_THREADS
- update
* omnitrace-sample update
- remove env disabling of critical-trace and process-sampling
* core library update
- make_unique in concepts.hpp
- add OMNITRACE_USE_ROCM_SMI to "process_sampling" category
- remove forced disabling of critical-trace in sampling mode
- parentheses for OMNITRACE_PREFER
- use tim::get_hash_id instead of tim::get_combined_hash_id
* core library update (containers)
- added aligned_static_vector.hpp
- similar to static_vector.hpp but attempts to align to cache line size
- alignment template parameter for stable_vector
- added missing aliases in static_vector
- consistent with aligned_static_vector aliases
* thread_info update
- track the peak number of threads created
- thread_info::get_peak_num_threads() returns the peak number of threads
* thread_data update
- generic thread_data inherits from base_thread_data
- thread_data reworked to support dynamic expansion
- base_thread_data updated to invoke private_instance() function
- thread_data<optional<T>> uses stable_vector aligned to cache line width
- thread_data<identity<T>> uses stable_vector aligned to cache line width
- thread_data for optional and identity provide private private_instance function + friend to base_thread_data
- component_bundle_cache<T> is now thread_data<component_bundle_cache_impl<T>>
* causal update
- thread_data<T>::instances -> thread_data<T>::instance(construct_on_thread{ ... })
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- update progress_bundle usage to new thread_data API
* backtrace/backtrace_metrics component update
- backtrace_metrics update
- update to new thead_data API
- add thread CPU time row in perfetto
- fix potential bug when rusage categories are disabled
- fix bug in operator-= not subtracting cpu time of rhs
- backtrace update
- skip all child call-stack below 'tim::openmp::' if sampling_keep_internal = false
* pthread_gotcha component update
- pthread_gotcha::shutdown() invokes pthread_create_gotcha::shutdown()
* pthread_create_gotcha component update
- minor tweak to {start,stop}_bundle functions: pass in thread id
- update to new thread_data API
- track native handles of internal threads
- implement system with pthread_kill to stop dangling bundles
* rocprofiler/roctracer component update
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
* critical trace (library) update
- update to new thread_data API
- tim::get_combined_hash_id -> tim::get_hash_id
* coverage update
- update to new thread_data API
* tasking update
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
* roctracer update
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
* rocm_smi update
- update to new thread_data API
* runtime.cpp update
- update to new thread_data API
* sampling.cpp update
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
* ompt.cpp update
- invoke pthread_gotcha::shutdown before invoking OMPT finalize function
- this prevents signals from being delivered to OpenMP threads
* tracing.hpp and tracing.cpp update
- replace get_timemory_hash_{ids,aliases} functions with copy_timemory_hash_ids function
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- improvements to + error checking in thread_init function
* library.cpp update
- move copying timemory hash id/aliases to tracing.cpp
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
* Update BuildSettings.cmake
- add -Wno-interference-size to suppress warning about use of std::hardware_destructive_interference
* Update fork example
- improve scheme for waiting on child processes via waitpid instead of wait
- support running main routine multiple times
- push/pop regions in child process
* Update lib/common/defines.h.in
- allow use to specify misc values via -D <name>=<value>
- OMNITRACE_CACHELINE_SIZE
- OMNITRACE_CACHELINE_SIZE_MIN
- OMNITRACE_ROCM_MAX_COUNTERS
- remove unused defines
- OMNITRACE_ROCM_LOOK_AHEAD
- OMNITRACE_MAX_ROCM_QUEUES
* Update rocprofiler.hpp
- OMNITRACE_MAX_ROCM_COUNTERS -> OMNITRACE_ROCM_MAX_COUNTERS
* Update aligned_static_vector
- set cacheline_align_v from max of OMNITRACE_CACHELINE_SIZE and OMNITRACE_CACHELINE_SIZE_MIN
* Update tracing.cpp
- acquire locks for updating main hash ids/aliases
- only propagate ids/aliases when finalizing
* Update pthread_create_gotcha.cpp
- make sure hash for "start_thread" exists on main thread
* Update causal end to end tests
- if OMNITRACE_BUILD_NUMBER is 1, set OMNITRACE_VERBOSE=0
|
||
|
|
3e2fa69a14 |
CI timeout + line-info in releases (#279)
* Update perfetto args.gn.in - remove enable_perfetto_tools_trace_to_text (unused) * core timeout implementation - requires OMNITRACE_CI=ON - requires OMNITRACE_CI_TIMEOUT=<sec> - adds pthread_self and std::this_thread::get_id to thread info - pthread_create_gotcha stores native handles (pthread_self) * Testing updates - improve detection of segfault/failures with PASS_REGEX exists - add OMNITRACE_CI_TIMEOUT env variable to all tests * Line-info in releases - e.g. -g1 + more options to minimize size of debug info * Fix typo in config exit action message * OMNITRACE_UNLIKELY around debug/verbose messages * format fixes * Overflow tests + capability check * transpose example update - link to threads library * roctracer/rocprofiler update - in ROCm 5.5.0, cannot include rocprofiler.h and roctracer.h in same file due to conflicting enum defs - Moved HSA tracing setup/shutdown to component::roctracer * roctracer update - fix definition of roctracer::setup when disabled * Update fork example - detach threads on main PID - flush io outputs when printing info * Update overflow tests - pass regular expressions - overflow on PERF_COUNT_SW_CPU_CLOCK event * fork gotcha update - use getpid() instead of getppid() * update fork example - wait on threads calling fork * timeout update - wait on timeout thread to launch before proceeding |
||
|
|
9de3a6b0b4 |
Linux Perf Support + Causal Profiling Updates (#276)
* causal backtrace updates
- fix initial causal sampling period value
* causal delay updates
- tweak handling of sleep_for_overhead
* Fix experiment global scaling for prog pts
- results in drastically improved predictions
* pthread_mutex_gotcha updates
- disable all wrappers during causal profiling
* validate-causal-json.py updates
- support decimal stddev
- fix setting stddev from command-line
* causal perform_experiment_impl update
- handle start failing because finalizing
* deprecate causal::component::sample_rate
- appears to not help at all
* Rework sample info
* Increase causal unwind_depth
- use OMNITRACE_MAX_UNWIND_DEPTH
* validate-causal-json updates
- min experiments
- exclude reporting predictions with less than X experiments at a given speedup
- percent samples
- only print samples within X% of the peak (default: 95%)
* Update timemory submodule
- extensions to sampling for signals delivered via non-timer method
- e.g. via HW counter overflow
* dwarf_entry::operator< updates
- sort via file
* causal profiling docs updates
- info about backends
- info about installing/enabling perf
* config updates: causal backend
- CausalBackend enum
- OMNITRACE_CAUSAL_BACKEND: perf, timer, auto
- omnitrace-causal option: --backend
* debug update
- use spin_mutex instead of std::mutex
* address_range::contains update
- range from 0-100 contains range from 10-100 but was returning false because high was == 100 not < 100
* symbol::operator< update
- handle load address differences
* sampling updates (non-causal)
- update get_timer to get_trigger + dynamic_cast
* container::static_vector updates
- support construction from container::c_array
- update_size private member func for handling atomic m_size
* Move perf files
- moved library/causal/perf.{hpp,cpp} to library/perf.{hpp,cpp}
* causal example update
- created impl.hpp (forward decls)
- renamed {cpu,rng}_func_impl to {cpu,rng}_impl_func
- only create two threads which run N iterations instead of two threads each iteration
* Update timemory submodule
- updates to unwind::processed_entry
- updates to procfs::maps
* Updated causal documentation
- fixed line numbers changed by modifications to causal example
* omnitrace-causal exe updates
- set OMNITRACE_THREAD_POOL_SIZE to zero by default
* core/containers updates
- static_vector: provide data() member function
- c_array pop_front() and pop_back() member functions
* core: config and argparse updates + perf
- core/perf.{hpp,cpp}
- forward decl of enums
- config-related capabilities
- argparse: --sample-overflow
- renamed some config functions
- e.g. get_sampling_cpu_freq -> get_sampling_cputime_freq
- added config settings related to overflow sampling via perf
- added timer_sampling and overflow_sampling categories
* Update timemory submodule
- sampling allocator flushing
* binary updates
- lookup_ipaddr_entry
- use bfd_find_nearest_line instead of bfd_find_nearest_line_discriminator
- discriminators are not used
- explicit instantiations of inlined_symbol::serialize
* Bump VERSION to 1.10.0
* sampling and perf updates
- support overflow sampling via Linux Perf
- update perf namespace
- update perf::perf_event
- update record ctor: pointer instead of const ref
- update open member func: return optional string
- add m_batch_size member variable
- sampling updates
- support overflow sampling
- flush allocators
- increase buffer size from 1024 to 2048
- restructure post-processing in light of perf overflow supports
- improve offload memory usage only load buffers for thread
- load_offload_buffer(tid) uses thread-specific filepos
- component updates
- backtrace_metrics::operator-=
- backtrace_metrics::operator-
- backtrace::sample does not record for overflow signal
- callchain: perf overflow sample
* core updates
- component::sampling_percent does not report self + uses_percent_units
* causal updates
- tweak get_line_info
- overloads for set_current_selection (uint64_t, c_array, std::array)
- delay
- use sampling::pause/sampling::resume
- experiment
- experiment::sample derives from unwind::processed_entry
- experiment::samples is vector instead of set
- fixed samples
- overloads for is_selected (uint64_t, c_array, std::array)
- scaling factor defaults to 100 instead of 50
- serialize updates follow change to experiment::sample
- modify algorithm for increasing/decreasing experiment length
- sample_data
- use map<uintptr, uint64_t> instead of set<sample_data>
- get_samples returns vector<sample_data> instead of set<sample_data>
- sampling
- support overflow via Linux Perf
- update causal_offload_buffer
- flush sampling allocator
- backtrace
- overflow component
* libomnitrace-dl updates
- handle dl::InstrumentMode::PythonProfile
* testing updates (causal)
- causal line 155 -> causal line 100
- causal line 165 -> causal line 110
* formatting
* exit_gotcha updates
- exit_info for abort()
- message about non-zero exit code
* testing updates
- fail regex for causal tests
- validate-causal-json: >= min_experiments instead of > min_experiments
- handle OMNITRACE_DEBUG_SETTINGS in omnitrace_write_test_config
* causal sampling updates
- add new lines where appropriate
* causal data updates
- reorder diagnostic info when experiment fails to start
* binary updates
- symbol address range from address to address + symsize + 1
- add 1 based on debug info
* causal data updates
- sample_selection wait_ns defaults to 1,000 instead of 10,000
- sample_selection wait scaled by iteration number
- save_line_info_impl verbosity
- print latest_eligible_pc when experiment does not start
* causal sampling + component updates
- perf backend disables component::backtrace
- ensure get_sampling_(realtime|cputime|overflow)_signal do not malloc
* causal: remove period stats
* validate-causal-json update
- fix --help
* causal data updates
- improve eligible pc history reporting when experiment fails to start
* causal data updates
- fix compute_eligible_lines_impl
- eligible address ranges returning too many ranges
- occasionally, overwrite all *true* eligible address ranges
* causal data updates
- reduce scoped ranges to symbol ranges
- is_eligible_address() returns true contains (not just coarse)
- revert some sample_selection behavior
* binary address_multirange updates
- make coarse_range private
- fix operator+=(pair<coarse, uintptr_t>)
* causal example update
- fix nsync to default to once per iteration
* binary analysis updates
- tweak header file includes
* causal updates
- remove factoring in sleep_for_overhead
- invoke delay::process() even if experiment is not active
* causal data updates
- update latest_eligible_pc structure
* update omnitrace-install.py.in
- fix support for fedora
- /etc/os-release does not have ID_LIKE
- fallback to RHEL 8.7 if version not specified
* update omnitrace-install.py.in
- fix support for debian
- /etc/os-release does not have ID_LIKE
- version mapping
* Update documentation
- update docs on installation
* causal data and experiment updates
- data: reset_sample_selection
* causal set_current_selection debugging
- debug messages for failed e2e runs
* causal data and backtrace component updates
- data: set_current_selection returns the number of eligible addresses added
- backtrace: if cputime signal has selected zero IPs > 5x, then realtime signal starts contributing call-stacks
* core library updates
- move config::parse_numeric_range to utility namespace
- add core/utility.cpp
- support range:increment, e.g. 5-25:10 expands to '5 15 25' instead of '5 10 15 20 25'
* omnitrace-causal update
- end-to-end expands all speedups
- support range:increment in speedups
* causal backtrace updates
- remove select_ival (realtime signal always contributes when select_count == 0)
* containers: static_vector update
- explicit c_array constructor
- explicit std::array constructor
* causal data updates
- remove set_current_selection(uint64_t)
- remove set_current_selection(std::array)
- sample_selection increase default wait time
- report eligible PC candidates
- move reset_sample_selection to perform_experiment_impl
- decrease latest_eligible_pc array size
- set_current_selection does not guard for experiment::active
* core debug updates
- OMNITRACE_PRINT_COLOR macros
* causal data updates
- tweak to experiment never started message
* causal gotcha updates
- remove unused code
* critical trace updates
- remove unused code
* omnitrace-causal
- OMNITRACE_LAUNCHER
* causal data updates
- don't fail on end-to-end + omnitrace-causal
* causal backtrace updates
- reintroduce select_ival behavior
* causal data updates
- tweak verbose messages about number of PC candidates
* core mproc updates
- utilities for waiting on child PID and diagnosing status
- omnitrace::mproc::wait_pid
- omnitrace::mproc::diagnose_status
* omnitrace-run updates
- support --fork argument for executing via fork in current process + execvpe on child instead of execvpe in current process
* omnitrace-causal updates
- wait_pid and diagnose_status just call equivalent functions in omnitrace::mproc
* ubuntu-focal workflow update
- attempt to launch ubuntu-focal-codecov job with CAP_SYS_ADMIN and use perf backend
* tests reorg and updates
- remove binary-rewrite-sampling and runtime-instrument-sampling tests
- rename *-preload tests (which use omnitrace-sample exe) to *-sampling
- split tests/CMakeLists.txt into several tests/omnitrace-<category>-tests.cmake files
- tweak to causal-both-omni-func test
- add args: -n 2 -b timer
* update validate-causal-json.py
- better reasoning info for adjusting tolerance
- always apply tolerance adjustments in CI mode
* causal e2e tests update
- add label "causal-e2e" label
- tweak params
- old: 80 12 432525 500000000
- new: 80 50 432525 100000000
- disable processor affinity for slow-func/line-100 tests
- artificially inflates some speedups with perf
* unblocking_gotcha updates
- overload operator() according to gotcha function index
* blocking_gotcha updates
- overload operator() according to gotcha function index
- fix bug where potentially post block functors (e.g. pthread_mutex_trylock) throw error if lock is not acquired.
* parse_numeric_range update
- support unordered_set
* config update
- OMNITRACE_DEBUG_{TIDS,PIDS} use parse_numeric_range
|
||
|
|
b39a683eab |
omnitrace-avail updates (#272)
* omnitrace-avail updates - enables text wrapping for descriptions - reworks the HW counters display layout - added new column "Device" which has either "CPU" or "GPU" - support sorting HW counters alphabetically - fixed some minor csv issues - reorganize the order of the argparse arguments * Fix tests |
||
|
|
abe35de43a |
omnitrace-run executable - required for running binary writes (#257)
* omnitrace-run exe - ensure LD_PRELOAD for libomnitrace-dl.so - convert config options into command-line options * Update timemory submodule - updates to tsettings - updates to argparser * common environment update - throw error if get_env<bool> has empty string * config updates - minor tweaks to categories of settings * core lib update - add argparse for common handling of argument parsers * omnitrace-sample update - fix handling of --trace-file (OMNITRACE_PERFETTO_FILE) * omnitrace-run update - updated to use omnitrace::argparse functions * Tests for omnitrace-run * argparse core update - remove choices for --cpu-events and --gpu-events * remove some debugging prints * fix timemory include in argparse.cpp * always provide --hsa-interrupt option * Update source/lib/core/argparse.cpp - fix pedantic warning * Update testing - remove testing args that may not be there in some builds * roctracer/pthread_create fix - disable roctracer_data when roctracer not enabled * omnitrace-causal tweak * omnitrace-instrument: module_function tweak - allow DEFAULT_MODULE and LIBRARY_MODULE * common environment update - support get_env for enums * core: config update - Add "mode" category to OMNITRACE_MODE * Update timemory submodule - remove debug print statement * omnitrace-sample tweak - change var init * omnitrace-run testing update - use --help instead of -? * core: common.hpp - tweak header include style * core: argparser update - add_ld_preload func - launcher and command member variables in parser_data - support launcher * omnitrace-run update - clean up and reworked * libomnitrace-dl updates - require LD_PRELOAD with binary rewrite - dl::InstrumentMode - dl::get_instrumented() - verify_instrumented_preloaded() - omnitrace_set_instrumented(int) - relocated omnitrace_main from main.c to dl.cpp - omnitrace_set_env does not dlopen libomnitrace - omnitrace_set_main(func_ptr) [internal API] - OMNITRACE_HIDDEN_API -> OMNITRACE_INTERNAL_API * Update testing to new LD_PRELOAD requirements * omnitrace-instrument updates - adhere to LD_PRELOAD requirementsa - invoke omnitrace_set_instrumented - binary rewrite does not instrument main - binary rewrite does not instrument call to omnitrace_init - runtime instr does not instrument main - runtime instr does not instrument call to omnitrace_init * Bump to v1.9.0 - LD_PRELOAD requirement necessitates minor version increment * common: environment - fix ambiguous get_env calls * omnitrace-instrument update - fix issue with temporaries * omnitrace-instrument and libomnitrace-dl updates - runtime instrumentation does not work if libomnitrace-dl is preloaded * libomnitrace-dl and libpyomnitrace updates - define dl::InstrumentMode in dl.hpp - handle instrumentation via setprofile libpyomnitrace - do not push trace in omnitrace_init * omnitrace-instrument and libomnitrace-dl updates - move header to dl subdirectory - omnitrace::omnitrace-headers include omnitrace-dl folder - use InstrumentMode in omnitrace-instrument * Update workflows and scripts - Use omnitrace-run on instrumented exes * Update docs - add omnitrace-run to examples of running binary rewritten exes |
||
|
|
ab0e5d9b44 |
omnitrace -> omnitrace-instrument (#256)
* omnitrace-exe -> omnitrace-instrument - Renamed omnitrace executable to omnitrace-instrument - Provided dummy omnitrace exe which forwards onto omnitrace-instrument - updated all docs to reflect the name change of the executable - however, it is possible some were missed * Update dyninst submodule - correctly handle BOOST_LINK_STATIC in DyninstBoost.cmake * Disable IPO for omnitrace-instrument |
||
|
|
1688a027d8 |
Add RedHat CI and release packaging (#251)
- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings - improves the stability of MPI finalization - reduces some debug messages within timemory when `OMNITRACE_DEBUG=ON` - fixes issue found in RHEL where libunwind is using mutex and omnitrace was not treating this as an internal mutex call - this may have been affecting the causal profiling slightly (tests seem a bit more stable now) - fix data race in timemory * Add RedHat CI and release packaging - additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings * Fix URL for ROCm packages in redhat workflow * Fix dnf --enable-repo for ROCm perl packages * Dockerfile.rhel and redhat.yml updates - Fix dnf repo for ROCm PERL packages - Disable python in CI (interpreter segfaults) - Exclude parallel-overhead-locks tests due to inclusion of internal locks - This needs to be remedied in the future * Exclude _dl_relocate_static_pie from instrumentation * Testing updates - OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks * Fix redhat workflow * redhat.yml update - remove if condition on config/build/test step * Update timemory submodule - tweaks to verbosity messages * Set thread state before unw_step - on Redhat, unw_step calls mutex * Update timemory submodule - verbosity changes - gotcha uses spin_lock/spin_mutex * Remove using gsplit-dwarf unless OMNITRACE_BUILD_NUMBER > 2 * Re-enable parallel-overhead-locks tests in redhat workflow * Always disable timemory manager metadata auto output * testing updates - tweak parallel-overhead-locks-timemory to higher instruction count min - OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks-perfetto * Update timemory submodule - quiet realpath queries * omnitrace exe updates - detect text files - improved bin/lib locating * cmake format * test-install.sh and redhat workflow updates - handle testing when ls is script - re-enable python testing on redhat workflow - invoke test-install.sh in redhat workflow * Misc guards for finalization * omnitrace-exe, testing updates - test-install.sh: LS_EXEC -> LS_NAME - handle /usr/bin/ls being script in source/bin/tests - improve locating the binary * Fix mpi_gotcha compile error * omnitrace-exe updates - improve file locating * formatting * Misc fixes - remove -static-libstdc++ for RHEL packaging (rocky-linux doesn't distribute static lib) * omnitrace-exe paths * Replace realpath with absolute - using absolute path to symlink fixes issues with locating libdyninstAPI_RT at runtime * omnitrace exe updates - judicious use of realpath * Update timemory submodule - fix update main hash ids/aliases data race in merge * bin tests update - change working directory of omnitrace-exe-simulate-lib-basename * omnitrace exe updates - Update resolved exe/lib messaging * bin tests update - change working directory of omnitrace-exe-simulate-lib-basename |
||
|
|
9618ddefba |
Causal profiling (#229)
* Addition of basic structure
* Reworked categories
* More causal integration additions
* Causal implementation
* Update examples
* delete virtual_speedup files
* Update perfetto submodule to v31.0
* Update dyninst submodule
* Update timemory submodule
* ElfUtils build for libdw
* OMNITRACE_LIKELY and OMNITRACE_UNLIKELY
* Update common lib join
* Examples updates for causal profiling
* config updates with causal options
- OMNITRACE_CAUSAL_FIXED_LINE
- OMNITRACE_CAUSAL_FIXED_SPEEDUP
- OMNITRACE_CAUSAL_FILE
- OMNITRACE_CAUSAL_BINARY_SCOPE
- OMNITRACE_CAUSAL_SOURCE_SCOPE
- version info in banner
- support increments in parse_numeric_range
- fix occasional deadlock in first call to get_config
* PTL general task group
* Always include PID in debug/verbose messages
* Add blocking/unblocking gotchas to runtime init bundle
* CausalState
* thread_data updates
- generic component_bundle_cache
* Improve handling of causal in category_region
* components updates
- backtrace_causal component
- backtrace::get_data member func
- decrease ignore_depth in backtrace::sample(int)
- handle "omnitrace_main" in backtrace::filter_and_patch(...)
- tweak internal thread state scope for pthread_mutex_gotcha wrappers
* simplify tracing get_instrumentation_bundles usage
* sampling updates
- include backtrace_causal component
- disable backtrace_metrics if using causal and not using perfetto
- disable backtrace and backtrace_timestamp when using causal
- post_process_causal
* causal updates
- more checks in blocking_gotcha and unblocking_gotcha start/stop
- miscellaneous overhaul of data
- experiment update
* Remove virtual speedup
* libomnitrace code_object
* causal-profiling test
* libomnitrace library.cpp updates
- handle causal profiling
- fini_bundle
* Disable causal profiling by default
* Updated causal code and example
- example: three execution variants: cpu + rng, cpu, rng
- example: three instrumentation variants: none, omni, coz
- fix blocking gotcha credit
- rework perform_experiment_impl
- get_eligible_address_ranges
- compute_eligible_lines
- support fixed lines/speedups/functions
- update selected_entry to support function mode
- fix causal::delay
- experiment updates
* omnitrace_progress / omnitrace_user_progress
- with accompanying omnitrace_annotated_progress / omnitrace_user_annotated_progress
* Update timemory submodule
* CausalMode
- mode indicated whether causal predictions source be at line-level or function-level
* code_object, config, runtime, sampling, thread_data
- code_object: address_range
- code_object: basic::line_info serialize(), name(), hash()
- config updates
- two signals for causal sampling
- thread_data init fixes
* pthread updates
- pthread_create_gotcha processes delays
- pthread_mutex_gotcha does not wrap pthread_join in causal mode
* backtrace_causal update
- dynamic delay period stats
* main wrapper uses basename of argv[0]
* update elfio submodule
* perf support (currently unused)
* Fix experiment JSON serialization
- static_vector.hpp (unused)
* causal executable + config options updates
- omnitrace-causal exe simplifies running multiple causal configs
- changed the causal config option names
* Support both throughput and latency points
* process-causal-json.py script
- will be used later for testing
* stable_vector
* Rework thread_data
* Improve omnitrace-causal exe
- better verbosity handling
- correct diagnosis of status for child process
- execvpe when only one iteration (debugging)
* Update timemory submodule
* exe --version
- omnitrace, omnitrace-avail, and omnitrace-sample all support --version on command-line
* OMNITRACE_INTERNAL_API + OMNITRACE_{LIKELY,UNLIKELY}
* omnitrace-causal cmake format
* omnitrace config update
- OMNITRACE_CAUSAL_FILE_CLOBBER
* custom exception
- wraps STL exception and gets stacktrace during construction
* exit_gotcha supports _Exit
* use global construct_on_init + max threads
- add some safety when exceeding max # of threads
* update code_object binary filter
- exclude dyninst and tbbmalloc library
* containers: c_array, static_vector, stable_vector
- moved utility::c_array to container::c_array
- created static_vector: std::vector bound to std::array
- created stable_vector: vector with stable references
* grow thread_data when new thread created
* causal updates
- data: improve compute_eligible_lines to ignore lambdas
- data: use new thread_data
- delay: use new thread_data
- experiment: properly support latency points
- experiment: support file clobber
- experiment: ensure non-zero experiment time
- progress_point: use new thread_data
- backtrace_causal: use new thread_data
* Update causal-profiling tests
* fix omnitrace-causal backslash escaping
* process-causal-json script
* restructure causal implementation
- update verbose messages for omnitrace-causal diagnose_status
- migrated causal implementation in sampling.cpp to causal/sampling.cpp
- OMNITRACE_USE_CAUSAL does not require OMNITRACE_USE_SAMPLING
- added Mode::Causal
- causal sampling uses same signals as regular sampling
- moved tracing::thread_init to implementation file
- combined tracing::thread_init and tracing::thread_init_sampling
- added causal/components folder
- pthread_create_gotcha::wrapper_config
- omnitrace_preload checks OMNITRACE_USE_CAUSAL
- updates mode accordingly
* update timemory submodule
* update timemory submodule
* causal example updates
- causal for lulesh
* perf code + utility - helpers
- relocated causal perf code
- placement new when generating unique ptr trait for potentially allocating during sampling
- additions to utility header
- removed previously added helpers.hpp
* update timemory submodule
* Default env variables for omnitrace-causal
- activate OMNITRACE_USE_KOKKOSP, etc.
* update stable_vector and static_vector
- static vector can use atomic for size tracking for thread-safe situations
* update causal example header
- CAUSAL_PROGRESS_NAMED
- use CAUSAL_ prefix for some macros
* Tweak lulesh example
- use CAUSAL_PROGRESS instead of CAUSAL_BEGIN and CAUSAL_END
* omnitrace-sample support for causal mode
- set OMNITRACE_USE_SAMPLING to off when OMNITRACE_MODE=causal
* refactor and cleanup code_object
- scope filter
- fixes to address_range
* overhaul causal data + causal config options
- full support for function and line mode
- support static vector of instruction pointers
- improve line info mapping resolution
- remove thread-locality from miscellanous functions where unnecessary
- causal options for {binary,source,function,fileline} exclusion
* causal experiment, sampling, and backtrace updates
- is_selected + unwind address array
- experiment warning about progress points
- increased buffer size for backtrace_casual sampler
- backtrace_causal only stores IP addresses instead of full unwind info
* category_region updates
- minor refactor
- local_category_region::mark
* Update causal tests
* Bump version to 1.8.0
* omnitrace-causal args + CLOBBER -> RESET
- renamed OMNITRACE_CAUSAL_FILE_CLOBBER to OMNITRACE_CAUSAL_FILE_RESET
- updated omnitrace-causal exe to support recently added configuration options
- other miscellaneous tweaks to data.cpp, experiment.cpp, and sampling.cpp
* Refactor causal and code_object
- code_object.hpp and code_object.cpp moved into binary folder
- causal components namespaced into omnitrace::causal::component
- moved sample_data out of backtrace_causal and into own file
- renamed backtrace_causal to causal::component::backtrace
* preload omnitrace_init + OMNITRACE_DEBUG_MARK
- env OMNITRACE_DEBUG_MARK
- fix omnitrace_init call when LD_PRELOAD-ing omnitrace
* Fix fileline support + line-info output names + experiment log
- line-info log files are prefixed with experiment name
- don't print experiment duration when E2E
- account for fileline scope in analysis
* KokkosP: OMNITRACE_KOKKOSP_NAME_LENGTH_MAX
- config option to limit the name of kokkos tool callbacks
- remove [kokkos] from KokkosP names
* Update causal example
- minor tweaks to decrease probability of overlapping regions in binary
* omnitrace-causal update
- prefix N / Ntot in environment printout
* Miscellaneous updates
- causal::finish_experimenting()
- OMNITRACE_CAUSAL_RANDOM_SEED
- KokkosP causal updates
- exclude some callbacks, make some callbacks unique, etc.
- address_range::operator+=(address_range)
- combine contiguous ranges in binary/analysis.cpp when file, func, line is same and address range is contiguous
- bfd_line_info reads inline info
- wait for perform_experiment_impl to complete
- causal::delay updates
- delay::process checks if experiment is active
- uses threading::get_id()
- experiment scales duration up for larger speedup experiments
- line info samples includes excluded lines
- sampler uses CLOCK_REALTIME
- blocking_gotcha updates
- is no longer fully static
- adds audit routine which sets the postblock value to zero if try/timed routine fails
- category::host was added to causal_throughput_categories_t
- pthread_create_gotcha sets new threads local parent delay
- was using internal value, now uses sequent value
* Causal improvements to KokkosP
* Updates to experiment time scaling
- use stats instead of just max
* binary/link_map.{hpp,cpp}
* update process-causal-json.py
* Folded fileline scope into source scope
* Update documentation
- Add documentation for causal profiling
- Replace 'Omnitrace' with 'OmniTrace' everywhere
* Update causal-helpers.cmake + omnitrace-testing.cmake
- split tests/CMakeLists.txt partially into omnitrace-testing.cmake
* omnitrace/causal.h
- OMNITRACE_CAUSAL_PROGRESS
- OMNITRACE_CAUSAL_PROGRESS_NAMED
- OMNITRACE_CAUSAL_BEGIN
- OMNITRACE_CAUSAL_END
* selected_entry + remove default filters for lambdas and operator()
- selected entry stores range and binary load address
* update process-causal-json.py
* format examples/lulesh/CMakeLists.txt
* causal-helpers find_package(Threads)
* OMNITRACE_KOKKOSP_KERNEL_LOGGER
- was OMNITRACE_KOKKOS_KERNEL_LOGGER
* quiet find of coz-profiler
* Fix rocm_smi exception handling
* Update timemory submodule (binutils)
- fix binutls compile error on some systems
- bump binutils to v2.40
* Fix miscellaneous tests
* OMNITRACE_KOKKOSP_PREFIX
* revert rocm_smi handling
* ElfUtils updates
- default to download version 0.188
- add -Wno-error=null-dereference due to GCC 12 compiler error
* Update causal example
* Remove OMNITRACE_VERBOSE from global workflow envs
* Reliable causal test
* disable compilation of causal perf files
* Remove set_current_selection with unwind stack
* update timemory submodule
* fix for segfault on bionic
- locking in TLS dtor was causing segfault
* remove experiment::is_selected(unwind_stack_t)
* update default init of selected_entry
* Fix for when IP is not offset by load address
* Update CMakeLists.txt
* Miscellaneous updates
- OMNITRACE_WARNING_OR_CI_THROW
- OMNITRACE_REQUIRE
- OMNITRACE_PREFER
- fixed issues with no ASLR
- added load address variable and ipaddr() func to basic/bfd line info
- removed get_basic() from dwarf_line_info
- TIMEMORY_PREFER -> OMNITRACE_PREFER
- removed previously added binary_address and range variables from selected_entry
* Removed superfluous CausalState
* Additional causal tests (lulesh + kokkos)
* filter, prefer, analysis ASLR handling
- removed default filter on cold functions
- fixed OMNITRACE_PREFER
- fixed analysis ASLR handling
* Tweak line-info output
* Removed some superfluous code
- causal/delay
- causal/selected_entry
* Exclude main.cold in function mode
* Update validate-perfetto-proto.py
- account for occasional http errors
* Add sampling test disabling tmp files
* argparser for process-causal-json
- support validation
- support filtering
* Avoid pthread_{lock,unlock} in sampling offload
- use homemade atomic_mutex/atomic_lock since contention will be low and using pthread tools might trigger our wrappers
* Rename process-causal-json.py
- validate-causal-json.py
* rework omnitrace_add_causal_test
- capable of performing validation
- added validation tests
* Fix kokkosp_begin_deep_copy + causal
* Tweak address range in bfd_line_info::read_pc
* Tweak analysis and data IP handling
- look for gaps
* Disable scaling experiment time by speedup
* Revert change in max threads during CI
* binary updates
- significant overhaul of binary analysis implementation
- removed "basic_line_info" and "bfd_line_info" in lieu of "symbol" class
- symbol class has basic BFD info + vector of inlines + vector of dwarf info
* Updated causal to use new binary analysis
- Fix symbol.cpp includes
* Updated formatting target
- include *.cmake files
* Updated causal tests
- causal tests should be stable now
* Update timemory and dyninst submodules
- TPLs are stripped + built w/o debug info
* Increase tolerance for causal validation speedups
- higher speedups have more variance (increased to +/- 5 from 3)
* Support causal output for MPI
- i.e. tag with MPI rank
* omnitrace-causal launcher argument
* improve experiment sampling output
* causal data updates
- call compute lines once
- fixed filtered cached binary info
- debugging info when experiment fails to start
* Tweaked causal validation tests
* dwarf_entry ranges
* CI updates
- increase max threads to 64
* Tweak causal E2E validation tests
- more threads
- shorter thread runtime
- more iterations
* Fix shadowed variable
* fix symbol read_bfd last PC calculation
* fix maybe-uninitialized warning
* omnitrace-causal launcher update
- only inject "omnitrace-causal --" once
- throw error if no matches found
* Update causal profiling docs for launcher
* fix address range boundaries
|
||
|
|
e1102a8ba4 |
CI and testing updates (#203)
* Python implementation of run-ci.sh * Container workflow update - retry failed container build to combat network failures * cpack workflow update - retry failed base container build to combat network failures * General CI workflow updates - retry failed "Install packages" step to combat network failures * Miscellanous linting fixes * Formatting workflow update - improve regex for source formatting * format user.h * Add new omnitrace-avail tests * Make run-ci.py executable * workflow retry fix - timeout_seconds -> retry_wait_seconds * Fix cmake formatting glob * source formatting * Handle PRs in run-ci.py * Specify timeout_minutes in retry steps * Remove remaining --cmake-args from workflows * CI warnings about using MPICH headers * Remove text=True from run-ci.py - not capturing stdout/sterr so unnecessary * Fix OpenSUSE step label * Update omnitrace-avail-write-config tests - use TWD (Test Working Directory) instead of PWD since PWD might not be build directory * paths-ignore + workflow_dispatch |
||
|
|
90ff7188f8 |
Crusher hackathon updates (#164)
- improved error handling in dyninst - improved error handling in omnitrace exe - new logging facility for omnitrace exe - improved backtraces - disable concurrent kernels in rocprofiler - updates `setup-env.sh` and modulefile - set `omnitrace_ROOT` - set `HSA_TOOLS_LIB` if roctracer or rocprofiler enabled - set `ROCP_TOOL_LIB` if rocprofiler enabled - closes #163 - No longer make setting `HSA_ENABLE_INTERRUPT=0` the default - this has performance implications - this was set to workaround a bug in ROCR which caused an ioctl call in ROCm to hang when interrupted. But it was only interrupted when realtime sampling was enabled since the CPU-clock doesn't increment when waiting - This bug should be fixed in ROCm 5.3 - omnitrace no longer activates a realtime sampler by default when sampling, thus this bug is no longer encountered unless the user explicitly triggers realtime sampling |
||
|
|
808ea7dfa7 |
Rework sampling and colorized logs (#140)
## Overview
This is a significant PR which has 3 very notable characteristics:
1. Omnitrace colorizes most of it's logging
2. Completely reworked the sampling
- Samples now record the current instruction pointers instead of strings
- This _dramatically_ decreases the overhead of taking a sample
- The collection of metrics during a sample are split out into another component, enabling that data collection to be disabled -- which decreases the sampling overhead even further
- When both `OMNITRACE_SAMPLING_CPUTIME` and `OMNITRACE_SAMPLING_REALTIME` are ON:
- `OMNITRACE_SAMPLING_CPUTIME_FREQ` and `OMNITRACE_SAMPLING_REALTIME_FREQ` can be used to individually control the sampling frequency
- `OMNITRACE_SAMPLING_CPUTIME_DELAY` and `OMNITRACE_SAMPLING_REALTIME_DELAY` can be used to individually control the delay time before starting
- Now, omnitrace does not start a real-time sampler on the main thread unless `OMNITRACE_SAMPLING_REALTIME` is ON
- In the future, an `OMNITRACE_SAMPLING_TIDS` (and real-time, cpu-time variants) configuration variable(s) will allow you to select which threads will be sampled
3. Files produced by `omnitrace` exe -- `available-instr.txt`, `instrumented-instr.txt`, etc. -- now no longer has `-instr` suffix and are placed in `instrumentation/` subfolder, i.e. `available-instr.txt` -> instrumentation/available.txt`
- This helped de-clutter the output folder
Most of the other edits were reorganization (e.g. internal namespace changes), cleanup, and splitting up functionality.
## Bug Fixes
There is a bug fix with respect to the HSA callbacks which disabled sampling on child threads when an HSA API call was made
## Details
- created thread_info struct for mapping different thread IDs
- reorganized file structure significantly
- added categories.hpp, concepts.hpp
- moved around name trait definitions
- moved all omnitrace components into `omnitrace::component` namespace
- there was a lot of inconsistency b/t using `tim::component` in some places and `omnitrace::component`
- added macros like OMNITRACE_DECLARE_COMPONENT in lieu of TIMEMORY_DECLARE_COMPONENT
- OMNITRACE_CRITICAL_TRACE_NUM_THREADS -> OMNITRACE_THREAD_POOL_SIZE
- roctracer and critical_trace use same thread pool
- critical_trace functions do not lock anymore bc of thread-local TaskGroup
- added `component::local_category_region` to support using `component::category_region` without explicitly passing in name
- removed `component::omnitrace` (unused)
- migrated KokkosP and OMPT to use `component::local_category_region`
- removed `component::user_region` as a result
- migrated omnitrace_{push,pop}_{trace,region}_hidden to use component::category_region
- removed `component::functors` as a result
- migrated some ppdefs
- `api::omnitrace` -> `project::omnitrace`
- `api::(...)` -> `category::(...)`
- improved recording the execution time of threads
- migrated this functionality out of pthread_create_gotcha and into thread_info
- moved mpi_gotcha, fork_gotcha, exit_gotcha, rcclp into omnitrace::component namespace
- split backtrace up into backtrace, backtrace_metrics, backtrace_timestamp components
- sampling.cpp handles setup and post-processing that was formerly in backtrace
- updated logging to use colors
- `OMNITRACE_COLORIZED_LOG` config variable
- updated docs on JSON output from timemory
- instrumentation info in instrumentation subfolder
- added testing for KokkosP entries
- added testing for ompt entries
- add_critical_trace function defined in critical_trace.hpp
- disable push_thread_state and pop_thread_state when thread state is Disabled or Completed
- add comp::page_rss to main bundle
- thread_data supports std::optional instead of std::unique_ptr
- thread_data supports tim::identity<T> to avoid unique_ptr or optional
- tracing::record_thread_start_time()
- tracing::push_timemory and tracing::pop_timemory are templated on CategoryT
- removed anonymous namespace from omnitrace::utility
- sampling backtrace stores instruction pointers instead of strings
- component::category_region updates
- handle disabled thread state
- handle finalized state
- fewer debug messages
- invoke thread_init()
- invoke thread_init_sampling()
- handle push/pop count based on category
- push/pop count only modified when used
- component::cpu_freq
- components/ensure_storage.hpp
- reworked the pthread_create replacement function
- updated parallel-overhead example to report # of times locked
- OMNITRACE_MAX_UNWIND_DEPTH build option
- update timemory submodule
|
||
|
|
afa3df8523 |
Advanced category for configuration options (#125)
Adds advanced category - advanced category hides less relevant configuration options - omnitrace-avail has new '--advanced' option which shows these flags - increase verbosity level to print issue with reading ppid children - OMNITRACE_ROCTRACER_HSA_ACTIVITY defaults to ON - OMNITRACE_ROCTRACER_HSA_API defaults to ON |
||
|
|
7e31d9f450 |
ROCm environment fixes + workflow updates (#117)
* Improve dlopen of ROCm libraries + rocprofiler test - Use PROJECT_BINARY_DIR in tests - Added rocprofiler test * Revert OMNITRACE_FORCE_ROCPROFILER_INIT * omnitrace-avail --all test * Fix ROCP_METRICS for ROCm 5.2.0 * Fix ROCP_METRICS for ROCm 5.2.0 * Restrict containers workflow to AMDResearch/omnitrace * Bump version to 1.3.1 * Update cpack workflow - generate release draft - upload installers as release assets * Test rocprofiler w/o roctracer enabled * Fix formatting * verbose message |
||
|
|
d27f22ea37 |
Sampling use SIGRTMIN + N signals (#104)
* Use SIGRTMIN instead of SIGALRM for sampling * Config options + fully working SIGRTMIN sampling - OMNITRACE_SAMPLING_KEEP_INTERNAL config option - OMNITRACE_PROCESS_SAMPLING_FREQ config option - OMNITRACE_SAMPLING_REALTIME config option - OMNITRACE_SAMPLING_CPUTIME config option - OMNITRACE_SAMPLING_REALTIME_OFFSET config option * Fix omnitrace-avail-regex-negation test - OMNITRACE_PROCESS_SAMPLING_FREQ was causing failure |
||
|
|
d04cbe862e |
fix omnitrace print-* with libraries (#94)
* fix omnitrace print-* with libraries * timemory submodule update * Update workflows to use ./bin/omnitrace instead of ./omnitrace * cmake format * update timemory submodule - fix ODR violations in utility/procfs * cmake updates - uniform find_package for all ROCm-based libraries * tweak transpose example - throw exception instead of std::exit * Inspect cmdv name before assuming not exe - some ELF execs "think" they are libraries so only assume rewrite + simulate + all-functions if filename looks like library - adds some test for --print-available -- <library> * Fix _has_lib_prefix when command is < 3 * Updates and reverts to omnitrace exe - update module_function operator< and operator== - add function_signature operator< - refactor module_function ctor - revert some previous changes w.r.t. simulate and include_unninstr * Fix source/bin/tests to use same output dir as tests * cmake format * Segfault mitigation + refactor + modify function iteration - refactor module_function ctor to avoid segfaults - string_t -> std::string - replace std::string with std::string_view in some places - get_name(module_t*) - get_name(procedure_t*) - disable using both app_modules and app_functions - new option: --parse-all-modules to iterate over app_modules - removed some unused code w.r.t. debug info * Disable module_function address range for uninstrumentable functions * Disable module_function address range for uninstrumentable functions * Refactored getting file/line info and init/fini - use dyninst insertInitCallback and insertFiniCallback if main not found - fixed all issues with segmentation faults in --simulate --all-functions * revert changes to Findrocprofiler.cmake |
||
|
|
1877ebf47b |
omnitrace-avail generate config (#69)
* Config updates - See PR #69 for details - change type of OMNITRACE_DL_VERBOSE - add "deprecated" category to OMNITRACE_ROCM_SMI_DEVICES - reduce size of perfetto shared memory size hint - deprecate OMNITRACE_OUTPUT_FILE in favor of OMNITRACE_PERFETTO_FILE - set papi event choices - read config file after reading command line - fix update of OMNITRACE_DL_VERBOSE - mark several settings as hidden - timemory update support hidden attribute for settings - rework get_perfetto_output_filename() - Hide settings from not available backends * Rework omnitrace-avail to support dumping configurations * Overwrite query, tests, output flag - Support using -O flag when dumping config - Support checking before overwriting existing config - Support --force to overwrite existing config - Fix get_component_info not including omnitrace components - Testing for dumping config * Update documentation on omnitrace-avail * Fix issue with timemory format + "/__w/" * Update output prefix keys docs * Rename --dump-config to --generate-config * Hide MPI related options - OMNITRACE_PERFETTO_COMBINE_TRACES and OMNITRACE_COLLAPSE_PROCESSES are hidden w/o MPI support |
||
|
|
8837b744ca |
Fixes excluded-instr output, fini functions, tweaks MPI (#50)
- fixes population of excluded_module_functions - omnitrace-compile-definitions have OMNITRACE_USE_MPI and OMNITRACE_USE_MPI_HEADERS - Do not enable mpi support if no full or partial MPI support - New option --all-functions |
||
|
|
f93ddc1ee5 |
Fix category regex + new features (#25)
* Fix category regex + new features - fixes issue with -R option - Supports --csv option - Supports --csv-separator option - Signal handler to dump logs - Tweak to component id strings display - Support regex negation * Tweak PASS_REGEX for new tests |
||
|
|
791375bb24 |
Code Coverage Support (#46)
* Code-coverage support
* Examples update
- code-coverage example
- tweak transpose and parallel-overhead
* Coverage output + testing
- config::get_setting value(...)
- REGULAR_EXPRESSION -> REGEX in cmake func args
- coverage.hpp header
- coverage JSON
- coverage tests
* cmake formatting
* Library instrumentation w/o main + more
- fixed library instrumentation w/o main
- use TIMEMORY_PROJECT_NAME in output messages
- removed '--driver' option from omnitrace exe
- support coverage in trace mode
- OMNITRACE_KOKKOS_KERNEL_LOGGER
- support multiple calls to omnitrace_set_env after init if already called
- support multiple calls to omnitrace_set_mpi after init if same args
- support multiple calls to omnitrace_init if same mode
- unique_ptr_t for thread_data which calls finalize when thread_data is destroyed
- tweaked openmp tests
- improved finalization
* Replace CI --output-on-failure with -V
* Fix to OMNITRACE_DL_INVOKE
* omnitrace-exe and testing updates
- omnitrace::omnitrace-timemory interface library
- support for configs in omnitrace exe
- print-{available,instrumented,...} opts no longer exit w/o --simulate
- all tests apply --print-instrumented functions
- tweaked coverage tests
- print-* options print instructions not address range
* Remove OMNITRACE_DEBUG_FINALIZE=ON from CI
* Python cmake tweaks
* Tweak test ordering
* Upload CI artifacts if fail or success
* CI Python tweaks
- Use OMNITRACE_PYTHON_PREFIX and OMNITRACE_PYTHON_ENVS
* CI ELFULTILS_DOWNLOAD_VERSION
* test tweaks
- labels and more coverage tests
* tweak to omnitrace --config handling
* Update module/function constraint handling + PP
- tweak pre-processor definition handling
- removed free-standing module_constraint
- remove free-standing routine_constraint
- remove module_name.find("omnitrace") module constraint
- fully handle the output path of omnitrace *-instr files
- get_use_code_coverage config option
- print-coverage option
- coverage_module_functions
* use github.job not github.name
* Re-enable HSA_ENABLE_INTERRUPT
- remove coverage address report
|
||
|
|
afa3edebab |
Python support (#37)
* Initial python support * Add python testing * Increase timeout for bin tests * cmake-format * Valid build types + testing + formatting + more - Enforce valid build types - Fix to numpy install - Increase testing timeout - Fix to cmake format glob - Fix to backtrace verbose * Disable stripping libraries by default * omnitrace exe updates - new '--print-instructions' option - changed format of instructions in JSON - remove no-save-fpr tests * Default to strip libraries when release build |
||
|
|
945f541965 |
Documentation + Miscellaneous Fixes (#36)
* Added documentation markdown source * Replaced AARInternal with AMDResearch in URLs * Renamed cpack artifact names * Fix to testing and lulesh submodule checkout * Docker updates * CMake and CPack - force CMAKE_INSTALL_LIBDIR to lib - CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME - CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME - Tweak LIBOMP_LIBRARY find in examples/openmp - Tweak setup-env.sh.in * Partial update of README - status badges - docs link - removed install info (covered by docs) * OMNITRACE_SAMPLING_CPUS setting - enables control over which CPUs are sampled for frequency * omnitrace exe updates - exclude transaction clone, virtual thunk, non-virtual thunk - module_function::start_address - module_function::instructions - verbosity > 0 encodes instructions into JSON * Miscellaneous fixes - relocate setup-env.sh.in - add modulefile.in - Updated README.md and source/docs/about.md - cmake fix for libomp - fix license in miscellaneous places - dl.hpp and dl.cpp * Update timemory and dyninst submodules - timemory signals updates - dyninst Movement-adhoc updates * cmake format |
||
|
|
138d16d16a |
Split workflows + docker usage (#31)
* Split workflows + docker usage * Fix omnitrace-ci-ubuntu-focal-external * fix env * Update path to action * fix entrypoint * Updated cancelling, disabled formatting * fix entrypoint * rework * try using container * relocate container * fix image name * shell expand * external and external-rocm * install libopenmpi-dev * remove github.workspace * github.workspace for rocm * Update bionic, etc. + docker CI * Remove self-hosted + bionic fix * GIT_DISCOVERY_ACROSS_FILESYSTEM for bionic * TIMEMORY_INSTALL_LIBRARIES + exe RPATH updates - fix RPATH for omnitrace, omnitrace-avail, and omnitrace-critical-trace * ubuntu bionic update * bionic and focal-dyninst-package updates * Disable lulesh MPI by default + timeouts - increase openmp CG timeout - decrease openmp CG runtime |
||
|
|
d80752bc69 |
User API + reorganized lib folders (#30)
* User API + reorganized lib folders - omnitrace_user_start_trace - omnitrace_user_stop_trace - omnitrace_user_start_thread_trace - omnitrace_user_stop_thread_trace - omnitrace_user_push_region - omnitrace_user_pop_region * New OpenMP examples/tests * Fix to KokkosP * OMPT support - fixed omnitrace instrumenting reporting - common invoke improvements - component::user_region * exclude kmp_threadprivate_ * Separate omnitrace into multiple files * PTL and timemory submodule updates * Active guards + USE_OMPT guards in omnitrace-dl * Tweak transpose default iterations * omnitrace-precommit build target * Omnitrace exe restructuring pt 2 - Never instrument functions with less than 4 instructions - Never instrument ompt_start_tool or nanosleep - module_function serializes heuristics - removed hash stuff from omnitrace - removed instr_procedures lambda - WAITPID_DEBUG_MESSAGE * set_state, "_hidden" fix, CI exceptions, backtrace fix - set_state function - fixed "_hidden" from appearing in print macros using __FUNCTION__ - OMNITRACE_CI_THROW - more CI checks in library - fixed backtrace init value sample issue being ignored * Tweaks to OMPT tests * cmake-formatting * Removed debug output from backtrace processing * Fix warnings and verbosity * omnitrace-dl fix for libomp * omnitrace-avail fixes - remove second omnitrace_init_library call - fix -r option not working * Additional testing - source/bin/tests - tests for omnitrace-exe - tests for omnitrace-avail * cmake-format * Reduce runtime of openmp-lu * Update openmp-lu and tests timeout * openmp-lu and CI tweaks - decrease iterations - OMP_NUM_THREADS=2 - install clang and libomp-dev in linux-ci - fix data-files in linux-ci |