9618ddefba
* Addition of basic structure
* Reworked categories
* More causal integration additions
* Causal implementation
* Update examples
* delete virtual_speedup files
* Update perfetto submodule to v31.0
* Update dyninst submodule
* Update timemory submodule
* ElfUtils build for libdw
* OMNITRACE_LIKELY and OMNITRACE_UNLIKELY
* Update common lib join
* Examples updates for causal profiling
* config updates with causal options
- OMNITRACE_CAUSAL_FIXED_LINE
- OMNITRACE_CAUSAL_FIXED_SPEEDUP
- OMNITRACE_CAUSAL_FILE
- OMNITRACE_CAUSAL_BINARY_SCOPE
- OMNITRACE_CAUSAL_SOURCE_SCOPE
- version info in banner
- support increments in parse_numeric_range
- fix occasional deadlock in first call to get_config
* PTL general task group
* Always include PID in debug/verbose messages
* Add blocking/unblocking gotchas to runtime init bundle
* CausalState
* thread_data updates
- generic component_bundle_cache
* Improve handling of causal in category_region
* components updates
- backtrace_causal component
- backtrace::get_data member func
- decrease ignore_depth in backtrace::sample(int)
- handle "omnitrace_main" in backtrace::filter_and_patch(...)
- tweak internal thread state scope for pthread_mutex_gotcha wrappers
* simplify tracing get_instrumentation_bundles usage
* sampling updates
- include backtrace_causal component
- disable backtrace_metrics if using causal and not using perfetto
- disable backtrace and backtrace_timestamp when using causal
- post_process_causal
* causal updates
- more checks in blocking_gotcha and unblocking_gotcha start/stop
- miscellaneous overhaul of data
- experiment update
* Remove virtual speedup
* libomnitrace code_object
* causal-profiling test
* libomnitrace library.cpp updates
- handle causal profiling
- fini_bundle
* Disable causal profiling by default
* Updated causal code and example
- example: three execution variants: cpu + rng, cpu, rng
- example: three instrumentation variants: none, omni, coz
- fix blocking gotcha credit
- rework perform_experiment_impl
- get_eligible_address_ranges
- compute_eligible_lines
- support fixed lines/speedups/functions
- update selected_entry to support function mode
- fix causal::delay
- experiment updates
* omnitrace_progress / omnitrace_user_progress
- with accompanying omnitrace_annotated_progress / omnitrace_user_annotated_progress
* Update timemory submodule
* CausalMode
- mode indicated whether causal predictions source be at line-level or function-level
* code_object, config, runtime, sampling, thread_data
- code_object: address_range
- code_object: basic::line_info serialize(), name(), hash()
- config updates
- two signals for causal sampling
- thread_data init fixes
* pthread updates
- pthread_create_gotcha processes delays
- pthread_mutex_gotcha does not wrap pthread_join in causal mode
* backtrace_causal update
- dynamic delay period stats
* main wrapper uses basename of argv[0]
* update elfio submodule
* perf support (currently unused)
* Fix experiment JSON serialization
- static_vector.hpp (unused)
* causal executable + config options updates
- omnitrace-causal exe simplifies running multiple causal configs
- changed the causal config option names
* Support both throughput and latency points
* process-causal-json.py script
- will be used later for testing
* stable_vector
* Rework thread_data
* Improve omnitrace-causal exe
- better verbosity handling
- correct diagnosis of status for child process
- execvpe when only one iteration (debugging)
* Update timemory submodule
* exe --version
- omnitrace, omnitrace-avail, and omnitrace-sample all support --version on command-line
* OMNITRACE_INTERNAL_API + OMNITRACE_{LIKELY,UNLIKELY}
* omnitrace-causal cmake format
* omnitrace config update
- OMNITRACE_CAUSAL_FILE_CLOBBER
* custom exception
- wraps STL exception and gets stacktrace during construction
* exit_gotcha supports _Exit
* use global construct_on_init + max threads
- add some safety when exceeding max # of threads
* update code_object binary filter
- exclude dyninst and tbbmalloc library
* containers: c_array, static_vector, stable_vector
- moved utility::c_array to container::c_array
- created static_vector: std::vector bound to std::array
- created stable_vector: vector with stable references
* grow thread_data when new thread created
* causal updates
- data: improve compute_eligible_lines to ignore lambdas
- data: use new thread_data
- delay: use new thread_data
- experiment: properly support latency points
- experiment: support file clobber
- experiment: ensure non-zero experiment time
- progress_point: use new thread_data
- backtrace_causal: use new thread_data
* Update causal-profiling tests
* fix omnitrace-causal backslash escaping
* process-causal-json script
* restructure causal implementation
- update verbose messages for omnitrace-causal diagnose_status
- migrated causal implementation in sampling.cpp to causal/sampling.cpp
- OMNITRACE_USE_CAUSAL does not require OMNITRACE_USE_SAMPLING
- added Mode::Causal
- causal sampling uses same signals as regular sampling
- moved tracing::thread_init to implementation file
- combined tracing::thread_init and tracing::thread_init_sampling
- added causal/components folder
- pthread_create_gotcha::wrapper_config
- omnitrace_preload checks OMNITRACE_USE_CAUSAL
- updates mode accordingly
* update timemory submodule
* update timemory submodule
* causal example updates
- causal for lulesh
* perf code + utility - helpers
- relocated causal perf code
- placement new when generating unique ptr trait for potentially allocating during sampling
- additions to utility header
- removed previously added helpers.hpp
* update timemory submodule
* Default env variables for omnitrace-causal
- activate OMNITRACE_USE_KOKKOSP, etc.
* update stable_vector and static_vector
- static vector can use atomic for size tracking for thread-safe situations
* update causal example header
- CAUSAL_PROGRESS_NAMED
- use CAUSAL_ prefix for some macros
* Tweak lulesh example
- use CAUSAL_PROGRESS instead of CAUSAL_BEGIN and CAUSAL_END
* omnitrace-sample support for causal mode
- set OMNITRACE_USE_SAMPLING to off when OMNITRACE_MODE=causal
* refactor and cleanup code_object
- scope filter
- fixes to address_range
* overhaul causal data + causal config options
- full support for function and line mode
- support static vector of instruction pointers
- improve line info mapping resolution
- remove thread-locality from miscellanous functions where unnecessary
- causal options for {binary,source,function,fileline} exclusion
* causal experiment, sampling, and backtrace updates
- is_selected + unwind address array
- experiment warning about progress points
- increased buffer size for backtrace_casual sampler
- backtrace_causal only stores IP addresses instead of full unwind info
* category_region updates
- minor refactor
- local_category_region::mark
* Update causal tests
* Bump version to 1.8.0
* omnitrace-causal args + CLOBBER -> RESET
- renamed OMNITRACE_CAUSAL_FILE_CLOBBER to OMNITRACE_CAUSAL_FILE_RESET
- updated omnitrace-causal exe to support recently added configuration options
- other miscellaneous tweaks to data.cpp, experiment.cpp, and sampling.cpp
* Refactor causal and code_object
- code_object.hpp and code_object.cpp moved into binary folder
- causal components namespaced into omnitrace::causal::component
- moved sample_data out of backtrace_causal and into own file
- renamed backtrace_causal to causal::component::backtrace
* preload omnitrace_init + OMNITRACE_DEBUG_MARK
- env OMNITRACE_DEBUG_MARK
- fix omnitrace_init call when LD_PRELOAD-ing omnitrace
* Fix fileline support + line-info output names + experiment log
- line-info log files are prefixed with experiment name
- don't print experiment duration when E2E
- account for fileline scope in analysis
* KokkosP: OMNITRACE_KOKKOSP_NAME_LENGTH_MAX
- config option to limit the name of kokkos tool callbacks
- remove [kokkos] from KokkosP names
* Update causal example
- minor tweaks to decrease probability of overlapping regions in binary
* omnitrace-causal update
- prefix N / Ntot in environment printout
* Miscellaneous updates
- causal::finish_experimenting()
- OMNITRACE_CAUSAL_RANDOM_SEED
- KokkosP causal updates
- exclude some callbacks, make some callbacks unique, etc.
- address_range::operator+=(address_range)
- combine contiguous ranges in binary/analysis.cpp when file, func, line is same and address range is contiguous
- bfd_line_info reads inline info
- wait for perform_experiment_impl to complete
- causal::delay updates
- delay::process checks if experiment is active
- uses threading::get_id()
- experiment scales duration up for larger speedup experiments
- line info samples includes excluded lines
- sampler uses CLOCK_REALTIME
- blocking_gotcha updates
- is no longer fully static
- adds audit routine which sets the postblock value to zero if try/timed routine fails
- category::host was added to causal_throughput_categories_t
- pthread_create_gotcha sets new threads local parent delay
- was using internal value, now uses sequent value
* Causal improvements to KokkosP
* Updates to experiment time scaling
- use stats instead of just max
* binary/link_map.{hpp,cpp}
* update process-causal-json.py
* Folded fileline scope into source scope
* Update documentation
- Add documentation for causal profiling
- Replace 'Omnitrace' with 'OmniTrace' everywhere
* Update causal-helpers.cmake + omnitrace-testing.cmake
- split tests/CMakeLists.txt partially into omnitrace-testing.cmake
* omnitrace/causal.h
- OMNITRACE_CAUSAL_PROGRESS
- OMNITRACE_CAUSAL_PROGRESS_NAMED
- OMNITRACE_CAUSAL_BEGIN
- OMNITRACE_CAUSAL_END
* selected_entry + remove default filters for lambdas and operator()
- selected entry stores range and binary load address
* update process-causal-json.py
* format examples/lulesh/CMakeLists.txt
* causal-helpers find_package(Threads)
* OMNITRACE_KOKKOSP_KERNEL_LOGGER
- was OMNITRACE_KOKKOS_KERNEL_LOGGER
* quiet find of coz-profiler
* Fix rocm_smi exception handling
* Update timemory submodule (binutils)
- fix binutls compile error on some systems
- bump binutils to v2.40
* Fix miscellaneous tests
* OMNITRACE_KOKKOSP_PREFIX
* revert rocm_smi handling
* ElfUtils updates
- default to download version 0.188
- add -Wno-error=null-dereference due to GCC 12 compiler error
* Update causal example
* Remove OMNITRACE_VERBOSE from global workflow envs
* Reliable causal test
* disable compilation of causal perf files
* Remove set_current_selection with unwind stack
* update timemory submodule
* fix for segfault on bionic
- locking in TLS dtor was causing segfault
* remove experiment::is_selected(unwind_stack_t)
* update default init of selected_entry
* Fix for when IP is not offset by load address
* Update CMakeLists.txt
* Miscellaneous updates
- OMNITRACE_WARNING_OR_CI_THROW
- OMNITRACE_REQUIRE
- OMNITRACE_PREFER
- fixed issues with no ASLR
- added load address variable and ipaddr() func to basic/bfd line info
- removed get_basic() from dwarf_line_info
- TIMEMORY_PREFER -> OMNITRACE_PREFER
- removed previously added binary_address and range variables from selected_entry
* Removed superfluous CausalState
* Additional causal tests (lulesh + kokkos)
* filter, prefer, analysis ASLR handling
- removed default filter on cold functions
- fixed OMNITRACE_PREFER
- fixed analysis ASLR handling
* Tweak line-info output
* Removed some superfluous code
- causal/delay
- causal/selected_entry
* Exclude main.cold in function mode
* Update validate-perfetto-proto.py
- account for occasional http errors
* Add sampling test disabling tmp files
* argparser for process-causal-json
- support validation
- support filtering
* Avoid pthread_{lock,unlock} in sampling offload
- use homemade atomic_mutex/atomic_lock since contention will be low and using pthread tools might trigger our wrappers
* Rename process-causal-json.py
- validate-causal-json.py
* rework omnitrace_add_causal_test
- capable of performing validation
- added validation tests
* Fix kokkosp_begin_deep_copy + causal
* Tweak address range in bfd_line_info::read_pc
* Tweak analysis and data IP handling
- look for gaps
* Disable scaling experiment time by speedup
* Revert change in max threads during CI
* binary updates
- significant overhaul of binary analysis implementation
- removed "basic_line_info" and "bfd_line_info" in lieu of "symbol" class
- symbol class has basic BFD info + vector of inlines + vector of dwarf info
* Updated causal to use new binary analysis
- Fix symbol.cpp includes
* Updated formatting target
- include *.cmake files
* Updated causal tests
- causal tests should be stable now
* Update timemory and dyninst submodules
- TPLs are stripped + built w/o debug info
* Increase tolerance for causal validation speedups
- higher speedups have more variance (increased to +/- 5 from 3)
* Support causal output for MPI
- i.e. tag with MPI rank
* omnitrace-causal launcher argument
* improve experiment sampling output
* causal data updates
- call compute lines once
- fixed filtered cached binary info
- debugging info when experiment fails to start
* Tweaked causal validation tests
* dwarf_entry ranges
* CI updates
- increase max threads to 64
* Tweak causal E2E validation tests
- more threads
- shorter thread runtime
- more iterations
* Fix shadowed variable
* fix symbol read_bfd last PC calculation
* fix maybe-uninitialized warning
* omnitrace-causal launcher update
- only inject "omnitrace-causal --" once
- throw error if no matches found
* Update causal profiling docs for launcher
* fix address range boundaries
202 baris
6.5 KiB
C++
202 baris
6.5 KiB
C++
// MIT License
|
|
//
|
|
// Copyright (c) 2022 Advanced Micro Devices, Inc. All Rights Reserved.
|
|
//
|
|
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
// of this software and associated documentation files (the "Software"), to deal
|
|
// in the Software without restriction, including without limitation the rights
|
|
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
// copies of the Software, and to permit persons to whom the Software is
|
|
// furnished to do so, subject to the following conditions:
|
|
//
|
|
// The above copyright notice and this permission notice shall be included in all
|
|
// copies or substantial portions of the Software.
|
|
//
|
|
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
// SOFTWARE.
|
|
|
|
#include "library/binary/dwarf_entry.hpp"
|
|
#include "library/binary/fwd.hpp"
|
|
#include "library/timemory.hpp"
|
|
#include "library/utility.hpp"
|
|
|
|
#include <dwarf.h>
|
|
#include <elfutils/libdw.h>
|
|
|
|
namespace omnitrace
|
|
{
|
|
namespace binary
|
|
{
|
|
namespace
|
|
{
|
|
using utility::combine;
|
|
|
|
auto
|
|
get_dwarf_address_ranges(Dwarf_Die* _die)
|
|
{
|
|
auto _ranges = std::vector<address_range>{};
|
|
|
|
if(dwarf_tag(_die) != DW_TAG_compile_unit) return _ranges;
|
|
|
|
Dwarf_Addr _low_pc;
|
|
Dwarf_Addr _high_pc;
|
|
dwarf_lowpc(_die, &_low_pc);
|
|
dwarf_highpc(_die, &_high_pc);
|
|
|
|
_ranges.emplace_back(address_range{ _low_pc, _high_pc });
|
|
|
|
Dwarf_Addr _base_addr;
|
|
ptrdiff_t _offset = 0;
|
|
do
|
|
{
|
|
_ranges.emplace_back(address_range{ 0, 0 });
|
|
} while((_offset = dwarf_ranges(_die, _offset, &_base_addr, &_ranges.back().low,
|
|
&_ranges.back().high)) > 0);
|
|
// will always have one extra
|
|
_ranges.pop_back();
|
|
|
|
return _ranges;
|
|
}
|
|
|
|
auto
|
|
get_dwarf_entry(Dwarf_Die* _die)
|
|
{
|
|
auto _line_info = std::deque<dwarf_entry>{};
|
|
|
|
if(dwarf_tag(_die) != DW_TAG_compile_unit) return _line_info;
|
|
|
|
Dwarf_Lines* _lines = nullptr;
|
|
size_t _num_lines = 0;
|
|
if(dwarf_getsrclines(_die, &_lines, &_num_lines) == 0)
|
|
{
|
|
_line_info.resize(_num_lines);
|
|
for(size_t j = 0; j < _num_lines; ++j)
|
|
{
|
|
auto& itr = _line_info.at(j);
|
|
auto* _line = dwarf_onesrcline(_lines, j);
|
|
if(_line)
|
|
{
|
|
int _lineno = 0;
|
|
uintptr_t _address = 0;
|
|
dwarf_lineno(_line, &_lineno);
|
|
dwarf_linecol(_line, &itr.col);
|
|
dwarf_linebeginstatement(_line, &itr.begin_statement);
|
|
dwarf_lineendsequence(_line, &itr.end_sequence);
|
|
dwarf_lineblock(_line, &itr.line_block);
|
|
dwarf_lineepiloguebegin(_line, &itr.epilogue_begin);
|
|
dwarf_lineprologueend(_line, &itr.prologue_end);
|
|
dwarf_lineisa(_line, &itr.isa);
|
|
dwarf_linediscriminator(_line, &itr.discriminator);
|
|
dwarf_lineaddr(_line, &_address);
|
|
itr.address = address_range{ _address };
|
|
if(_lineno > 0) itr.line = _lineno;
|
|
const auto* _file = dwarf_linesrc(_line, nullptr, nullptr);
|
|
if(!_file) _file = dwarf_diename(_die);
|
|
itr.file = filepath::realpath(_file, nullptr, false);
|
|
}
|
|
}
|
|
}
|
|
|
|
return _line_info;
|
|
}
|
|
} // namespace
|
|
|
|
bool
|
|
dwarf_entry::operator<(const dwarf_entry& _rhs) const
|
|
{
|
|
return std::tie(address, line, col, discriminator) <
|
|
std::tie(_rhs.address, _rhs.line, _rhs.col, _rhs.discriminator);
|
|
}
|
|
|
|
bool
|
|
dwarf_entry::operator==(const dwarf_entry& _rhs) const
|
|
{
|
|
return std::tie(address, line, col, discriminator, vliw_op_index, isa, file) ==
|
|
std::tie(_rhs.address, _rhs.line, _rhs.col, _rhs.discriminator,
|
|
_rhs.vliw_op_index, _rhs.isa, _rhs.file);
|
|
}
|
|
|
|
bool
|
|
dwarf_entry::operator!=(const dwarf_entry& _rhs) const
|
|
{
|
|
return !(*this == _rhs);
|
|
}
|
|
|
|
bool
|
|
dwarf_entry::is_valid() const
|
|
{
|
|
return (*this != dwarf_entry{} && !file.empty());
|
|
}
|
|
|
|
std::deque<dwarf_entry>
|
|
dwarf_entry::process_dwarf(int _fd, std::vector<address_range>& _ranges)
|
|
{
|
|
auto* _dwarf_v = dwarf_begin(_fd, DWARF_C_READ);
|
|
auto _line_info = std::deque<dwarf_entry>{};
|
|
|
|
size_t cu_header_size = 0;
|
|
Dwarf_Off cu_off = 0;
|
|
Dwarf_Off next_cu_off = 0;
|
|
for(; dwarf_nextcu(_dwarf_v, cu_off, &next_cu_off, &cu_header_size, nullptr, nullptr,
|
|
nullptr) == 0;
|
|
cu_off = next_cu_off)
|
|
{
|
|
Dwarf_Off cu_die_off = cu_off + cu_header_size;
|
|
Dwarf_Die cu_die;
|
|
if(dwarf_offdie(_dwarf_v, cu_die_off, &cu_die) != nullptr)
|
|
{
|
|
Dwarf_Die* _die = &cu_die;
|
|
if(dwarf_tag(_die) == DW_TAG_compile_unit)
|
|
{
|
|
combine(_line_info, get_dwarf_entry(_die));
|
|
combine(_ranges, get_dwarf_address_ranges(_die));
|
|
}
|
|
}
|
|
}
|
|
|
|
dwarf_end(_dwarf_v);
|
|
utility::filter_sort_unique(_line_info);
|
|
utility::filter_sort_unique(_ranges);
|
|
|
|
return _line_info;
|
|
}
|
|
|
|
template <typename ArchiveT>
|
|
void
|
|
dwarf_entry::serialize(ArchiveT& ar, const unsigned int)
|
|
{
|
|
#define OMNITRACE_SERIALIZE_MEMBER(MEMBER) ar(::tim::cereal::make_nvp(#MEMBER, MEMBER));
|
|
|
|
OMNITRACE_SERIALIZE_MEMBER(file)
|
|
OMNITRACE_SERIALIZE_MEMBER(line)
|
|
OMNITRACE_SERIALIZE_MEMBER(col)
|
|
OMNITRACE_SERIALIZE_MEMBER(address)
|
|
OMNITRACE_SERIALIZE_MEMBER(discriminator)
|
|
// OMNITRACE_SERIALIZE_MEMBER(begin_statement)
|
|
// OMNITRACE_SERIALIZE_MEMBER(end_sequence)
|
|
// OMNITRACE_SERIALIZE_MEMBER(line_block)
|
|
// OMNITRACE_SERIALIZE_MEMBER(prologue_end)
|
|
// OMNITRACE_SERIALIZE_MEMBER(epilogue_begin)
|
|
// OMNITRACE_SERIALIZE_MEMBER(vliw_op_index)
|
|
// OMNITRACE_SERIALIZE_MEMBER(isa)
|
|
}
|
|
|
|
template void
|
|
dwarf_entry::serialize<cereal::JSONInputArchive>(cereal::JSONInputArchive&,
|
|
const unsigned int);
|
|
|
|
template void
|
|
dwarf_entry::serialize<cereal::MinimalJSONOutputArchive>(
|
|
cereal::MinimalJSONOutputArchive&, const unsigned int);
|
|
|
|
template void
|
|
dwarf_entry::serialize<cereal::PrettyJSONOutputArchive>(cereal::PrettyJSONOutputArchive&,
|
|
const unsigned int);
|
|
} // namespace binary
|
|
} // namespace omnitrace
|