Grafico dei commit

4 Commit

Autore SHA1 Messaggio Data
Jonathan R. Madsen 3151dd3aeb Created push/pop system for whether sampling is enabled (#27)
- also permitted turning off sampling in sampling mode
- also fixed ambiguous rocm_smi namespace issue in roctracer
2022-02-25 05:33:59 -06:00
Jonathan R. Madsen 0d5c557552 Stability improvements (#26)
* omnitrace verbprintf and errprintf

* avail categories fix

* omnitrace-dl namespace

* OMNITRACE_CI macro / OMNITRACE_BUILD_CI option

- always enables asserts

* Roctracer improvements

- Reworked roctracer significantly
- Added categories to settings
- create_cpu_cid_entry
- handle clock_skew in roctracer
- fixed roctracer activity names
- hip_api_callback is "host"
- perfetto::Flow for GPU

* timemory submodule update

* Tweak to redirect

* Improved recursive guards

- functors component
- created "_hidden" variants of instrumentation funcs
  - omnitrace_* calls omnitrace_*_hidden
  - omnitrace-dl calls non-hidden
- omnitrace-dl now strongly protects against recursion
- omnitrace-dl now is standalone w.r.t. headers

* Stability fixes
- OMNITRACE_DEBUG_PUSH env variable
- fix to HSA_TOOLS_LIB in dl.cpp
- Fixed SFINAE warning in mpi_gotcha
- Handle 64, _l, _r extensions in whole function names

* cmake formatting

* Fix for last commit + push/pop count info

- don't instrument rocr::core::Signal::WaitAny
- don't instrument rocr::core::Runtime::AsyncEventsLoop
- fixed main not being popped in runtime instrument
- updated interval data reserve
- copy hash-ids and aliases onto main thread
- warn about unclosed regions
- removed guards in libomnitrace
- added error checks for incorrect push_count vs. pop_count
- fixed missing pop_timemory in last commit

* Finalization methodology updates

- added some more rocr:: functions to whole function names

* Add event_base_loop to whole functions

* Update VERSION to 0.1.0
2022-02-25 03:56:41 -06:00
Jonathan R. Madsen 145a6ae06f omnitrace-dl-library (#25)
* timemory submodule update

* Visibility, setting categories, and task-group protection

- OMNITRACE_VISIBILITY instead of TIMEMORY_VISIBILITY
- increased task group data-race protection
- add omnitrace categories to settings

* set component_apis type-trait

* omnitrace-dl-library implementation

- this library dlopen + dlsym's libomnitrace
- significantly reduces the instrumentation time

* omnitrace-avail categories

- suppress AVAILABLE column when --available

* omnitrace-exe update

- uses omnitrace-dl
- adds --print-excluded option
- removes --jump option
- comments out --stubs option
- removes --stdlib option
- support for C++ STL functions not in libstdc++
- tweak the --print-* outputs
- significantly refactors instrument_module and instrument_entity
- removes unused c_stdlib_module_constraint
- removes unused c_stdlib_function_constraint
- decreases get_whole_function_names() coverage

* library.cpp updates

- OMNITRACE_DEBUG -> OMNITRACE_DEBUG_F
- omnitrace_finalize sets state earlier
- omnitrace_finalize clears push/pop functors
- increased tasking shutdown safety

* - fix critical-trace thread hierarchy
- signal handler calls omnitrace_finalize
- get_cpu_cid_stack supports parent tid
- interval data reserves
- omnitrace-avail serialization support for module_functions
- omnitrace --simulate option
- omnitrace --print-format option
- omnitrace --load-instr option
- omnitrace runtime-inst doesn't oneTimeCode
- updated regex
- expand get_whole_function_names()
- Test Install CI update

* fixes to last commit

- expand get_whole_function_names()
- ignore sig c modules
- kill process in signal handler

* Remove RTLD_DEEPBIND + more

- removed use of RTLD_DEEPBIND
  - causes dyninst segfaults
- fixed signal handling
- updated timemory submodule

* Build/link static timemory libraries

* omnitrace --{module,function}-restrict option

- Added restrict regex options
- Reworked handling of regex options
- Reworked reporting of module/function skipping
- Handle -o w/o file specified

* timemory-avail

- category views
- backtrace::sample checks state

* get_debug_sampling()
2022-02-23 06:59:32 -06:00
Jonathan R. Madsen b016c8929f Critical trace updates (#24)
* Source code restructuring

* Critical trace updates following restructuring

* thread_sampler, timestamps

- thread_sampler
- CPU frequency managed via thread_sampler
- rocm-smi managed via thread_sampler
- Use consistent timestamps for perfetto
- removed hsa_timer_t in favor of wall_clock::record()
- disable KokkosP by default
- re-enable critical-trace testing

* cmake-format

* Fix for defines.hpp.in

* Remove OMNITRACE_ROCM_SMI_FREQ

- thread_sampler freq is set via OMNITRACE_SAMPLING_FREQ w/ max of 1000

* Increase CI Install Dyninst timeout

* Debug macros + omnitrace_init_tooling + config

- new debug macros
- extern "C" omnitrace_init_tooling
- guard get_rocm_smi_devices

* Miscellaneous tweaks

- tweak to transpose
- critical_trace::Device::ANY
- perfetto "critical-trace" category
- OMNITRACE_VERBOSE usage

* Disable key and tid data for HIP API calls

- non-kernels are ignored in activity callback

* critical-trace exe updates

- fix perfetto generation
- improved logging
- improved readability

* timemory submodule update

- lulesh example cmake tweaks
2022-02-19 02:00:59 -06:00