Граф коммитов

14 Коммитов

Автор SHA1 Сообщение Дата
Jonathan R. Madsen a2dcc7381b Unified setup_environ b/t libomni and libomni-dl (#86)
- omnitrace::common::setup_environ provides a standardized exporting of relevant env variables
2022-07-18 01:52:21 -05:00
Jonathan R. Madsen 4208b5654c GPU HW Counters via rocprofiler (#84)
* Initial support for GPU hardware counters

* Update find modules for roctracer and rocprofiler

- /opt/rocm/{rocprofiler,roctracer} path is deprecated so tweak search procedure

* Improve ConfigCPack for MPI

* Update rocprofiler

- rocm_metrics()
- minor cleanup

* Update rocm find modules

* declare rocm_metrics + call in omnitrace-avail

* relocate omnitrace-launch-compiler

* REALPATH and find_modules

* Examples cmake (may drop)

* omnitrace-avail

- hw_counter categories
- init rocm

* setenv updates for rocprofiler in library.cpp and dl.cpp

* get_rocm_events config

* gpu::hip_device_count()

* rocm_metrics returns hardware_counters::info

* - relocated library/components/roctracer_callbacks.* to library/roctracer.*
- relocated library/components/rocprofiler.* to library/rocprofiler.*
- cleaned up rocprofiler.hpp
- added perfetto output of rocprofiler
- added timemory output of rocprofiler
- renamed omni.roctracer thread to roctracer.hip
- added roctracer.hsa thread name
- updated timemory submodule to support std::variant
- updated timemory submodule to support = in config value
- updated timemory submodule to support standalone storage
- updated timemory submodule to support new hw counter apis
- updated timemory submodule to prevent label/description caching in data_tracker

* update omnitrace-avail info_type generation

* Update timemory submodule

* rocprofiler component

* cmake formatting

* omnitrace-avail handle no GPUs

- Add -c command-line option for --categories
- support verbosity

* hsa_rsrc_factory throws exceptions

- throw exceptions to avoid aborting on HSA_STATUS_ERROR_NOT_INITIALIZED when advantageous
- removed duplicate specialization of is_available for component::rocprofiler

* rocprofiler symbols for when disabled

* Fix warning in omnitrace-avail

- std::stringstream from initializer list would use explicit constructor

* Fix finalization after settings are deleted

* Reorganized rocprofiler source

* Updated formatting

* Miscellaneous tweaks

- added using statements from timemory
- tweaked the main and thread bundle names
- fixed timemory header includes
2022-07-17 21:52:09 -05:00
Jonathan R. Madsen 1877ebf47b omnitrace-avail generate config (#69)
* Config updates

- See PR #69 for details

- change type of OMNITRACE_DL_VERBOSE
- add "deprecated" category to OMNITRACE_ROCM_SMI_DEVICES
- reduce size of perfetto shared memory size hint
- deprecate OMNITRACE_OUTPUT_FILE in favor of OMNITRACE_PERFETTO_FILE
- set papi event choices
- read config file after reading command line
- fix update of OMNITRACE_DL_VERBOSE
- mark several settings as hidden
- timemory update support hidden attribute for settings
- rework get_perfetto_output_filename()
- Hide settings from not available backends

* Rework omnitrace-avail to support dumping configurations

* Overwrite query, tests, output flag

- Support using -O flag when dumping config
- Support checking before overwriting existing config
- Support --force to overwrite existing config
- Fix get_component_info not including omnitrace components
- Testing for dumping config

* Update documentation on omnitrace-avail

* Fix issue with timemory format + "/__w/"

* Update output prefix keys docs

* Rename --dump-config to --generate-config

* Hide MPI related options

- OMNITRACE_PERFETTO_COMBINE_TRACES and OMNITRACE_COLLAPSE_PROCESSES are hidden w/o MPI support
2022-06-28 01:36:04 -05:00
Jonathan R. Madsen 5105e2c94f tracing NS + category region component + MPI args (#52)
tracing NS + category region component

- made library.cpp impl more broadly available
- support for perfetto args
- MPI wrappers encode args and return type
- new categories / perfetto categories
- omnitrace_library category -> libomnitrace
- omnitrace_dl_library -> libomnitrace-dl
2022-06-24 16:08:06 -05:00
Jonathan R. Madsen 27e4e82376 Deprecate omnitrace use thread sampling (#68)
* Deprecate OMNITRACE_USE_THREAD_SAMPLING

* Reworked config based on OMNITRACE_MODE

- config::set_default_setting_value(...)
- config::get_mode() is now dynamically deduced
- moved tweaking defaults from library.cpp to config::configure_mode_settings(...)
- timemory submodule update fixing vsetting issue

* runtime.md update

* revert accidental lambda name change

* Reintroduce (deprecated) OMNITRACE_ROCM_SMI_DEVICES

- add handle_deprecated_setting(...) for this deprecated setting
2022-06-24 15:03:15 -05:00
Jonathan R. Madsen a142b2029d Fix the main library stop routine for timemory (#39)
* Fix the main library stop routine for timemory

- the main pop_timemory function was popping too many calls
- this primarily affected recursive calls

* Lengthen the timeout for the Configure CMake step

* Fix python tests

- new validate-timemory-json.py script

* Documentation update

- Call-counts in timemory output examples in documentation were affected by the changes

* Fix the per-thread metrics during finalization

- pthread_create_mutex starts/stops the per-thread data
- removed unintentional continue statement

* Docs tweaks

* Fix lap counter on per-thread metrics
2022-06-13 15:57:44 -05:00
Jonathan R. Madsen 90ab7a89fc Fix sampling counter time scales (#33)
* Fix sampling counter time scales

- All perfetto trace events have "begin_ns" and "end_ns" debug fields
- data for thread start and end timestamp in pthread_create_gotcha
- discard samples outside of thread start and end timestamps
- rename "CPU User CPU Time" perfetto counter to "CPU User Time"
- rename "CPU Kernel CPU Time" perfetto counter to "CPU Kernel Time"
- ensure CPU system samples in perfetto are set to zero at end
- backtrace uses comp::wall_clock record() for timestamps (consistency)
- "Peak Memory Usage [Thread X] (S)" renamed to "Thread Peak Memory Usage [X] (S)"
- "Context Switches [Thread X] (S)" renamed to "Thread Context Switches Usage [X] (S)"
- "Page Faults [Thread X] (S)" renamed to "Thread Page Faults Usage [X] (S)"
- "<PAPI_DESC> [Thread X] (S)" renamed to "Thread <PAPI_DESC> [X] (S)"
- samples

* Fix includes
2022-06-10 08:35:39 -05:00
Jonathan R. Madsen 6af5b2a7e2 clang-tidy (#9)
- Fixed some clang-tidy warnings
- Fixed issue with omnitrace_launch_compiler + clang-tidy
2022-05-25 14:18:55 -05:00
Jonathan R. Madsen ee67748042 Fix for empty perfetto output (#7)
Fix to perfetto config

- Erroneously replaced data_sources config "track_event" with "omnitrace"
- Using "omnitrace" resulted in empty perfetto output files
2022-05-25 00:35:02 -05:00
Jonathan R. Madsen 353e8eeb69 Critical trace updates (#6)
* critical trace updates

- better handling of OMNITRACE_USE_PERFETTO in omnitrace-critical-trace exe
- changed some data types in `critical_trace::entry`
- added device ids to critical trace entries
- added process ids to critical trace entries
- added packing to critical trace entries

* Update timemory submodule
2022-05-24 19:25:54 -05:00
Jonathan R. Madsen c2b206ba28 Timemory procfs utilities (#60)
- Serialize memory maps
 - Utilize tim::utility::procfs::cpuinfo::freq in cpu_freqs.cpp
2022-05-19 16:07:11 -05:00
Jonathan R. Madsen 346f8cd0bc Option rename + minor fixes (#57)
- Set choices of OMNITRACE_BACKEND option
- rename OMNITRACE_SHMEM_SIZE_HINT_KB option
- rename OMNITRACE_BUFFER_SIZE_KB option
- rename OMNITRACE_COMBINE_PERFETTO_TRACES
- rename OMNITRACE_BACKEND option
- default to OMNITRACE_COLLAPSE_PROCESSES for combining perfetto traces
- OMNITRACE_PERFETTO_FILL_POLICY option
- fix unused variables due to constexpr in add_critical_trace
- rename perfetto config from "track_event" to "omnitrace"
- fix build-release.sh + python
- handle config file updating OMNITRACE_DL_VERBOSE in omnitrace-dl
- rename roctrace.cfg to omnitrace.cfg
- accept "on" and "off" for get_sampling_cpus()
2022-05-10 17:30:45 -05:00
Jonathan R. Madsen b208047741 Support for tracing mutex locking (#52)
* Parallel overhead example with locks

* Support tracing mutex locking + more

- support wrapping pthread_mutex_lock
- support wrapping pthread_mutex_unlock
- support wrapping pthread_mutex_trylock
- get_perfetto_combined_traces setting
- OMNITRACE_TRACE_THREAD_LOCKS option
- ThreadState
- critical trace includes queue id
- enabled/disabled settings in timemory
- fix OMNITRACE_TIMEMORY_COMPONENTS
- fix reading config
- fix setting categories
- applied ThreadState::Internal in various places
- utility::get_filled_array
- utility::get_reserved_vector
- utility::get_thread_index
- fork_gotcha messages about forks
- split out some pthread_gotcha functionality into pthread_create_gotcha
- handle queue id in roctracer callbacks

* Update timemory and PTL submodules

* Misc CMake updates

- Includes fix to omnitrace-static-lib{gcc,stdcxx}

* Misc cleanup to pthread_mutex_gotcha and backtrace

* Fix to duplicate field in module_function json

* Improvement to debug messages

* omnitrace-dl and common improvements

- tweak to delimit
- common::ignore message
- common::join quoting of strings
- omnitrace_set_env ignores if inited and active
- omnitrace_set_mpi ignores if inited and active

* nsync for transpose example

* Fix to thread_deleter<void> functor invoke

* Fix thread state and HIP stream enums
2022-05-08 04:40:10 -05:00
Jonathan R. Madsen 1f66e23fdd Reorganize source/lib/omnitrace (#51)
- Got rid of `source/lib/omnitrace/include` and `source/lib/omnitrace/src` and merged into `source/lib/omnitrace`
- Updated perfetto submodule to v25.0
- Updated papi submodule
2022-05-02 13:08:51 -05:00