Commit graph

38 Commits

Autor SHA1 Nachricht Datum
Jonathan R. Madsen 642b6b95ca Support external (i.e. user-defined) trace annotations (#195)
* Support external (i.e. user-defined) trace annotations

- tweaked the python examples to be more balanced
- updated the user-api example to conform to user API changes
- moved the get/set for State and ThreadState to state.{hpp,cpp}
- introduced user-provided trace annotations
- added perfetto python category
- moved coverage impl files around
- created enumerations for mapping category enums to category types
- created enumerations for mapping annotation type enums to annotation values
- moved tracing::add_perfetto_annotations to tracing/annotation.hpp
- utility make_index_sequence_range
- libomnitrace-dl: omnitrace_push_category_region
- libomnitrace-dl: omnitrace_pop_category_region
- libomnitrace-user: omnitrace_user_push_annotated_region
- libomnitrace-user: omnitrace_user_pop_annotated_region
- libpyomnitrace: support extra annotations via annotate_trace config value
  - filename
  - line
  - last attempted instr in bytecode (lasti)
  - argcount
  - num local variables
  - stacksize
- omnitrace-python: -a / --annotate-traces option

* tweak ubuntu-focal workflow

* Fix installation of omnitrace-user headers

* ubuntu-focal-codecov workflow update

- Install texinfo

* Update timemory submodule
2022-11-11 07:31:14 -06:00
Jonathan R. Madsen b23b581563 Offload sampling data (#190)
- update timemory submodule
  - support for load/save of ring_buffers
  - new output keys, e.g. `%nid%`
  - sampling allocator offloading data
- writing sampling data to temporary file
- new advanced config option `OMNITRACE_USE_TEMPORARY_FILES`
- new advanced config option `OMNITRACE_TMPDIR`
- SIGINT signal (i.e. `Ctrl+C`) triggers backtrace + finalization
  - this behavior is common to other profilers

* update output.md docs

* Update omnitrace-avail output keys handling

* update writing metadata

* str format in perfetto_counter_track

* Fix fail regex for mpi-example

* config updates

- OMNITRACE_USE_TEMPORARY_FILES
- OMNITRACE_TMPDIR
- Enable finalization with SIGINT
- code supporting creation of temp files

* sampling offloading to temporary file

* Disable creation of empty temporary files when off
2022-10-31 22:23:10 -05:00
Jonathan R. Madsen 46b6db1a4c Submitting jobs to cdash (#124)
* Submitting jobs to cdash

* Fail on submit

* submit url env

* submit url env

* try passing submit url as arg

* fix submit url

* Updated default URL

* Add submissions for remaining ubuntu focal workflow jobs

* Replace g++ with gcc in dashboard build name

* Add --ctest-args to run-ci.sh

* Add cdash support for bionic, jammy, and opensuse workflows

* Decrease CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE

* OMNITRACE_BUILD_CODECOV option

* Support code coverage in CDash script

* CI dyninst built with debug info

* Update ci-containers

- cron schedule moved 4 hours later to UTC+5

* Update implementation of config::configure_signal_handler

- using lambdas failed to compile with codecov flags

* Add codecov job to ubuntu focal workflow

* Fix support for --ctest-args in run-ci script

* Fix ubuntu workflows

* Fix quotation handling in run-ci script

* git safe directory for codecov

* New MPI examples

* Remove --stop-on-failure

* dynamic_library update

- find_library_path checks procfs maps
- invoke find_library_path with no additional args to resolve to mapped file

* RCCLP uses dynamic_library

* check if file exists for memory_map_files metadata

* Testing updates

- include new mpi examples in tests
- fix test labels
- test critical-trace exe

* Update MPI C examples tests (needed arg)

* Remove try/catch block from critical-trace

* Fix sampling max wait when shutting down

* Fix test env for critical-trace

* Fix settings for critical-trace

- disable time output: data is deterministic
- disable PID suffixes: not multiprocess

* Update critical-trace ctest

* Update critical-trace exe

- throw error if input cannot be opened
- throw error if input has no data

* Update lulesh example with more kokkos tools usage

* Fix tasking issue with critical_trace and roctracer

- were not setting pools to active
- also sync before critical_trace::get_entries

* Increase verbosity of critical-trace tests

* Update code coverage tests

- skip code coverage + preload
- code-coverage python example and test

* Remove duplication omnitrace.initialize function

* Skip python3.6 for ubuntu jammy

* Update MPI examples

- use MPI_Isend and MPI_Irecv
- explicitly use MPI_Bcast

* Update Formatting.cmake

- include C files in examples

* run-ci script does not check return of coverage

* mpi-allreduce link to libm

* Update ctest args in run-ci script

* Update dyninst submodule

- safety improvements in BinaryEdit::openResolvedLibraryName

* capture cmake error for ctest_coverage
2022-10-31 15:39:45 -05:00
Jonathan R. Madsen 271f851896 Disable HSA API and activity by default (#183)
- OMNITRACE_ROCTRACER_HSA_ACTIVITY is OFF by default
- OMNITRACE_ROCTRACER_HSA_API is OFF by default
- in real applications, this adds way too much tracing data to perfetto
2022-10-21 09:23:29 -05:00
Jonathan R. Madsen ede6007f9b Support for Ubuntu 22.04 and ROCm 5.3 (#48)
* Testing and CI support for Ubuntu 22.04

* Fixes for ROCm

- Jammy does not have ROCm installers

* Name, timeout, and python updates

- renamed ubuntu-jammy-external.yml to ubuntu-jammy.yml
- increased all 5 minute timeouts to 10 minutes
- include python 3.10 in testing

* Update dyninst to remove interposed definition of _r_debug

* Rebuild Dyninst + test install script

* Revert container change

* git safe directory

* pushd -> cd

* fix MPI include

* Fix testing step

* OMPI_ALLOW_RUN_AS_ROOT

* Test script changes

* Fix mismatched malloc / delete[]

* Jammy workflow tweaks

* CPack tweak for boost deb deps

* pthread_mutex_gotcha config returns when not enabled

* fix echoing config in CI

* USE_CLANG_OMP

- option to disable using LLVM OpenMP when building OpenMP test executables
- Jammy workflow sets USE_CLANG_OMP=OFF

* Dyninst submodule boost download

- updated containers workflow to include jammy
- updated workflow to use ci

* Updates to workflows + replace test-install.sh

- test-install.sh in this branch was replaced with one in main branch

* Expand jammy test-install.sh args

* Fix openmp-cg-sampling-duration test

* update timemory submodule

- use-after-free violation in popen::pclose

* revert some tweaks to sampling-duration test

* Fix env of test-install.sh

* cmake format

* jammy bash

* CPack install for jammy

* formatting workflow action version bump

* Update timemory submodule

- libunwind submodule via timemory sets SOVERSION to 99 to avoid ABI conflicts with v8

* Fix help menu for omnitrace-sample

* Support other boolean forms in test-install.sh

* Update docker files and build-docker.sh

- consolidated cases in build-docker.sh
- support rocm version of 0.0 (no rocm install)
- support rocm v5.3
- updated centos handling

* update opensuse actions/checkout version

* Tweaks to ubuntu-focal testing

- actions/checkout@v3
- use test-install script

* update cpack

- ubuntu 22.04
- rocm 5.3
- rename os matrix field to os-version
- remove CI_ROCM_VERSION (no longer necessary)
- remove default-rocm-version matrix field (no longer necessary)
- CentOS packaging

* fix argparsing and omnitrace-sample tests in install-tests.sh

* focal rocm test install workflow fix

* Fix omnitrace-sample build

* Dockerfile.centos + build-docker.sh updates

* Update actions/upload-artifact version

* Dockerfile.ubuntu: install rocm-device-libs

* Refactor cpack

* fix cpack if quotes

* Dockerfile.ubuntu rocm < 5 installs rocm-dev

* build-release.sh defaults to boost version 1.79.0
2022-10-17 12:54:26 -05:00
Jonathan R. Madsen a3439d5bf2 Trace thread config + paranoid level + preload (#176)
- OMNITRACE_TRACE_THREAD_BARRIERS config option
  - set to OFF to disable wrapping `pthread_barrier`
- OMNITRACE_TRACE_THREAD_JOIN config option
  - set to OFF to disable wrapping `pthread_join`
- allow PAPI with perf_event_paranoid at level 2
- default to no PAPI events
- setenv LD_PRELOAD to not include libomnitrace after preload
  - closes #175 
- bump version to 1.7.1
2022-10-06 19:11:08 -05:00
Jonathan R. Madsen 2a387f9099 Fix finalization segfaults (#174)
- update timemory submodule with fixes to papi components and signals
update
2022-10-04 00:00:05 -05:00
Jonathan R. Madsen 79a8f16646 omnitrace-sample (#169)
- `omnitrace-sample` executable which executes sampling (no
instrumentation)
- fixes bug with OMPT ignoring value of `OMNITRACE_USE_OMPT`
- fixes some issues with sampling duration
- new `OMNITRACE_SAMPLING_INCLUDE_INLINES` configuration variable
- restricts process-sampling to 100 interrupts/sec when inheriting value
from `OMNITRACE_SAMPLING_FREQ`
- `OMNITRACE_PROCESS_SAMPLING_FREQ` still supports up to 1000
interrupts/sec
- fixes bug with colorized log not truly being disabled in all instances
- adds tests for `omnitrace-sample`
- adds tests for sampling duration
- settings ROCP_TOOL_LIB to libomnitrace-dl throws error
  - rocprofiler does not configure correctly when this is done
- Quiet numa_gotcha warnings
- Fixed some shadowed variables
2022-09-30 10:47:07 -05:00
Jonathan R. Madsen 8f36620e29 Fix deadlocks during initialization (#167)
- More to come in later commit, below is just tidying some stuff up
  - clang-tidy
  - mpi_gotcha quiet about not finding funcs
  - update to new papi config
  - sampling block_samples / unblock_samples
    - disable calling component's sample functions within sampler
  - release doesn't strip library
  - remove HSA and ROCP env variables from modulefile / setup-env
- preliminary support for LD_PRELOAD usage
- default sampling rate is 300 interrupts / second
- fixes various deadlock issues at startup
2022-09-26 07:52:14 -05:00
Jonathan R. Madsen 2718596e5a Support tracing thread locks with perfetto (#143)
- remove sampling and roctracer flat/timeline options
  - unused/unnecessary clutter
- start pthread_gotcha before perfetto
- remove pthread_mutex_gotcha validate
- update timemory submodule with tid fix
2022-08-31 11:33:45 -05:00
Jonathan R. Madsen e67afd33eb Support sampling duration, sampling TIDs (#142)
- Sampling duration config values
  - OMNITRACE_SAMPLING_DURATION
  - OMNITRACE_PROCESS_SAMPLING_DURATION
  - Disables sampling after this time (in seconds) has elapsed 
- Sampling thread-id config values
  - OMNITRACE_SAMPLING_TIDS
  - OMNITRACE_SAMPLING_CPUTIME_TIDS
  - OMNITRACE_SAMPLING_REALTIME_TIDS
  - Allows user to select certain threads for sampling
- Miscellaneous
  - Tweaked the finalization verbosity messages
  - moved sampling-on-child-threads into runtime.hpp and runtime.cpp
  - fixed submodule dyninst header install
2022-08-31 06:29:19 -05:00
Jonathan R. Madsen 808ea7dfa7 Rework sampling and colorized logs (#140)
## Overview

This is a significant PR which has 3 very notable characteristics:

1. Omnitrace colorizes most of it's logging
2. Completely reworked the sampling 
  - Samples now record the current instruction pointers instead of strings
    - This _dramatically_ decreases the overhead of taking a sample
  - The collection of metrics during a sample are split out into another component, enabling that data collection to be disabled -- which decreases the sampling overhead even further
  - When both `OMNITRACE_SAMPLING_CPUTIME` and `OMNITRACE_SAMPLING_REALTIME` are ON:
    - `OMNITRACE_SAMPLING_CPUTIME_FREQ` and `OMNITRACE_SAMPLING_REALTIME_FREQ` can be used to individually control the sampling frequency 
  - `OMNITRACE_SAMPLING_CPUTIME_DELAY` and `OMNITRACE_SAMPLING_REALTIME_DELAY` can be used to individually control the delay time before starting
  - Now, omnitrace does not start a real-time sampler on the main thread unless `OMNITRACE_SAMPLING_REALTIME` is ON
    - In the future, an `OMNITRACE_SAMPLING_TIDS` (and real-time, cpu-time variants) configuration variable(s) will allow you to select which threads will be sampled
3. Files produced by `omnitrace` exe -- `available-instr.txt`, `instrumented-instr.txt`, etc. -- now no longer has `-instr` suffix and are placed in `instrumentation/` subfolder, i.e. `available-instr.txt` -> instrumentation/available.txt`
  - This helped de-clutter the output folder

Most of the other edits were reorganization (e.g. internal namespace changes), cleanup, and splitting up functionality.

## Bug Fixes

There is a bug fix with respect to the HSA callbacks which disabled sampling on child threads when an HSA API call was made

## Details

- created thread_info struct for mapping different thread IDs
- reorganized file structure significantly
- added categories.hpp, concepts.hpp
- moved around name trait definitions
- moved all omnitrace components into `omnitrace::component` namespace
  - there was a lot of inconsistency b/t using `tim::component` in some places and `omnitrace::component`
  - added macros like OMNITRACE_DECLARE_COMPONENT in lieu of TIMEMORY_DECLARE_COMPONENT
- OMNITRACE_CRITICAL_TRACE_NUM_THREADS -> OMNITRACE_THREAD_POOL_SIZE
- roctracer and critical_trace use same thread pool
- critical_trace functions do not lock anymore bc of thread-local TaskGroup
- added `component::local_category_region` to support using `component::category_region` without explicitly passing in name
- removed `component::omnitrace` (unused)
- migrated KokkosP and OMPT to use `component::local_category_region`
  - removed `component::user_region` as a result
- migrated omnitrace_{push,pop}_{trace,region}_hidden to use component::category_region
  - removed `component::functors` as a result
- migrated some ppdefs
- `api::omnitrace` -> `project::omnitrace`
- `api::(...)` -> `category::(...)`
- improved recording the execution time of threads
  - migrated this functionality out of pthread_create_gotcha and into thread_info
- moved mpi_gotcha, fork_gotcha, exit_gotcha, rcclp into omnitrace::component namespace
- split backtrace up into backtrace, backtrace_metrics, backtrace_timestamp components
- sampling.cpp handles setup and post-processing that was formerly in backtrace
- updated logging to use colors
- `OMNITRACE_COLORIZED_LOG` config variable
- updated docs on JSON output from timemory
- instrumentation info in instrumentation subfolder
- added testing for KokkosP entries
- added testing for ompt entries
- add_critical_trace function defined in critical_trace.hpp
- disable push_thread_state and pop_thread_state when thread state is Disabled or Completed
- add comp::page_rss to main bundle
- thread_data supports std::optional instead of std::unique_ptr
- thread_data supports tim::identity<T> to avoid unique_ptr or optional
- tracing::record_thread_start_time()
- tracing::push_timemory and tracing::pop_timemory are templated on CategoryT
- removed anonymous namespace from omnitrace::utility
- sampling backtrace stores instruction pointers instead of strings
- component::category_region updates
  - handle disabled thread state
  - handle finalized state
  - fewer debug messages
  - invoke thread_init()
  - invoke thread_init_sampling()
  - handle push/pop count based on category
  - push/pop count only modified when used
- component::cpu_freq
- components/ensure_storage.hpp
- reworked the pthread_create replacement function
- updated parallel-overhead example to report # of times locked
- OMNITRACE_MAX_UNWIND_DEPTH build option
- update timemory submodule
2022-08-31 01:24:31 -05:00
Jonathan R. Madsen 040da3fc6a Enable TRACE_THREAD_RW_LOCKS and TRACE_THREAD_SPIN_LOCKS by default (#136)
* Enable OMNITRACE_TRACE_THREAD_{RW,SPIN}_LOCKS by default

- updates timemory submodule with updated GOTCHA submodule
- fix to GOTCHA library which defaults to not wrapping dlopen and dlsym prevents deadlock

* Bump version to 1.4.0
2022-08-08 15:28:01 -05:00
Jonathan R. Madsen 34013bc539 OMNITRACE_TRACE_THREAD_SPIN_LOCKS config (#134)
- configuration setting to wrap pthread_spin_lock, pthread_spin_trylock, pthread_spin_unlock
2022-08-08 08:38:52 -05:00
Jonathan R. Madsen afa3df8523 Advanced category for configuration options (#125)
Adds advanced category

- advanced category hides less relevant configuration options
- omnitrace-avail has new '--advanced' option which shows these flags
- increase verbosity level to print issue with reading ppid children
- OMNITRACE_ROCTRACER_HSA_ACTIVITY defaults to ON
- OMNITRACE_ROCTRACER_HSA_API defaults to ON
2022-08-03 12:13:00 -05:00
Jonathan R. Madsen 45be03906a RCCL support (#93)
* Initial support for RCCL

* OMNITRACE_USE_RCCLP + sampling tweaks

- also OMNITRACE_SAMPLING_KEEP_INTERNAL option
- minor modifications to sampling to use keep internal option + discard funlockfile

* Update docker and workflows to download RCCL

* Update CPack DEB with rocprofiler dependency

* Rework rccl into library and library/components folder

- add tpls/rccl/rccl/rccl.h

* Fix timemory includes

* rcclp inline definitions when disabled

* Tweaks to ubuntu-focal-external-rocm

- disable ompt
- enable building testing

* Tweaks to ubuntu-focal-external-rocm

- ctest exclude

* Tweak ubuntu-focal.yml

- remove source /.../setup-env.sh, replace with $GITHUB_ENV

* Fix ubuntu-focal-rocm + OMPI + root

* Improved rocm-smi error handling

- Recover from rocm-smi errors
- Disabling rocm-smi after recovering from errors
- Werror in developer mode
- Remove State::DelayedInit
- Add State::Disabled

* formatting

* Fix merge of OMNITRACE_SAMPLING_KEEP_INTERNAL

* Update RCCL include directory

- based on ROCm version we need with <rccl/rccl.h> or <rccl.h>

* RCCL Testing

- updated tests to use configuration files
- many tests generate a configuration file
- tests how have GPU option
- enable ncclCommCount, disable ncclGetVersion
- add testing for RCCLP via rccl-tests
- working directory of tests is PROJECT_BINARY_DIR
- add nccl/rccl functions to get_whole_function_names
- some clang compiler fixes

* Handle RCCL include w/o HIP

* RCCL requires HIP

* Update OMNITRACE_SAMPLING_CPUS for testing

* Update tests/CMakeLists.txt

* Debug settings

* Install MPI even when USE_MPI=OFF

* exclude printf

* skip mpi tests w/o USE_MPI or USE_MPI_HEADERS

* update ubuntu rocm workflow

* Fix configure env step for ubuntu rocm
2022-07-25 12:16:11 -05:00
Jonathan R. Madsen 99da25ea80 exit gotcha + remove DelayedInit state + rocm-smi + cleanup (#110)
* exit gotcha + remove DelayedInit state + cleanup

- exit gotcha which wraps exit, quick_exit, abort
- minor refactor of mpi gotchas
- removed some redundant code in omnitrace_finalized_hidden
- exclude instrumenting functions starting with dlopen and dlsym
- exclude instrumenting exit, quick_exit, and abort functions
- update timemory submodule with support for new gotcha_invoker with (gotcha_data, <function pointer>, args...)

* Improved rocm_smi error handling
2022-07-24 22:09:32 -05:00
Jonathan R. Madsen bcdec188eb Fix reliability when KOKKOS_PROFILE_LIBRARY is set in env (#103)
* Fix reliability when KOKKOS_PROFILE_LIBRARY is set in env

- in certain situations, an exe using kokkos may be instrumented
- this will cause libomnitrace to be dlopened via libomnitrace-dl
- if KOKKOS_PROFILE_LIBRARY is set to libomnitrace and not libomnitrace-dl, you will end up with different instances of libomnitrace trying to collect data

* Set OMNITRACE_MAX_THREADS=32 in CI
2022-07-24 22:01:53 -05:00
Jonathan R. Madsen d27f22ea37 Sampling use SIGRTMIN + N signals (#104)
* Use SIGRTMIN instead of SIGALRM for sampling

* Config options + fully working SIGRTMIN sampling

- OMNITRACE_SAMPLING_KEEP_INTERNAL config option
- OMNITRACE_PROCESS_SAMPLING_FREQ config option
- OMNITRACE_SAMPLING_REALTIME config option
- OMNITRACE_SAMPLING_CPUTIME config option
- OMNITRACE_SAMPLING_REALTIME_OFFSET config option

* Fix omnitrace-avail-regex-negation test

- OMNITRACE_PROCESS_SAMPLING_FREQ was causing failure
2022-07-22 14:17:27 -05:00
Jonathan R. Madsen c006e542a5 Replaces OMNITRACE_CONDITIONAL_BASIC_PRINT with OMNITRACE_VERBOSE (#97)
Replaces OMNITRACE_CONDITIONAL_BASIC_PRINT with OMNITRACE_VERBOSE and similar where possible
2022-07-21 11:15:26 -05:00
Jonathan R. Madsen 16bd20121e Support for disabling perfetto categories (#72)
* Updates

- get_perfetto_categories() impl
- rework perfetto category usage

* TIMEMORY_DEFINE_NAME_TRAIT for cpu_freq::cpu_peak

* Add process_memory_hwm perfetto category
2022-07-18 08:25:48 -05:00
Jonathan R. Madsen d22725e830 Support ACTIVITY_DOMAIN_ROCTX (#87)
- New configuration variable: OMNITRACE_USE_ROCTX
- Enable support for roctxRangePushA, roctxRangePop, roctxRangeStartA, roctxRangeStop
2022-07-18 02:06:40 -05:00
Jonathan R. Madsen 4208b5654c GPU HW Counters via rocprofiler (#84)
* Initial support for GPU hardware counters

* Update find modules for roctracer and rocprofiler

- /opt/rocm/{rocprofiler,roctracer} path is deprecated so tweak search procedure

* Improve ConfigCPack for MPI

* Update rocprofiler

- rocm_metrics()
- minor cleanup

* Update rocm find modules

* declare rocm_metrics + call in omnitrace-avail

* relocate omnitrace-launch-compiler

* REALPATH and find_modules

* Examples cmake (may drop)

* omnitrace-avail

- hw_counter categories
- init rocm

* setenv updates for rocprofiler in library.cpp and dl.cpp

* get_rocm_events config

* gpu::hip_device_count()

* rocm_metrics returns hardware_counters::info

* - relocated library/components/roctracer_callbacks.* to library/roctracer.*
- relocated library/components/rocprofiler.* to library/rocprofiler.*
- cleaned up rocprofiler.hpp
- added perfetto output of rocprofiler
- added timemory output of rocprofiler
- renamed omni.roctracer thread to roctracer.hip
- added roctracer.hsa thread name
- updated timemory submodule to support std::variant
- updated timemory submodule to support = in config value
- updated timemory submodule to support standalone storage
- updated timemory submodule to support new hw counter apis
- updated timemory submodule to prevent label/description caching in data_tracker

* update omnitrace-avail info_type generation

* Update timemory submodule

* rocprofiler component

* cmake formatting

* omnitrace-avail handle no GPUs

- Add -c command-line option for --categories
- support verbosity

* hsa_rsrc_factory throws exceptions

- throw exceptions to avoid aborting on HSA_STATUS_ERROR_NOT_INITIALIZED when advantageous
- removed duplicate specialization of is_available for component::rocprofiler

* rocprofiler symbols for when disabled

* Fix warning in omnitrace-avail

- std::stringstream from initializer list would use explicit constructor

* Fix finalization after settings are deleted

* Reorganized rocprofiler source

* Updated formatting

* Miscellaneous tweaks

- added using statements from timemory
- tweaked the main and thread bundle names
- fixed timemory header includes
2022-07-17 21:52:09 -05:00
Jonathan R. Madsen e099c84640 pthread_rwlock deadlock fix (#82)
- found when using ROCm-enabled OpenMPI with rocHPL
  - when wrapping pthread_rwlock_rdlock, pthread_rwlock_wrlock, and pthread_rwlock_unlock, omnitrace has been found to deadlock for some unknown reason
- New configuration variable: OMNITRACE_TRACE_THREAD_RW_LOCKS which defaults to false
2022-07-11 20:59:57 -05:00
Jonathan R. Madsen f6b6be3ddc Fix empty OMNITRACE_CONFIG_FILE and suppressing config and parsing (#81)
* Fix empty OMNITRACE_CONFIG_FILE and suppressing config and parsing
2022-07-11 18:46:12 -05:00
Jonathan R. Madsen f82845388a Handle OMNITRACE_ENABLED + minor updates (#78)
- handle OMNITRACE_ENABLED=OFF by disabling everything
- use set_data to get wrappee in pthread_create_gotcha
- clear roctracer_data storage if roctracer not initialized
2022-06-29 19:39:55 -05:00
Jonathan R. Madsen 1877ebf47b omnitrace-avail generate config (#69)
* Config updates

- See PR #69 for details

- change type of OMNITRACE_DL_VERBOSE
- add "deprecated" category to OMNITRACE_ROCM_SMI_DEVICES
- reduce size of perfetto shared memory size hint
- deprecate OMNITRACE_OUTPUT_FILE in favor of OMNITRACE_PERFETTO_FILE
- set papi event choices
- read config file after reading command line
- fix update of OMNITRACE_DL_VERBOSE
- mark several settings as hidden
- timemory update support hidden attribute for settings
- rework get_perfetto_output_filename()
- Hide settings from not available backends

* Rework omnitrace-avail to support dumping configurations

* Overwrite query, tests, output flag

- Support using -O flag when dumping config
- Support checking before overwriting existing config
- Support --force to overwrite existing config
- Fix get_component_info not including omnitrace components
- Testing for dumping config

* Update documentation on omnitrace-avail

* Fix issue with timemory format + "/__w/"

* Update output prefix keys docs

* Rename --dump-config to --generate-config

* Hide MPI related options

- OMNITRACE_PERFETTO_COMBINE_TRACES and OMNITRACE_COLLAPSE_PROCESSES are hidden w/o MPI support
2022-06-28 01:36:04 -05:00
Jonathan R. Madsen efe1edd253 Fix PID resolution + OMNITRACE_VERSION + fix various configs (#71)
* Fix to pid via mpi_gotcha

* OMNITRACE_VERSION defines

* call perfetto on hsa_activity_callback thread

* Test label tweak

* Config fixes

- Change type of OMNITRACE_DL_VERBOSE
- Update OMNITRACE_DL_VERBOSE properly
- Add OMNITRACE_ROCM_SMI_DEVICES to deprecated group
- Set default_process_suffix

* metadata for OMNITRACE_VERSION and OMNITRACE_HIP_VERSION
2022-06-27 23:01:24 -05:00
Jonathan R. Madsen 5105e2c94f tracing NS + category region component + MPI args (#52)
tracing NS + category region component

- made library.cpp impl more broadly available
- support for perfetto args
- MPI wrappers encode args and return type
- new categories / perfetto categories
- omnitrace_library category -> libomnitrace
- omnitrace_dl_library -> libomnitrace-dl
2022-06-24 16:08:06 -05:00
Jonathan R. Madsen 27e4e82376 Deprecate omnitrace use thread sampling (#68)
* Deprecate OMNITRACE_USE_THREAD_SAMPLING

* Reworked config based on OMNITRACE_MODE

- config::set_default_setting_value(...)
- config::get_mode() is now dynamically deduced
- moved tweaking defaults from library.cpp to config::configure_mode_settings(...)
- timemory submodule update fixing vsetting issue

* runtime.md update

* revert accidental lambda name change

* Reintroduce (deprecated) OMNITRACE_ROCM_SMI_DEVICES

- add handle_deprecated_setting(...) for this deprecated setting
2022-06-24 15:03:15 -05:00
Jonathan R. Madsen 354bbf2a32 Rename OMNITRACE_ROCM_SMI_DEVICES to OMNITRACE_SAMPLING_GPUS (#58)
- support ranges in OMNITRACE_SAMPLING_GPUS
2022-06-22 15:01:13 -05:00
Jonathan R. Madsen f27f062e88 Fixes OMNITRACE_SUPPRESS_CONFIG handling (#53)
Fixes OMNITRACE_SUPPRESS_CONFIG

- now, if set to ON, the config file will not be read
2022-06-20 00:49:57 -05:00
Jonathan R. Madsen 3ca81fd8c0 Support strict settings option in timemory + expanded config syntax (#31)
Support strict settings option in timemory

- timemory settings updates
  - strict config option
  - improved variable support
    - $env: lvalues
    - support for ${VARIABLE} syntax
    - support for variable expansion in substring
  - chained config files
2022-06-08 16:58:06 -05:00
Jonathan R. Madsen 353e8eeb69 Critical trace updates (#6)
* critical trace updates

- better handling of OMNITRACE_USE_PERFETTO in omnitrace-critical-trace exe
- changed some data types in `critical_trace::entry`
- added device ids to critical trace entries
- added process ids to critical trace entries
- added packing to critical trace entries

* Update timemory submodule
2022-05-24 19:25:54 -05:00
Jonathan R. Madsen 8146426e8b Install perfetto tools option (#58)
* Install perfetto tools option

- E.g. traced, perfetto, etc.

* Fix copying of perfetto directory

* Require curl for installing perfetto tools

* Fix to locating tools/ninja
2022-05-11 15:05:09 -05:00
Jonathan R. Madsen 346f8cd0bc Option rename + minor fixes (#57)
- Set choices of OMNITRACE_BACKEND option
- rename OMNITRACE_SHMEM_SIZE_HINT_KB option
- rename OMNITRACE_BUFFER_SIZE_KB option
- rename OMNITRACE_COMBINE_PERFETTO_TRACES
- rename OMNITRACE_BACKEND option
- default to OMNITRACE_COLLAPSE_PROCESSES for combining perfetto traces
- OMNITRACE_PERFETTO_FILL_POLICY option
- fix unused variables due to constexpr in add_critical_trace
- rename perfetto config from "track_event" to "omnitrace"
- fix build-release.sh + python
- handle config file updating OMNITRACE_DL_VERBOSE in omnitrace-dl
- rename roctrace.cfg to omnitrace.cfg
- accept "on" and "off" for get_sampling_cpus()
2022-05-10 17:30:45 -05:00
Jonathan R. Madsen b208047741 Support for tracing mutex locking (#52)
* Parallel overhead example with locks

* Support tracing mutex locking + more

- support wrapping pthread_mutex_lock
- support wrapping pthread_mutex_unlock
- support wrapping pthread_mutex_trylock
- get_perfetto_combined_traces setting
- OMNITRACE_TRACE_THREAD_LOCKS option
- ThreadState
- critical trace includes queue id
- enabled/disabled settings in timemory
- fix OMNITRACE_TIMEMORY_COMPONENTS
- fix reading config
- fix setting categories
- applied ThreadState::Internal in various places
- utility::get_filled_array
- utility::get_reserved_vector
- utility::get_thread_index
- fork_gotcha messages about forks
- split out some pthread_gotcha functionality into pthread_create_gotcha
- handle queue id in roctracer callbacks

* Update timemory and PTL submodules

* Misc CMake updates

- Includes fix to omnitrace-static-lib{gcc,stdcxx}

* Misc cleanup to pthread_mutex_gotcha and backtrace

* Fix to duplicate field in module_function json

* Improvement to debug messages

* omnitrace-dl and common improvements

- tweak to delimit
- common::ignore message
- common::join quoting of strings
- omnitrace_set_env ignores if inited and active
- omnitrace_set_mpi ignores if inited and active

* nsync for transpose example

* Fix to thread_deleter<void> functor invoke

* Fix thread state and HIP stream enums
2022-05-08 04:40:10 -05:00
Jonathan R. Madsen 1f66e23fdd Reorganize source/lib/omnitrace (#51)
- Got rid of `source/lib/omnitrace/include` and `source/lib/omnitrace/src` and merged into `source/lib/omnitrace`
- Updated perfetto submodule to v25.0
- Updated papi submodule
2022-05-02 13:08:51 -05:00