Jonathan R. Madsen
08dc696e47
Fix for empty perfetto output ( #7 )
...
Fix to perfetto config
- Erroneously replaced data_sources config "track_event" with "omnitrace"
- Using "omnitrace" resulted in empty perfetto output files
[ROCm/rocprofiler-systems commit: ee67748042 ]
2022-05-25 00:35:02 -05:00
Jonathan R. Madsen
3f7ae0b01e
Documentation on metadata JSON file [skip ci] ( #8 )
...
- "CPU_FEATURES" entry is not a JSON array, not a string
- Adds examples of "memory_maps" and "memory_map_files"
- Provides a list overview of the contents
[ROCm/rocprofiler-systems commit: f9b3b28d34 ]
2022-05-24 23:03:40 -05:00
Jonathan R. Madsen
76f30b1c6d
omnitrace find_package support ( #3 )
...
* omnitrace find_package support
- Fix to INSTALL_DESTINATION for configure_package_config_file
- Fixes to ConfigInstall.cmake and omnitrace-config.cmake.in
* Test find_package
[ROCm/rocprofiler-systems commit: d2e635ed3c ]
2022-05-24 22:45:26 -05:00
Jonathan R. Madsen
e84ffb80de
omnitrace function exclude updates ( #5 )
...
- These functions cause weird call-stack behavior when instrumented
- rocr::image::ImageRuntime::CreateImageManager
- rocr::AMD::GpuAgent::GetInfo
- rocr::HSA::hsa_agent_get_info
- These functions cause out-of-order call-stacks when KokkosP is enabled
- Kokkos::Profiling::*
[ROCm/rocprofiler-systems commit: 6491ce7808 ]
2022-05-24 19:26:12 -05:00
Jonathan R. Madsen
43b257a03b
Critical trace updates ( #6 )
...
* critical trace updates
- better handling of OMNITRACE_USE_PERFETTO in omnitrace-critical-trace exe
- changed some data types in `critical_trace::entry`
- added device ids to critical trace entries
- added process ids to critical trace entries
- added packing to critical trace entries
* Update timemory submodule
[ROCm/rocprofiler-systems commit: 353e8eeb69 ]
2022-05-24 19:25:54 -05:00
Jonathan R. Madsen
0b75ce03a0
Minor updates for transpose, timemory submodule, roctracer, and omnitrace exe ( #4 )
...
* transpose usage message
* timemory submodule update
* roctracer updates
- Changes to verbosity of roctracer::shutdown
- protect_flush_activity prevents deadlock when error in callback
* Removed linking to timemory-cxx in omnitrace
- omnitrace exe does not link to `timemory-cxx` target
[ROCm/rocprofiler-systems commit: 5b2c27cccd ]
2022-05-24 18:35:33 -05:00
Jonathan R. Madsen
6a2377262b
Packaging test scripts + cpack fixes ( #2 )
...
[ROCm/rocprofiler-systems commit: f26b3b81d6 ]
2022-05-24 17:30:27 -05:00
Jonathan R. Madsen
9cf7d13388
cpack workflow fixes ( #64 )
...
- increase timeout
- exclude opensuse 15.2 + rocm 5.1
- combined extensions
[ROCm/rocprofiler-systems commit: ff7151dbf2 ]
2022-05-19 23:45:46 -05:00
Jonathan R. Madsen
feb2b45c0d
build-docker.sh: CI -> BUILD_CI ( #63 )
...
- docker run fix (remove -it argument)
[ROCm/rocprofiler-systems commit: 919dcf5456 ]
2022-05-19 21:16:20 -05:00
Jonathan R. Madsen
048d7cb856
Fix stray character in dockerfile.opensuse ( #62 )
...
[ROCm/rocprofiler-systems commit: c7e9627e75 ]
2022-05-19 16:12:45 -05:00
Jonathan R. Madsen
49b226a0ab
Fixes for roctracer_callbacks PP regions ( #59 )
...
- define OMNITRACE_HIP_VERSION
- fix for ROCm < 4.3
- fix for PP blocks based on HIP version
[ROCm/rocprofiler-systems commit: 506c26cf82 ]
2022-05-19 16:07:27 -05:00
Jonathan R. Madsen
cab75263f4
Timemory procfs utilities ( #60 )
...
- Serialize memory maps
- Utilize tim::utility::procfs::cpuinfo::freq in cpu_freqs.cpp
[ROCm/rocprofiler-systems commit: c2b206ba28 ]
2022-05-19 16:07:11 -05:00
Jonathan R. Madsen
358a3a7e36
Docker and build-release script updates [skip ci] ( #61 )
...
- Update CPack
[ROCm/rocprofiler-systems commit: 9cba1f80ba ]
2022-05-19 16:06:38 -05:00
Jonathan R. Madsen
27103d771b
Install perfetto tools option ( #58 )
...
* Install perfetto tools option
- E.g. traced, perfetto, etc.
* Fix copying of perfetto directory
* Require curl for installing perfetto tools
* Fix to locating tools/ninja
[ROCm/rocprofiler-systems commit: 8146426e8b ]
2022-05-11 15:05:09 -05:00
Jonathan R. Madsen
57ef312d26
Option rename + minor fixes ( #57 )
...
- Set choices of OMNITRACE_BACKEND option
- rename OMNITRACE_SHMEM_SIZE_HINT_KB option
- rename OMNITRACE_BUFFER_SIZE_KB option
- rename OMNITRACE_COMBINE_PERFETTO_TRACES
- rename OMNITRACE_BACKEND option
- default to OMNITRACE_COLLAPSE_PROCESSES for combining perfetto traces
- OMNITRACE_PERFETTO_FILL_POLICY option
- fix unused variables due to constexpr in add_critical_trace
- rename perfetto config from "track_event" to "omnitrace"
- fix build-release.sh + python
- handle config file updating OMNITRACE_DL_VERBOSE in omnitrace-dl
- rename roctrace.cfg to omnitrace.cfg
- accept "on" and "off" for get_sampling_cpus()
[ROCm/rocprofiler-systems commit: 346f8cd0bc ]
2022-05-10 17:30:45 -05:00
Jonathan R. Madsen
77721c2db5
Remove wikipedia links [skip ci] ( #56 )
...
[ROCm/rocprofiler-systems commit: ef202f3d86 ]
2022-05-10 13:16:04 -05:00
Jonathan R. Madsen
facd23b7bb
Docs images [skip ci] ( #55 )
...
* Added images of perfetto in docs
* README images + updates
[ROCm/rocprofiler-systems commit: ae2ea090fb ]
2022-05-08 07:57:09 -05:00
Jonathan R. Madsen
14d8998ba0
Fix $HOME/.omnitrace [skip ci] ( #54 )
...
[ROCm/rocprofiler-systems commit: e60fae5361 ]
2022-05-08 06:21:14 -05:00
Jonathan R. Madsen
0d5f0fb9cf
Support for tracing mutex locking ( #52 )
...
* Parallel overhead example with locks
* Support tracing mutex locking + more
- support wrapping pthread_mutex_lock
- support wrapping pthread_mutex_unlock
- support wrapping pthread_mutex_trylock
- get_perfetto_combined_traces setting
- OMNITRACE_TRACE_THREAD_LOCKS option
- ThreadState
- critical trace includes queue id
- enabled/disabled settings in timemory
- fix OMNITRACE_TIMEMORY_COMPONENTS
- fix reading config
- fix setting categories
- applied ThreadState::Internal in various places
- utility::get_filled_array
- utility::get_reserved_vector
- utility::get_thread_index
- fork_gotcha messages about forks
- split out some pthread_gotcha functionality into pthread_create_gotcha
- handle queue id in roctracer callbacks
* Update timemory and PTL submodules
* Misc CMake updates
- Includes fix to omnitrace-static-lib{gcc,stdcxx}
* Misc cleanup to pthread_mutex_gotcha and backtrace
* Fix to duplicate field in module_function json
* Improvement to debug messages
* omnitrace-dl and common improvements
- tweak to delimit
- common::ignore message
- common::join quoting of strings
- omnitrace_set_env ignores if inited and active
- omnitrace_set_mpi ignores if inited and active
* nsync for transpose example
* Fix to thread_deleter<void> functor invoke
* Fix thread state and HIP stream enums
[ROCm/rocprofiler-systems commit: b208047741 ]
2022-05-08 04:40:10 -05:00
Jonathan R. Madsen
0094a471fd
Update documentation ( #53 )
...
- updated info about OMNITRACE_USE_MPI
- removed wiki links
- info about metadata.json
- update HW counters and fix typos
- fix update-docs.sh
[ROCm/rocprofiler-systems commit: bab90baf0b ]
2022-05-08 02:51:35 -05:00
Jonathan R. Madsen
060da8159c
Code coverage updates ( #50 )
...
* code coverage updates
- python support
- refactored source
* remove code_coverage::operator+ and operator+=
* impl/coverage.hpp
[ROCm/rocprofiler-systems commit: 134b33320d ]
2022-05-08 01:40:56 -05:00
Jonathan R. Madsen
00315e1e2f
Reorganize source/lib/omnitrace ( #51 )
...
- Got rid of `source/lib/omnitrace/include` and `source/lib/omnitrace/src` and merged into `source/lib/omnitrace`
- Updated perfetto submodule to v25.0
- Updated papi submodule
[ROCm/rocprofiler-systems commit: 1f66e23fdd ]
2022-05-02 13:08:51 -05:00
Jonathan R. Madsen
b3c5a6f048
perfetto mpi + mpi example ( #49 )
...
[ROCm/rocprofiler-systems commit: 6b7b6e46cf ]
2022-04-27 16:58:45 -05:00
Jonathan R. Madsen
2bb6fd0cfb
Misc updates ( #48 )
...
- reworked `add_critical_trace`
- `get_use_thread_sampling` / `"OMNITRACE_USE_THREAD_SAMPLING"` option
- `get_cpu_cid_stack_lock`
- reworked finalization messaging
- significant updates to pthread_gotcha
- shutdown stability
- `"start_thread"` entries
- `rocm_smi` stability
- roctracer_callbacks add critical trace entries on the callback thread
- reworked CPU CID initialization
- thread_sampler stability
[ROCm/rocprofiler-systems commit: 9b25d4b3b5 ]
2022-04-27 16:56:38 -05:00
Jonathan R. Madsen
d45e84b116
GOTCHA + Kokkos + tasking + more ( #47 )
...
* GOTCHA + Kokkos + tasking + more
- update gotcha with fix for dlsym(RTLD_NEXT, ...)
- support for standalone KOKKOS_PROFILE_LIBRARY
- remove extra flags for omnitrace-user
- roctracer and critical_trace namespaces in tasking
- generic tasking functions, e.g. join(), shutdown(), etc.
- omnitrace_init_tooling_hidden in api.hpp
- ompt.cpp uses OMNITRACE_USE_OMPT
- kokkosp uses user_region instead of omnitrace component
- re-enable recycling thread ids
- more generic _{push,pop}_perfetto functors
- fix for thread_data::instance(construct_on_init, ...)
- fix for omnitrace-headers interface target
- omnitrace_watch_for_change
[ROCm/rocprofiler-systems commit: 29220cba58 ]
2022-04-26 22:08:51 -05:00
Jonathan R. Madsen
72d0a7d08a
Code Coverage Support ( #46 )
...
* Code-coverage support
* Examples update
- code-coverage example
- tweak transpose and parallel-overhead
* Coverage output + testing
- config::get_setting value(...)
- REGULAR_EXPRESSION -> REGEX in cmake func args
- coverage.hpp header
- coverage JSON
- coverage tests
* cmake formatting
* Library instrumentation w/o main + more
- fixed library instrumentation w/o main
- use TIMEMORY_PROJECT_NAME in output messages
- removed '--driver' option from omnitrace exe
- support coverage in trace mode
- OMNITRACE_KOKKOS_KERNEL_LOGGER
- support multiple calls to omnitrace_set_env after init if already called
- support multiple calls to omnitrace_set_mpi after init if same args
- support multiple calls to omnitrace_init if same mode
- unique_ptr_t for thread_data which calls finalize when thread_data is destroyed
- tweaked openmp tests
- improved finalization
* Replace CI --output-on-failure with -V
* Fix to OMNITRACE_DL_INVOKE
* omnitrace-exe and testing updates
- omnitrace::omnitrace-timemory interface library
- support for configs in omnitrace exe
- print-{available,instrumented,...} opts no longer exit w/o --simulate
- all tests apply --print-instrumented functions
- tweaked coverage tests
- print-* options print instructions not address range
* Remove OMNITRACE_DEBUG_FINALIZE=ON from CI
* Python cmake tweaks
* Tweak test ordering
* Upload CI artifacts if fail or success
* CI Python tweaks
- Use OMNITRACE_PYTHON_PREFIX and OMNITRACE_PYTHON_ENVS
* CI ELFULTILS_DOWNLOAD_VERSION
* test tweaks
- labels and more coverage tests
* tweak to omnitrace --config handling
* Update module/function constraint handling + PP
- tweak pre-processor definition handling
- removed free-standing module_constraint
- remove free-standing routine_constraint
- remove module_name.find("omnitrace") module constraint
- fully handle the output path of omnitrace *-instr files
- get_use_code_coverage config option
- print-coverage option
- coverage_module_functions
* use github.job not github.name
* Re-enable HSA_ENABLE_INTERRUPT
- remove coverage address report
[ROCm/rocprofiler-systems commit: 791375bb24 ]
2022-04-25 17:00:52 -05:00
Jonathan R. Madsen
28ade7fbb9
Update CI to test multiple python versions ( #45 )
...
* Update CI to test multiple python versions
* Ensure numpy is installed
* Handle lulesh with cmake < 3.16
* Fix typo
* Bump minimum CMake version to 3.16
- CMake 3.15 has issue with PTL object library
* Tweak CI test output
[ROCm/rocprofiler-systems commit: 22eaa780ec ]
2022-04-22 03:05:07 -05:00
Jonathan R. Madsen
55fb69a57c
Miscellaneous fixes ( #44 )
...
* Miscellaneous fixes
- handle HSA OnLoad called during omnitrace-avail
- disable setting HSA_ENABLE_INTERRUPT when roctracer not used
- sampler max verbose
- fix roctracer get_clock_skew
- cleanup roctracer debug output
- update timemory submodule with fence
- simplify min-instructions vs. min-address-range specification
- exclude cxx regex updates
- disable HSA_TOOLS_LIB and HSA_ENABLE_INTERRUPT when no roctracer
* git safe.directory
[ROCm/rocprofiler-systems commit: 77703ef4f1 ]
2022-04-21 22:59:50 -05:00
Jonathan R. Madsen
b4b5acf0a6
omnitrace-compile-definitions (CMake) [skip ci] ( #43 )
...
[ROCm/rocprofiler-systems commit: cc9ce3a871 ]
2022-04-21 21:52:57 -05:00
Jonathan R. Madsen
a438000c21
Multiple python versions ( #42 )
...
* Support multiple Python versions in single build
* RPATH + Split up config into config and runtime
* pybind11 submodule
* Docker build updates
[ROCm/rocprofiler-systems commit: 4db6ba3d28 ]
2022-04-21 21:36:07 -05:00
Jonathan R. Madsen
681678ff11
Support for building PAPI via a submodule ( #41 )
...
* Enable building PAPI via submodule
* Miscellaneous fixes
- Use TIMEMORY_PAPI_ARRAY_SIZE in backtrace
- remove pthread_gotcha init from fork_gotcha::configure
- fix HSA OnLoad called during before tooling init
* PAPI array size + PAPI.cmake updates
- updated timemory submodule with PAPI updates
- fix for backtrace _hw_cnt_labels
* Disable OMPT for focal
* format
[ROCm/rocprofiler-systems commit: d98e60a17f ]
2022-04-21 20:33:51 -05:00
Jonathan R. Madsen
317240ca1c
Setup and Nomenclature pages [skip ci] ( #40 )
...
[ROCm/rocprofiler-systems commit: e24c24dc56 ]
2022-04-12 00:49:55 -05:00
Jonathan R. Madsen
44c7c29b54
Workaround for dyninst bug with SIGTRAP ( #39 )
...
- on some systems (e.g. OLCF Crusher) it has been noted that dyninst will raise SIGTRAP (or SIGILL if DYNINST_SIGNAL_TRAMPOLINE_SIGILL is set in env)
- this fix adds an environment variable OMNITRACE_IGNORE_DYNINST_TRAMPOLINE which, when on, will try to ignore this
[ROCm/rocprofiler-systems commit: d3c73a5860 ]
2022-04-05 20:46:17 -05:00
Jonathan R. Madsen
e7546b201a
Python updates ( #38 )
...
* silence SFINAE disabled for fork_gotcha
* Python updates
- Options for --{module,function}-include
- libpyomnitrace is_initialized and is_finalized
- source instrumentation auto init
- atexit finalization
- improved python testing
* Documentation Update
* Fix to 'cmake -E cat' not available < cmake v3.18
* Fix for inverse tests
* Update cancelling.yml
[ROCm/rocprofiler-systems commit: 593b3b69b8 ]
2022-04-05 20:40:27 -05:00
Jonathan R. Madsen
6daac0f60c
Python support ( #37 )
...
* Initial python support
* Add python testing
* Increase timeout for bin tests
* cmake-format
* Valid build types + testing + formatting + more
- Enforce valid build types
- Fix to numpy install
- Increase testing timeout
- Fix to cmake format glob
- Fix to backtrace verbose
* Disable stripping libraries by default
* omnitrace exe updates
- new '--print-instructions' option
- changed format of instructions in JSON
- remove no-save-fpr tests
* Default to strip libraries when release build
[ROCm/rocprofiler-systems commit: afa3edebab ]
2022-04-05 00:24:34 -05:00
Jonathan R. Madsen
127e30a4d7
Documentation + Miscellaneous Fixes ( #36 )
...
* Added documentation markdown source
* Replaced AARInternal with AMDResearch in URLs
* Renamed cpack artifact names
* Fix to testing and lulesh submodule checkout
* Docker updates
* CMake and CPack
- force CMAKE_INSTALL_LIBDIR to lib
- CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- Tweak LIBOMP_LIBRARY find in examples/openmp
- Tweak setup-env.sh.in
* Partial update of README
- status badges
- docs link
- removed install info (covered by docs)
* OMNITRACE_SAMPLING_CPUS setting
- enables control over which CPUs are sampled for frequency
* omnitrace exe updates
- exclude transaction clone, virtual thunk, non-virtual thunk
- module_function::start_address
- module_function::instructions
- verbosity > 0 encodes instructions into JSON
* Miscellaneous fixes
- relocate setup-env.sh.in
- add modulefile.in
- Updated README.md and source/docs/about.md
- cmake fix for libomp
- fix license in miscellaneous places
- dl.hpp and dl.cpp
* Update timemory and dyninst submodules
- timemory signals updates
- dyninst Movement-adhoc updates
* cmake format
[ROCm/rocprofiler-systems commit: 945f541965 ]
2022-04-04 15:27:38 -05:00
Jonathan R. Madsen
4ddb8405ac
cpack workflow for building installers ( #35 )
...
* cpack workflow for building installers
- ConfigCPack.cmake update
- STGZ and DEB + containers + test artifact
- DEBIAN_FRONTEND + set -v
- submodule fix
- actions checkout
- OMNITRACE_ROCM_VERSION + continue-on-error
- Change CPack generators + fix path to DEB
- separate configure, build, and package steps
- use cd instead of pushd
- FindROCmVersion + fix to cpack testing
- use ${ROCM_PATH}/.info/version for ROCm version info
- Tweaks for debian installer
- Packaging fixes
- Use CMAKE_SHARED_LIBRARY_SUFFIX instead of .so
- Split cpack.yml into 4 workflows
- Replace source with export in cpack
- Dyninst boost uses tar.gz instead of zip on Unix
* Fix to common join
* Update VERSION to 1.0.0
[ROCm/rocprofiler-systems commit: 5c4d5c394f ]
2022-03-27 22:52:36 -05:00
Jonathan R. Madsen
5f08854a3a
Relaxed module/function restrictions ( #33 )
...
* Relaxed module/function restrictions
* Updated tests
[ROCm/rocprofiler-systems commit: 4a18f55d34 ]
2022-03-23 00:28:25 -05:00
Jonathan R. Madsen
e8819abae1
Fixes for ROCM-SMI + MPI ( #34 )
...
[ROCm/rocprofiler-systems commit: f4e27d8aee ]
2022-03-23 00:28:13 -05:00
Jonathan R. Madsen
9206792846
User api updates ( #32 )
...
* Update invoke.hpp
* Update OMNITRACE_FUNCTION
* Update library debug messages
* ptl verbosity
* Update timemory submodule
* mpi_gotcha calls omnitrace_finalize_hidden
* omnitrace_{push,pop}_region returns error code
* omnitrace-user updates
- doxygen documentation
- omnitrace_get_user_callbacks
- omnitrace_user_error_string
- omnitrace-user functions return error codes
* Update user-api example
* Tweak to workflows and tests
* Fix for OMNITRACE_FUNCTION
- conditional impl if __GNUC__ < 9
* focal-external-rocm workflow update
[ROCm/rocprofiler-systems commit: f6241af5ee ]
2022-03-22 15:51:57 -05:00
Jonathan R. Madsen
6b51dbccf8
Split workflows + docker usage ( #31 )
...
* Split workflows + docker usage
* Fix omnitrace-ci-ubuntu-focal-external
* fix env
* Update path to action
* fix entrypoint
* Updated cancelling, disabled formatting
* fix entrypoint
* rework
* try using container
* relocate container
* fix image name
* shell expand
* external and external-rocm
* install libopenmpi-dev
* remove github.workspace
* github.workspace for rocm
* Update bionic, etc. + docker CI
* Remove self-hosted + bionic fix
* GIT_DISCOVERY_ACROSS_FILESYSTEM for bionic
* TIMEMORY_INSTALL_LIBRARIES + exe RPATH updates
- fix RPATH for omnitrace, omnitrace-avail, and omnitrace-critical-trace
* ubuntu bionic update
* bionic and focal-dyninst-package updates
* Disable lulesh MPI by default + timeouts
- increase openmp CG timeout
- decrease openmp CG runtime
[ROCm/rocprofiler-systems commit: 138d16d16a ]
2022-03-22 12:30:07 -05:00
Jonathan R. Madsen
083035dd8b
User API + reorganized lib folders ( #30 )
...
* User API + reorganized lib folders
- omnitrace_user_start_trace
- omnitrace_user_stop_trace
- omnitrace_user_start_thread_trace
- omnitrace_user_stop_thread_trace
- omnitrace_user_push_region
- omnitrace_user_pop_region
* New OpenMP examples/tests
* Fix to KokkosP
* OMPT support
- fixed omnitrace instrumenting reporting
- common invoke improvements
- component::user_region
* exclude kmp_threadprivate_
* Separate omnitrace into multiple files
* PTL and timemory submodule updates
* Active guards + USE_OMPT guards in omnitrace-dl
* Tweak transpose default iterations
* omnitrace-precommit build target
* Omnitrace exe restructuring pt 2
- Never instrument functions with less than 4 instructions
- Never instrument ompt_start_tool or nanosleep
- module_function serializes heuristics
- removed hash stuff from omnitrace
- removed instr_procedures lambda
- WAITPID_DEBUG_MESSAGE
* set_state, "_hidden" fix, CI exceptions, backtrace fix
- set_state function
- fixed "_hidden" from appearing in print macros using __FUNCTION__
- OMNITRACE_CI_THROW
- more CI checks in library
- fixed backtrace init value sample issue being ignored
* Tweaks to OMPT tests
* cmake-formatting
* Removed debug output from backtrace processing
* Fix warnings and verbosity
* omnitrace-dl fix for libomp
* omnitrace-avail fixes
- remove second omnitrace_init_library call
- fix -r option not working
* Additional testing
- source/bin/tests
- tests for omnitrace-exe
- tests for omnitrace-avail
* cmake-format
* Reduce runtime of openmp-lu
* Update openmp-lu and tests timeout
* openmp-lu and CI tweaks
- decrease iterations
- OMP_NUM_THREADS=2
- install clang and libomp-dev in linux-ci
- fix data-files in linux-ci
[ROCm/rocprofiler-systems commit: d80752bc69 ]
2022-03-07 20:40:48 -06:00
Jonathan R. Madsen
a23bf28aaa
Fix compilation for ROCm 4.0 ( #29 )
...
[ROCm/rocprofiler-systems commit: 2acaa7aa9f ]
2022-03-07 13:16:41 -06:00
Jonathan R. Madsen
78ae7d1e37
Tweaks to docker scripts [skip ci] ( #28 )
...
[ROCm/rocprofiler-systems commit: 80e1a0d7e7 ]
2022-02-25 18:30:37 -06:00
Jonathan R. Madsen
1ad5529697
Created push/pop system for whether sampling is enabled ( #27 )
...
- also permitted turning off sampling in sampling mode
- also fixed ambiguous rocm_smi namespace issue in roctracer
[ROCm/rocprofiler-systems commit: 3151dd3aeb ]
2022-02-25 05:33:59 -06:00
Jonathan R. Madsen
2403bbde49
Stability improvements ( #26 )
...
* omnitrace verbprintf and errprintf
* avail categories fix
* omnitrace-dl namespace
* OMNITRACE_CI macro / OMNITRACE_BUILD_CI option
- always enables asserts
* Roctracer improvements
- Reworked roctracer significantly
- Added categories to settings
- create_cpu_cid_entry
- handle clock_skew in roctracer
- fixed roctracer activity names
- hip_api_callback is "host"
- perfetto::Flow for GPU
* timemory submodule update
* Tweak to redirect
* Improved recursive guards
- functors component
- created "_hidden" variants of instrumentation funcs
- omnitrace_* calls omnitrace_*_hidden
- omnitrace-dl calls non-hidden
- omnitrace-dl now strongly protects against recursion
- omnitrace-dl now is standalone w.r.t. headers
* Stability fixes
- OMNITRACE_DEBUG_PUSH env variable
- fix to HSA_TOOLS_LIB in dl.cpp
- Fixed SFINAE warning in mpi_gotcha
- Handle 64, _l, _r extensions in whole function names
* cmake formatting
* Fix for last commit + push/pop count info
- don't instrument rocr::core::Signal::WaitAny
- don't instrument rocr::core::Runtime::AsyncEventsLoop
- fixed main not being popped in runtime instrument
- updated interval data reserve
- copy hash-ids and aliases onto main thread
- warn about unclosed regions
- removed guards in libomnitrace
- added error checks for incorrect push_count vs. pop_count
- fixed missing pop_timemory in last commit
* Finalization methodology updates
- added some more rocr:: functions to whole function names
* Add event_base_loop to whole functions
* Update VERSION to 0.1.0
[ROCm/rocprofiler-systems commit: 0d5c557552 ]
2022-02-25 03:56:41 -06:00
Jonathan R. Madsen
8b058902a2
omnitrace-dl-library ( #25 )
...
* timemory submodule update
* Visibility, setting categories, and task-group protection
- OMNITRACE_VISIBILITY instead of TIMEMORY_VISIBILITY
- increased task group data-race protection
- add omnitrace categories to settings
* set component_apis type-trait
* omnitrace-dl-library implementation
- this library dlopen + dlsym's libomnitrace
- significantly reduces the instrumentation time
* omnitrace-avail categories
- suppress AVAILABLE column when --available
* omnitrace-exe update
- uses omnitrace-dl
- adds --print-excluded option
- removes --jump option
- comments out --stubs option
- removes --stdlib option
- support for C++ STL functions not in libstdc++
- tweak the --print-* outputs
- significantly refactors instrument_module and instrument_entity
- removes unused c_stdlib_module_constraint
- removes unused c_stdlib_function_constraint
- decreases get_whole_function_names() coverage
* library.cpp updates
- OMNITRACE_DEBUG -> OMNITRACE_DEBUG_F
- omnitrace_finalize sets state earlier
- omnitrace_finalize clears push/pop functors
- increased tasking shutdown safety
* - fix critical-trace thread hierarchy
- signal handler calls omnitrace_finalize
- get_cpu_cid_stack supports parent tid
- interval data reserves
- omnitrace-avail serialization support for module_functions
- omnitrace --simulate option
- omnitrace --print-format option
- omnitrace --load-instr option
- omnitrace runtime-inst doesn't oneTimeCode
- updated regex
- expand get_whole_function_names()
- Test Install CI update
* fixes to last commit
- expand get_whole_function_names()
- ignore sig c modules
- kill process in signal handler
* Remove RTLD_DEEPBIND + more
- removed use of RTLD_DEEPBIND
- causes dyninst segfaults
- fixed signal handling
- updated timemory submodule
* Build/link static timemory libraries
* omnitrace --{module,function}-restrict option
- Added restrict regex options
- Reworked handling of regex options
- Reworked reporting of module/function skipping
- Handle -o w/o file specified
* timemory-avail
- category views
- backtrace::sample checks state
* get_debug_sampling()
[ROCm/rocprofiler-systems commit: 145a6ae06f ]
2022-02-23 06:59:32 -06:00
Jonathan R. Madsen
b99b153030
Critical trace updates ( #24 )
...
* Source code restructuring
* Critical trace updates following restructuring
* thread_sampler, timestamps
- thread_sampler
- CPU frequency managed via thread_sampler
- rocm-smi managed via thread_sampler
- Use consistent timestamps for perfetto
- removed hsa_timer_t in favor of wall_clock::record()
- disable KokkosP by default
- re-enable critical-trace testing
* cmake-format
* Fix for defines.hpp.in
* Remove OMNITRACE_ROCM_SMI_FREQ
- thread_sampler freq is set via OMNITRACE_SAMPLING_FREQ w/ max of 1000
* Increase CI Install Dyninst timeout
* Debug macros + omnitrace_init_tooling + config
- new debug macros
- extern "C" omnitrace_init_tooling
- guard get_rocm_smi_devices
* Miscellaneous tweaks
- tweak to transpose
- critical_trace::Device::ANY
- perfetto "critical-trace" category
- OMNITRACE_VERBOSE usage
* Disable key and tid data for HIP API calls
- non-kernels are ignored in activity callback
* critical-trace exe updates
- fix perfetto generation
- improved logging
- improved readability
* timemory submodule update
- lulesh example cmake tweaks
[ROCm/rocprofiler-systems commit: b016c8929f ]
2022-02-19 02:00:59 -06:00
Jonathan R. Madsen
4ae26e2d08
rocm-smi and KokkosTools support ( #23 )
...
* renamed omnitrace_thread_data to thread_data
* initial implementation
* Numerous fixes and updates
- Updated timemory submodule
- Updated perfetto submodule (pulls in fixes for TRACE_EVENT)
- pthread_gotcha only after omnitrace_init_tooling
- omnitrace banner
- config settings for rocm-smi freq and devices
- critical_trace::get_entries
- OMNITRACE_BASIC_PRINT
- rocm_smi perfetto category
- redirect roctracer warnings for ROCm 4.5.0
- property specializations for rocm-smi components
- units fixes data_tracker types
- roctracer entries for pthread_create and start_thread
- omnitrace-avail defaults to settings, not components
- settings have conforming names
- settings warn about duplicates
- ptl named threads
- decreased max freq for sampler SIGALRM
- rocm-smi names thread
- rocm-smi avoids call to hipGetDeviceCount
- name roctracer activity callback threads
- fixed binary rewrite test output names
* Update lulesh example
- supports non-UVM GPU
* Lulesh tweaks + formatting
* KokkosP + Mode + Roctracer sampling deadlock fix
- kokkosp support
- omnitrace_init_library
- config::print_settings()
- config::get_mode()
- omnitrace::Mode
- omnitrace-avail improvements (removes settings)
- handle get_verbose() < 0
- disable dyninst InstrStackFrames by default
- handle perf_event_paranoid > 1 by disabling PAPI
- SIGALRM max freq to 5.0
- Name threads
- rocm-smi handles get_use_perfetto() and get_use_timemory()
- HSA_ENABLE_INTERRUPT=0 when roctracer + sampling (fixes deadlock)
* Tests, API renaming, roctracer
- disable renaming of thread 0
- verbprintf_bare
- enable dyninst merge tramp
- tweaked some omnitrace exe verbose levels
- reworked roctracer::setup and roctracer::shutdown
- rocm_smi::data::poll checks get_state()
- omnitrace_trace_finalize -> omnitrace_finalize
- omnitrace_trace_init -> omnitrace_init
- omnitrace_trace_set_env -> omnitrace_set_env
- omnitrace_trace_set_mpi -> omnitrace_set_mpi
- sampling mode does not disable timemory
- disable roctracer before shutting down rocm-smi
- lulesh tests w/ and w/o kokkosp
- lulesh tests for perfetto only
- with --dynamic-callsites --traps --allow-overlapping
- lulesh tests for timemory only
- with --stdlib --dynamic-callsites --traps --allow-overlapping
* Update timemory submodule
- fix for TIMEMORY_PROPERTY_SPECIALIZATION
* get_verbose() handling + timemory submodule update
- Findroctracer.cmake uses find_package(hsakmt)
* Stability fixes + rework roctracer + perfetto
- reworked roctracer start up
- critical_trace perfetto basic values
- perfetto sampling category
- sampler checks signals
- peak_rss in sampling
- pthread_gotcha::shutdown()
- rocm_smi::device_count()
- HSA_TOOLS_LIB is set
- HSA_ENABLE_INTERRUPT in omnitrace exe
- omnitrace exe verbosity level changes
- Avoid instrumenting Impl ns in Kokkos
- gpu::device_count prefers rocm_smi instead of hip
- ptl blocks signals
- fixed pthread_gotcha roctracer_data values
- removed runtime-instrument-sampling tests
- timemory submodule update
* cmake formatting
* timemory + roctracer updates
- fix timemory issue with papi_common
- fix timemory issue with units
- define roctracer::is_setup()
* Miscellaneous tweaks
- Disable sampling during runtime instrument
- Fixed warnings about dynamic callsites
- Fixed backtrace output when timemory disabled
- Test tweaks
* cmake-format
* omnitrace_target_compile_definitions
* timemory submodule update
* config, omnitrace, State, mpi_gotcha updates
- use OMNITRACE_THROW instead of direct throw
- is_attached()
- is_binary_rewrite()
- get_is_continuous_integration()
- get_debug_init()
- get_debug_finalize()
- max_thread_bookmarks default to 1
- State::Init
- app_thread oneTimeCode
- runtime instrumentation uses waitpid
- fixed init_names
- include main in MPI runs
- fixed sampling setup when disabled
- reworked mpi_gotcha
- disabled critical trace in transpose test
* cmake-format
* handle rocm_smi::device_count() exception
* CI timeouts
* Re-enable runtime-instrument + sampling
[ROCm/rocprofiler-systems commit: 39f17ae8b8 ]
2022-02-08 17:42:17 -06:00
Jonathan R. Madsen
b4a82711d1
Sampler improvements ( #22 )
...
* Sampler improvements
- roctracer_flush_activity
- papi_array in backtrace
- fixed sampler trait specializations
- split main_bundle into main and gotcha bundles
- cmake option display
* timemory update
* EINTR handling + debug_{pid,tid}
- sampler handles EINTR for sem_init and sem_destroy
- OMNITRACE_DEBUG_{TIDS,PIDS} env variables
* Increase waitForStatusChange
[ROCm/rocprofiler-systems commit: eccba14f00 ]
2022-01-27 21:31:08 -06:00