Convert a subset of the ctest to pytest to be used in TheRock CI.
Create a new cmake flag `ROCPROFSYS_INSTALL_TESTING` to control test suite installation.
- pytest package will be installed to share/rocprofiler-systems/tests
- all compiled examples are put in share/rocprofiler-systems/examples
- all test relevant scripts are put in share/rocprofiler-systems/tests
- see README.md in share/rocprofiler-systems/tests
## Motivation
<!-- Explain the purpose of this PR and the goals it aims to achieve. -->
- __Reduced Code Duplication__: Version parsing logic moved from individual Dockerfiles to the central build script
- __Improved Edge Case Handling__: Better handling of ROCm versions with and without patch numbers (e.g., `6.2` vs `6.2.0`)
- __Easier Maintenance__: Future version-related changes only need to be made in one place
- __Cleaner Dockerfiles__: Simplified Dockerfiles focus on package installation rather than complex shell logic
- __Updated Platform Support__: Refreshed container matrix to reflect current platform/ROCm version combinations
- __Fix OpenSUSE Docker Generation__: OpenSUSE container generation fails due to a change to the `binutils-gold` package
- __Error Handling__: Fix bug where errors in docker image build were being masked, allowing workflow to pass anyway.
## Technical Details
<!-- Explain the changes along with any relevant GitHub links. -->
- Updated `Dockerfile.opensuse` and `Dockerfile.opensuse.ci` docker files to remove `binutils-gold`
- Not needed since we build `binutils` with systems anyways
- Updated `rocprofiler-systems-containers.yml` to remove `pushd/popd` commands and just run the shell scripts
- There was a silent failure observed here, which I verified in this PR before adding the fix for openSUSE
- Refactor ROCm version parsing. Move this logic to the `build-docker.sh` script to reduce duplication.
- Fix bug that caused ROCm 7.0 to fail installation. The trailing `.0` was being trimmed.
- Fixed inconsistencies in `containers.yml` that lead to invalid ROCm-OS_VERSION combinations.
- Formatting fixes
- Removed trailing whitespace
- Fix docker build warnings. Use an `=` rather than ` ` when assigning an environment variable.
ROCProfiler-Register/Systems/Compute: The license file name in the CMake install module and other locations was originally LICENSE, but it was recently changed to LICENSE.md, requiring an update to the CMake install module and all other relevant locations.
- Update minimum_cmake_required to match version used in CI
- We should match the minimum version that we test against
- Ensure ".S" files are treated as assembly.
Update Dyninst submodule
Refactoring of build scripts to build TBB, Boost, ElfUtils, and LibIberty, since Dyninst build scripts no longer do.
Workflows are now building Dyninst and its dependencies.
---------
Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 96df9b6d3e]
- Add cmake formatting rule for `rocprofiler-systems-custom-compilation`
- Resolve build failure observed with the `openmp-target` example when building in some environments
- Observed in our docker images
- Ensure `libomptarget-amdgpu-gfx*.bc` files are found
[ROCm/rocprofiler-systems commit: 96b31d92c3]
- Bringing in recent changes from rocm-6.4 branch (https://github.com/ROCm/rocprofiler-systems/pull/171)
- Add libdrm-devel to rhel and suse files
- Update ROCm installation method in Ubuntu file
- Add additional output to `test-release.sh` to catch failures due to a Python version not included
- Add Python 3.13 to Dockers
[ROCm/rocprofiler-systems commit: 83a9eb3d7c]
- Added script to merge multiprocess output automatically to one file when there are multiprocess proto files written into output folder
- Execute the merge multiprocess script from the rank 0 process
- Added the scripts folder path to env path, via setup-env.sh
- Installed merge_multiprocess_output.sh to /share/rocprofiler-systems/bin dir
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 0263e951ff]
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
- Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change.
[ROCm/rocprofiler-systems commit: 88aa2d3cbe]
* Add installers for rocm-6.3 and rhel-9.5
* Updated the template "rocprof-sys-install.py.in".
Fixed the installer for the "rocm-x.y.z" style tags.
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 7e2242414c]
- Porting from https://github.com/ROCm/omnitrace/pull/411
- Improve OMPT support
- Add OpenMP target example to testing
- Update Timemory submodule to use ROCm/Timemory rather than NERSC/Timemory
- Update `actions/upload-artifacts` to v4
- Standardize the `cmake_minimum_required` to 3.18.4 across workflows, project, and examples
- Updated Ubuntu 20.04 workflows
[ROCm/rocprofiler-systems commit: 7dce5926a7]
Updated OS test matrix to match ROCm 6.2.
Update build and CI docker files
Remove the "docs" workflow, because "read-the-docs" is now being used for ROCm documentation
[ROCm/rocprofiler-systems commit: b15c9e94fc]
The Omnitrace program is being renamed.
Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"
---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: d07bf508a9]
* Update perfetto args.gn.in
- remove enable_perfetto_tools_trace_to_text (unused)
* core timeout implementation
- requires OMNITRACE_CI=ON
- requires OMNITRACE_CI_TIMEOUT=<sec>
- adds pthread_self and std::this_thread::get_id to thread info
- pthread_create_gotcha stores native handles (pthread_self)
* Testing updates
- improve detection of segfault/failures with PASS_REGEX exists
- add OMNITRACE_CI_TIMEOUT env variable to all tests
* Line-info in releases
- e.g. -g1 + more options to minimize size of debug info
* Fix typo in config exit action message
* OMNITRACE_UNLIKELY around debug/verbose messages
* format fixes
* Overflow tests + capability check
* transpose example update
- link to threads library
* roctracer/rocprofiler update
- in ROCm 5.5.0, cannot include rocprofiler.h and roctracer.h in same file due to conflicting enum defs
- Moved HSA tracing setup/shutdown to component::roctracer
* roctracer update
- fix definition of roctracer::setup when disabled
* Update fork example
- detach threads on main PID
- flush io outputs when printing info
* Update overflow tests
- pass regular expressions
- overflow on PERF_COUNT_SW_CPU_CLOCK event
* fork gotcha update
- use getpid() instead of getppid()
* update fork example
- wait on threads calling fork
* timeout update
- wait on timeout thread to launch before proceeding
[ROCm/rocprofiler-systems commit: 3e2fa69a14]
* omnitrace-run exe
- ensure LD_PRELOAD for libomnitrace-dl.so
- convert config options into command-line options
* Update timemory submodule
- updates to tsettings
- updates to argparser
* common environment update
- throw error if get_env<bool> has empty string
* config updates
- minor tweaks to categories of settings
* core lib update
- add argparse for common handling of argument parsers
* omnitrace-sample update
- fix handling of --trace-file (OMNITRACE_PERFETTO_FILE)
* omnitrace-run update
- updated to use omnitrace::argparse functions
* Tests for omnitrace-run
* argparse core update
- remove choices for --cpu-events and --gpu-events
* remove some debugging prints
* fix timemory include in argparse.cpp
* always provide --hsa-interrupt option
* Update source/lib/core/argparse.cpp
- fix pedantic warning
* Update testing
- remove testing args that may not be there in some builds
* roctracer/pthread_create fix
- disable roctracer_data when roctracer not enabled
* omnitrace-causal tweak
* omnitrace-instrument: module_function tweak
- allow DEFAULT_MODULE and LIBRARY_MODULE
* common environment update
- support get_env for enums
* core: config update
- Add "mode" category to OMNITRACE_MODE
* Update timemory submodule
- remove debug print statement
* omnitrace-sample tweak
- change var init
* omnitrace-run testing update
- use --help instead of -?
* core: common.hpp
- tweak header include style
* core: argparser update
- add_ld_preload func
- launcher and command member variables in parser_data
- support launcher
* omnitrace-run update
- clean up and reworked
* libomnitrace-dl updates
- require LD_PRELOAD with binary rewrite
- dl::InstrumentMode
- dl::get_instrumented()
- verify_instrumented_preloaded()
- omnitrace_set_instrumented(int)
- relocated omnitrace_main from main.c to dl.cpp
- omnitrace_set_env does not dlopen libomnitrace
- omnitrace_set_main(func_ptr) [internal API]
- OMNITRACE_HIDDEN_API -> OMNITRACE_INTERNAL_API
* Update testing to new LD_PRELOAD requirements
* omnitrace-instrument updates
- adhere to LD_PRELOAD requirementsa
- invoke omnitrace_set_instrumented
- binary rewrite does not instrument main
- binary rewrite does not instrument call to omnitrace_init
- runtime instr does not instrument main
- runtime instr does not instrument call to omnitrace_init
* Bump to v1.9.0
- LD_PRELOAD requirement necessitates minor version increment
* common: environment
- fix ambiguous get_env calls
* omnitrace-instrument update
- fix issue with temporaries
* omnitrace-instrument and libomnitrace-dl updates
- runtime instrumentation does not work if libomnitrace-dl is preloaded
* libomnitrace-dl and libpyomnitrace updates
- define dl::InstrumentMode in dl.hpp
- handle instrumentation via setprofile libpyomnitrace
- do not push trace in omnitrace_init
* omnitrace-instrument and libomnitrace-dl updates
- move header to dl subdirectory
- omnitrace::omnitrace-headers include omnitrace-dl folder
- use InstrumentMode in omnitrace-instrument
* Update workflows and scripts
- Use omnitrace-run on instrumented exes
* Update docs
- add omnitrace-run to examples of running binary rewritten exes
[ROCm/rocprofiler-systems commit: abe35de43a]
* omnitrace-exe -> omnitrace-instrument
- Renamed omnitrace executable to omnitrace-instrument
- Provided dummy omnitrace exe which forwards onto omnitrace-instrument
- updated all docs to reflect the name change of the executable
- however, it is possible some were missed
* Update dyninst submodule
- correctly handle BOOST_LINK_STATIC in DyninstBoost.cmake
* Disable IPO for omnitrace-instrument
[ROCm/rocprofiler-systems commit: ab0e5d9b44]
* Fixes for Python 3.11
* Add python 3.11 to scripts
- also tweak to to{upper,lower} bash functions
* Fix PAPI RPM packaging in RedHat
- fix error from #!/usr/bin/python in papi_hl_output_writer.py
- requires either python2 or python3 instead of python
* cpack updates
- only generate STGZ for RedHat
- support `--generators` arg in build-release.sh
- support 7z, zip, and other zip generators
- fix build-release.sh with `--mpi`
- support setting CONDA_ROOT
* Support rhel/fedora/centos in omnitrace-install.py
* RedHat status badge
* Fix support for Python 3.11 + tweak ubuntu ci
- Remove installing clang and mpich in Ubuntu CI container
- Fallback on conda-forge for Python 3.11
- Enable entrypoint-rhel.sh for RHEL CI
- Pull latest container by default
* Update ElfUtils and PAPI builds
- quieter build output
- disable-nls for ElfUtils
- use -s flag for make
* Development Guide Docs
[ROCm/rocprofiler-systems commit: 83f9ed8696]
- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings
- improves the stability of MPI finalization
- reduces some debug messages within timemory when `OMNITRACE_DEBUG=ON`
- fixes issue found in RHEL where libunwind is using mutex and omnitrace was not treating this as an internal mutex call
- this may have been affecting the causal profiling slightly (tests seem a bit more stable now)
- fix data race in timemory
* Add RedHat CI and release packaging
- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings
* Fix URL for ROCm packages in redhat workflow
* Fix dnf --enable-repo for ROCm perl packages
* Dockerfile.rhel and redhat.yml updates
- Fix dnf repo for ROCm PERL packages
- Disable python in CI (interpreter segfaults)
- Exclude parallel-overhead-locks tests due to inclusion of internal locks
- This needs to be remedied in the future
* Exclude _dl_relocate_static_pie from instrumentation
* Testing updates
- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks
* Fix redhat workflow
* redhat.yml update
- remove if condition on config/build/test step
* Update timemory submodule
- tweaks to verbosity messages
* Set thread state before unw_step
- on Redhat, unw_step calls mutex
* Update timemory submodule
- verbosity changes
- gotcha uses spin_lock/spin_mutex
* Remove using gsplit-dwarf unless OMNITRACE_BUILD_NUMBER > 2
* Re-enable parallel-overhead-locks tests in redhat workflow
* Always disable timemory manager metadata auto output
* testing updates
- tweak parallel-overhead-locks-timemory to higher instruction count min
- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks-perfetto
* Update timemory submodule
- quiet realpath queries
* omnitrace exe updates
- detect text files
- improved bin/lib locating
* cmake format
* test-install.sh and redhat workflow updates
- handle testing when ls is script
- re-enable python testing on redhat workflow
- invoke test-install.sh in redhat workflow
* Misc guards for finalization
* omnitrace-exe, testing updates
- test-install.sh: LS_EXEC -> LS_NAME
- handle /usr/bin/ls being script in source/bin/tests
- improve locating the binary
* Fix mpi_gotcha compile error
* omnitrace-exe updates
- improve file locating
* formatting
* Misc fixes
- remove -static-libstdc++ for RHEL packaging (rocky-linux doesn't distribute static lib)
* omnitrace-exe paths
* Replace realpath with absolute
- using absolute path to symlink fixes issues with locating libdyninstAPI_RT at runtime
* omnitrace exe updates
- judicious use of realpath
* Update timemory submodule
- fix update main hash ids/aliases data race in merge
* bin tests update
- change working directory of omnitrace-exe-simulate-lib-basename
* omnitrace exe updates
- Update resolved exe/lib messaging
* bin tests update
- change working directory of omnitrace-exe-simulate-lib-basename
[ROCm/rocprofiler-systems commit: 1688a027d8]
* Address and thread sanitizer fixes
- Fix compilation with clang
- Tweak perfetto copy to build tree
- Added suppression files to scripts
- fix LD_PRELOAD support in omnitrace-causal and omnitrace-sample
- use spin_mutex and spin_lock from timemory instead of atomic_mutex and atomic_lock
- state uses atomic
- fix some memory leaks
- tweak testing
- mpi tests do not use preload
- increase timeout when using sanitizers
- add env LD_PRELOAD when using sanitizers
* Tweak perfetto build
* Update timemory submodule
* Update version to 1.8.1
* Update omnitrace-leak.supp
* Update timemory submodule
- fixed spin_mutex implementation
* Remove previously added addr_space->allowTraps(instr_traps)
- this appears to cause errors during binary rewrite
* causal testing updates
- relaxed causal validation on CI systems (to account for hyperthreading decreasing prediction)
- improved impact calculation
- other general improvements to validate-causal-json.py
* Improve fork handling for perfetto
- numerous updates changing perfetto:: to ::perfetto::
- added perfetto_fwd.hpp
* Updated fork example
- user API for validation that stopping/starting perfetto is valid
* Misc fixes to perfetto + fork support
- tweak regions in fork example
- handle disabling tmp files
- get rid of stop/start with perfetto before/after fork
- fixed sampling support during fork
- tweak env of fork test
* Fix find_package in build-tree
* Fix buildtree export
* Fix buildtree export
* Restructured ConfigInstall before adding examples
* Guard against creating tmp file in sampling when disabled
* Fix buildtree package
* formatting
* exit handlers on child processes
- quick exit to avoid perfetto cleanup
* Further tweaking of causal tests for reliability
- enable PROCESSOR_AFFINITY
- decrease to 5 iterations
* Further tweaking of causal tests for reliability
- disable PROCESSOR_AFFINITY for fast func e2e tests
- enabling affinity results in (valid) speedup predictions greater than zero
* Fixes to fork handling
- use pthread_atfork for redundancy if fork_gotcha fails
* cmake formatting
* Fix fork init settings + install components
- remove dl from PROJECT_BUILD_TARGETS
* Testing tweaks
- fix mpi-binary-rewrite-run regex when OMNITRACE_VERBOSE set > 1 in env
- increase causal e2e iterations to 8
* Fix "Test User API"
- test-find-package.sh included dl component
* Further tweaks to causal validation
- further considerations of variance
[ROCm/rocprofiler-systems commit: 846301bcaf]
* Always print PID in log messages
* omnitrace-dl updates
- omnitrace_preload does not call omnitrace_init or omnitrace_init_tooling
- omnitrace_preload will call omnitrace_set_mpi if OMNITRACE_USE_MPI
or OMNITRACE_USE_MPIP in the env is true but not call it otherwise
because doing so either overrides OMNITRACE_USE_PID (when true) or
disable mpip from initialization (when false) and the MPI
init can be caught later and override OMNITRACE_USE_PID
* config updates
- set_setting_value sets user update type
- remove volatile from get_settings_configured
- don't override settings::default_process_suffix
- don't kill process in omnitrace_exit_action
- set_state ignores updating state if >= State::Finalized
* Handle state > State::Finalized
* fork gotcha updates
- unsets LD_PRELOAD
- sets OMNITRACE_ROOT_PROCESS
- sets OMNITRACE_CHILD_PROCESS
* libomnitrace library.cpp updates
- basic_bundle for fini metrics
- handle finalization from child process
* sampling updates
- sampling::shutdown handles when child process
* Add example and test using fork
* Update run-ci script to support not submitting
* Tweak test envs
* Update build flags when codecov enabled
* remove unnecessary includes of sampling header
* Replace mpi copy/fini static lambda with free-funcs
* Update codecov job
* Fix OMPT segfaults after finalization
* Miscellaneous updates after rebase
* fixes for causal profiling
* revert some run-ci.sh changes
* Disable storing env in sampling::shutdown
* formatting fix
* Update timemory submodule
- fixed occasional synchronization issues with allocator offloading
- exclude protozero:: from internal samples
* improve root/child process detection
- avoid omnitrace_finalize in MPI when child process
- revert some testing tweaks
[ROCm/rocprofiler-systems commit: 32b15fe7b7]
- The primary feature of this PR is the **addition of support for scoping the collection of tracing/profiling data into one or more time-based windows**
- Closes#222
- Closes#207
- Support for a real-clock time delay and/or a duration for tracing/profiling was added, *resembling the support for this feature during sampling and process-sampling*
- However, above paradigm was enhanced for tracing
- Instead of one delay and/or one duration based on real time, ***tracing supports periodic and varying delays and durations and these delay+duration sets can be controlled with different clocks***
- At some point, this capability will be extended to sampling and process-sampling
- A secondary feature of this PR are the improvements to the handling of categories (by-product of the primary feature)
- For example, previously setting `OMNITRACE_ENABLE_CATEGORIES` to a specific set of categories only eliminated the disabled categories from the perfetto trace, now these are applied to timemory profiles too
- A new configuration variable `OMNITRACE_DISABLE_CATEGORIES` was added for when disabling only a handful of categories is easier
- There are quite a few miscellaneous modifications which pollute this PR a bit
## Multiple Tracing Windows
As noted above, tracing now supports specifying multiple delays and durations _and_ with different clocks. Consider the configuration below with two entries in the format `<DELAY>:<DURATION>:<REPEAT>:<CLOCK_TYPE>`:
```console
OMNITRACE_TRACE_PERIODS = 0.5:1.0:2:realtime 10.0:5.0:3:cputime
```
The above configuration defines:
1. `0.5:1.0:2:realtime`
- A delay of 0.5 seconds (real-time)
- Followed by a data collection duration of 1 second (real-time)
- This delay + duration is repeated 2x
- Summary: tracing data is collected for 2 out of the first 3 seconds of the application's execution
2. `10.0:5.0:3:cputime`
- A delay of 10 seconds (process _CPU-time_)
- Followed by a data collection duration of 5 seconds (process _CPU-time_)
- This delay + duration is repeated 3x
- Summary: tracing data is collected for a total of 15 seconds of process CPU-time in the ensuing 75 seconds of CPU-time during the application execution.
- Note: the elapsed CPU-time is the aggregate of the CPU-time consumed by all the threads in the process and should be scaled accordingly, e.g., 4 threads running constantly for 1 second of real-time is ~4 seconds of CPU time.
## `omnitrace-sample` Changes
Formerly, `--wait` and `--duration` command-line options only applied to sampling delay and duration. The value of these options are now applied to the tracing delay and duration. To retain the ability to control sampling delay/duration without setting tracing delay/duration or vice versa, `--sampling-wait`, `--sampling-duration`, `--trace-wait`, and `--trace-duration` options were added. `omnitrace-sample` also has new options for most of the new configuration options detailed below.
## New configuration options
| Option | Description |
| ------- | ----------- |
| `OMNITRACE_DISABLE_CATEGORIES` | inverse behavior from `OMNITRACE_ENABLE_CATEGORIES` -- populates list of all available categories and then removes the specified ones. |
| `OMNITRACE_TRACE_DELAY` | Single floating-point number specifying time to wait before starting data collection. Analagous to `OMNITRACE_SAMPLING_DELAY` and `OMNITRACE_PROCESS_SAMPLING_DELAY` |
| `OMNITRACE_TRACE_DURATION` | Single floating-point number specifying data collection duration. Analagous to `OMNITRACE_SAMPLING_DURATION` and `OMNITRACE_PROCESS_SAMPLING_DURATION` |
| `OMNITRACE_TRACE_PERIOD_CLOCK_ID` | Sets the default clock-type for tracing delay/duration. Always applied to above two options, can be overridden in below option. Accepts `CLOCK_REALTIME`, `CLOCK_MONOTONIC`, `CLOCK_PROCESS_CPUTIME_ID`, `CLOCK_MONOTONIC_RAW`, `CLOCK_REALTIME_COARSE`, `CLOCK_MONOTONIC_COARSE`, `CLOCK_BOOTTIME`. See `man 2 clock_gettime` for details on differences. |
| `OMNITRACE_TRACE_PERIODS` | More powerful version for specifying delay + duration. Supports formats: `<DELAY>`, `<DELAY>:<DURATION>`, `<DELAY>:<DURATION>:<REPEAT>`, and `<DELAY>:<DURATION>:<REPEAT>:<CLOCK_ID>`. |
## Miscellaneous Changes
- Expanded `critical_trace_categories_t` to include tracing data from MPI, pthread, HIP, HSA, RCCL, NUMA, and Python.
- Added categories `thread_wall_time` and `thread_cpu_time` (derived from sampling)
- Read DWARF info for breakpoints
- Relocated some source code
- Reason: necessary to make `libomnitrace` a bit more modular. Eventually, a large chunk will be separated into `libomnitrace-core`, `libomnitrace-binary`, etc. in order to facilitate re-usability
- Relocated some functionality from `runtime.cpp` to `config.cpp`
- Relocated code using rocm-smi library to query number of devices to `gpu.cpp` (where the code for using HIP to query number of devices is)
- Relocated code for perfetto config and perfetto session out of tracing namespace to reside with other perfetto code
- `OMNITRACE_COLORIZED_LOG` configuration option renamed to `OMNITRACE_MONOCHROME`
- Backwards compatibility via a deprecated option was not retained here since the logic changed (i.e. true in former means false in latter)
- Replaced `TIMEMORY_DEFAULT_OBJECT` macro with `OMNITRACE_DEFAULT_OBJECT` macro
- Updated some code in roctracer to use `component::category_region` instead of explicitly using `tracing::` functions
- Updated `backtrace_metrics` to better support controlling their presence in the traces/profiles via categories
- Added support for `--print` in `validate-timemory-json.py`
- Generic `OMNITRACE_ADD_VALIDATION_TEST` CMake function
## Git Log
* OMNITRACE_DEFAULT_OBJECT
- replace TIMEMORY_DEFAULT_OBJECT with TIMEMORY_DEFAULT_OBJECT
* trace-time-window example + tests
- adds cmake OMNITRACE_ADD_VALIDATION_TEST function for testing
- validate-timemory-json.py now supports printing (-p)
- update to OMNITRACE_STRIP_TARGET
* Update timemory submodule
- detailed backtrace print /proc/<PID>/maps
- operation::push_node verbosity change
- storage::insert_hierarchy use emplace + at instead of operator[]
- concepts::is_type_listing
- argparse updates for start/end group
- argparse color fixes
* perfetto updates
- Remove OMNITRACE_CUSTOM_DATA_SOURCE CMake option
- move tracing::get_perfetto_config and tracing::get_perfetto_session to perfetto.cpp
* config and runtime updates
- OMNITRACE_DISABLE_CATEGORIES option
- get_enabled_categories() + get_disabled_categories()
- config impl handles populating them
- OMNITRACE_TRACE_DELAY option
- OMNITRACE_TRACE_DURATION option
- OMNITRACE_TRACE_PERIODS option
- {get,set}_signal_handler
- removes config.cpp link dependency for omnitrace_finalize
- get_realtime_signal() + get_cputime_signal() + get_sampling_signals()
- moved from runtime.cpp to config.cpp
* utility::convert
- helper function for converting string to a type
* pthread_create_gotcha + thread_info updates
- thread_index_data::as_string()
- tweak printing info about new thread / exited thread
* binary updates
- get_binary_info has arg to disable dwarf parsing
- binary_info contains vector of breakpoint addresses
- binary_info:filename() function
- binary::get_linked_path
- binary::get_link_map has args for dlopen mode
- symbol::read_dwarf -> symbol::read_dwarf_entries
- symbol::read_dwarf_breakpoints
* library updates + categories impl
- implement config::set_signal_handler
- categories.cpp for handling trace delays
- implement trace delay/duration/periods
* concepts + debug + defines
- tuple_element in concepts
- removed runtime header from debug header
- OMNITRACE_DEFAULT_COPY_MOVE
* gpu + rocm_smi
- moved rsmi_num_monitor_devices call to gpu.cpp
- gpu::rsmi_device_count()
* roctracer updates
- roctracer_bundle_t -> roctracer_hip_bundle_t
- use category_region instead of explicit tracing push/pop calls
* sampling + backtrace_metrics
- rework backtrace_metrics to support categories
* tracing updates
- category stack counters (i.e. push vs. pop counter) for profiling and tracing
- push_timemory and pop_timemory accept string_view instead of const char*
- tweaked the pop_timemory hash search
- {push,pop}_perfetto theoretically supports same invocations as for {push,pop}_perfetto_ts and {push,pop}_perfetto_track
- mark_perfetto, mark_perfetto_ts, mark_perfetto_track
* category_region update
- expanded the critical trace categories
- use category_push_disabled
- use category_pop_disabled
- use category_mark_disabled
* constraint implementation
- This provides generic functionality for constraining data collection within a windows of time.
- E.g., delay, delay + duration, (delay + duration) * nrepeat
* COLORIZED_LOG -> MONOCHROME
* constraint + omnitrace-causal + omnitrace-sample updates
- support for using different clock IDs for constraints
- OMNITRACE_TRACE_PERIOD_CLOCK_ID option
- tweak to trace-time-window example
- tweak to trace-time-window tests
* Fix formatting
* Update time-window tests
- Fix detection of validation support for perfetto
- Using the --caller-include feature + runtime instrumentation on Ubuntu 18.04 and OpenSUSE 15.2 results in a segfault in the internals of Dyninst.
- For now, mark that these tests will fail
- Later, determine if updating Dyninst submodule fixes this problem
* Fix OMNITRACE_OUTPUT_PATH for all tests
- Provide absolute path instead of relative
* Tweak lambda for checking whether HW counters are enabled
- causing strange build errors on older GCC compilers
* Update dyninst submodule
- fix issues with using --caller-include for Ubuntu 18.04, OpenSUSE 15.x
* cmake formatting
* fix sampling compiler issue for GCC 8
* Tweak thread create message
* Increase causal validation iterations
[ROCm/rocprofiler-systems commit: 8feb6bf8b6]
- omnitrace-install.py will be uploaded as a release asset
- script simplifies selecting the correct installer script
[ROCm/rocprofiler-systems commit: 1f818054ce]
* roctracer: use multiple tracks for HIP streams
Use different perfetto tracks for each stream, and set the name of
these tracks to the stream pointer values. Setting the name like this
matches the args in the API traces.
This fixes overlapping work on multiple streams appearing as a call
stack.
* Fix -pedantic
* Run clang-format
* Add option to disable per stream tracks in perfetto
* Updated scheme for roctracer activity + general roctracer fixes
- Per-device tracks
- Handle HSA OPS in ROCm 5.3
- the changes in ROCm 5.3 were causing HSA ops to get discarded
- Default for OMNITRACE_ROCTRACER_DISCARD_INVALID is now zero
- i.e. default behavior is to flip beg_ns and end_ns when beg_ns > end_ns
* Flush perfetto at end of hip_activity_callback
- fixes unterminated regions
* GitHub Actions and run-ci script updates
- improve reliability
* Set OMNITRACE_TMPDIR in testing
- files in /tmp get occasionally deleted during CI
Co-authored-by: Gergely Meszaros <gergely@streamhpc.com>
[ROCm/rocprofiler-systems commit: 589a729702]
* CDash name prefix {{ repo_owner }}-{{ ref_name }}
- remove /merge from CI name
* disable using BFD when sampling_include_inlines is OFF
- this consumes a lot of memory
* Improve finalization of rocprofiler
* update timemory submodule
- disable OMPT thread begin/end callbacks
- support hierarchies in signal handlers
- update operation::pop_node debugging
- settings_update_type + setting_supported_data_types
- fixed parsing args in timemory_init
* Improve timemory build time
* Remove kokkosp restrictions for perfetto
* omnitrace exe signal handler update
- configure signal handlers before main to allow libomnitrace to override
* Backtrace and timemory submodule updates
- Use unwind::cache w/o inline info
- update timemory submodule
- unwind::cache updates
- filepath updates
- fix termination_signal_message
- fix vsettings::report_change
* Update dyninst submodule
- updates BinaryEdit::getResolvedLibraryPath
* update timemory submodule
- update CpuArch support
* Cleanup configure warnings
* Update examples cmake and workflows
- (Mostly) eliminate configuration warnings
* omnitrace exe updates
- pass environ to BPatch::processCreate
- avoid trailing ":" in DYNINST_REWRITER_PATHS
* Update dyninst submodule
- Add flags to DyninstOptimization.cmake
- Remove strtok from BinaryEdit::getResolvedLibraryPath
* examples/mpi CMakeLists.txt update
- STATUS message about missing MPI during CI, otherwise AUTHOR_WARNING
* Dev build and linker flags
- use -gsplit-dwarf when OMNITRACE_BUILD_DEVELOPER is ON
- disable when OMNITRACE_BUILD_NUMBER > 1
- OMNITRACE_BUILD_LINKER option
- add -fuse-ld=${OMNITRACE_BUILD_LINKER}
- omnitrace_add_cache_option function
* Update workflows to set OMNITRACE_BUILD_NUMBER
* Fix generator expressions for -fuse-ld
* Suppress some configuration warnings during CI
- helps to keep track of real warnings when they arise
* Update timemory and dyninst submodules with CMP0135
* Add -V flag to run-ci script
[ROCm/rocprofiler-systems commit: f147670a7a]
* Submitting jobs to cdash
* Fail on submit
* submit url env
* submit url env
* try passing submit url as arg
* fix submit url
* Updated default URL
* Add submissions for remaining ubuntu focal workflow jobs
* Replace g++ with gcc in dashboard build name
* Add --ctest-args to run-ci.sh
* Add cdash support for bionic, jammy, and opensuse workflows
* Decrease CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE
* OMNITRACE_BUILD_CODECOV option
* Support code coverage in CDash script
* CI dyninst built with debug info
* Update ci-containers
- cron schedule moved 4 hours later to UTC+5
* Update implementation of config::configure_signal_handler
- using lambdas failed to compile with codecov flags
* Add codecov job to ubuntu focal workflow
* Fix support for --ctest-args in run-ci script
* Fix ubuntu workflows
* Fix quotation handling in run-ci script
* git safe directory for codecov
* New MPI examples
* Remove --stop-on-failure
* dynamic_library update
- find_library_path checks procfs maps
- invoke find_library_path with no additional args to resolve to mapped file
* RCCLP uses dynamic_library
* check if file exists for memory_map_files metadata
* Testing updates
- include new mpi examples in tests
- fix test labels
- test critical-trace exe
* Update MPI C examples tests (needed arg)
* Remove try/catch block from critical-trace
* Fix sampling max wait when shutting down
* Fix test env for critical-trace
* Fix settings for critical-trace
- disable time output: data is deterministic
- disable PID suffixes: not multiprocess
* Update critical-trace ctest
* Update critical-trace exe
- throw error if input cannot be opened
- throw error if input has no data
* Update lulesh example with more kokkos tools usage
* Fix tasking issue with critical_trace and roctracer
- were not setting pools to active
- also sync before critical_trace::get_entries
* Increase verbosity of critical-trace tests
* Update code coverage tests
- skip code coverage + preload
- code-coverage python example and test
* Remove duplication omnitrace.initialize function
* Skip python3.6 for ubuntu jammy
* Update MPI examples
- use MPI_Isend and MPI_Irecv
- explicitly use MPI_Bcast
* Update Formatting.cmake
- include C files in examples
* run-ci script does not check return of coverage
* mpi-allreduce link to libm
* Update ctest args in run-ci script
* Update dyninst submodule
- safety improvements in BinaryEdit::openResolvedLibraryName
* capture cmake error for ctest_coverage
[ROCm/rocprofiler-systems commit: 46b6db1a4c]
* Testing and CI support for Ubuntu 22.04
* Fixes for ROCm
- Jammy does not have ROCm installers
* Name, timeout, and python updates
- renamed ubuntu-jammy-external.yml to ubuntu-jammy.yml
- increased all 5 minute timeouts to 10 minutes
- include python 3.10 in testing
* Update dyninst to remove interposed definition of _r_debug
* Rebuild Dyninst + test install script
* Revert container change
* git safe directory
* pushd -> cd
* fix MPI include
* Fix testing step
* OMPI_ALLOW_RUN_AS_ROOT
* Test script changes
* Fix mismatched malloc / delete[]
* Jammy workflow tweaks
* CPack tweak for boost deb deps
* pthread_mutex_gotcha config returns when not enabled
* fix echoing config in CI
* USE_CLANG_OMP
- option to disable using LLVM OpenMP when building OpenMP test executables
- Jammy workflow sets USE_CLANG_OMP=OFF
* Dyninst submodule boost download
- updated containers workflow to include jammy
- updated workflow to use ci
* Updates to workflows + replace test-install.sh
- test-install.sh in this branch was replaced with one in main branch
* Expand jammy test-install.sh args
* Fix openmp-cg-sampling-duration test
* update timemory submodule
- use-after-free violation in popen::pclose
* revert some tweaks to sampling-duration test
* Fix env of test-install.sh
* cmake format
* jammy bash
* CPack install for jammy
* formatting workflow action version bump
* Update timemory submodule
- libunwind submodule via timemory sets SOVERSION to 99 to avoid ABI conflicts with v8
* Fix help menu for omnitrace-sample
* Support other boolean forms in test-install.sh
* Update docker files and build-docker.sh
- consolidated cases in build-docker.sh
- support rocm version of 0.0 (no rocm install)
- support rocm v5.3
- updated centos handling
* update opensuse actions/checkout version
* Tweaks to ubuntu-focal testing
- actions/checkout@v3
- use test-install script
* update cpack
- ubuntu 22.04
- rocm 5.3
- rename os matrix field to os-version
- remove CI_ROCM_VERSION (no longer necessary)
- remove default-rocm-version matrix field (no longer necessary)
- CentOS packaging
* fix argparsing and omnitrace-sample tests in install-tests.sh
* focal rocm test install workflow fix
* Fix omnitrace-sample build
* Dockerfile.centos + build-docker.sh updates
* Update actions/upload-artifact version
* Dockerfile.ubuntu: install rocm-device-libs
* Refactor cpack
* fix cpack if quotes
* Dockerfile.ubuntu rocm < 5 installs rocm-dev
* build-release.sh defaults to boost version 1.79.0
[ROCm/rocprofiler-systems commit: ede6007f9b]
- More to come in later commit, below is just tidying some stuff up
- clang-tidy
- mpi_gotcha quiet about not finding funcs
- update to new papi config
- sampling block_samples / unblock_samples
- disable calling component's sample functions within sampler
- release doesn't strip library
- remove HSA and ROCP env variables from modulefile / setup-env
- preliminary support for LD_PRELOAD usage
- default sampling rate is 300 interrupts / second
- fixes various deadlock issues at startup
[ROCm/rocprofiler-systems commit: 8f36620e29]
- Fix setup-env.sh
- Closes#149
- omnitrace exe color
- test-install.sh script
- if config variable is updated in config or env, include in generated
config
- metadata for hsa, rocm, and ompt
- Closes#153
- Closes#154
[ROCm/rocprofiler-systems commit: 15e6e6d979]
- add OnLoad and OnUnload to omnitrace-dl
- disable global fence for kokkos profiling tools
- tweak omnitrace_strip_target to use wildcards
- added dl-gen.py script for generating dlopen bindings
- added support for kokkosp_request_tool_settings
- added support for kokkosp_dual_view_sync
- added support for kokkosp_dual_view_modify
[ROCm/rocprofiler-systems commit: ab395f86c4]
* omnitrace find_package support
- Fix to INSTALL_DESTINATION for configure_package_config_file
- Fixes to ConfigInstall.cmake and omnitrace-config.cmake.in
* Test find_package
[ROCm/rocprofiler-systems commit: d2e635ed3c]