Wykres commitów

285 Commity

Autor SHA1 Wiadomość Data
David Galiffi 7bbca47ee8 Enable features to generate dependency list and fix RPM installation. (#343)
* CPACK: Enable features to generate dependency lists

For DEBIAN, enable CPACK_DEBIAN_PACKAGE_SHLIBDEPS to generate a package
dependency list.

For RPM, enable CPACK_RPM_PACKAGE_AUTOREQPROV to automatically generate
lists of shared libraries that this package requires and provides.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Manually derive RPM provides list for libunwind.

[Why]
`dnf install` fails dependency resolution

[How]
Auto-generation did not add it to the "provides" list, despite it being
included in the package when OMNITRACE_BUILD_LIBUNWIND is enabled. So,
manually include these in the "CPACK_RPM_PACKAGE_PROVIDES" parameter.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Fix cmake-format linting

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 129580b416]
2024-06-07 10:34:35 -05:00
David Galiffi 01f11ff7b4 Update CPack packaging for ROCm release support (#339)
* Update CPack packaging variables.

Look for ROCM_LIBPATH_VERSION environment variable to patch the
CPACK_PACKAGE_VERSION.
Add some status logs to output packaging information.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Add CPack variables to "omnitrace_add_feature".

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 6bcd1d36cc]
2024-05-01 21:15:20 -05:00
Jonathan R. Madsen ab551e7eb7 Remove Critical Trace Support (#327)
* Delete core critical-trace files

* Update docs and README

* Update workflows

* Update testing

* Update cmake

* Remove critical trace usage in source code

* Update source/docs/critical_trace.md

- fix spelling

* Formatting

* Update bin/omnitrace-avail/avail.cpp

- statically allocate shared pointers for timemory manager and hash id/aliases to prevent use-after-free errors

[ROCm/rocprofiler-systems commit: 9499e2f521]
2024-04-23 09:35:44 -05:00
David Galiffi 2a42e3abf0 Updated links to point to the ROCm organization. (#337)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: b81db80926]
2024-04-23 01:44:02 -05:00
David Galiffi 3a9db4dc7f Create CODEOWNERS (#338)
[ROCm/rocprofiler-systems commit: c29f8e8244]
2024-04-22 23:22:01 -05:00
Jonathan R. Madsen 3dad7a8c18 Installers for RHEL 8.8 and RHEL 8.9 (#334)
* Installers for RHEL 8.8 and RHEL 8.9

- RHEL 8.8: Supports ROCm 5.6, 5.7, 6.0
- RHEL 8.9: Supports ROCm 6.0

* Update build-docker.sh

- fix PERL_REPO for OpenSUSE

* Update some workflows to use Node.js 20

* Fix Dockerfile.opensuse*

[ROCm/rocprofiler-systems commit: 1df597e049]
2024-04-01 13:31:10 -05:00
David Galiffi 9ab37345b8 Fix link to ROCm install instructions (#333)
Now pointing to https://rocm.docs.amd.com/projects/install-on-linux/en/latest/.

[ROCm/rocprofiler-systems commit: b1e5c356aa]
2024-03-27 18:13:06 -05:00
Jonathan R. Madsen 25ff5e3891 OMNITRACE_ROCM_SMI_METRICS (#331)
* OMNITRACE_ROCM_SMI_METRICS

- configuration variable OMNITRACE_ROCM_SMI_METRICS for specifying which rocm-smi metrics to collect
- auto-disable metric collection when rsmi_dev_X_get returns RSMI_STATUS_NOT_SUPPORTED

* Bump version to 1.11.1

* Python formatting

* Update python/libpyomnitrace.cpp

- fix usage of substr (ignored return value)

* Update python/gui/source/gui.py

- Fix E721
  - do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`

[ROCm/rocprofiler-systems commit: 15127c0d43]
2024-02-08 07:06:23 -06:00
Jonathan R. Madsen 8c8caaa1d9 Fix omnitrace-avail component list (#328)
* Fix omnitrace-avail component list

- remove omnitrace components from `omnitrace-avail -C` since these are no-ops in OMNITRACE_TIMEMORY_COMPONENTS

* Fix omnitrace-avail-filter-wall-clock-available test

[ROCm/rocprofiler-systems commit: 77d52814e9]
2024-01-10 20:00:46 -06:00
Jonathan R. Madsen 06c47383cc Fix thread-limit bug in roctracer (#326)
Update roctracer.cpp

- fix call to hip_exec_activity_callbacks when more runtime threads than compile time max

[ROCm/rocprofiler-systems commit: edd6f57cf3]
2024-01-10 19:10:45 -06:00
Jonathan R. Madsen 1444bf4d85 ROCm 6.0 packaging (#325)
* Update build-docker.sh

- support rocm 6.0

* Update cpack workflow

- support rocm 6.0

* Update CI testing workflow paths-ignore

- changes to {docs,cpack,containers,formatting}.yml and docker do not require testing

* Update docker for OpenSUSE

- always use --non-interactive with zypper
- tweak to PERL_REPO when OS version >= 15.4

[ROCm/rocprofiler-systems commit: cfaace38a8]
2024-01-10 17:29:47 -06:00
Jonathan R. Madsen d4ac1ed7ea Fix cpack on SLES (#324)
Update lib/core/gpu.cpp

- use const std::array instead of constexpr std::array due to internal compiler errors on systems with older GCC compilers

[ROCm/rocprofiler-systems commit: adefde707c]
2024-01-10 09:38:09 -06:00
Ben Richard 857eee18c2 Deprecate OMNITRACE_USE_PERFETTO, OMNITRACE_USE_TIMEMORY (#306)
* Rename OMNITRACE_USE_PERFETTO to OMNITRACE_TRACE

* Rename OMNITRACE_USE_TIMEMORY to OMNITRACE_PROFILE

* Revert change to Perfetto.cmake

* Fix formatting

clang-format-11 was complaining about formatting

[ROCm/rocprofiler-systems commit: 5de4163d66]
2024-01-10 07:20:54 -06:00
Tal Ben-Nun 5e4d7f7f84 Add option to skip barrier marker events in traces (#320)
* Add option to skip barrier marker events in traces

* Formatting

* Apply review suggestions

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* clang-format

* Formatting

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-systems commit: 82cbe3f306]
2024-01-10 07:17:32 -06:00
Tal Ben-Nun e0ed9a6e52 Enable running setup-env from non-bash shells (#319)
Signed-off-by: Tal Ben-Nun <talbn@llnl.gov>

[ROCm/rocprofiler-systems commit: 5b42702299]
2024-01-10 07:17:08 -06:00
Jonathan R. Madsen 86946e95e2 HIP API backtraces (#323)
* Update lib/core/config.cpp

- Add OMNITRACE_ROCTRACER_HIP_API_BACKTRACE option

* Update lib/omnitrace/library/roctracer.cpp

- support perfetto debug annotation of backtrace in HIP API call

* Fix backtrace resolution and ordering in UI (#1)

* Fix backtrace resolution for non-omnitrace libraries

* Nicer Perfetto UI on long backtraces

* Make Perfetto annotation consistent

* clang-format

---------

Co-authored-by: Tal Ben-Nun <tbennun@users.noreply.github.com>

[ROCm/rocprofiler-systems commit: 608287ddad]
2024-01-10 06:15:32 -06:00
Jonathan R. Madsen b5bdba12e4 Roctracer flush activity fix + perfetto.cfg (#317)
* Fix roctracer_flush_activity

- invoke roctracer_flush_activity() before disabling domains

* create comp::roctracer::flush()

- real issue was the global state when roctracer_flush_activity() was called

* formatting

* Update lib/omnitrace/library/components/roctracer.hpp

- provide definition of comp::roctracer::flush when OMNITRACE_USE_ROCTRACER is not defined

* omnitrace.cfg -> perfetto.cfg

- rename provided perfetto config file (omnitrace.cfg) to perfetto.cfg to avoid confusion

* Update lib/core

- gpu.hpp: defines for OMNITRACE_USE_{HIP,ROCTRACER,ROCPROFILER,ROCM_SMI}
- gpu.cpp
  - include core/hip_runtime.hpp
  - fix serialization of hipDeviceProp_t
- add hip_runtime.hpp
  -  ensure proper inclusion of hip_runtime.h
- add rccl.hpp
  - ensure proper inclusion of rccl.h

* Update lib/omnitrace/library

- rcclp.cpp
  - update includes for rccl
- roctracer.hpp
  - update includes for hip_runtime
- components/comm_data.hpp
  - update includes for rccl
- components/rcclp.hpp
  - update includes for rccl

* Update bin/omnitrace-avail/avail.cpp

- update includes for hip_runtime

* Update examples/rccl/CMakeLists.txt

- fix find_package for rccl when CI enabled

* Update CMakeLists.txt

- set cmake policy CMP0135 to NEW for cmake >= 3.24
  - Enable DOWNLOAD_EXTRACT_TIMESTAMP with ExternalProject_Add + URL download method

* Update timemory submodule

* Update pybind11 submodule

* Update pybind11 submodule

* Update lib/core/rccl.hpp

- include rccl.h only if OMNITRACE_USE_RCCL > 0

* Update lib/core/{gpu,hip_runtime}.hpp

* Update lib/core/gpu.cpp

- reintroduce some ppdefs

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/gpu.cpp

- fix static assert for OMNITRACE_HIP_VERSION_MINOR when HIP version 4.x or older (unreliable minor versions)

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/config.cpp

- disable OMNITRACE_PERFETTO_COMBINE_TRACES by default

* Update lib/core/perfetto.cpp

- if unable to open perfetto temp file, return the ReadTraceBlocking()

* Update lib/core/config.*

- flush tmpfile before closing

[ROCm/rocprofiler-systems commit: 7bc50f5a0a]
2024-01-10 05:02:22 -06:00
Ben Richard e75c591baa Fix MPI test failures (#322)
The CI test machines only have 2 MPI slots. MPI tests were failing
when requesting 4 CPUs. Update these tests to request 2 CPUs.

[ROCm/rocprofiler-systems commit: aeb346b6d6]
2024-01-09 09:19:06 -06:00
Jonathan R. Madsen a1b11b94f0 Dynamic expansion of thread data (#294)
* Tests for exceeding OMNITRACE_MAX_THREADS

- tests which exceeds OMNITRACE_MAX_THREADS value for thread creation

* CMake Formatting.cmake update

- include source files in /tests/source directory

* Add unknown-hash= to OMNITRACE_ABORT_FAIL_REGEX

- fail if a timemory hash is not resolved to a name

* Tests for exceeding OMNITRACE_MAX_THREADS

- update

* omnitrace-sample update

- remove env disabling of critical-trace and process-sampling

* core library update

- make_unique in concepts.hpp
- add OMNITRACE_USE_ROCM_SMI to "process_sampling" category
- remove forced disabling of critical-trace in sampling mode
- parentheses for OMNITRACE_PREFER
- use tim::get_hash_id instead of tim::get_combined_hash_id

* core library update (containers)

- added aligned_static_vector.hpp
  - similar to static_vector.hpp but attempts to align to cache line size
- alignment template parameter for stable_vector
- added missing aliases in static_vector
  - consistent with aligned_static_vector aliases

* thread_info update

- track the peak number of threads created
- thread_info::get_peak_num_threads() returns the peak number of threads

* thread_data update

- generic thread_data inherits from base_thread_data
- thread_data reworked to support dynamic expansion
- base_thread_data updated to invoke private_instance() function
- thread_data<optional<T>> uses stable_vector aligned to cache line width
- thread_data<identity<T>> uses stable_vector aligned to cache line width
- thread_data for optional and identity provide private private_instance function + friend to base_thread_data
- component_bundle_cache<T> is now thread_data<component_bundle_cache_impl<T>>

* causal update

- thread_data<T>::instances -> thread_data<T>::instance(construct_on_thread{ ... })
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- update progress_bundle usage to new thread_data API

* backtrace/backtrace_metrics component update

- backtrace_metrics update
  - update to new thead_data API
  - add thread CPU time row in perfetto
  - fix potential bug when rusage categories are disabled
  - fix bug in operator-= not subtracting cpu time of rhs
- backtrace update
  - skip all child call-stack below 'tim::openmp::' if sampling_keep_internal = false

* pthread_gotcha component update

- pthread_gotcha::shutdown() invokes pthread_create_gotcha::shutdown()

* pthread_create_gotcha component update

- minor tweak to {start,stop}_bundle functions: pass in thread id
- update to new thread_data API
- track native handles of internal threads
- implement system with pthread_kill to stop dangling bundles

* rocprofiler/roctracer component update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* critical trace (library) update

- update to new thread_data API
- tim::get_combined_hash_id -> tim::get_hash_id

* coverage update

- update to new thread_data API

* tasking update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* roctracer update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* rocm_smi update

- update to new thread_data API

* runtime.cpp update

- update to new thread_data API

* sampling.cpp update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* ompt.cpp update

- invoke pthread_gotcha::shutdown before invoking OMPT finalize function
  - this prevents signals from being delivered to OpenMP threads

* tracing.hpp and tracing.cpp update

- replace get_timemory_hash_{ids,aliases} functions with copy_timemory_hash_ids function
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- improvements to + error checking in thread_init function

* library.cpp update

- move copying timemory hash id/aliases to tracing.cpp
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* Update BuildSettings.cmake

- add -Wno-interference-size to suppress warning about use of std::hardware_destructive_interference

* Update fork example

- improve scheme for waiting on child processes via waitpid instead of wait
- support running main routine multiple times
- push/pop regions in child process

* Update lib/common/defines.h.in

- allow use to specify misc values via -D <name>=<value>
  - OMNITRACE_CACHELINE_SIZE
  - OMNITRACE_CACHELINE_SIZE_MIN
  - OMNITRACE_ROCM_MAX_COUNTERS
- remove unused defines
  - OMNITRACE_ROCM_LOOK_AHEAD
  - OMNITRACE_MAX_ROCM_QUEUES

* Update rocprofiler.hpp

- OMNITRACE_MAX_ROCM_COUNTERS -> OMNITRACE_ROCM_MAX_COUNTERS

* Update aligned_static_vector

- set cacheline_align_v from max of OMNITRACE_CACHELINE_SIZE and OMNITRACE_CACHELINE_SIZE_MIN

* Update tracing.cpp

- acquire locks for updating main hash ids/aliases
- only propagate ids/aliases when finalizing

* Update pthread_create_gotcha.cpp

- make sure hash for "start_thread" exists on main thread

* Update causal end to end tests

- if OMNITRACE_BUILD_NUMBER is 1, set OMNITRACE_VERBOSE=0

[ROCm/rocprofiler-systems commit: 518c83e0f9]
2023-10-16 18:04:47 -05:00
Jonathan R. Madsen 63e8cec645 Bump version to 1.11.0 (#314)
Bump version to 1.11.0

[ROCm/rocprofiler-systems commit: 6a61a83452]
2023-10-16 13:11:58 -05:00
Jonathan R. Madsen 41e9dddc74 Installers for ROCm 5.7, Python 3.12 + Remove DEB and RPM installers (#313)
* Dockerfile update

- Python 3.12 support

* Bump version to 1.10.4

* Update docker scripts

- Support Python 3.12
- Set RETRY to 1 if less than 1
- Support ROCm 5.7

* Update scripts/build-release.sh

- Default to python 3.6-3.12 (i.e. add Python 3.12)

* Update cpack workflow

- Packaging for ROCm 5.7
  - Ubuntu 20.04
  - Ubuntu 22.04
  - OpenSUSE 15.4
  - RHEL 8.7
  - RHEL 9.1
- Packaging for older ROCms (by request)
  - RHEL 8.7 + ROCm 5.3
  - OpenSUSE 15.3 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.3
- Remove DEB and RPM installers
  - Only generate STGZ installers

* Update cpack workflow

- disable uploading DEB and RPM artifacts

[ROCm/rocprofiler-systems commit: 2e581e2a10]
2023-10-16 10:39:18 -05:00
Jonathan R. Madsen 1502968c67 Fix roctracer data race (#309)
- roctracer_type_mutex was per-thread, causing lack of sync b/t callback and activity

[ROCm/rocprofiler-systems commit: 227980f32b]
2023-09-27 18:06:16 -05:00
Fei Zheng 587bd81d29 A few fixes for the document (#302)
a few doc fixes

[ROCm/rocprofiler-systems commit: f88614dbcd]
2023-09-27 14:43:06 -05:00
Jonathan R. Madsen 553fff7218 Clean package manager caches in CI (#305)
* Clean package manager caches in CI

* remove cpack docker prune

* Apply suggestions from code review

* Apply suggestions from code review

* Update .github/workflows/cpack.yml

- disable removing large packages and swap storage to see if that fixes the jobs losing communication with server

* Update ubuntu-focal workflow

- disable debug flags in code coverage run

[ROCm/rocprofiler-systems commit: 0d84a357e5]
2023-09-26 15:44:49 -05:00
Jonathan R. Madsen 7499945cb9 Reduce release packaging (#300)
* Bump version to 1.10.3

* Drop releases for ROCm < 5.3

- ROCm is no longer providing release for Ubuntu 18.04 starting with 5.3 so omnitrace is dropping support for Ubuntu 18.04 + ROCm
- Dropping ROCm 5.2 releases for Ubuntu 20.04
- Dropping ROCm 5.2 releases for OpenSUSE 15.4

* Update redhat workflow

- Test RedHat 9.1 + ROCm 5.5
- Test RedHat 9.1 + ROCm 5.6

* Update ubuntu-focal workflow

- drop ROCm 5.2 testing
- add ROCm 5.6 testing

* Update Findroctracer.cmake

- provide /opt/amdgpu to HINTS/PATHS for drm and drm_amdgpu libraries

* Update Findrocprofiler.cmake

- prefer librocprofiler64.so.1

* Update librocprofiler64.so to librocprofiler64.so.1

- search for the SOVERSION 1 library of librocprofiler64.so if ROCm > 5.5.0

* Update Findrocprofiler.cmake

- link to libpciaccess for ROCm 5.5.0

* Update redhat CI workflow

- install libpciaccess for rocm CI

* Update cpack workflow

- Remove all RHEL 9.0 packaging
- Remove all packaging for ROCm 5.3 on OSes supporting where releases are provided for 5.4, 5.5, and 5.6

* Update ubuntu focal workflow

- remove rocm 5.3 jobs

[ROCm/rocprofiler-systems commit: 1216fd99a7]
2023-09-12 20:19:26 -05:00
Jonathan R. Madsen c905a38e5c Dependabot package updates (#301)
Dependabot package updates

[ROCm/rocprofiler-systems commit: 70d6fc7631]
2023-08-10 15:10:04 -05:00
Jonathan R. Madsen e86c87b755 Packaging for ROCm 5.6 (#299)
* Packaging for ROCm 5.6

- Bump version to 1.10.2
- build rocm 5.6 containers for ubuntu 20.04, 22.04
- build rocm 5.6 containers for opensuse 15.4
- build rocm 5.5 and 5.6 for rhel 8.7, 9.0, 9.1
- cpack rocm 5.6 for ubuntu 20.04, ubuntu 22.04, opensuse 15.4, rhel 8.7, rhel 9.0, rhel 9.1

* Update omnitrace.cfg

- remove file_write_period_ms
- remove flush_period_ms

* Remove ROCm 5.6 for RHEL 9.0

- no packaging support

[ROCm/rocprofiler-systems commit: 0b751d2aef]
2023-08-09 18:59:45 -05:00
Jonathan R. Madsen e135d3c6eb Sampling post-processing Perfetto fix (#298)
sampling post-processing perfetto fix

- avoid creating overflow sampling perfetto tracks when there is no data
- fix the parent region begin/end timestamps for the sampling tracks

[ROCm/rocprofiler-systems commit: 5276c957fb]
2023-07-06 02:40:49 -05:00
Jonathan R. Madsen 82c9cdd9b5 Update GOTCHA submodule (via timemory submodule) (#295)
Update timemory submodule

- updated GOTCHA submodule
  - support for wrapping the latest symbol version
- CI updates
- macos-ci in GitHub actions
- misc fixes for macOS
- fixed yaml-cpp install

[ROCm/rocprofiler-systems commit: 68b4d790f3]
2023-06-30 17:56:32 -05:00
Jonathan R. Madsen 077a844cb6 PyTorch Python fork fix part 2 (#292)
PyTorch Python fork fix part 2

- store script file in environment for robustness against restart after fork

[ROCm/rocprofiler-systems commit: 6c9b66d938]
2023-06-22 17:14:40 -05:00
Jonathan R. Madsen 5eeccc1a8a PyTorch Python fork fix (#291)
* PyTorch Python fork fix

- fixes issue where forking process in PyTorch causes omnitrace/__main__.py to fail due to missing script argument

* Update source/python/omnitrace/__main__.py

Remove debugging "print" LOC

[ROCm/rocprofiler-systems commit: a85f141afe]
2023-06-21 22:30:47 -05:00
Jonathan R. Madsen 41394a8fad GitHub Actions workflow and docker updates (#290)
* Support ROCm 5.5 in docker

* Update containers workflow

- add Ubuntu and OpenSUSE container builds for ROCm 5.4 and 5.5
- add RHEL builds

* Update cpack workflow

- build on PR against main when cpack.yml or docker files updated
- removed packaging for ROCm < 5.2 for many OSes
- added packaging for ROCm 5.5

* Update OpenSUSE workflow

- add python 3.11 to OMNITRACE_PYTHON_ENVS
- upload-artifacts name includes strategy.job-index (prevent overwrite)
- only upload artifacts on failure
- continue on error if upload artifacts fails

* Update RedHat workflow

- provide run-name
- add python 3.11 to OMNITRACE_PYTHON_ENVS
- upload-artifacts name includes strategy.job-index (prevent overwrite)
- only upload artifacts on failure
- continue on error if upload artifacts fails

* Update Ubuntu (Bionic) workflow

- add python 3.11 to OMNITRACE_PYTHON_ENVS
- upload-artifacts name includes strategy.job-index (prevent overwrite)
- only upload artifacts on failure
- continue on error if upload artifacts fails

* Update Ubuntu (Focal) workflow

- add python 3.11 to OMNITRACE_PYTHON_ENVS
- upload-artifacts name includes strategy.job-index (prevent overwrite)
- only upload artifacts on failure
- continue on error if upload artifacts fails
- remove testing of ROCm 4.3, 5.0, 5.1
- add testing of ROCm 5.5

* Update Ubuntu (Jammy) workflow

- add python 3.11 to OMNITRACE_PYTHON_ENVS
- upload-artifacts name includes strategy.job-index (prevent overwrite)
- only upload artifacts on failure
- continue on error if upload artifacts fails
- add testing of ROCm latest

* Dockerfile.{rhel,opensuse} update

- remove use of amdgpu-install in favor of installing rocm-dev package
  - In ROCm 5.5, amdgpu-install changed meaning of --usecase=rocm (added rocmdev use case)

* redhat workflow update

- remove use of amdgpu-install in favor of installing rocm-dev package
  - In ROCm 5.5, amdgpu-install changed meaning of --usecase=rocm (added rocmdev use case)

* build-docker.sh update

- add '--progress plain' to docker build commands

* Ubuntu (jammy) workflow update

- fix rocm installation

* Update Dockerfile.rhel

- add LIBRARY_PATH for /opt/amdgpu/lib64 for redhat

* Update Dockerfile.rhel

- install libpciaccess for rocm

[ROCm/rocprofiler-systems commit: 693f753a9e]
2023-06-20 06:26:17 -05:00
Jonathan R. Madsen c6929f545d Perfetto annotation from timemory components (#289)
* Annotate perfetto with timemory component data

- support perfetto annotations via timemory component data, e.g. use PAPI component for exact HW counter annotations

* Tests for perfetto annotation via timemory data

* Update omnitrace-instrument

- remove --default-components argument as this overrides any components set in configuration file
- required by perfetto annotation via timemory data tests

* filter unavailable timemory components

- filter out unavailable timemory components before attempting to invoke the annotate operation on the bundle

* update annotate tests

- account for no PAPI support

* update lulesh-timemory test

- replace '-d wall_clock peak_rss' with '--env OMNITRACE_TIMEMORY_COMPONENTS="wall_clock peak_rss"'

* annotate tests update

- fix misnamed test

* annotate tests update

- restrict binary rewrite to run function to force instrumentation despite heuristics

* annotate tests update

- print {available,overlapping,excluded,instrumented} functions during binary rewrite

* annotate tests update

- add allow-overlapping flag

* Support PAPI with CAP_SYS_ADMIN

- do not disable PAPI if perf_event_paranoid > 2 but has CAP_SYS_ADMIN capability

[ROCm/rocprofiler-systems commit: 1aca8c177b]
2023-06-19 19:18:04 -05:00
Jonathan R. Madsen a0812bfa0b Fix rocprofiler usage in ROCm >= 5.5.x (#288)
Fix rocprofiler usage in ROCm >= 5.5.x

- starting with ROCm 5.5.0, rocprofiler throws exception if OnLoad + dlopen librocprofiler
- CI skipped for this PR since CI does not support GPU usage (tested locally)

[ROCm/rocprofiler-systems commit: 223536896b]
2023-06-15 23:28:45 -05:00
Jonathan R. Madsen 97011ea642 Fix thread index values (#287)
* Update PTL

- PTL submodule waits for threads to start before proceeding

* Initialize perfetto after init_bundle

- perfetto thread creation after pthread_create wrapped

* backtrace component update

- exclude gotcha call-tree

* callchain component update

- callchain::get sorts based on timestamp
- callchain::sample supports duplicate IPs (recursion)

* Bump version to 1.10.1

[ROCm/rocprofiler-systems commit: de9f0e4c10]
2023-06-15 22:37:33 -05:00
Jonathan R. Madsen 973f5bc348 kokkosp update: disable deep copy tracing by default (#286)
kokkosp update

- disable tracing deep copies by default because inconsistent begin/end corrupts call-stack hierarchy
- tracing deep copy operations can be enabled via `OMNITRACE_KOKKOSP_DEEP_COPY`

[ROCm/rocprofiler-systems commit: 262f1e9299]
2023-06-15 17:55:38 -05:00
Jonathan R. Madsen a527ebcf4e Install rocm-dev for hipcc in ubuntu-jammy workflow (#285)
Install rocm-dev for hipcc in ubuntu-jammy workflow

[ROCm/rocprofiler-systems commit: 2c684de367]
2023-06-15 15:58:32 -05:00
Jonathan R. Madsen d38a04a7ea Remove docker hiplibsdk from amdgpu-install (#283)
Remove docker hiplibsdk from amdgpu-install

- amdgpu-install use case hiplibsdk is not necessary and bloats the install
- same as above for package rocm-hip-sdk

[ROCm/rocprofiler-systems commit: 2ce0cb4a19]
2023-06-14 15:09:35 -05:00
Jonathan R. Madsen 8d85410b11 Fix RHEL docker containers (#282)
* Fix RHEL docker containers

- avoid `yum update` since that can update the distro minor version

[ROCm/rocprofiler-systems commit: ad51223960]
2023-06-14 13:18:43 -05:00
Jonathan R. Madsen b65f8e7605 CI timeout + line-info in releases (#279)
* Update perfetto args.gn.in

- remove enable_perfetto_tools_trace_to_text (unused)

* core timeout implementation

- requires OMNITRACE_CI=ON
- requires OMNITRACE_CI_TIMEOUT=<sec>
- adds pthread_self and std::this_thread::get_id to thread info
- pthread_create_gotcha stores native handles (pthread_self)

* Testing updates

- improve detection of segfault/failures with PASS_REGEX exists
- add OMNITRACE_CI_TIMEOUT env variable to all tests

* Line-info in releases

- e.g. -g1 + more options to minimize size of debug info

* Fix typo in config exit action message

* OMNITRACE_UNLIKELY around debug/verbose messages

* format fixes

* Overflow tests + capability check

* transpose example update

- link to threads library

* roctracer/rocprofiler update

- in ROCm 5.5.0, cannot include rocprofiler.h and roctracer.h in same file due to conflicting enum defs
- Moved HSA tracing setup/shutdown to component::roctracer

* roctracer update

- fix definition of roctracer::setup when disabled

* Update fork example

- detach threads on main PID
- flush io outputs when printing info

* Update overflow tests

- pass regular expressions
- overflow on PERF_COUNT_SW_CPU_CLOCK event

* fork gotcha update

- use getpid() instead of getppid()

* update fork example

- wait on threads calling fork

* timeout update

- wait on timeout thread to launch before proceeding

[ROCm/rocprofiler-systems commit: 3e2fa69a14]
2023-06-14 11:55:22 -05:00
Jonathan R. Madsen 557adea45a Linux Perf Support + Causal Profiling Updates (#276)
* causal backtrace updates

- fix initial causal sampling period value

* causal delay updates

- tweak handling of sleep_for_overhead

* Fix experiment global scaling for prog pts

- results in drastically improved predictions

* pthread_mutex_gotcha updates

- disable all wrappers during causal profiling

* validate-causal-json.py updates

- support decimal stddev
- fix setting stddev from command-line

* causal perform_experiment_impl update

- handle start failing because finalizing

* deprecate causal::component::sample_rate

- appears to not help at all

* Rework sample info

* Increase causal unwind_depth

- use OMNITRACE_MAX_UNWIND_DEPTH

* validate-causal-json updates

- min experiments
  - exclude reporting predictions with less than X experiments at a given speedup
- percent samples
  - only print samples within X% of the peak (default: 95%)

* Update timemory submodule

- extensions to sampling for signals delivered via non-timer method
  - e.g. via HW counter overflow

* dwarf_entry::operator< updates

- sort via file

* causal profiling docs updates

- info about backends
- info about installing/enabling perf

* config updates: causal backend

- CausalBackend enum
- OMNITRACE_CAUSAL_BACKEND: perf, timer, auto
- omnitrace-causal option: --backend

* debug update

- use spin_mutex instead of std::mutex

* address_range::contains update

- range from 0-100 contains range from 10-100 but was returning false because high was == 100 not < 100

* symbol::operator< update

- handle load address differences

* sampling updates (non-causal)

- update get_timer to get_trigger + dynamic_cast

* container::static_vector updates

- support construction from container::c_array
- update_size private member func for handling atomic m_size

* Move perf files

- moved library/causal/perf.{hpp,cpp} to library/perf.{hpp,cpp}

* causal example update

- created impl.hpp (forward decls)
- renamed {cpu,rng}_func_impl to {cpu,rng}_impl_func
- only create two threads which run N iterations instead of two threads each iteration

* Update timemory submodule

- updates to unwind::processed_entry
- updates to procfs::maps

* Updated causal documentation

- fixed line numbers changed by modifications to causal example

* omnitrace-causal exe updates

- set OMNITRACE_THREAD_POOL_SIZE to zero by default

* core/containers updates

- static_vector: provide data() member function
- c_array pop_front() and pop_back() member functions

* core: config and argparse updates + perf

- core/perf.{hpp,cpp}
  - forward decl of enums
  - config-related capabilities
- argparse: --sample-overflow
- renamed some config functions
  - e.g. get_sampling_cpu_freq -> get_sampling_cputime_freq
- added config settings related to overflow sampling via perf
- added timer_sampling and overflow_sampling categories

* Update timemory submodule

- sampling allocator flushing

* binary updates

- lookup_ipaddr_entry
- use bfd_find_nearest_line instead of bfd_find_nearest_line_discriminator
  - discriminators are not used
- explicit instantiations of inlined_symbol::serialize

* Bump VERSION to 1.10.0

* sampling and perf updates

- support overflow sampling via Linux Perf
- update perf namespace
- update perf::perf_event
  - update record ctor: pointer instead of const ref
  - update open member func: return optional string
  - add m_batch_size member variable
- sampling updates
  - support overflow sampling
  - flush allocators
  - increase buffer size from 1024 to 2048
  - restructure post-processing in light of perf overflow supports
  - improve offload memory usage only load buffers for thread
  - load_offload_buffer(tid) uses thread-specific filepos
- component updates
  - backtrace_metrics::operator-=
  - backtrace_metrics::operator-
  - backtrace::sample does not record for overflow signal
  - callchain: perf overflow sample

* core updates

- component::sampling_percent does not report self + uses_percent_units

* causal updates

- tweak get_line_info
- overloads for set_current_selection (uint64_t, c_array, std::array)
- delay
  - use sampling::pause/sampling::resume
- experiment
  - experiment::sample derives from unwind::processed_entry
  - experiment::samples is vector instead of set
  - fixed samples
  - overloads for is_selected (uint64_t, c_array, std::array)
  - scaling factor defaults to 100 instead of 50
  - serialize updates follow change to experiment::sample
  - modify algorithm for increasing/decreasing experiment length
- sample_data
  - use map<uintptr, uint64_t> instead of set<sample_data>
  - get_samples returns vector<sample_data> instead of set<sample_data>
- sampling
  - support overflow via Linux Perf
  - update causal_offload_buffer
  - flush sampling allocator
- backtrace
  - overflow component

* libomnitrace-dl updates

- handle dl::InstrumentMode::PythonProfile

* testing updates (causal)

- causal line 155 -> causal line 100
- causal line 165 -> causal line 110

* formatting

* exit_gotcha updates

- exit_info for abort()
- message about non-zero exit code

* testing updates

- fail regex for causal tests
- validate-causal-json: >= min_experiments instead of > min_experiments
- handle OMNITRACE_DEBUG_SETTINGS in omnitrace_write_test_config

* causal sampling updates

- add new lines where appropriate

* causal data updates

- reorder diagnostic info when experiment fails to start

* binary updates

- symbol address range from address to address + symsize + 1
  - add 1 based on debug info

* causal data updates

- sample_selection wait_ns defaults to 1,000 instead of 10,000
- sample_selection wait scaled by iteration number
- save_line_info_impl verbosity
- print latest_eligible_pc when experiment does not start

* causal sampling + component updates

- perf backend disables component::backtrace
- ensure get_sampling_(realtime|cputime|overflow)_signal do not malloc

* causal: remove period stats

* validate-causal-json update

- fix --help

* causal data updates

- improve eligible pc history reporting when experiment fails to start

* causal data updates

- fix compute_eligible_lines_impl
  - eligible address ranges returning too many ranges
  - occasionally, overwrite all *true* eligible address ranges

* causal data updates

- reduce scoped ranges to symbol ranges
- is_eligible_address() returns true contains (not just coarse)
- revert some sample_selection behavior

* binary address_multirange updates

- make coarse_range private
- fix operator+=(pair<coarse, uintptr_t>)

* causal example update

- fix nsync to default to once per iteration

* binary analysis updates

- tweak header file includes

* causal updates

- remove factoring in sleep_for_overhead
- invoke delay::process() even if experiment is not active

* causal data updates

- update latest_eligible_pc structure

* update omnitrace-install.py.in

- fix support for fedora
  - /etc/os-release does not have ID_LIKE
  - fallback to RHEL 8.7 if version not specified

* update omnitrace-install.py.in

- fix support for debian
  - /etc/os-release does not have ID_LIKE
  - version mapping

* Update documentation

- update docs on installation

* causal data and experiment updates

- data: reset_sample_selection

* causal set_current_selection debugging

- debug messages for failed e2e runs

* causal data and backtrace component updates

- data: set_current_selection returns the number of eligible addresses added
- backtrace: if cputime signal has selected zero IPs > 5x, then realtime signal starts contributing call-stacks

* core library updates

- move config::parse_numeric_range to utility namespace
- add core/utility.cpp
- support range:increment, e.g. 5-25:10 expands to '5 15 25' instead of '5 10 15 20 25'

* omnitrace-causal update

- end-to-end expands all speedups
- support range:increment in speedups

* causal backtrace updates

- remove select_ival (realtime signal always contributes when select_count == 0)

* containers: static_vector update

- explicit c_array constructor
- explicit std::array constructor

* causal data updates

- remove set_current_selection(uint64_t)
- remove set_current_selection(std::array)
- sample_selection increase default wait time
- report eligible PC candidates
- move reset_sample_selection to perform_experiment_impl
- decrease latest_eligible_pc array size
- set_current_selection does not guard for experiment::active

* core debug updates

- OMNITRACE_PRINT_COLOR macros

* causal data updates

- tweak to experiment never started message

* causal gotcha updates

- remove unused code

* critical trace updates

- remove unused code

* omnitrace-causal

- OMNITRACE_LAUNCHER

* causal data updates

- don't fail on end-to-end + omnitrace-causal

* causal backtrace updates

- reintroduce select_ival behavior

* causal data updates

- tweak verbose messages about number of PC candidates

* core mproc updates

- utilities for waiting on child PID and diagnosing status
  - omnitrace::mproc::wait_pid
  - omnitrace::mproc::diagnose_status

* omnitrace-run updates

- support --fork argument for executing via fork in current process + execvpe on child instead of execvpe in current process

* omnitrace-causal updates

- wait_pid and diagnose_status just call equivalent functions in omnitrace::mproc

* ubuntu-focal workflow update

- attempt to launch ubuntu-focal-codecov job with CAP_SYS_ADMIN and use perf backend

* tests reorg and updates

- remove binary-rewrite-sampling and runtime-instrument-sampling tests
- rename *-preload tests (which use omnitrace-sample exe) to *-sampling
- split tests/CMakeLists.txt into several tests/omnitrace-<category>-tests.cmake files
- tweak to causal-both-omni-func test
  - add args: -n 2 -b timer

* update validate-causal-json.py

- better reasoning info for adjusting tolerance
- always apply tolerance adjustments in CI mode

* causal e2e tests update

- add label "causal-e2e" label
- tweak params
  - old: 80 12 432525 500000000
  - new: 80 50 432525 100000000
- disable processor affinity for slow-func/line-100 tests
  - artificially inflates some speedups with perf

* unblocking_gotcha updates

- overload operator() according to gotcha function index

* blocking_gotcha updates

- overload operator() according to gotcha function index
- fix bug where potentially post block functors (e.g. pthread_mutex_trylock) throw error if lock is not acquired.

* parse_numeric_range update

- support unordered_set

* config update

- OMNITRACE_DEBUG_{TIDS,PIDS} use parse_numeric_range

[ROCm/rocprofiler-systems commit: 9de3a6b0b4]
2023-04-13 02:14:35 -05:00
Jonathan R. Madsen 778e87c69c Casual Profiling GUI (#265)
* Long workload name layout fix (#269)

* changed layout to fit experiment names

- added span to show full name
- shortened name to fit dropwdown
- changed layout for added consistency

* layout Fixes

- refresh button is to the right
- header is more consistent across different width screens

* header layout update

- center div makes turns into multiple lines if not all items fit

* slight improvement for header/graph spacing

* Fixed refresh button shape and function

- moved find_causal_files to parser so that main and gui can access
- resized refresh width to allow for same shape across different screens

* all graphs now have the same width

- graphs now have same width

- chart headers start well below the header with filters

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* Causal GUI: Linting, synced y-range, remove unused imports/variables, and bug fixes (#274)

* GUI: python linting workflow

- runs flake8 on code in source/python/gui

* GUI: flake8 settings in source/python/gui/setup.cfg

- ignore E501 errors (line too long)
- ignore W503 errors (line break before binary operator)

* GUI: setup.py updates

- remove unused imports

* GUI: __main__.py updates

- stddev is float value
- remove unused imports
- effectively propagate the --stddev argument

* GUI: gui.py updates

- remove unused imports
- fix light mode
- sync initial y range of all causal plots
- fix error bars for causal data
- set x-ticks to 5
- set y-ticks to 10
- only display top 99% of samples
- separate global declarations and assignments to global values
- remove unused variable assignments
- fix mislabeled function_regex and exp_regex
- change if X == False to if not X

* GUI: header.py updates

- remove unused imports
- fix mislabeled function_regex and exp_regex

* GUI: parser.py updates

- add set_num_stddev function for manipulating global num_stddev value
- remove unused variables
- fix latency point object (duplicated __init__ function)
- fix handling latency in JSON
- fix formatting of validation format error message
- replace if X == False with if not X
- fix unused dataframe creation in add_latency
- fix flake8 do not assign lambda for name_wo_ext (use def)

* GUI: gui.py updates

- replace misnamed "func_list" with "experiment_list"
- replace misnamed "exp_list" with "progpt_list"

* GUI: fix python workflow

- quote python versions to avoid truncating 3.10 to 3.1

---------

Co-authored-by: JoseSantosAMD <87447437+JoseSantosAMD@users.noreply.github.com>

[ROCm/rocprofiler-systems commit: cc14b52584]
2023-04-11 23:36:24 -05:00
Jonathan R. Madsen 70e2e7e745 omnitrace-avail updates (#272)
* omnitrace-avail updates

- enables text wrapping for descriptions
- reworks the HW counters display layout
  - added new column "Device" which has either "CPU" or "GPU"
- support sorting HW counters alphabetically
- fixed some minor csv issues
- reorganize the order of the argparse arguments

* Fix tests

[ROCm/rocprofiler-systems commit: b39a683eab]
2023-03-30 15:06:29 -05:00
Jonathan R. Madsen bac3d632d0 rocprofler_iterate_info workaround v2 (#271)
Revert agent_info->dev_index in hsa_rsrc_factory.cpp

- disable using the driver_node_id and, instead, start at zero
  because it breaks the lookup for rocprofiler_pool_fetch
  lookup in rocprofiler.cpp. On my system (one AMD GPU and
  one NVIDIA GPU), it has a value of 1, not 0 and the pool
  size is 1 -- resulting in segfault

[ROCm/rocprofiler-systems commit: ab8894082b]
2023-03-30 11:15:19 -05:00
Jonathan R. Madsen 70c8d1229c rocprofler_iterate_info workaround + omnitrace-avail update (#270)
* rocprofler_iterate_info workaround + omnitrace-avail update

- provides workaround for rocprofiler_iterate_info behavior change in ROCm 5.4.0-3
- update timemory submodule with argparse tweaks
- updates hsa_rsrc_factory.{hpp,cpp}
- colorized log in omnitrace-avail
- Bump version to 1.9.2

* Fix empty_base inheritance

- timemory's component::empty_base inherits from concepts::component so direct inheritance was removed

* Fix OMNITRACE_HIP_VERSION_COMPAT_STRING

- defined as "" when OMNITRACE_HIP_VERSION_MAJOR==0

* new defines + extra info

- define OMNITRACE_LIBRARY_ARCH (via CMAKE_LIBRARY_ARCHITECTURE)
- define OMNITRACE_SYSTEM_NAME (via CMAKE_SYSTEM_NAME)
- define OMNITRACE_SYSTEM_PROCESSOR (via CMAKE_SYSTEM_PROCESSOR)
- define OMNITRACE_SYSTEM_VERSION (via OMNITRACE_SYSTEM_VERSION)
- define OMNITRACE_COMPILER_ID (via CMAKE_CXX_COMPILER_ID)
- define OMNITRACE_COMPILER_VERSION (via CMAKE_CXX_COMPILER_VERSION)
- include this info in metadata
- include subset of this info in --version for bin tools
- tweak to perfetto verbose messages

[ROCm/rocprofiler-systems commit: 4ed5f3e67b]
2023-03-30 04:21:43 -05:00
Jonathan R. Madsen a1213480e0 Roctracer perfetto flow fixes (#267)
* testing label updates

- automatically add "gpu", "roctracer", "rocm-smi", and "rocprofiler" test labels when appropriate

* Bump version to v1.9.1

* roctracer and config updates

- fix perfetto::Flow
  - use roctracer correlation ID instead of critical trace correlation ID
- renamed ambiguous _cid, _parent_cid, _corr_id variables to _crit_cid, _parent_crit_cid, _roct_cid
- use atomic_{mutex,lock} instead of STL mutex/lock
- support for individual perfetto annotations for HIP API args
- OMNITRACE_PERFETTO_COMPACT_ROCTRACER_ANNOTATIONS option for controlling compact vs. individual perfetto annotations for HIP API args

* Update timemory submodule

- argparser updates
  - help prints to std::cout by default now
  - supports setting custom ostream

* cmake formatting

* config::get_setting_value updates

- config::get_setting_value returns std::optional instead of std::pair<bool, Tp>

[ROCm/rocprofiler-systems commit: 279a8e0952]
2023-03-23 01:13:12 -05:00
Jonathan R. Madsen b1f52afeaf rocprofiler and roctx updates (#261)
* Improve locating ROCP_METRICS

- moved common::path to common/path.hpp
- added more functionality to common::path
- common::path::exists now returns true for directories
- check for metrics.xml and gfx_metrics.xml before setting ROCP_METRICS
- throw error if path for ROCP_METRICS is cannot be explicitly determined

* Fix roctxRangePop handling

- message is nullptr -> keep thread-local stack

[ROCm/rocprofiler-systems commit: 9eafb23602]
2023-03-22 00:49:14 -05:00
Jonathan R. Madsen 6a8c757822 fix omnitrace-install.py script (#258)
- handle blank lines in /etc/os-release

[ROCm/rocprofiler-systems commit: e580bc9186]
2023-03-16 14:50:32 -05:00
Jonathan R. Madsen a8c505c91d omnitrace-run executable - required for running binary writes (#257)
* omnitrace-run exe

- ensure LD_PRELOAD for libomnitrace-dl.so
- convert config options into command-line options

* Update timemory submodule

- updates to tsettings
- updates to argparser

* common environment update

- throw error if get_env<bool> has empty string

* config updates

- minor tweaks to categories of settings

* core lib update

- add argparse for common handling of argument parsers

* omnitrace-sample update

- fix handling of --trace-file (OMNITRACE_PERFETTO_FILE)

* omnitrace-run update

- updated to use omnitrace::argparse functions

* Tests for omnitrace-run

* argparse core update

- remove choices for --cpu-events and --gpu-events

* remove some debugging prints

* fix timemory include in argparse.cpp

* always provide --hsa-interrupt option

* Update source/lib/core/argparse.cpp

- fix pedantic warning

* Update testing

- remove testing args that may not be there in some builds

* roctracer/pthread_create fix

- disable roctracer_data when roctracer not enabled

* omnitrace-causal tweak

* omnitrace-instrument: module_function tweak

- allow DEFAULT_MODULE and LIBRARY_MODULE

* common environment update

- support get_env for enums

* core: config update

- Add "mode" category to OMNITRACE_MODE

* Update timemory submodule

- remove debug print statement

* omnitrace-sample tweak

- change var init

* omnitrace-run testing update

- use --help instead of -?

* core: common.hpp

- tweak header include style

* core: argparser update

- add_ld_preload func
- launcher and command member variables in parser_data
- support launcher

* omnitrace-run update

- clean up and reworked

* libomnitrace-dl updates

- require LD_PRELOAD with binary rewrite
- dl::InstrumentMode
- dl::get_instrumented()
- verify_instrumented_preloaded()
- omnitrace_set_instrumented(int)
- relocated omnitrace_main from main.c to dl.cpp
- omnitrace_set_env does not dlopen libomnitrace
- omnitrace_set_main(func_ptr) [internal API]
- OMNITRACE_HIDDEN_API -> OMNITRACE_INTERNAL_API

* Update testing to new LD_PRELOAD requirements

* omnitrace-instrument updates

- adhere to LD_PRELOAD requirementsa
- invoke omnitrace_set_instrumented
- binary rewrite does not instrument main
- binary rewrite does not instrument call to omnitrace_init
- runtime instr does not instrument main
- runtime instr does not instrument call to omnitrace_init

* Bump to v1.9.0

- LD_PRELOAD requirement necessitates minor version increment

* common: environment

- fix ambiguous get_env calls

* omnitrace-instrument update

- fix issue with temporaries

* omnitrace-instrument and libomnitrace-dl updates

- runtime instrumentation does not work if libomnitrace-dl is preloaded

* libomnitrace-dl and libpyomnitrace updates

- define dl::InstrumentMode in dl.hpp
- handle instrumentation via setprofile libpyomnitrace
  - do not push trace in omnitrace_init

* omnitrace-instrument and libomnitrace-dl updates

- move header to dl subdirectory
- omnitrace::omnitrace-headers include omnitrace-dl folder
- use InstrumentMode in omnitrace-instrument

* Update workflows and scripts

- Use omnitrace-run on instrumented exes

* Update docs

- add omnitrace-run to examples of running binary rewritten exes

[ROCm/rocprofiler-systems commit: abe35de43a]
2023-03-14 19:48:29 -05:00
Jonathan R. Madsen 61a050fd1d omnitrace -> omnitrace-instrument (#256)
* omnitrace-exe -> omnitrace-instrument

- Renamed omnitrace executable to omnitrace-instrument
- Provided dummy omnitrace exe which forwards onto omnitrace-instrument
- updated all docs to reflect the name change of the executable
  - however, it is possible some were missed

* Update dyninst submodule

- correctly handle BOOST_LINK_STATIC in DyninstBoost.cmake

* Disable IPO for omnitrace-instrument

[ROCm/rocprofiler-systems commit: ab0e5d9b44]
2023-03-09 18:18:34 -06:00