30 Commit

Autore SHA1 Messaggio Data
Sajina PK 15c82d6da8 [rocprofiler-system]: Enable UCX Communication API tracing (#2306)
## Motivation

Enable UCX communication tracing and communication metadata 

## Technical Details

Implement UCX API wrappers to trace transport-layer communication. This adds communication data tracking and exposes “UCX Comm Send/Recv” timelines, enabling detailed analysis of MPI, OpenSHMEM, and other UCX-based runtime communication patterns.

- Implements function interception for UCX functions across multiple categories using gotcha component.
- Extended comm_data component to track UCX send/recv operations - Added ucx_send and ucx_recv labels for Perfetto counter tracks. Integrated UCX data tracking with existing MPI/RCCL tracking infrastructure.
- Added ROCPROFSYS_USE_UCX configuration option (enabled by default).
- Created FindUCX.cmake module for UCX header detection. Falls back to internal UCX headers if system headers not found.
- Updated all Dockerfiles  to include UCX dependencies.
2026-01-20 13:16:43 -05:00
David Galiffi 3883bd3e93 Support for TheRock builds (#1545)
* Cleaning up some BUILD_<dep> config variables

The `ROCPROFSYS_BUILD_<dep>` settings were being translated to `BUILD_<dep>` for the old Dyninst dependencies.
Remove this extra layer
Add `rocprofiler_systems_add_option` for the `ROCPROFSYS_BUILD_<dep>` options, so there is a better description in the in the CMakeCache.

* Changes to support USE_ROCM in TheRock builds

* Removed `amd-smi::roctx` from Findamd-smi.cmake

* Fix linking error on rocm-6.4 when including amd_smi

* Format cmake

* Fix typo in logs

* Removing Findamd-smi.cmake

* Refactor the cmake parameters for `amd-smi`.

The `drm` libraries were only required ba amdsmi for rocm-6.4.0. There was no point adding them for other versions.
2025-11-10 14:38:51 -05:00
Aleksandar Djordjevic f39a60ac25 [rocprofiler-systems] Apply new CMake formatting for the latest gersemi version (#1778)
* Fix cmake formatting

* Updated rev. in `.pre-commit-config.yaml`

* Pin the gersemi used in CI to v0.23.1, matching the pre-commit

---------

Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-10 13:08:44 -05:00
David Galiffi 847580dd9e Update minimum_cmake_required to match version used in CI (#679)
- Update minimum_cmake_required to match version used in CI
  - We should match the minimum version that we test against

- Ensure ".S" files are treated as assembly.
2025-08-21 15:56:47 -04:00
David Galiffi 8fcf3a50b0 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 122623a929]
2025-06-22 10:44:33 -04:00
Pranjal Swarup 9d94e11e2b Update dyninst to v13 (#190)
Update Dyninst submodule
Refactoring of build scripts to build TBB, Boost, ElfUtils, and LibIberty, since Dyninst build scripts no longer do.
Workflows are now building Dyninst and its dependencies.

---------

Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 96df9b6d3e]
2025-06-06 22:52:23 -04:00
David Galiffi c7c3c3f97e Use rocprofiler-sdk for RCCL-API tracing (#126)
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.

[ROCm/rocprofiler-systems commit: af77d93f75]
2025-06-06 11:36:17 -04:00
David Galiffi bd0eeb9555 Reapply "Upgrade ROCm-SMI to AMD SMI (#86)" (#147)
* Reapply "Upgrade ROCm-SMI to AMD SMI (#86)"

This reverts commit 9fcea73122.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 85bbea4954]
2025-03-25 17:31:27 -04:00
Sajina PK fba64f8acd Add support for VA-API and rocDecode tracing (#92)
- VA API tracing using Timemory gotcha wrappers.
- rocDecode API tracing integration using callback to ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API
- Updated videodecode ctest to validate rocDecode APIs in perfetto trace. 

[ROCm/rocprofiler-systems commit: 697d1ac02f]
2025-02-11 13:08:23 -05:00
David Galiffi 2c9d92be33 Remove remaining roctracer references (#82)
[ROCm/rocprofiler-systems commit: e437200e9e]
2025-02-07 23:27:58 -05:00
David Galiffi 9fcea73122 Revert "Upgrade ROCm-SMI to AMD SMI (#86)" (#100)
This reverts commit 8c5db3f1d8.

[ROCm/rocprofiler-systems commit: b3eee295dd]
2025-02-07 11:45:26 -05:00
cfallows-amd 8c5db3f1d8 Upgrade ROCm-SMI to AMD SMI (#86)
* Integrating amd-smi into rocprofiler-systems due to rocm-smi deprecation.
* No functionality changes to users other than naming conventions.
* New tracks available in perfetto- gpu busy percentage metrics now splits gfx busy into separate gfx, umc, and mm engine measurements.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 0c32dfd6bc]
2025-01-30 21:32:07 -05:00
David Galiffi 489eda995d Rename Omnitrace to ROCm Systems Profiler (#4)
The Omnitrace program is being renamed. 

Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"

---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d07bf508a9]
2024-10-15 11:20:40 -04:00
Jonathan R. Madsen 7499945cb9 Reduce release packaging (#300)
* Bump version to 1.10.3

* Drop releases for ROCm < 5.3

- ROCm is no longer providing release for Ubuntu 18.04 starting with 5.3 so omnitrace is dropping support for Ubuntu 18.04 + ROCm
- Dropping ROCm 5.2 releases for Ubuntu 20.04
- Dropping ROCm 5.2 releases for OpenSUSE 15.4

* Update redhat workflow

- Test RedHat 9.1 + ROCm 5.5
- Test RedHat 9.1 + ROCm 5.6

* Update ubuntu-focal workflow

- drop ROCm 5.2 testing
- add ROCm 5.6 testing

* Update Findroctracer.cmake

- provide /opt/amdgpu to HINTS/PATHS for drm and drm_amdgpu libraries

* Update Findrocprofiler.cmake

- prefer librocprofiler64.so.1

* Update librocprofiler64.so to librocprofiler64.so.1

- search for the SOVERSION 1 library of librocprofiler64.so if ROCm > 5.5.0

* Update Findrocprofiler.cmake

- link to libpciaccess for ROCm 5.5.0

* Update redhat CI workflow

- install libpciaccess for rocm CI

* Update cpack workflow

- Remove all RHEL 9.0 packaging
- Remove all packaging for ROCm 5.3 on OSes supporting where releases are provided for 5.4, 5.5, and 5.6

* Update ubuntu focal workflow

- remove rocm 5.3 jobs

[ROCm/rocprofiler-systems commit: 1216fd99a7]
2023-09-12 20:19:26 -05:00
Jonathan R. Madsen 5a1cec92e8 Various optimizations (#192)
* CDash name prefix {{ repo_owner }}-{{ ref_name }}

- remove /merge from CI name

* disable using BFD when sampling_include_inlines is OFF

- this consumes a lot of memory

* Improve finalization of rocprofiler

* update timemory submodule

- disable OMPT thread begin/end callbacks
- support hierarchies in signal handlers
- update operation::pop_node debugging
- settings_update_type + setting_supported_data_types
- fixed parsing args in timemory_init

* Improve timemory build time

* Remove kokkosp restrictions for perfetto

* omnitrace exe signal handler update

- configure signal handlers before main to allow libomnitrace to override

* Backtrace and timemory submodule updates

- Use unwind::cache w/o inline info
- update timemory submodule
  - unwind::cache updates
  - filepath updates
  - fix termination_signal_message
  - fix vsettings::report_change

* Update dyninst submodule

- updates BinaryEdit::getResolvedLibraryPath

* update timemory submodule

- update CpuArch support

* Cleanup configure warnings

* Update examples cmake and workflows

- (Mostly) eliminate configuration warnings

* omnitrace exe updates

- pass environ to BPatch::processCreate
- avoid trailing ":" in DYNINST_REWRITER_PATHS

* Update dyninst submodule

- Add flags to DyninstOptimization.cmake
- Remove strtok from BinaryEdit::getResolvedLibraryPath

* examples/mpi CMakeLists.txt update

- STATUS message about missing MPI during CI, otherwise AUTHOR_WARNING

* Dev build and linker flags

- use -gsplit-dwarf when OMNITRACE_BUILD_DEVELOPER is ON
  - disable when OMNITRACE_BUILD_NUMBER > 1
- OMNITRACE_BUILD_LINKER option
- add -fuse-ld=${OMNITRACE_BUILD_LINKER}
- omnitrace_add_cache_option function

* Update workflows to set OMNITRACE_BUILD_NUMBER

* Fix generator expressions for -fuse-ld

* Suppress some configuration warnings during CI

- helps to keep track of real warnings when they arise

* Update timemory and dyninst submodules with CMP0135

* Add -V flag to run-ci script

[ROCm/rocprofiler-systems commit: f147670a7a]
2022-11-01 17:28:12 -05:00
Jonathan R. Madsen 6ec1c639bd Fix spack builds when no ROCm .info/version files (#170)
- closes #166

[ROCm/rocprofiler-systems commit: 19edd0d421]
2022-09-27 07:40:02 -05:00
Jonathan R. Madsen 62fdf2720f Fix building w/ hip, etc. but w/o rocprofiler (#159)
- Fixes builds with `OMNITRACE_USE_ROCPROFILER=OFF` but
`OMNITRACE_USE_ROCTRACER=ON`
- Move rocprofiler/hsa_rsrc_factory.{hpp,cpp}` to rocm folder

[ROCm/rocprofiler-systems commit: 472e96a084]
2022-09-14 00:07:33 -05:00
Jonathan R. Madsen 6227c25220 Fix setup-env + hsa/rocm/ompt serialization + testing + misc (#156)
- Fix setup-env.sh
  - Closes #149 
- omnitrace exe color
- test-install.sh script
- if config variable is updated in config or env, include in generated
config
- metadata for hsa, rocm, and ompt
  - Closes #153 
  - Closes #154

[ROCm/rocprofiler-systems commit: 15e6e6d979]
2022-09-13 08:11:48 -05:00
Jonathan R. Madsen 9defbdcd4d RPATH to rocprofiler_LIBRARY_DIR for ROCm < v5.2 (#126)
* RPATH to rocprofiler_LIBRARY_DIR for ROCm < v5.2

- until v5.2 only librocprofiler64.so was symlinked in /opt/rocm. Thus linker using SOVERSION caused issues finding librocprofiler64.so.1

* Test ROCm w/ CMAKE_INSTALL_RPATH_USE_LINK_PATH=OFF

* INSTALL_RPATH_USE_LINK_PATH for omnitrace exe

[ROCm/rocprofiler-systems commit: b79ce10fee]
2022-08-02 14:19:04 -05:00
Jonathan R. Madsen 93343034bc RCCL support (#93)
* Initial support for RCCL

* OMNITRACE_USE_RCCLP + sampling tweaks

- also OMNITRACE_SAMPLING_KEEP_INTERNAL option
- minor modifications to sampling to use keep internal option + discard funlockfile

* Update docker and workflows to download RCCL

* Update CPack DEB with rocprofiler dependency

* Rework rccl into library and library/components folder

- add tpls/rccl/rccl/rccl.h

* Fix timemory includes

* rcclp inline definitions when disabled

* Tweaks to ubuntu-focal-external-rocm

- disable ompt
- enable building testing

* Tweaks to ubuntu-focal-external-rocm

- ctest exclude

* Tweak ubuntu-focal.yml

- remove source /.../setup-env.sh, replace with $GITHUB_ENV

* Fix ubuntu-focal-rocm + OMPI + root

* Improved rocm-smi error handling

- Recover from rocm-smi errors
- Disabling rocm-smi after recovering from errors
- Werror in developer mode
- Remove State::DelayedInit
- Add State::Disabled

* formatting

* Fix merge of OMNITRACE_SAMPLING_KEEP_INTERNAL

* Update RCCL include directory

- based on ROCm version we need with <rccl/rccl.h> or <rccl.h>

* RCCL Testing

- updated tests to use configuration files
- many tests generate a configuration file
- tests how have GPU option
- enable ncclCommCount, disable ncclGetVersion
- add testing for RCCLP via rccl-tests
- working directory of tests is PROJECT_BINARY_DIR
- add nccl/rccl functions to get_whole_function_names
- some clang compiler fixes

* Handle RCCL include w/o HIP

* RCCL requires HIP

* Update OMNITRACE_SAMPLING_CPUS for testing

* Update tests/CMakeLists.txt

* Debug settings

* Install MPI even when USE_MPI=OFF

* exclude printf

* skip mpi tests w/o USE_MPI or USE_MPI_HEADERS

* update ubuntu rocm workflow

* Fix configure env step for ubuntu rocm

[ROCm/rocprofiler-systems commit: 45be03906a]
2022-07-25 12:16:11 -05:00
Jonathan R. Madsen 17cf00c51d fix omnitrace print-* with libraries (#94)
* fix omnitrace print-* with libraries

* timemory submodule update

* Update workflows to use ./bin/omnitrace instead of ./omnitrace

* cmake format

* update timemory submodule

- fix ODR violations in utility/procfs

* cmake updates

- uniform find_package for all ROCm-based libraries

* tweak transpose example

- throw exception instead of std::exit

* Inspect cmdv name before assuming not exe

- some ELF execs "think" they are libraries so only assume rewrite + simulate + all-functions if filename looks like library
- adds some test for --print-available -- <library>

* Fix _has_lib_prefix when command is < 3

* Updates and reverts to omnitrace exe

- update module_function operator< and operator==
- add function_signature operator<
- refactor module_function ctor
- revert some previous changes w.r.t. simulate and include_unninstr

* Fix source/bin/tests to use same output dir as tests

* cmake format

* Segfault mitigation + refactor + modify function iteration

- refactor module_function ctor to avoid segfaults
- string_t -> std::string
- replace std::string with std::string_view in some places
- get_name(module_t*)
- get_name(procedure_t*)
- disable using both app_modules and app_functions
- new option: --parse-all-modules to iterate over app_modules
- removed some unused code w.r.t. debug info

* Disable module_function address range for uninstrumentable functions

* Disable module_function address range for uninstrumentable functions

* Refactored getting file/line info and init/fini

- use dyninst insertInitCallback and insertFiniCallback if main not found
- fixed all issues with segmentation faults in --simulate --all-functions

* revert changes to Findrocprofiler.cmake

[ROCm/rocprofiler-systems commit: d04cbe862e]
2022-07-21 01:15:41 -05:00
Jonathan R. Madsen 39adaaef98 Unified setup_environ b/t libomni and libomni-dl (#86)
- omnitrace::common::setup_environ provides a standardized exporting of relevant env variables

[ROCm/rocprofiler-systems commit: a2dcc7381b]
2022-07-18 01:52:21 -05:00
Jonathan R. Madsen 7d1989f5a4 GPU HW Counters via rocprofiler (#84)
* Initial support for GPU hardware counters

* Update find modules for roctracer and rocprofiler

- /opt/rocm/{rocprofiler,roctracer} path is deprecated so tweak search procedure

* Improve ConfigCPack for MPI

* Update rocprofiler

- rocm_metrics()
- minor cleanup

* Update rocm find modules

* declare rocm_metrics + call in omnitrace-avail

* relocate omnitrace-launch-compiler

* REALPATH and find_modules

* Examples cmake (may drop)

* omnitrace-avail

- hw_counter categories
- init rocm

* setenv updates for rocprofiler in library.cpp and dl.cpp

* get_rocm_events config

* gpu::hip_device_count()

* rocm_metrics returns hardware_counters::info

* - relocated library/components/roctracer_callbacks.* to library/roctracer.*
- relocated library/components/rocprofiler.* to library/rocprofiler.*
- cleaned up rocprofiler.hpp
- added perfetto output of rocprofiler
- added timemory output of rocprofiler
- renamed omni.roctracer thread to roctracer.hip
- added roctracer.hsa thread name
- updated timemory submodule to support std::variant
- updated timemory submodule to support = in config value
- updated timemory submodule to support standalone storage
- updated timemory submodule to support new hw counter apis
- updated timemory submodule to prevent label/description caching in data_tracker

* update omnitrace-avail info_type generation

* Update timemory submodule

* rocprofiler component

* cmake formatting

* omnitrace-avail handle no GPUs

- Add -c command-line option for --categories
- support verbosity

* hsa_rsrc_factory throws exceptions

- throw exceptions to avoid aborting on HSA_STATUS_ERROR_NOT_INITIALIZED when advantageous
- removed duplicate specialization of is_available for component::rocprofiler

* rocprofiler symbols for when disabled

* Fix warning in omnitrace-avail

- std::stringstream from initializer list would use explicit constructor

* Fix finalization after settings are deleted

* Reorganized rocprofiler source

* Updated formatting

* Miscellaneous tweaks

- added using statements from timemory
- tweaked the main and thread bundle names
- fixed timemory header includes

[ROCm/rocprofiler-systems commit: 4208b5654c]
2022-07-17 21:52:09 -05:00
Jonathan R. Madsen 8cc87ca6b8 MPI headers + mutex gotcha + roctracer + kokkosp (#11)
* MPI headers, mutex gotcha + roctracer + kokkosp

- relocate internal MPI headers
- pthread_barrier in parallel-overhead
- doc fixes to DYNINST options
- minor tweaks to dynamic_library
- dlopen libamdhip64.so
- scoped thread state in kokkos
- extended pthread_mutex_gotcha

* Fix for unused-but-set-variables

[ROCm/rocprofiler-systems commit: 424a3593e7]
2022-05-30 18:25:12 -05:00
Jonathan R. Madsen 938aaef082 Updates installation docs, cmake updates, internal OpenMPI header (#10)
* Updates installation docs + minor cmake tweaks

- OMNITRACE_BUILD_LIBUNWIND option
- Locally set OMNITRACE_USE_HIP=OFF if roctracer and rocm-smi are off
- Force TIMEMORY_BUILD_GOTCHA to avoid bug in gotcha not patched upstream

* MPI-Headers

- include copy of mpi.h from OpenMPI
- reworked FindMPI-Headers to support the internal OpenMPI headers

[ROCm/rocprofiler-systems commit: d5995846df]
2022-05-26 10:03:31 -05:00
Jonathan R. Madsen 127e30a4d7 Documentation + Miscellaneous Fixes (#36)
* Added documentation markdown source

* Replaced AARInternal with AMDResearch in URLs

* Renamed cpack artifact names

* Fix to testing and lulesh submodule checkout

* Docker updates

* CMake and CPack

- force CMAKE_INSTALL_LIBDIR to lib
- CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- Tweak LIBOMP_LIBRARY find in examples/openmp
- Tweak setup-env.sh.in

* Partial update of README

- status badges
- docs link
- removed install info (covered by docs)

* OMNITRACE_SAMPLING_CPUS setting

- enables control over which CPUs are sampled for frequency

* omnitrace exe updates

- exclude transaction clone, virtual thunk, non-virtual thunk
- module_function::start_address
- module_function::instructions
- verbosity > 0 encodes instructions into JSON

* Miscellaneous fixes

- relocate setup-env.sh.in
- add modulefile.in
- Updated README.md and source/docs/about.md
- cmake fix for libomp
- fix license in miscellaneous places
- dl.hpp and dl.cpp

* Update timemory and dyninst submodules

- timemory signals updates
- dyninst Movement-adhoc updates

* cmake format

[ROCm/rocprofiler-systems commit: 945f541965]
2022-04-04 15:27:38 -05:00
Jonathan R. Madsen 4ddb8405ac cpack workflow for building installers (#35)
* cpack workflow for building installers

- ConfigCPack.cmake update
  - STGZ and DEB + containers + test artifact
  - DEBIAN_FRONTEND + set -v
  - submodule fix
  - actions checkout
- OMNITRACE_ROCM_VERSION + continue-on-error
- Change CPack generators + fix path to DEB
- separate configure, build, and package steps
- use cd instead of pushd
- FindROCmVersion + fix to cpack testing
- use ${ROCM_PATH}/.info/version for ROCm version info
- Tweaks for debian installer
- Packaging fixes
- Use CMAKE_SHARED_LIBRARY_SUFFIX instead of .so
- Split cpack.yml into 4 workflows
- Replace source with export in cpack
- Dyninst boost uses tar.gz instead of zip on Unix

* Fix to common join

* Update VERSION to 1.0.0

[ROCm/rocprofiler-systems commit: 5c4d5c394f]
2022-03-27 22:52:36 -05:00
Jonathan R. Madsen 4ae26e2d08 rocm-smi and KokkosTools support (#23)
* renamed omnitrace_thread_data to thread_data

* initial implementation

* Numerous fixes and updates

- Updated timemory submodule
- Updated perfetto submodule (pulls in fixes for TRACE_EVENT)
- pthread_gotcha only after omnitrace_init_tooling
- omnitrace banner
- config settings for rocm-smi freq and devices
- critical_trace::get_entries
- OMNITRACE_BASIC_PRINT
- rocm_smi perfetto category
- redirect roctracer warnings for ROCm 4.5.0
- property specializations for rocm-smi components
- units fixes data_tracker types
- roctracer entries for pthread_create and start_thread
- omnitrace-avail defaults to settings, not components
- settings have conforming names
- settings warn about duplicates
- ptl named threads
- decreased max freq for sampler SIGALRM
- rocm-smi names thread
- rocm-smi avoids call to hipGetDeviceCount
- name roctracer activity callback threads
- fixed binary rewrite test output names

* Update lulesh example

- supports non-UVM GPU

* Lulesh tweaks + formatting

* KokkosP + Mode + Roctracer sampling deadlock fix

- kokkosp support
- omnitrace_init_library
- config::print_settings()
- config::get_mode()
- omnitrace::Mode
- omnitrace-avail improvements (removes settings)
- handle get_verbose() < 0
- disable dyninst InstrStackFrames by default
- handle perf_event_paranoid > 1 by disabling PAPI
- SIGALRM max freq to 5.0
- Name threads
- rocm-smi handles get_use_perfetto() and get_use_timemory()
- HSA_ENABLE_INTERRUPT=0 when roctracer + sampling (fixes deadlock)

* Tests, API renaming, roctracer

- disable renaming of thread 0
- verbprintf_bare
- enable dyninst merge tramp
- tweaked some omnitrace exe verbose levels
- reworked roctracer::setup and roctracer::shutdown
- rocm_smi::data::poll checks get_state()
- omnitrace_trace_finalize -> omnitrace_finalize
- omnitrace_trace_init -> omnitrace_init
- omnitrace_trace_set_env -> omnitrace_set_env
- omnitrace_trace_set_mpi -> omnitrace_set_mpi
- sampling mode does not disable timemory
- disable roctracer before shutting down rocm-smi
- lulesh tests w/ and w/o kokkosp
- lulesh tests for perfetto only
    - with --dynamic-callsites --traps --allow-overlapping
- lulesh tests for timemory only
    - with --stdlib --dynamic-callsites --traps --allow-overlapping

* Update timemory submodule

- fix for TIMEMORY_PROPERTY_SPECIALIZATION

* get_verbose() handling + timemory submodule update

- Findroctracer.cmake uses find_package(hsakmt)

* Stability fixes + rework roctracer + perfetto

- reworked roctracer start up
- critical_trace perfetto basic values
- perfetto sampling category
- sampler checks signals
- peak_rss in sampling
- pthread_gotcha::shutdown()
- rocm_smi::device_count()
- HSA_TOOLS_LIB is set
- HSA_ENABLE_INTERRUPT in omnitrace exe
- omnitrace exe verbosity level changes
- Avoid instrumenting Impl ns in Kokkos
- gpu::device_count prefers rocm_smi instead of hip
- ptl blocks signals
- fixed pthread_gotcha roctracer_data values
- removed runtime-instrument-sampling tests
- timemory submodule update

* cmake formatting

* timemory + roctracer updates

- fix timemory issue with papi_common
- fix timemory issue with units
- define roctracer::is_setup()

* Miscellaneous tweaks

- Disable sampling during runtime instrument
- Fixed warnings about dynamic callsites
- Fixed backtrace output when timemory disabled
- Test tweaks

* cmake-format

* omnitrace_target_compile_definitions

* timemory submodule update

* config, omnitrace, State, mpi_gotcha updates

- use OMNITRACE_THROW instead of direct throw
- is_attached()
- is_binary_rewrite()
- get_is_continuous_integration()
- get_debug_init()
- get_debug_finalize()
- max_thread_bookmarks default to 1
- State::Init
- app_thread oneTimeCode
- runtime instrumentation uses waitpid
- fixed init_names
- include main in MPI runs
- fixed sampling setup when disabled
- reworked mpi_gotcha
- disabled critical trace in transpose test

* cmake-format

* handle rocm_smi::device_count() exception

* CI timeouts

* Re-enable runtime-instrument + sampling

[ROCm/rocprofiler-systems commit: 39f17ae8b8]
2022-02-08 17:42:17 -06:00
Jonathan R. Madsen efb6d766af Reorganization and critical trace support (#17)
* Roctracer wall clock integration (#16)

* Integrates roctracer values into wall-clock

* Fixed scoping + timemory roctracer

* Fixed data race in roctracer

* Synchronized HIP API on main thread

- Cache hip activity callbacks and execute on main thread
- Minor updates to transpose

* Debugging + MPI + transpose updates

* PTL + HSA and timemory + kernel timing

- PTL usage fixed HSA + timemory issues bc we could control the thread destruction
- Fixed laps counting in roctracer callbacks

* Ignore select HIP API types

- The ignored API types are ignored because there appears to be a bug
  which causes the "end" callback to be labeled as begin
- hipDeviceEnablePeerAccess
- hipImportExternalMemory
- hipDestroyExternalMemory

* Tweaks to PTL config

* Timemory update + pid-prefix w/ mpi headers

- %pid%- prefix with mpi headers
- timemory submodule update

* CMake + critical trace + reorganize library source

- clang-tidy tweaks
- cmake function updates to use hosttrace_ prefix
- update gitignore
- cmake HOSTTRACE_MAX_THREADS option
- Formatting.cmake
- cleaned up MacroUtilities.cmake
- PTL submodule + usage
- tweak to Findroctracer.cmake
- MT transpose
- Updated PTL submodule
- Updated timemory submodule
- fix to hosttrace return value type if type not found
- reorganized library source code
- support for critical trace

* Remove bits/stdint-uintn.h headers

* Rename + config + depth + critical path

- rename hosttrace_timemory_data to instrumentation_bundles
- rename hosttrace_bundle_t to main_bundle_t
- rename bundle_t to instrumentation_bundle_t
- rework of configuration setup
- critical_trace write directly to file option
- tweaked depth calculation
- updated timemory submodule
- improved parallel support in roctracer callbacks
- working critical_trace
- perfetto device-critical-trace and host-critical-trace categories
- made transpose example parallel
- made parallel-overhead example a bit uneven
- relocated LTO activation

* Fixed duplicates in perfetto critical-trace

* reworked critical trace support

- substantial perf improvement (30-45 min -> 30 sec)
- changes to configuration (new and removed options)

* Removed "%pid%-" output prefix in mpi_gotcha

* Update timemory submodule

[ROCm/rocprofiler-systems commit: 752424efc2]
2021-11-23 02:53:14 -06:00
Jonathan R. Madsen cdd2707058 v0.0.3: MPI wrappers w/o full MPI support + setup-env.sh (#15)
* v0.0.3: MPI wrappers w/o full MPI support + setup-env.sh

- bumped to v0.0.3
- enabled gotcha wrapping of MPI functions w/o enabling MPI
- added setup-env.sh script
- minor updates to testing
- Update timemory submodule
- fixes tim::component::configure_mpip undefined symbol
- Script updates

[ROCm/rocprofiler-systems commit: c56b49a0bd]
2021-10-01 16:46:03 -05:00