Графік комітів

15 Коміти

Автор SHA1 Повідомлення Дата
marantic-amd 956a73c4c8 [rocprof-sys] Use fmt APIs to construct strings instead of JOIN (#2643)
## Motivation

With the introduction of the new logging system base on `spdlog` library, opportunity shows to replace `timemory` dependent JOIN implementation with `fmt` library `format` and `join` APIs, which are shipped as a part of `spdlog` lib

## Technical Details

Use `fmt` provided APIs to properly format and package strings.
2026-01-23 00:34:58 -05:00
Milan Radosavljevic 318d13870f [rocprofiler-systems] Update logging to use spdlog library (#2428)
## Motivation

- Structured logging with proper log levels (TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL)
- Better performance through compile-time formatting
- Consistent formatting using fmt library
- Runtime log level control via arguments and environment variables
- Easier maintenance and debugging capabilities

## Technical Details

- Added spdlog as a submodule and integrated it into CMake build system
- Created new `rocprofiler-systems-logger` library wrapping spdlog functionality
- Replaced custom logging macros (`ROCPROFSYS_VERBOSE`, `ROCPROFSYS_DEBUG`, `ROCPROFSYS_FATAL`, `ROCPROFSYS_REQUIRE`, `ROCPROFSYS_CI_THROW`, etc.) with spdlog equivalents (`LOG_DEBUG`, `LOG_WARNING`, `LOG_CRITICAL`, etc.)
- Implemented log level control through command-line arguments and environment variables
- Converted assertion macros to proper error handling with exceptions and std::abort()
2026-01-14 15:27:51 -05:00
marantic-amd ba1380a75d Put cached perfetto traces as default one (#2138)
* Put cached perfetto traces as default one

* Improve cached data and perfetto traces in order to be more aligned with E2E tests

* Addressing PR comments and findings

* Force early instrumentation bundle instantiation

* Sync-up insturumented containers with thread growth data

* Revert ompvv number of host threads to default 8

* Fixed counter track namings for amd-smi

* AIPROFSYST-34 [rocprof-sys] Update documentation describing newly introduced changes to default tracing mechanism
2025-12-22 12:47:35 +01:00
marantic-amd 3b11e01716 Perfetto traces from cached data (#1704)
## Motivation

The idea is to unify the way and place where we store our traces. Current implementation uses `trace_cache` for rocpd traces, but perfetto is in lined inside of each module. This change allows us to have a single point in code where we will collect data, process it and store it in the desired format. This means that we can declutter the code further and have single point of responsibility and single point of failure.

## Technical Details

New `processor` (perfetto_post_processing.cpp) is added to the `trace_cache` which purpose is to use the cached data to populate perfetto tracks. Cache manager is responsible for keeping the instance of this processor and for its lifetime.
2025-12-01 09:59:16 -05:00
Milan Radosavljevic 00faa48ac2 Add flushing of perfetto buffer (#1417)
- Add flushing of perfetto buffer
- Add `ROCPROFSYS_PERFETTO_FLUSH_PERIOD_MS` config setting.
- Update CHANGELOG.sh
- Resolves SWDEV-518817

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-17 09:30:29 -04:00
Sajina PK 916aac1e92 Enable MPI tracing for Fortran (#185)
- Move the MPI gotcha functionality from Timemory to the repo.
- Add the PMPI Fortran MPI functions to the existing mpi gotcha handle.

[ROCm/rocprofiler-systems commit: 4fcd8cc78d]
2025-06-04 18:06:18 -04:00
David Galiffi 6fe19b681a Fix path to post-processing merge script (#187)
- Path to merge script not found unless user explicitly sources "share/rocprofiler-systems/setup-env.sh" to setup PATHs.
- Instead, let's derive the path when the application loads and use it when executing the helper script
- Rename script to rocprof-sys-merge-output.sh.
- Change install folder to <prefix>/libexec/rocprofiler-systems based on dev-ops feedback.
- Updated PATH variable in the modulefile and source scrtipt.
- For SWDEV-528101

[ROCm/rocprofiler-systems commit: adc66956b0]
2025-05-02 16:52:54 -04:00
Peter Park 3f9a3861ac Update copyright year to 2025 (#83)
[ROCm/rocprofiler-systems commit: 0a15d355e0]
2025-01-29 16:53:16 -05:00
Pranjal Swarup 64bb1ea13f Merge proto files from multiprocess run into one file. (#63)
- Added script to merge multiprocess output automatically to one file when there are multiprocess proto files written into output folder
- Execute the merge multiprocess script from the rank 0 process
- Added the scripts folder path to env path, via setup-env.sh
- Installed merge_multiprocess_output.sh to /share/rocprofiler-systems/bin dir

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 0263e951ff]
2024-12-18 17:34:02 -05:00
David Galiffi 7eaaa83024 Fix for proto files not being viewable in Perfetto UI (#16)
- Fix for proto files not being viewable in Perfetto UI
  - Ported from https://github.com/ROCm/omnitrace/pull/411

- Update Workflows

- Use V47 trace_processor_shell for certain OS releases.
  - RedHat 8, SUSE 15.5, and Ubuntu 20.04 are no longer compatible with the latest trace_processor_shell.
  - Incompatible version of GLIBC.

- Remove notes about Perfetto workaround in documentation.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 21d9ab79fd]
2024-11-05 10:14:25 -05:00
David Galiffi 489eda995d Rename Omnitrace to ROCm Systems Profiler (#4)
The Omnitrace program is being renamed. 

Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"

---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d07bf508a9]
2024-10-15 11:20:40 -04:00
Jonathan R. Madsen b5bdba12e4 Roctracer flush activity fix + perfetto.cfg (#317)
* Fix roctracer_flush_activity

- invoke roctracer_flush_activity() before disabling domains

* create comp::roctracer::flush()

- real issue was the global state when roctracer_flush_activity() was called

* formatting

* Update lib/omnitrace/library/components/roctracer.hpp

- provide definition of comp::roctracer::flush when OMNITRACE_USE_ROCTRACER is not defined

* omnitrace.cfg -> perfetto.cfg

- rename provided perfetto config file (omnitrace.cfg) to perfetto.cfg to avoid confusion

* Update lib/core

- gpu.hpp: defines for OMNITRACE_USE_{HIP,ROCTRACER,ROCPROFILER,ROCM_SMI}
- gpu.cpp
  - include core/hip_runtime.hpp
  - fix serialization of hipDeviceProp_t
- add hip_runtime.hpp
  -  ensure proper inclusion of hip_runtime.h
- add rccl.hpp
  - ensure proper inclusion of rccl.h

* Update lib/omnitrace/library

- rcclp.cpp
  - update includes for rccl
- roctracer.hpp
  - update includes for hip_runtime
- components/comm_data.hpp
  - update includes for rccl
- components/rcclp.hpp
  - update includes for rccl

* Update bin/omnitrace-avail/avail.cpp

- update includes for hip_runtime

* Update examples/rccl/CMakeLists.txt

- fix find_package for rccl when CI enabled

* Update CMakeLists.txt

- set cmake policy CMP0135 to NEW for cmake >= 3.24
  - Enable DOWNLOAD_EXTRACT_TIMESTAMP with ExternalProject_Add + URL download method

* Update timemory submodule

* Update pybind11 submodule

* Update pybind11 submodule

* Update lib/core/rccl.hpp

- include rccl.h only if OMNITRACE_USE_RCCL > 0

* Update lib/core/{gpu,hip_runtime}.hpp

* Update lib/core/gpu.cpp

- reintroduce some ppdefs

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/gpu.cpp

- fix static assert for OMNITRACE_HIP_VERSION_MINOR when HIP version 4.x or older (unreliable minor versions)

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/config.cpp

- disable OMNITRACE_PERFETTO_COMBINE_TRACES by default

* Update lib/core/perfetto.cpp

- if unable to open perfetto temp file, return the ReadTraceBlocking()

* Update lib/core/config.*

- flush tmpfile before closing

[ROCm/rocprofiler-systems commit: 7bc50f5a0a]
2024-01-10 05:02:22 -06:00
Jonathan R. Madsen 70c8d1229c rocprofler_iterate_info workaround + omnitrace-avail update (#270)
* rocprofler_iterate_info workaround + omnitrace-avail update

- provides workaround for rocprofiler_iterate_info behavior change in ROCm 5.4.0-3
- update timemory submodule with argparse tweaks
- updates hsa_rsrc_factory.{hpp,cpp}
- colorized log in omnitrace-avail
- Bump version to 1.9.2

* Fix empty_base inheritance

- timemory's component::empty_base inherits from concepts::component so direct inheritance was removed

* Fix OMNITRACE_HIP_VERSION_COMPAT_STRING

- defined as "" when OMNITRACE_HIP_VERSION_MAJOR==0

* new defines + extra info

- define OMNITRACE_LIBRARY_ARCH (via CMAKE_LIBRARY_ARCHITECTURE)
- define OMNITRACE_SYSTEM_NAME (via CMAKE_SYSTEM_NAME)
- define OMNITRACE_SYSTEM_PROCESSOR (via CMAKE_SYSTEM_PROCESSOR)
- define OMNITRACE_SYSTEM_VERSION (via OMNITRACE_SYSTEM_VERSION)
- define OMNITRACE_COMPILER_ID (via CMAKE_CXX_COMPILER_ID)
- define OMNITRACE_COMPILER_VERSION (via CMAKE_CXX_COMPILER_VERSION)
- include this info in metadata
- include subset of this info in --version for bin tools
- tweak to perfetto verbose messages

[ROCm/rocprofiler-systems commit: 4ed5f3e67b]
2023-03-30 04:21:43 -05:00
Jonathan R. Madsen 49851b05ae Address and thread sanitizer fixes (#250)
* Address and thread sanitizer fixes

- Fix compilation with clang
- Tweak perfetto copy to build tree
- Added suppression files to scripts
- fix LD_PRELOAD support in omnitrace-causal and omnitrace-sample
- use spin_mutex and spin_lock from timemory instead of atomic_mutex and atomic_lock
- state uses atomic
- fix some memory leaks
- tweak testing
  - mpi tests do not use preload
  - increase timeout when using sanitizers
  - add env LD_PRELOAD when using sanitizers

* Tweak perfetto build

* Update timemory submodule

* Update version to 1.8.1

* Update omnitrace-leak.supp

* Update timemory submodule

- fixed spin_mutex implementation

* Remove previously added addr_space->allowTraps(instr_traps)

- this appears to cause errors during binary rewrite

* causal testing updates

- relaxed causal validation on CI systems (to account for hyperthreading decreasing prediction)
- improved impact calculation
- other general improvements to validate-causal-json.py

* Improve fork handling for perfetto

- numerous updates changing perfetto:: to ::perfetto::
- added perfetto_fwd.hpp

* Updated fork example

- user API for validation that stopping/starting perfetto is valid

* Misc fixes to perfetto + fork support

- tweak regions in fork example
- handle disabling tmp files
- get rid of stop/start with perfetto before/after fork
- fixed sampling support during fork
- tweak env of fork test

* Fix find_package in build-tree

* Fix buildtree export

* Fix buildtree export

* Restructured ConfigInstall before adding examples

* Guard against creating tmp file in sampling when disabled

* Fix buildtree package

* formatting

* exit handlers on child processes

- quick exit to avoid perfetto cleanup

* Further tweaking of causal tests for reliability

- enable PROCESSOR_AFFINITY
- decrease to 5 iterations

* Further tweaking of causal tests for reliability

- disable PROCESSOR_AFFINITY for fast func e2e tests
- enabling affinity results in (valid) speedup predictions greater than zero

* Fixes to fork handling

- use pthread_atfork for redundancy if fork_gotcha fails

* cmake formatting

* Fix fork init settings + install components

- remove dl from PROJECT_BUILD_TARGETS

* Testing tweaks

- fix mpi-binary-rewrite-run regex when OMNITRACE_VERBOSE set > 1 in env
- increase causal e2e iterations to 8

* Fix "Test User API"

- test-find-package.sh included dl component

* Further tweaks to causal validation

- further considerations of variance

[ROCm/rocprofiler-systems commit: 846301bcaf]
2023-02-27 12:09:03 -06:00
Jonathan R. Madsen b2bedda138 restructure libomnitrace + tasking and omnitrace-causal updates (#237)
* restructured libomnitrace

- this is necessary to incorporate some of the binary analysis capabilities into omnitrace exe
- created libomnitrace-core (static)
- created libomnitrace-binary (static)
- created libomnitrace (static)
- omnitrace-avail links to libomnitrace.a
- omnitrace-critical-trace links to libomnitrace.a
- tweaked the testing
  - reduced verbosity on some of MPI tests
  - excluded trace-time-window from tests on Ubuntu 18.04
  - reduced causal e2e iterations
- minor tweak to tasking
  - manually create `PTL::UserTaskQueue` instance instead of relying on `PTL::ThreadPool` to create it

* Update formatting workflow

- source formatting uses ubuntu-22.04
- check-includes doesn't generate false positive for 'include "timemory.hpp"'

* omnitrace-causal --generate-configs

- fix config generation in omnitrace causal
- add test for omnitrace-causal + generating configs

* Fix omnitrace-object-library build

- accidentally included rocm sources in non-rocm builds

* Fix rocm compilation w/o rocprofiler

* update timemory submodule with mpi_get warning messages

* sampling offload file updates

- more verbose messages
- disable offload before stopping

* testing updates

- increase causal e2e iterations to 12
- increase lock_environment verbose to 2 (for sampling offload messages)
- fix return for omnitrace_add_validation_test

[ROCm/rocprofiler-systems commit: e7d3125459]
2023-02-04 10:59:50 -06:00