Commit Graph

4 Commits

Author SHA1 Message Date
Jonathan R. Madsen 7bc50f5a0a Roctracer flush activity fix + perfetto.cfg (#317)
* Fix roctracer_flush_activity

- invoke roctracer_flush_activity() before disabling domains

* create comp::roctracer::flush()

- real issue was the global state when roctracer_flush_activity() was called

* formatting

* Update lib/omnitrace/library/components/roctracer.hpp

- provide definition of comp::roctracer::flush when OMNITRACE_USE_ROCTRACER is not defined

* omnitrace.cfg -> perfetto.cfg

- rename provided perfetto config file (omnitrace.cfg) to perfetto.cfg to avoid confusion

* Update lib/core

- gpu.hpp: defines for OMNITRACE_USE_{HIP,ROCTRACER,ROCPROFILER,ROCM_SMI}
- gpu.cpp
  - include core/hip_runtime.hpp
  - fix serialization of hipDeviceProp_t
- add hip_runtime.hpp
  -  ensure proper inclusion of hip_runtime.h
- add rccl.hpp
  - ensure proper inclusion of rccl.h

* Update lib/omnitrace/library

- rcclp.cpp
  - update includes for rccl
- roctracer.hpp
  - update includes for hip_runtime
- components/comm_data.hpp
  - update includes for rccl
- components/rcclp.hpp
  - update includes for rccl

* Update bin/omnitrace-avail/avail.cpp

- update includes for hip_runtime

* Update examples/rccl/CMakeLists.txt

- fix find_package for rccl when CI enabled

* Update CMakeLists.txt

- set cmake policy CMP0135 to NEW for cmake >= 3.24
  - Enable DOWNLOAD_EXTRACT_TIMESTAMP with ExternalProject_Add + URL download method

* Update timemory submodule

* Update pybind11 submodule

* Update pybind11 submodule

* Update lib/core/rccl.hpp

- include rccl.h only if OMNITRACE_USE_RCCL > 0

* Update lib/core/{gpu,hip_runtime}.hpp

* Update lib/core/gpu.cpp

- reintroduce some ppdefs

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/gpu.cpp

- fix static assert for OMNITRACE_HIP_VERSION_MINOR when HIP version 4.x or older (unreliable minor versions)

* Update lib/core/gpu.cpp

- fix ifdef on OMNITRACE_HIP_VERSION

* Update lib/core/config.cpp

- disable OMNITRACE_PERFETTO_COMBINE_TRACES by default

* Update lib/core/perfetto.cpp

- if unable to open perfetto temp file, return the ReadTraceBlocking()

* Update lib/core/config.*

- flush tmpfile before closing
2024-01-10 05:02:22 -06:00
Jonathan R. Madsen 4ed5f3e67b rocprofler_iterate_info workaround + omnitrace-avail update (#270)
* rocprofler_iterate_info workaround + omnitrace-avail update

- provides workaround for rocprofiler_iterate_info behavior change in ROCm 5.4.0-3
- update timemory submodule with argparse tweaks
- updates hsa_rsrc_factory.{hpp,cpp}
- colorized log in omnitrace-avail
- Bump version to 1.9.2

* Fix empty_base inheritance

- timemory's component::empty_base inherits from concepts::component so direct inheritance was removed

* Fix OMNITRACE_HIP_VERSION_COMPAT_STRING

- defined as "" when OMNITRACE_HIP_VERSION_MAJOR==0

* new defines + extra info

- define OMNITRACE_LIBRARY_ARCH (via CMAKE_LIBRARY_ARCHITECTURE)
- define OMNITRACE_SYSTEM_NAME (via CMAKE_SYSTEM_NAME)
- define OMNITRACE_SYSTEM_PROCESSOR (via CMAKE_SYSTEM_PROCESSOR)
- define OMNITRACE_SYSTEM_VERSION (via OMNITRACE_SYSTEM_VERSION)
- define OMNITRACE_COMPILER_ID (via CMAKE_CXX_COMPILER_ID)
- define OMNITRACE_COMPILER_VERSION (via CMAKE_CXX_COMPILER_VERSION)
- include this info in metadata
- include subset of this info in --version for bin tools
- tweak to perfetto verbose messages
2023-03-30 04:21:43 -05:00
Jonathan R. Madsen 846301bcaf Address and thread sanitizer fixes (#250)
* Address and thread sanitizer fixes

- Fix compilation with clang
- Tweak perfetto copy to build tree
- Added suppression files to scripts
- fix LD_PRELOAD support in omnitrace-causal and omnitrace-sample
- use spin_mutex and spin_lock from timemory instead of atomic_mutex and atomic_lock
- state uses atomic
- fix some memory leaks
- tweak testing
  - mpi tests do not use preload
  - increase timeout when using sanitizers
  - add env LD_PRELOAD when using sanitizers

* Tweak perfetto build

* Update timemory submodule

* Update version to 1.8.1

* Update omnitrace-leak.supp

* Update timemory submodule

- fixed spin_mutex implementation

* Remove previously added addr_space->allowTraps(instr_traps)

- this appears to cause errors during binary rewrite

* causal testing updates

- relaxed causal validation on CI systems (to account for hyperthreading decreasing prediction)
- improved impact calculation
- other general improvements to validate-causal-json.py

* Improve fork handling for perfetto

- numerous updates changing perfetto:: to ::perfetto::
- added perfetto_fwd.hpp

* Updated fork example

- user API for validation that stopping/starting perfetto is valid

* Misc fixes to perfetto + fork support

- tweak regions in fork example
- handle disabling tmp files
- get rid of stop/start with perfetto before/after fork
- fixed sampling support during fork
- tweak env of fork test

* Fix find_package in build-tree

* Fix buildtree export

* Fix buildtree export

* Restructured ConfigInstall before adding examples

* Guard against creating tmp file in sampling when disabled

* Fix buildtree package

* formatting

* exit handlers on child processes

- quick exit to avoid perfetto cleanup

* Further tweaking of causal tests for reliability

- enable PROCESSOR_AFFINITY
- decrease to 5 iterations

* Further tweaking of causal tests for reliability

- disable PROCESSOR_AFFINITY for fast func e2e tests
- enabling affinity results in (valid) speedup predictions greater than zero

* Fixes to fork handling

- use pthread_atfork for redundancy if fork_gotcha fails

* cmake formatting

* Fix fork init settings + install components

- remove dl from PROJECT_BUILD_TARGETS

* Testing tweaks

- fix mpi-binary-rewrite-run regex when OMNITRACE_VERBOSE set > 1 in env
- increase causal e2e iterations to 8

* Fix "Test User API"

- test-find-package.sh included dl component

* Further tweaks to causal validation

- further considerations of variance
2023-02-27 12:09:03 -06:00
Jonathan R. Madsen e7d3125459 restructure libomnitrace + tasking and omnitrace-causal updates (#237)
* restructured libomnitrace

- this is necessary to incorporate some of the binary analysis capabilities into omnitrace exe
- created libomnitrace-core (static)
- created libomnitrace-binary (static)
- created libomnitrace (static)
- omnitrace-avail links to libomnitrace.a
- omnitrace-critical-trace links to libomnitrace.a
- tweaked the testing
  - reduced verbosity on some of MPI tests
  - excluded trace-time-window from tests on Ubuntu 18.04
  - reduced causal e2e iterations
- minor tweak to tasking
  - manually create `PTL::UserTaskQueue` instance instead of relying on `PTL::ThreadPool` to create it

* Update formatting workflow

- source formatting uses ubuntu-22.04
- check-includes doesn't generate false positive for 'include "timemory.hpp"'

* omnitrace-causal --generate-configs

- fix config generation in omnitrace causal
- add test for omnitrace-causal + generating configs

* Fix omnitrace-object-library build

- accidentally included rocm sources in non-rocm builds

* Fix rocm compilation w/o rocprofiler

* update timemory submodule with mpi_get warning messages

* sampling offload file updates

- more verbose messages
- disable offload before stopping

* testing updates

- increase causal e2e iterations to 12
- increase lock_environment verbose to 2 (for sampling offload messages)
- fix return for omnitrace_add_validation_test
2023-02-04 10:59:50 -06:00