Файли
rocm-systems/cmake/BuildSettings.cmake
T
Jonathan R. Madsen 518c83e0f9 Dynamic expansion of thread data (#294)
* Tests for exceeding OMNITRACE_MAX_THREADS

- tests which exceeds OMNITRACE_MAX_THREADS value for thread creation

* CMake Formatting.cmake update

- include source files in /tests/source directory

* Add unknown-hash= to OMNITRACE_ABORT_FAIL_REGEX

- fail if a timemory hash is not resolved to a name

* Tests for exceeding OMNITRACE_MAX_THREADS

- update

* omnitrace-sample update

- remove env disabling of critical-trace and process-sampling

* core library update

- make_unique in concepts.hpp
- add OMNITRACE_USE_ROCM_SMI to "process_sampling" category
- remove forced disabling of critical-trace in sampling mode
- parentheses for OMNITRACE_PREFER
- use tim::get_hash_id instead of tim::get_combined_hash_id

* core library update (containers)

- added aligned_static_vector.hpp
  - similar to static_vector.hpp but attempts to align to cache line size
- alignment template parameter for stable_vector
- added missing aliases in static_vector
  - consistent with aligned_static_vector aliases

* thread_info update

- track the peak number of threads created
- thread_info::get_peak_num_threads() returns the peak number of threads

* thread_data update

- generic thread_data inherits from base_thread_data
- thread_data reworked to support dynamic expansion
- base_thread_data updated to invoke private_instance() function
- thread_data<optional<T>> uses stable_vector aligned to cache line width
- thread_data<identity<T>> uses stable_vector aligned to cache line width
- thread_data for optional and identity provide private private_instance function + friend to base_thread_data
- component_bundle_cache<T> is now thread_data<component_bundle_cache_impl<T>>

* causal update

- thread_data<T>::instances -> thread_data<T>::instance(construct_on_thread{ ... })
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- update progress_bundle usage to new thread_data API

* backtrace/backtrace_metrics component update

- backtrace_metrics update
  - update to new thead_data API
  - add thread CPU time row in perfetto
  - fix potential bug when rusage categories are disabled
  - fix bug in operator-= not subtracting cpu time of rhs
- backtrace update
  - skip all child call-stack below 'tim::openmp::' if sampling_keep_internal = false

* pthread_gotcha component update

- pthread_gotcha::shutdown() invokes pthread_create_gotcha::shutdown()

* pthread_create_gotcha component update

- minor tweak to {start,stop}_bundle functions: pass in thread id
- update to new thread_data API
- track native handles of internal threads
- implement system with pthread_kill to stop dangling bundles

* rocprofiler/roctracer component update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* critical trace (library) update

- update to new thread_data API
- tim::get_combined_hash_id -> tim::get_hash_id

* coverage update

- update to new thread_data API

* tasking update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* roctracer update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* rocm_smi update

- update to new thread_data API

* runtime.cpp update

- update to new thread_data API

* sampling.cpp update

- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* ompt.cpp update

- invoke pthread_gotcha::shutdown before invoking OMPT finalize function
  - this prevents signals from being delivered to OpenMP threads

* tracing.hpp and tracing.cpp update

- replace get_timemory_hash_{ids,aliases} functions with copy_timemory_hash_ids function
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()
- tim::get_combined_hash_id -> tim::get_hash_id
- improvements to + error checking in thread_init function

* library.cpp update

- move copying timemory hash id/aliases to tracing.cpp
- update to new thread_data API
- loop over max_supported_threads (constexpr) -> loop over thread_info::get_peak_num_threads()

* Update BuildSettings.cmake

- add -Wno-interference-size to suppress warning about use of std::hardware_destructive_interference

* Update fork example

- improve scheme for waiting on child processes via waitpid instead of wait
- support running main routine multiple times
- push/pop regions in child process

* Update lib/common/defines.h.in

- allow use to specify misc values via -D <name>=<value>
  - OMNITRACE_CACHELINE_SIZE
  - OMNITRACE_CACHELINE_SIZE_MIN
  - OMNITRACE_ROCM_MAX_COUNTERS
- remove unused defines
  - OMNITRACE_ROCM_LOOK_AHEAD
  - OMNITRACE_MAX_ROCM_QUEUES

* Update rocprofiler.hpp

- OMNITRACE_MAX_ROCM_COUNTERS -> OMNITRACE_ROCM_MAX_COUNTERS

* Update aligned_static_vector

- set cacheline_align_v from max of OMNITRACE_CACHELINE_SIZE and OMNITRACE_CACHELINE_SIZE_MIN

* Update tracing.cpp

- acquire locks for updating main hash ids/aliases
- only propagate ids/aliases when finalizing

* Update pthread_create_gotcha.cpp

- make sure hash for "start_thread" exists on main thread

* Update causal end to end tests

- if OMNITRACE_BUILD_NUMBER is 1, set OMNITRACE_VERBOSE=0
2023-10-16 18:04:47 -05:00

373 рядки
14 KiB
CMake

# include guard
include_guard(DIRECTORY)
# ########################################################################################
#
# Handles the build settings
#
# ########################################################################################
include(GNUInstallDirs)
include(Compilers)
include(FindPackageHandleStandardArgs)
include(MacroUtilities)
omnitrace_add_option(
OMNITRACE_BUILD_DEVELOPER "Extra build flags for development like -Werror"
${OMNITRACE_BUILD_CI})
omnitrace_add_option(OMNITRACE_BUILD_RELEASE "Build with minimal debug line info" OFF)
omnitrace_add_option(OMNITRACE_BUILD_EXTRA_OPTIMIZATIONS "Extra optimization flags" OFF)
omnitrace_add_option(OMNITRACE_BUILD_LTO "Build with link-time optimization" OFF)
omnitrace_add_option(OMNITRACE_USE_COMPILE_TIMING
"Build with timing metrics for compilation" OFF)
omnitrace_add_option(OMNITRACE_USE_SANITIZER
"Build with -fsanitze=\${OMNITRACE_SANITIZER_TYPE}" OFF)
omnitrace_add_option(OMNITRACE_BUILD_STATIC_LIBGCC
"Build with -static-libgcc if possible" OFF)
omnitrace_add_option(OMNITRACE_BUILD_STATIC_LIBSTDCXX
"Build with -static-libstdc++ if possible" OFF)
omnitrace_add_option(OMNITRACE_BUILD_STACK_PROTECTOR "Build with -fstack-protector" ON)
omnitrace_add_cache_option(
OMNITRACE_BUILD_LINKER
"If set to a non-empty value, pass -fuse-ld=\${OMNITRACE_BUILD_LINKER}" STRING "bfd")
omnitrace_add_cache_option(OMNITRACE_BUILD_NUMBER "Internal CI use" STRING "0" ADVANCED
NO_FEATURE)
omnitrace_add_interface_library(omnitrace-static-libgcc
"Link to static version of libgcc")
omnitrace_add_interface_library(omnitrace-static-libstdcxx
"Link to static version of libstdc++")
omnitrace_add_interface_library(omnitrace-static-libgcc-optional
"Link to static version of libgcc")
omnitrace_add_interface_library(omnitrace-static-libstdcxx-optional
"Link to static version of libstdc++")
target_compile_definitions(omnitrace-compile-options INTERFACE $<$<CONFIG:DEBUG>:DEBUG>)
set(OMNITRACE_SANITIZER_TYPE
"leak"
CACHE STRING "Sanitizer type")
if(OMNITRACE_USE_SANITIZER)
omnitrace_add_feature(OMNITRACE_SANITIZER_TYPE
"Sanitizer type, e.g. leak, thread, address, memory, etc.")
endif()
if(OMNITRACE_BUILD_CI)
omnitrace_target_compile_definitions(${LIBNAME}-compile-options
INTERFACE OMNITRACE_CI)
endif()
# ----------------------------------------------------------------------------------------#
# dynamic linking and runtime libraries
#
if(CMAKE_DL_LIBS AND NOT "${CMAKE_DL_LIBS}" STREQUAL "dl")
# if cmake provides dl library, use that
set(dl_LIBRARY
${CMAKE_DL_LIBS}
CACHE FILEPATH "dynamic linking system library")
endif()
foreach(_TYPE dl rt dw)
if(NOT ${_TYPE}_LIBRARY)
find_library(${_TYPE}_LIBRARY NAMES ${_TYPE})
endif()
endforeach()
find_package_handle_standard_args(dl-library REQUIRED_VARS dl_LIBRARY)
find_package_handle_standard_args(rt-library REQUIRED_VARS rt_LIBRARY)
# find_package_handle_standard_args(dw-library REQUIRED_VARS dw_LIBRARY)
if(dl_LIBRARY)
target_link_libraries(omnitrace-compile-options INTERFACE ${dl_LIBRARY})
endif()
# ----------------------------------------------------------------------------------------#
# set the compiler flags
#
add_flag_if_avail(
"-W" "-Wall" "-Wno-unknown-pragmas" "-Wno-unused-function" "-Wno-ignored-attributes"
"-Wno-attributes" "-Wno-missing-field-initializers" "-Wno-interference-size")
if(OMNITRACE_BUILD_DEBUG)
add_flag_if_avail("-g3" "-fno-omit-frame-pointer")
endif()
if(WIN32)
# suggested by MSVC for spectre mitigation in rapidjson implementation
add_cxx_flag_if_avail("/Qspectre")
endif()
if(CMAKE_CXX_COMPILER_IS_CLANG)
add_cxx_flag_if_avail("-Wno-mismatched-tags")
endif()
# ----------------------------------------------------------------------------------------#
# extra flags for debug information in debug or optimized binaries
#
omnitrace_add_interface_library(
omnitrace-compile-debuginfo
"Attempts to set best flags for more expressive profiling information in debug or optimized binaries"
)
add_target_flag_if_avail(omnitrace-compile-debuginfo "-g3" "-fno-omit-frame-pointer"
"-fno-optimize-sibling-calls")
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
add_target_cuda_flag(omnitrace-compile-debuginfo "-lineinfo")
endif()
target_compile_options(
omnitrace-compile-debuginfo
INTERFACE $<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU>:-rdynamic>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-rdynamic>>)
if(NOT APPLE)
target_link_options(omnitrace-compile-debuginfo INTERFACE
$<$<CXX_COMPILER_ID:GNU>:-rdynamic>)
endif()
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
target_compile_options(
omnitrace-compile-debuginfo
INTERFACE
$<$<COMPILE_LANGUAGE:CUDA>:$<$<CXX_COMPILER_ID:GNU>:-Xcompiler=-rdynamic>>)
endif()
if(dl_LIBRARY)
target_link_libraries(omnitrace-compile-debuginfo INTERFACE ${dl_LIBRARY})
endif()
if(rt_LIBRARY)
target_link_libraries(omnitrace-compile-debuginfo INTERFACE ${rt_LIBRARY})
endif()
# ----------------------------------------------------------------------------------------#
# non-debug optimizations
#
omnitrace_add_interface_library(omnitrace-compile-extra "Extra optimization flags")
if(NOT OMNITRACE_BUILD_CODECOV AND OMNITRACE_BUILD_EXTRA_OPTIMIZATIONS)
add_target_flag_if_avail(
omnitrace-compile-extra "-finline-functions" "-funroll-loops" "-ftree-vectorize"
"-ftree-loop-optimize" "-ftree-loop-vectorize")
endif()
if(NOT "${CMAKE_BUILD_TYPE}" STREQUAL "Debug"
AND OMNITRACE_BUILD_EXTRA_OPTIMIZATIONS
AND NOT OMNITRACE_BUILD_CODECOV)
target_link_libraries(omnitrace-compile-options
INTERFACE $<BUILD_INTERFACE:omnitrace-compile-extra>)
add_flag_if_avail(
"-fno-signaling-nans" "-fno-trapping-math" "-fno-signed-zeros"
"-ffinite-math-only" "-fno-math-errno" "-fpredictive-commoning"
"-fvariable-expansion-in-unroller")
# add_flag_if_avail("-freciprocal-math" "-fno-signed-zeros" "-mfast-fp")
endif()
# ----------------------------------------------------------------------------------------#
# debug-safe optimizations
#
add_cxx_flag_if_avail("-faligned-new")
omnitrace_add_interface_library(omnitrace-lto "Adds link-time-optimization flags")
if(NOT OMNITRACE_BUILD_CODECOV)
omnitrace_save_variables(FLTO VARIABLES CMAKE_CXX_FLAGS)
set(_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
set(CMAKE_CXX_FLAGS "-flto=thin ${_CXX_FLAGS}")
add_target_flag_if_avail(omnitrace-lto "-flto=thin")
if(NOT cxx_omnitrace_lto_flto_thin)
set(CMAKE_CXX_FLAGS "-flto ${_CXX_FLAGS}")
add_target_flag_if_avail(omnitrace-lto "-flto")
if(NOT cxx_omnitrace_lto_flto)
set(OMNITRACE_BUILD_LTO OFF)
else()
target_link_options(omnitrace-lto INTERFACE -flto)
endif()
add_target_flag_if_avail(omnitrace-lto "-fno-fat-lto-objects")
if(cxx_omnitrace_lto_fno_fat_lto_objects)
target_link_options(omnitrace-lto INTERFACE -fno-fat-lto-objects)
endif()
else()
target_link_options(omnitrace-lto INTERFACE -flto=thin)
endif()
omnitrace_restore_variables(FLTO VARIABLES CMAKE_CXX_FLAGS)
endif()
# ----------------------------------------------------------------------------------------#
# print compilation timing reports (Clang compiler)
#
omnitrace_add_interface_library(
omnitrace-compile-timing
"Adds compiler flags which report compilation timing metrics")
if(CMAKE_CXX_COMPILER_IS_CLANG)
add_target_flag_if_avail(omnitrace-compile-timing "-ftime-trace")
if(NOT cxx_omnitrace_compile_timing_ftime_trace)
add_target_flag_if_avail(omnitrace-compile-timing "-ftime-report")
endif()
else()
add_target_flag_if_avail(omnitrace-compile-timing "-ftime-report")
endif()
if(OMNITRACE_USE_COMPILE_TIMING)
target_link_libraries(omnitrace-compile-options INTERFACE omnitrace-compile-timing)
endif()
# ----------------------------------------------------------------------------------------#
# fstack-protector
#
omnitrace_add_interface_library(omnitrace-stack-protector
"Adds stack-protector compiler flags")
add_target_flag_if_avail(omnitrace-stack-protector "-fstack-protector-strong"
"-Wstack-protector")
if(OMNITRACE_BUILD_STACK_PROTECTOR)
target_link_libraries(omnitrace-compile-options INTERFACE omnitrace-stack-protector)
endif()
# ----------------------------------------------------------------------------------------#
# developer build flags
#
if(OMNITRACE_BUILD_DEVELOPER)
add_target_flag_if_avail(
omnitrace-compile-options "-Werror" "-Wdouble-promotion" "-Wshadow" "-Wextra"
"-Wpedantic" "-Wstack-usage=524288" # 512 KB
"/showIncludes")
if(OMNITRACE_BUILD_NUMBER GREATER 2)
add_target_flag_if_avail(omnitrace-compile-options "-gsplit-dwarf")
endif()
endif()
if(OMNITRACE_BUILD_LINKER)
target_link_options(
omnitrace-compile-options INTERFACE
$<$<C_COMPILER_ID:GNU>:-fuse-ld=${OMNITRACE_BUILD_LINKER}>
$<$<CXX_COMPILER_ID:GNU>:-fuse-ld=${OMNITRACE_BUILD_LINKER}>)
endif()
# ----------------------------------------------------------------------------------------#
# release build flags
#
if(OMNITRACE_BUILD_RELEASE AND NOT OMNITRACE_BUILD_DEBUG)
add_target_flag_if_avail(
omnitrace-compile-options "-g1" "-feliminate-unused-debug-symbols"
"-gno-column-info" "-gno-variable-location-views" "-gline-tables-only")
endif()
# ----------------------------------------------------------------------------------------#
# visibility build flags
#
omnitrace_add_interface_library(omnitrace-default-visibility
"Adds -fvisibility=default compiler flag")
omnitrace_add_interface_library(omnitrace-hidden-visibility
"Adds -fvisibility=hidden compiler flag")
add_target_flag_if_avail(omnitrace-default-visibility "-fvisibility=default")
add_target_flag_if_avail(omnitrace-hidden-visibility "-fvisibility=hidden"
"-fvisibility-inlines-hidden")
# ----------------------------------------------------------------------------------------#
# developer build flags
#
if(dl_LIBRARY)
# This instructs the linker to add all symbols, not only used ones, to the dynamic
# symbol table. This option is needed for some uses of dlopen or to allow obtaining
# backtraces from within a program.
add_flag_if_avail("-rdynamic")
endif()
# ----------------------------------------------------------------------------------------#
# sanitizer
#
set(OMNITRACE_SANITIZER_TYPES
address
memory
thread
leak
undefined
unreachable
null
bounds
alignment)
set_property(CACHE OMNITRACE_SANITIZER_TYPE PROPERTY STRINGS
"${OMNITRACE_SANITIZER_TYPES}")
omnitrace_add_interface_library(omnitrace-sanitizer-compile-options
"Adds compiler flags for sanitizers")
omnitrace_add_interface_library(
omnitrace-sanitizer
"Adds compiler flags to enable ${OMNITRACE_SANITIZER_TYPE} sanitizer (-fsanitizer=${OMNITRACE_SANITIZER_TYPE})"
)
set(COMMON_SANITIZER_FLAGS "-fno-optimize-sibling-calls" "-fno-omit-frame-pointer"
"-fno-inline-functions")
add_target_flag(omnitrace-sanitizer-compile-options ${COMMON_SANITIZER_FLAGS})
foreach(_TYPE ${OMNITRACE_SANITIZER_TYPES})
set(_FLAG "-fsanitize=${_TYPE}")
omnitrace_add_interface_library(
omnitrace-${_TYPE}-sanitizer
"Adds compiler flags to enable ${_TYPE} sanitizer (${_FLAG})")
add_target_flag(omnitrace-${_TYPE}-sanitizer ${_FLAG})
target_link_libraries(omnitrace-${_TYPE}-sanitizer
INTERFACE omnitrace-sanitizer-compile-options)
set_property(TARGET omnitrace-${_TYPE}-sanitizer
PROPERTY INTERFACE_LINK_OPTIONS ${_FLAG} ${COMMON_SANITIZER_FLAGS})
endforeach()
unset(_FLAG)
unset(COMMON_SANITIZER_FLAGS)
if(OMNITRACE_USE_SANITIZER)
foreach(_TYPE ${OMNITRACE_SANITIZER_TYPE})
if(TARGET omnitrace-${_TYPE}-sanitizer)
target_link_libraries(omnitrace-sanitizer
INTERFACE omnitrace-${_TYPE}-sanitizer)
else()
message(
FATAL_ERROR "Error! Target 'omnitrace-${_TYPE}-sanitizer' does not exist!"
)
endif()
endforeach()
else()
set(OMNITRACE_USE_SANITIZER OFF)
endif()
# ----------------------------------------------------------------------------------------#
# static lib flags
#
target_compile_options(
omnitrace-static-libgcc
INTERFACE $<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU>:-static-libgcc>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-static-libgcc>>)
target_link_options(
omnitrace-static-libgcc INTERFACE
$<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU,Clang>:-static-libgcc>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU,Clang>:-static-libgcc>>)
target_compile_options(
omnitrace-static-libstdcxx
INTERFACE $<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-static-libstdc++>>)
target_link_options(
omnitrace-static-libstdcxx INTERFACE
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU,Clang>:-static-libstdc++>>)
if(OMNITRACE_BUILD_STATIC_LIBGCC)
target_link_libraries(omnitrace-static-libgcc-optional
INTERFACE omnitrace-static-libgcc)
endif()
if(OMNITRACE_BUILD_STATIC_LIBSTDCXX)
target_link_libraries(omnitrace-static-libstdcxx-optional
INTERFACE omnitrace-static-libstdcxx)
endif()
# ----------------------------------------------------------------------------------------#
# user customization
#
get_property(LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
if(NOT APPLE OR "$ENV{CONDA_PYTHON_EXE}" STREQUAL "")
add_user_flags(omnitrace-compile-options "CXX")
endif()