- missing-new-line CI job: ensures all source files end with new line
- logging updates
- add new line to the end of many files
- fix header include ordering is misc places
- transition to use hsa::get_core_table() and hsa::get_amd_ext_table() in various places instead of making copies
* Update lib/rocprofiler-sdk/context/*
- create correlation_id.{hpp,cpp} and moved implementation into these files instead of in context.{hpp,cpp}
* Update lib/rocprofiler-sdk/thread_trace/att_core.hpp
- fixed header includes
* Update lib/common/utility.hpp (runtime sizeof)
- added compute_runtime_sizeof<T>() function to set the "size" field to be the offset of the "reserved_padding" field if one exists
* Fix to compute_runtime_sizeof
* Use small_vector for API iterate_args
- replace dim3 value arguments with rocprofiler_dim3_t
- dim3 has a non-trivial destructor
- common::mpl::unqualified_type
- common::stringified_argument_array_t<N> alias
- assert_public_data_type_properties()
- common::container::small_vector<T>::at function
- stringize returns small_vector<stringified_argument>
- stack allocated vector
- remove has_pc_sampling condition (HSA, HIP)
- this will be handled in queue interception
* Misc tweaks
- .github/workflows/continuous_integration.yml
- apt-get update before apt-get install
- remove libgtest-dev
- actions-comment-pull-request: v2.4.3 -> v2.5.0
- .github/workflows/formatting.yml
- create-pull-request: v5 -> v6
- cmake/rocprofiler_options.cmake
- remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options
- samples/counter_collection/callback_client.cpp
- corr_id field renamed to correlation_id
- samples/counter_collection/client.cpp
- corr_id field renamed to correlation_id
- include/rocprofiler-sdk/fwd.h
- In rocprofiler_record_counter_t: rename corr_id field to correlation_id
- doxygen fixes
- lib/common/utility.*
- remove get_accurate_clock_id_impl
- timestamp_ns() defaults to CLOCK_BOOTTIME
- lib/rocprofiler-sdk/counters/core.cpp
- fix spelling mistake: extrenal -> external
- corr_id field renamed to correlation_id
- lib/rocprofiler-sdk-tool/tool.cpp
- fix destruction of static tool::output_file before finalization
- scripts/update-docs.sh
- define PROJECT_NAME
- tests/async-copy-tracing/validate.py
- init_time and fini_time checks
- hip_api_traces, marker_api_tracing
- tests/common/serialization.hpp
- fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id
- tests/kernel-tracing/validate.py
- init_time and fini_time checks
- relax test_total_runtime range
- tests/rocprofv3/tracing/CMakeLists.txt
- remove -M from rocprofv3-test-systrace-execute
- exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing
- tests/rocprofv3/tracing/validate.py
- update test_kernel_trace to accept mangled or demangled
- tests/tools/json-tool.cpp
- remove use of GLOG
- include init_time and fini_time
- write_json(...) function
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
* Update LICENSE
- fix inconsistencies
* Revert lib/rocprofiler/counters/parser/scanner.cpp
* Update lib/rocprofiler/counters/tests/dimension.cpp
- revert ending curly brace
* Revert missing curly braces
- missing curly braces when file did not end with a new line
* Limit the number of HSA signals that are active
There is a hard limit currently to the number of
signals that HSA allows to be created (before weird stuff
happens such as hangs or straight up crashes in HSA). While
there is some work going on to fix this in HSA/AQL. Lets limit the
number we create.
Increased the counter colleciton example to 200K launches, which
with this change no longer hangs/crashes randomly in HSA.
* source formatting (clang-format v11) (#142)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Up timout
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Update CI and linting workflows
- delete linting workflow
- compile default CI job with clang-tidy
- split out code coverage matrix entry to separate job
- code coverage job runs code coverage 3x
- once for total code coverage
- once for unittests code coverage
- once for samples code coverage
* Update PTL submodule
- improves handling of when thread pool is destroyed in atexit handler
* Update lib/rocprofiler/buffer
- buffer::instance::get_internal_buffer()
- allocate_buffer invokes internal_threading::initialize() on first entry
- update flush routine
- if wait is false, does not wait for task group to finish syncing
- checks for callback pointer
* Update lib/rocprofiler/internal_threading
- modifications to handle destruction of statics before atexit handler is invoked
* Update lib/rocprofiler/registration.cpp
- reorder atexit call in initialize()
- protect finalize from executing more than once
* Add unittests for rocprofiler buffer
* Update CI workflow
- disable fail-fast for sanitizers
- move AddressSanitizer job to top of the list
* Update lib/rocprofiler/tests/buffer/CMakeLists.txt
- do not set memcheck LD_PRELOAD for rocprofiler-lib-buffer-tests
* Update lib/rocprofiler/registration.{hpp,cpp}
- only invoke client finalizers if initialized
- remove invoke_client_initializer
- move invoke_client functions to anonymous namespace (no declaration in header)
- set fini status in finalize
* Update scripts/thread-sanitizer-suppr.txt
- suppress false positive for double mutex lock in external/ptl/source/PTL/TaskGroup.hh
* Restructure lib/rocprofiler/tests
* Update lib/common
- add utility.cpp
- move read_command_line to utility.{hpp,cpp}
- was formerly in config.cpp
* Update lib/rocprofiler
- checks for init status return configuration locked if status is not greater than -1
- in other words, this prevents calling these functions directly (which was possible when check was for greater than 0
* Update lib/rocprofiler/context/context.{hpp,cpp}
- provide deactivate_client_contexts and deregister_client_contexts
- these functions are used when the tool fails to configure
* Update lib/rocprofiler/registration.{hpp,cpp}
- internal "public" get_client_offet()
- client ids are offset by a random value to avoid default values behaving correctly
* Update lib/rocprofiler/tests
- fix rocprofiler_lib.registration_lambda_no_result
* Update lib/rocprofiler/tests
- fix rocprofiler_lib.registration_lambda_with_result
* Update lib/rocprofiler/tests
- remove deep bind from rocprofiler_lib.registration_lambda_with_result
* Update lib/rocprofiler/tests
- use RTLD_NOW when dlopen'ing in rocprofiler_lib.registration_lambda_with_result
* Update rocprofiler registration tests
- split registration tests into separate exe that links to shared library
* Formatting
* Update CI workflow
- always checkout submodules via actions/checkout
* Update lib/rocprofiler/buffer.{hpp,cpp}
- fix issue with buffer flushing not working when only called once
* Update rocprofiler lib registration test
- test for buffered callback
* Update include/rocprofiler/rocprofiler.h
- include internal_threading.h header
* Update rocprofiler lib registration test
- add in internal threading for buffered test
* Agent Implementation
* Remove unused Findrocprofiler
* Update lib/rocprofiler/hsa/agent.{hpp,cpp}
- default AgentInfo ctor
- getNumaNode() const
- noexcept move ctors
- default initializers for member variables
- fixed clang-tidy recommentations
- preallocate
- static in anon namespace
- AgentInfo::setName uses strncpy and ensures that it is terminated
* Update lib/rocprofiler/rocprofiler.cpp (agent.cpp and pc_sampling.cpp)
- move public PC sampling function implementations to pc_sampling.cpp
- move public agent function implementation to agent.cpp
* Update common/container
- fix namespace issue in operators.hpp
- fix exceptions in stable_vector
- fix exceptions in static_vector
- fix emplace_back construction with no args in static_vector
* Add lib/common/utility.hpp
- get_tid function
* Update lib/common/utility.hpp
- add timestamp_ns function