From 18da0bd49dcfd80e8699a88ee483e61bcbab14b4 Mon Sep 17 00:00:00 2001 From: "Jonathan R. Madsen" Date: Wed, 20 Sep 2023 19:32:02 -0500 Subject: [PATCH] Contexts, tracing, include reorg, registration, thread-pool (#65) * Update scripts/update-doxygen.sh - ensure build-docs folder exists * Update scripts/run-ci.py - exclude files in details subdirectory from code coverage * Update scripts/thread-sanitizer-suppr.txt - exclude races in glog * Update docs/rocprofiler.dox.in - exclude defines in include/rocprofiler/defines.h from doxygen - Tweak EXCLUDE_PATTERNS and EXAMPLE_PATTERNS * Update docs workflow - trigger workflow whenever there is a change to the public headers (which may be doxygen comments) * Update include/rocprofiler (reorg and overhaul) - rocprofiler_status_t additions - CONTEXT_NOT_FOUND - CONTEXT_ERROR - INVALID_CONTEXT_ID - INVALID_CONTEXT - BUFFER_BUSY - rocprofiler_context_is_active func - rocprofiler_context_is_valid func - rocprofiler_service_callback_tracing_kind_t update - remove ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD - Remove rocprofiler_tracing_helper_thread_operation_t - Remove rocprofiler_helper_thread_callback_tracer_data_t - Added rocprofiler_internal_thread_library_t - Added rocprofiler_at_internal_thread_create - split rocprofiler.h into several smaller headers - reworked rocprofiler_status_t values - added doxygen comments for enums - replaced rocprofiler_trace_record_operation_kind_t with rocprofiler_trace_operation_t - use @ instead of / in doxygen comment in rocprofiler_plugin.h - fix ref to ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API - end group in fwd.h - remove PROFILE_COUNTING group in dispatch_profile.h - remove premature group close in callback_tracing.h - hsa.h: remove rocprofiler_hsa_trace_data_t - fwd.h: remove rocprofiler_tracer_callback_data_t - rename rocprofiler_correlation_id_t.handle to rocprofiler_correlation_id_t.id (consistency) - fwd.h: add rocprofiler_callback_tracing_record_t - callback_tracing.h: update rocprofiler_hsa_api_callback_tracer_data_t - callback_tracing.h: add size fields - simplify rocprofiler_tracer_callback_t - removed ROCPROFILER_NONNULL from rocprofiler_get_version - added rocprofiler_get_timestamp - ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED in rocprofiler_status_t - add ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND rocprofiler_status_t - add rocprofiler_buffer_category_t - rocprofiler_trace_operation_t -> rocprofiler_tracing_operation_t - rocprofiler_user_data_t union - tweak rocprofiler_callback_tracing_record_t - make external_correlation_id non-pointer - add rocprofiler_user_data_t data field - tweak rocprofiler_record_header_t - instead of single uint64_t kind field, have union for category + kind (two u32) with u64 hash - API extensions for kind id <-> kind string - API extensions for operation id <-> operation string - rocprofiler_callback_trace_kind_name_cb_t - rocprofiler_callback_trace_operation_name_cb_t - rocprofiler_iterate_callback_trace_kind_names - rocprofiler_iterate_callback_trace_kind_operation_names - modify rocprofiler_hsa_api_callback_tracer_data_t data members (remove pointers) - add rocprofiler_callback_trace_operation_args_cb_t function pointer typedef - add rocprofiler_iterate_callback_trace_operation_args function - fixed inconsistent use of *_trace_* vs. *_tracing_* (opting for tracing) - removed rocprofiler_query_callback_trace_kind_name - removed rocprofiler_query_callback_kind_operation_name - Add include/rocprofiler/registration.h - header dedicated to registering a tool/client with rocprofiler - this header is not intended to be included by rocprofiler.h - rocprofiler_client_id_t - identifier for client tool - rocprofiler_client_finalize_t - function pointer prototype for tool-initiated finalization - rocprofiler_tool_initialize_t - function pointer prototype for tool initialization (i.e. configuration) - rocprofiler_tool_finalize_t - function pointer prototype for tool finalization - rocprofiler_tool_configure_result_t - struct returned by tool/client to rocprofiler - rocprofiler_is_initialized - function for querying whether tool-induced initialization is possible - rocprofiler_is_finalized - function for querying whether rocprofiler has been finalized - rocprofiler_configure prototype - this is the function tools implement - prototype is always marked as having default visibility - no implementation in rocprofiler - added typedef for rocprofiler_configure function pointer - added rocprofiler_force_configure to explicitly invoke rocprofiler_configure instead of relying on lazy init - made callback typedef names more consistent (_cb_t suffix) - typedef for rocprofiler_internal_thread_library_cb_t function pointer - added rocprofiler_at_internal_thread_create function - added rocprofiler_callback_thread_t struct - added rocprofiler_create_callback_thread function - added rocprofiler_assign_callback_thread function - removed rocprofiler_buffer_tracing_record_header_t in favor of kind and correlation id in each record type - added rocprofiler_buffer_tracing_kind_name_cb_t typedef - added rocprofiler_buffer_tracing_operation_name_cb_t typedef - added rocprofiler_iterate_buffer_tracing_kind_names function - added rocprofiler_iterate_buffer_tracing_kind_operation_names function - removed rocprofiler_query_buffer_trace_kind_name function - removed rocprofiler_query_buffer_kind_operation_name function * Update lib/common/container/stable_vector.hpp - include limits header - reserve_size struct - overload stable_vector constructor to support reserving as part of construction * Update lib/common/container/record_header_buffer.{hpp,cpp} - add emplace member function accepting category and kind (two u32 variables) instead of one u64 kind - use std::shared_mutex to prevent data-race when reading m_headers - record_header_buffer is now multiple writer, single reader - add read_lock member function (shared) - add read_unlock member function (shared) - lock member function gets exclusive lock - unlock member function releases exclusive lock * Rename "config" to "context" + restructure + implement - Restructure config files + license - move config files into lib/rocprofiler/config subfolder - rename some files - add license to some files which were missing it - Rename config/helpers.hpp - rename to allocator.hpp - remove get_domain_max_ops - Create config/domain.{hpp,cpp} - structures for handling tracing domains and ops - Update config/config.{hpp,cpp} - buffer_instance struct - callback_tracing_service struct - buffer_tracing_service struct - config struct - allocate_{config,buffer} func - {validate,start,stop}_config funcs - get_registered_configs func - get_active_configs func - get_buffers func - Update rocprofiler.cpp - Implement rocprofiler_create_context - Implement rocprofiler_start_context - Implement rocprofiler_stop_context - Implement rocprofiler_context_is_active - Implement rocprofiler_context_is_valid - Implement rocprofiler_flush_buffer - Implement rocprofiler_destroy_buffer - Implement rocprofiler_create_buffer - Update lib/rocprofiler/hsa - use rocprofiler_tracer_activity_domain_t instead of rocprofiler_tracer_activity_domain_t - remove ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API fromHSA_API_INFO_DEFINITION_* macros - Update lib/rocprofiler/context/domain.* - fixes for domain_info (i.e. use correct enums) - update rocprofiler_status_t codes - fix template instantiations - Update lib/rocprofiler/context/context.* - use rocprofiler_service_callback_tracing_kind_t instead of rocprofiler_tracer_activity_domain_t - rename correlation_context to correlation_tracing_service - fix domains in callback_tracing_service and buffer_tracing_service - unique_ptr for callback_tracer and buffered_tracer in context - Update lib/rocprofiler/rocprofiler.cpp - implement rocprofiler_configure_callback_tracing_service - Update lib/rocprofiler/hsa/ostream.hpp - include rocprofiler.h instead of tracer.hpp - Update lib/rocprofiler/hsa - migration to use rocprofiler_hsa_api_callback_tracer_data_t instead of rocprofiler_hsa_trace_data_t - restructure hsa_api_impl - remove phase_enter and phase_exit - add set_data_args (partial replacement for phase_enter) - functor handles the contexts - Update lib/rocprofiler/rocprofiler.cpp - implement rocprofiler_get_version - Update lib/rocprofiler/hsa/hsa.{hpp,cpp} - remove hsa_api_ prefix for functions already in hsa namespace - Update lib/rocprofiler/context/context.{hpp,cpp} - add client_idx to context struct (tool identifier) - add push_client function to set client_idx before context is allocated - add pop_client function to remove client identifier from future context creations - implemented {registered,active}_contexts and buffers to use new container::reserve_size overload to stable_vector - fix implementation of start_context - fix implementation of stop_context - Update lib/rocprofiler/rocprofiler.cpp - prevent context creation, buffer creation, pc sampling config, etc. after initialization - add nullptr checks to rocprofiler_context_is_valid - fix rocprofiler_configure_callback_tracing_service - was checking size of buffers, not registered context - implement rocprofiler_iterate_callback_trace_kind_names - implement rocprofiler_iterate_callback_trace_kind_operation_names - Update lib/rocprofiler/CMakeLists.txt - add registration.{hpp,cpp} to rocprofiler-library target sources - Update lib/rocprofiler/hsa/utils.hpp - fix using fmt::formt with const char* strings - remove join functions (no longer used) - Update lib/rocprofiler/hsa/hsa.{hpp,cpp} - remove args_string function - remove named_args_string function - update iterate_args function - change callback type - accept user data - rework the hsa_api_impl::functor function - save the rocprofiler_callback_tracing_record_t between callbacks - update update_table function - check buffered_tracer domains - remove comments - Update lib/rocprofiler/hsa/defines.hpp - remove MEMBER_ macros - add ADDR_MEMBER_ macros - remove doxygen comments for GET_MEMBER_FIELDS - add GET_ADDR_MEMBER_FIELDS - update HSA_API_INFO_DEFINITION_{0,V} - rename domain_idx to callback_domain_idx - add buffered_domain_idx - add as_arg_addr function - Update lib/rocprofiler/rocprofiler.cpp - implement rocprofiler_iterate_callback_trace_operation_args - Remove lib/rocprofiler/tracing.{hpp,cpp} and lib/rocprofiler/CMakeLists.txt - unused - Update lib/rocprofiler/hsa/hsa.{hpp,cpp} - support buffered tracing in hsa_api_impl::functor - rocprofiler_callback_trace_operation_args_cb_t -> rocprofiler_callback_tracing_operation_args_cb_t - i.e. trace -> tracing - Update lib/rocprofiler/context/context.{hpp,cpp} - removed buffer_instance struct - removed allocate_buffer function - removed get_buffers function - changed buffer_tracing_service::buffer_array_t - Update lib/rocprofiler/hsa: hsa.cpp, ostream.hpp, details folder - move ostream.hpp into details folder to prevent from contributing to code coverage - update cmake build system for new directory * Add lib/rocprofiler/registration.{hpp,cpp} - implements rocprofiler_set_api_table (called by rocprofiler-register) - miscellaneous functions for client configure/initialize/finalize - functions for querying the init/fini status - relocated OnLoad HSA workaround to this file - at present, this is used to workaround ROCr not having rocprofiler-register integration yet - implement rocprofiler_force_configure function - implement rocprofiler_is_initialized function - implement rocprofiler_is_finalized function - ensure configure functions only invoked once - ensure internal thread creation notification functions are invoked - get_status is pair of atomics - fix heap-use-after-free in init_logging - update finalize - invoke hsa_shut_down - set all active contexts to null pointers * Add lib/rocprofiler/buffer_tracing.cpp - contains implementations of buffer_tracing (i.e. rocprofiler/buffer_tracing.h) - previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp * Add lib/rocprofiler/buffer.{hpp,cpp} - contains implementations of buffer (i.e. rocprofiler/buffer.h) and misc internal access functions - previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp and lib/rocprofiler/context/context.{hpp,cpp} * Add lib/rocprofiler/callback_tracing.cpp - contains implementations of callback_tracing (i.e. rocprofiler/callback_tracing.h) - previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp * Add lib/rocprofiler/context.cpp - contains implementations of context public API functions (i.e. rocprofiler/context.h) - previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp * Add lib/rocprofiler/internal_threading.{hpp,cpp} - contains implementations of internal_threading (i.e. rocprofiler/internal_threading.h) - also contains implementations of internal access functions - update finalize function - join all task groups and destroy all thread pools first, then reset unique_ptr * Update lib/rocprofiler/rocprofiler.cpp - rocprofiler_get_version returns status - implement rocprofiler_get_timestamp - remove misc implementations that were split into other files * Update lib/rocprofiler/CMakeLists.txt - compile new implementation files - buffer.cpp - buffer_tracing.cpp - callback_tracing.cpp - context.cpp - internal_threading.cpp * Update lib/tests/buffering/buffering-*.cpp - update to reflect changes to rocprofiler_record_header_t * Update CMakeLists.txt - increase minimum cmake version to 3.21 which added HIP support as a language * Add samples/apps/transpose - simple HIP application for testing * Add samples/api_callback_tracing - HIP application and tool library - This effectively demos how to setup HSA API tracing - For each function called in tool, it stores the func/file/line and prints it during finalization - client.hpp and client.cpp are the tool library - Implement use of rocprofiler_iterate_callback_trace_operation_args - add demo of using rocprofiler_get_version - add_test - remove PASS_REGULAR_EXPRESSION - causing false passes during memcheck - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment - check if rocprofiler is initialized before stopping context * Add samples/api_buffered_tracing - Sample demonstrating tracing the HSA API via buffering - demo rocprofiler_record_header_compute_hash - throw exceptions for unexpected buffer data - add_test - remove PASS_REGULAR_EXPRESSION - causing false passes during memcheck - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment * Update samples/CMakeLists.txt - add subdirectory for api_callback_tracing - add subdirectory api_buffered_tracing * Update samples/pc_sampling/common.h - fix processing of headers * Update lib/rocprofiler/hsa/details/ostream.hpp - fix data race on HSA_depth_max_cnt and recursion - HSA_depth_max_cnt and recursion is now thread-local static instead of global static - replace std::string usage with std::string_view * Actions update - add dependabot.yml - use actions/checkout@v4 - install latest libasan and libtsan in sanitizer containers * Add PTL (Parallel Tasking Library) submodule [ROCm/rocprofiler-sdk commit: d3eaacd6108010c7be18786e87b53cfd488445b7] --- .../rocprofiler-sdk/.github/dependabot.yml | 11 + .../workflows/continuous_integration.yml | 8 +- .../.github/workflows/docs.yml | 8 +- .../.github/workflows/formatting.yml | 6 +- projects/rocprofiler-sdk/.gitmodules | 3 + projects/rocprofiler-sdk/CMakeLists.txt | 2 +- .../cmake/rocprofiler_config_interfaces.cmake | 8 + .../cmake/rocprofiler_interfaces.cmake | 1 + .../rocprofiler-sdk/external/CMakeLists.txt | 42 + projects/rocprofiler-sdk/external/ptl | 1 + .../rocprofiler-sdk/samples/CMakeLists.txt | 2 + .../api_buffered_tracing/CMakeLists.txt | 52 + .../samples/api_buffered_tracing/client.cpp | 383 +++++ .../samples/api_buffered_tracing/client.hpp | 44 + .../samples/api_buffered_tracing/main.cpp | 244 ++++ .../api_callback_tracing/CMakeLists.txt | 52 + .../samples/api_callback_tracing/client.cpp | 317 ++++ .../samples/api_callback_tracing/client.hpp | 44 + .../samples/api_callback_tracing/main.cpp | 244 ++++ .../samples/apps/transpose/CMakeLists.txt | 38 + .../samples/apps/transpose/transpose.cpp | 278 ++++ .../samples/pc_sampling/common.h | 2 +- .../source/docs/rocprofiler.dox.in | 24 +- .../source/include/rocprofiler/CMakeLists.txt | 26 +- .../source/include/rocprofiler/agent.h | 72 + .../include/rocprofiler/agent_profile.h | 70 + .../source/include/rocprofiler/buffer.h | 106 ++ .../include/rocprofiler/buffer_tracing.h | 278 ++++ .../include/rocprofiler/callback_tracing.h | 252 ++++ .../source/include/rocprofiler/config.h | 210 --- .../source/include/rocprofiler/context.h | 91 ++ .../source/include/rocprofiler/counters.h | 73 + .../source/include/rocprofiler/defines.h | 31 + .../include/rocprofiler/dispatch_profile.h | 97 ++ .../rocprofiler/external_correlation.h | 60 + .../source/include/rocprofiler/fwd.h | 457 ++++++ .../source/include/rocprofiler/hip.h | 1 - .../source/include/rocprofiler/hsa.h | 29 +- .../source/include/rocprofiler/hsa/api_args.h | 8 + .../include/rocprofiler/internal_threading.h | 123 ++ .../source/include/rocprofiler/marker.h | 3 - .../source/include/rocprofiler/pc_sampling.h | 79 + .../include/rocprofiler/profile_config.h | 63 + .../source/include/rocprofiler/registration.h | 220 +++ .../source/include/rocprofiler/rocprofiler.h | 1293 +---------------- .../include/rocprofiler/rocprofiler_plugin.h | 82 +- .../source/include/rocprofiler/spm.h | 51 + .../source/lib/common/CMakeLists.txt | 3 +- .../common/container/record_header_buffer.cpp | 13 +- .../common/container/record_header_buffer.hpp | 77 +- .../lib/common/container/stable_vector.hpp | 17 + .../source/lib/rocprofiler/CMakeLists.txt | 7 +- .../source/lib/rocprofiler/buffer.cpp | 203 +++ .../source/lib/rocprofiler/buffer.hpp | 122 ++ .../source/lib/rocprofiler/buffer_tracing.cpp | 151 ++ .../lib/rocprofiler/callback_tracing.cpp | 161 ++ .../lib/rocprofiler/config_internal.cpp | 28 - .../lib/rocprofiler/config_internal.hpp | 74 - .../source/lib/rocprofiler/context.cpp | 89 ++ .../lib/rocprofiler/context/CMakeLists.txt | 14 + .../allocator.hpp} | 50 +- .../lib/rocprofiler/context/context.cpp | 230 +++ .../lib/rocprofiler/context/context.hpp | 130 ++ .../source/lib/rocprofiler/context/domain.cpp | 99 ++ .../source/lib/rocprofiler/context/domain.hpp | 89 ++ .../source/lib/rocprofiler/hsa/CMakeLists.txt | 12 +- .../source/lib/rocprofiler/hsa/defines.hpp | 136 +- .../rocprofiler/hsa/details/CMakeLists.txt | 8 + .../rocprofiler/hsa/{ => details}/ostream.hpp | 467 +++--- .../source/lib/rocprofiler/hsa/hsa.cpp | 469 +++--- .../source/lib/rocprofiler/hsa/hsa.def.cpp | 422 +++--- .../source/lib/rocprofiler/hsa/hsa.hpp | 53 +- .../source/lib/rocprofiler/hsa/utils.hpp | 65 +- .../lib/rocprofiler/internal_threading.cpp | 279 ++++ .../lib/rocprofiler/internal_threading.hpp | 66 + .../source/lib/rocprofiler/registration.cpp | 556 +++++++ .../source/lib/rocprofiler/registration.hpp | 95 ++ .../source/lib/rocprofiler/rocprofiler.cpp | 76 +- .../lib/rocprofiler/rocprofiler_config.cpp | 701 --------- .../source/lib/rocprofiler/tracer.hpp | 510 ------- .../tests/buffering/buffering-parallel.cpp | 2 +- .../tests/buffering/buffering-save-load.cpp | 2 +- .../lib/tests/buffering/buffering-serial.cpp | 8 +- .../rocprofiler-sdk/source/scripts/run-ci.py | 2 +- .../source/scripts/thread-sanitizer-suppr.txt | 4 + .../source/scripts/update-doxygen.sh | 6 + 86 files changed, 7293 insertions(+), 3792 deletions(-) create mode 100644 projects/rocprofiler-sdk/.github/dependabot.yml create mode 160000 projects/rocprofiler-sdk/external/ptl create mode 100644 projects/rocprofiler-sdk/samples/api_buffered_tracing/CMakeLists.txt create mode 100644 projects/rocprofiler-sdk/samples/api_buffered_tracing/client.cpp create mode 100644 projects/rocprofiler-sdk/samples/api_buffered_tracing/client.hpp create mode 100644 projects/rocprofiler-sdk/samples/api_buffered_tracing/main.cpp create mode 100644 projects/rocprofiler-sdk/samples/api_callback_tracing/CMakeLists.txt create mode 100644 projects/rocprofiler-sdk/samples/api_callback_tracing/client.cpp create mode 100644 projects/rocprofiler-sdk/samples/api_callback_tracing/client.hpp create mode 100644 projects/rocprofiler-sdk/samples/api_callback_tracing/main.cpp create mode 100644 projects/rocprofiler-sdk/samples/apps/transpose/CMakeLists.txt create mode 100644 projects/rocprofiler-sdk/samples/apps/transpose/transpose.cpp create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/agent.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/agent_profile.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/buffer.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/buffer_tracing.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/callback_tracing.h delete mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/config.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/context.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/counters.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/dispatch_profile.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/external_correlation.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/fwd.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/internal_threading.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/pc_sampling.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/profile_config.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/registration.h create mode 100644 projects/rocprofiler-sdk/source/include/rocprofiler/spm.h create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.hpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/buffer_tracing.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/callback_tracing.cpp delete mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.cpp delete mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.hpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context/CMakeLists.txt rename projects/rocprofiler-sdk/source/lib/rocprofiler/{config_helpers.hpp => context/allocator.hpp} (59%) create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.hpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.hpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/CMakeLists.txt rename projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/{ => details}/ostream.hpp (76%) create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.hpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/registration.cpp create mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/registration.hpp delete mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler_config.cpp delete mode 100644 projects/rocprofiler-sdk/source/lib/rocprofiler/tracer.hpp diff --git a/projects/rocprofiler-sdk/.github/dependabot.yml b/projects/rocprofiler-sdk/.github/dependabot.yml new file mode 100644 index 0000000000..90e05c40d0 --- /dev/null +++ b/projects/rocprofiler-sdk/.github/dependabot.yml @@ -0,0 +1,11 @@ +# To get started with Dependabot version updates, you'll need to specify which +# package ecosystems to update and where the package manifests are located. +# Please see the documentation for all configuration options: +# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates + +version: 2 +updates: + - package-ecosystem: "github-actions" # See documentation for possible values + directory: "/" # Location of package manifests + schedule: + interval: "weekly" diff --git a/projects/rocprofiler-sdk/.github/workflows/continuous_integration.yml b/projects/rocprofiler-sdk/.github/workflows/continuous_integration.yml index 415c5ebeae..92f5ca0003 100644 --- a/projects/rocprofiler-sdk/.github/workflows/continuous_integration.yml +++ b/projects/rocprofiler-sdk/.github/workflows/continuous_integration.yml @@ -72,7 +72,7 @@ jobs: needs: get_latest_mainline_build_number steps: - - uses: actions/checkout@v2 + - uses: actions/checkout@v4 - name: List Files shell: bash @@ -161,7 +161,9 @@ jobs: needs: get_latest_mainline_build_number steps: - - uses: actions/checkout@v2 + - uses: actions/checkout@v4 + with: + submodules: true - name: List Files shell: bash @@ -174,7 +176,7 @@ jobs: shell: bash run: | pip3 install -r requirements.txt - apt install -y cmake libgtest-dev + apt install -y cmake libgtest-dev libasan8 libtsan2 git config --global --add safe.directory '*' - name: Configure, Build, and Test diff --git a/projects/rocprofiler-sdk/.github/workflows/docs.yml b/projects/rocprofiler-sdk/.github/workflows/docs.yml index 38c4973abc..7945066546 100644 --- a/projects/rocprofiler-sdk/.github/workflows/docs.yml +++ b/projects/rocprofiler-sdk/.github/workflows/docs.yml @@ -6,18 +6,20 @@ on: branches: [main] paths: - '*.md' + - 'VERSION' - 'source/docs/**' - 'source/scripts/update-docs.sh' + - 'source/include/rocprofiler/*' - '.github/workflows/docs.yml' - - 'VERSION' pull_request: branches: [main] paths: - '*.md' + - 'VERSION' - 'source/docs/**' - 'source/scripts/update-docs.sh' + - 'source/include/rocprofiler/*' - '.github/workflows/docs.yml' - - 'VERSION' concurrency: group: "pages" @@ -35,7 +37,7 @@ jobs: id-token: write steps: - name: Checkout - uses: actions/checkout@v3 + uses: actions/checkout@v4 with: submodules: true - name: Install Conda diff --git a/projects/rocprofiler-sdk/.github/workflows/formatting.yml b/projects/rocprofiler-sdk/.github/workflows/formatting.yml index 7cbbb93f75..3d07424ce4 100644 --- a/projects/rocprofiler-sdk/.github/workflows/formatting.yml +++ b/projects/rocprofiler-sdk/.github/workflows/formatting.yml @@ -19,7 +19,7 @@ jobs: runs-on: ubuntu-22.04 steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Extract branch name shell: bash @@ -60,7 +60,7 @@ jobs: runs-on: ubuntu-22.04 steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Install dependencies run: | @@ -105,7 +105,7 @@ jobs: python-version: ['3.10'] steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Extract branch name shell: bash diff --git a/projects/rocprofiler-sdk/.gitmodules b/projects/rocprofiler-sdk/.gitmodules index edf419b3a4..68e173b066 100644 --- a/projects/rocprofiler-sdk/.gitmodules +++ b/projects/rocprofiler-sdk/.gitmodules @@ -10,3 +10,6 @@ [submodule "source/docs/doxygen-awesome-css"] path = external/doxygen-awesome-css url = https://github.com/jothepro/doxygen-awesome-css.git +[submodule "external/ptl"] + path = external/ptl + url = https://github.com/jrmadsen/PTL diff --git a/projects/rocprofiler-sdk/CMakeLists.txt b/projects/rocprofiler-sdk/CMakeLists.txt index 6212b92083..049d09723a 100644 --- a/projects/rocprofiler-sdk/CMakeLists.txt +++ b/projects/rocprofiler-sdk/CMakeLists.txt @@ -1,4 +1,4 @@ -cmake_minimum_required(VERSION 3.16 FATAL_ERROR) +cmake_minimum_required(VERSION 3.21 FATAL_ERROR) if(CMAKE_SOURCE_DIR STREQUAL CMAKE_BINARY_DIR AND CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR) diff --git a/projects/rocprofiler-sdk/cmake/rocprofiler_config_interfaces.cmake b/projects/rocprofiler-sdk/cmake/rocprofiler_config_interfaces.cmake index 4b4022cc19..ef08da847e 100644 --- a/projects/rocprofiler-sdk/cmake/rocprofiler_config_interfaces.cmake +++ b/projects/rocprofiler-sdk/cmake/rocprofiler_config_interfaces.cmake @@ -146,3 +146,11 @@ find_package( lib/cmake/amd_comgr) target_link_libraries(rocprofiler-amd-comgr INTERFACE amd_comgr) + +# ----------------------------------------------------------------------------------------# +# +# PTL (Parallel Tasking Library) +# +# ----------------------------------------------------------------------------------------# + +target_link_libraries(rocprofiler-ptl INTERFACE PTL::ptl-static) diff --git a/projects/rocprofiler-sdk/cmake/rocprofiler_interfaces.cmake b/projects/rocprofiler-sdk/cmake/rocprofiler_interfaces.cmake index 1b24a11293..8914efb451 100644 --- a/projects/rocprofiler-sdk/cmake/rocprofiler_interfaces.cmake +++ b/projects/rocprofiler-sdk/cmake/rocprofiler_interfaces.cmake @@ -49,3 +49,4 @@ rocprofiler_add_interface_library(rocprofiler-gtest "Google Test library" INTERN rocprofiler_add_interface_library(rocprofiler-glog "Google Log library" INTERNAL) rocprofiler_add_interface_library(rocprofiler-fmt "C++ format string library" INTERNAL) rocprofiler_add_interface_library(rocprofiler-stdcxxfs "C++ filesystem library" INTERNAL) +rocprofiler_add_interface_library(rocprofiler-ptl "Parallel Tasking Library" INTERNAL) diff --git a/projects/rocprofiler-sdk/external/CMakeLists.txt b/projects/rocprofiler-sdk/external/CMakeLists.txt index 31394dd42a..b67a0e2dc3 100644 --- a/projects/rocprofiler-sdk/external/CMakeLists.txt +++ b/projects/rocprofiler-sdk/external/CMakeLists.txt @@ -88,3 +88,45 @@ else() find_package(fmt REQUIRED) target_link_libraries(rocprofiler-fmt INTERFACE fmt::fmt) endif() + +if(NOT TARGET PTL::ptl-static) + rocprofiler_checkout_git_submodule( + RELATIVE_PATH external/ptl + WORKING_DIRECTORY ${PROJECT_SOURCE_DIR} + REPO_URL https://github.com/jrmadsen/PTL.git + REPO_BRANCH rocprofiler) + + set(PTL_BUILD_EXAMPLES OFF) + set(PTL_USE_TBB OFF) + set(PTL_USE_GPU OFF) + set(PTL_DEVELOPER_INSTALL OFF) + + if(NOT DEFINED BUILD_OBJECT_LIBS) + set(BUILD_OBJECT_LIBS OFF) + endif() + + if(NOT DEFINED BUILD_STATIC_LIBS) + set(BUILD_STATIC_LIBS OFF) + endif() + + rocprofiler_save_variables( + BUILD_CONFIG + VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS + CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET + CMAKE_VISIBILITY_INLINES_HIDDEN) + + set(BUILD_SHARED_LIBS OFF) + set(BUILD_STATIC_LIBS ON) + set(BUILD_OBJECT_LIBS OFF) + set(CMAKE_POSITION_INDEPENDENT_CODE ON) + set(CMAKE_CXX_VISIBILITY_PRESET "hidden") + set(CMAKE_VISIBILITY_INLINES_HIDDEN ON) + + add_subdirectory(ptl EXCLUDE_FROM_ALL) + + rocprofiler_restore_variables( + BUILD_CONFIG + VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS + CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET + CMAKE_VISIBILITY_INLINES_HIDDEN) +endif() diff --git a/projects/rocprofiler-sdk/external/ptl b/projects/rocprofiler-sdk/external/ptl new file mode 160000 index 0000000000..7bbc5a4e66 --- /dev/null +++ b/projects/rocprofiler-sdk/external/ptl @@ -0,0 +1 @@ +Subproject commit 7bbc5a4e66d10d7acae5b353838e2404b3dd3742 diff --git a/projects/rocprofiler-sdk/samples/CMakeLists.txt b/projects/rocprofiler-sdk/samples/CMakeLists.txt index a481c79707..fcfac7d151 100644 --- a/projects/rocprofiler-sdk/samples/CMakeLists.txt +++ b/projects/rocprofiler-sdk/samples/CMakeLists.txt @@ -5,3 +5,5 @@ project(rocprofiler-samples LANGUAGES C CXX) # add_subdirectory(api_tracing) add_subdirectory(pc_sampling) +add_subdirectory(api_callback_tracing) +add_subdirectory(api_buffered_tracing) diff --git a/projects/rocprofiler-sdk/samples/api_buffered_tracing/CMakeLists.txt b/projects/rocprofiler-sdk/samples/api_buffered_tracing/CMakeLists.txt new file mode 100644 index 0000000000..cd97d37178 --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_buffered_tracing/CMakeLists.txt @@ -0,0 +1,52 @@ +# +# +# +cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR) + +if(NOT CMAKE_HIP_COMPILER) + find_program( + amdclangpp_EXECUTABLE + NAMES amdclang++ + HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm + PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm + PATH_SUFFIXES bin llvm/bin NO_CACHE) + mark_as_advanced(amdclangpp_EXECUTABLE) + + if(amdclangpp_EXECUTABLE) + set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}") + endif() +endif() + +project(rocprofiler-samples-buffered-api-tracing LANGUAGES CXX HIP) + +foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO) + if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "") + set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}") + endif() +endforeach() + +add_library(buffered-api-tracing-client SHARED) +target_sources(buffered-api-tracing-client PRIVATE client.cpp client.hpp) +target_link_libraries(buffered-api-tracing-client + PRIVATE rocprofiler::rocprofiler-library) + +set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP) +find_package(Threads REQUIRED) + +add_executable(buffered-api-tracing) +target_sources(buffered-api-tracing PRIVATE main.cpp) +target_link_libraries(buffered-api-tracing PRIVATE buffered-api-tracing-client + Threads::Threads) + +add_test(NAME buffered-api-tracing COMMAND $) + +set_tests_properties( + buffered-api-tracing + PROPERTIES + TIMEOUT + 45 + LABELS + "samples" + ENVIRONMENT + "${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$" + ) diff --git a/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.cpp b/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.cpp new file mode 100644 index 0000000000..ba707289af --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.cpp @@ -0,0 +1,383 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +// undefine NDEBUG so asserts are implemented +#ifdef NDEBUG +# undef NDEBUG +#endif + +/** + * @file samples/api_buffered_tracing/client.cpp + * + * @brief Example rocprofiler client (tool) + */ + +#include "client.hpp" + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define ROCPROFILER_CALL(result, msg) \ + { \ + rocprofiler_status_t CHECKSTATUS = result; \ + if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \ + { \ + std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \ + throw std::runtime_error(#result " failure"); \ + } \ + } + +namespace client +{ +namespace +{ +struct source_location +{ + std::string function = {}; + std::string file = {}; + uint32_t line = 0; + std::string context = {}; +}; + +using call_stack_t = std::vector; + +rocprofiler_client_id_t* client_id = nullptr; +rocprofiler_client_finalize_t client_fini_func = nullptr; +rocprofiler_context_id_t client_ctx = {}; +rocprofiler_buffer_id_t client_buffer = {}; + +void +print_call_stack(const call_stack_t& _call_stack) +{ + namespace fs = ::std::filesystem; + + size_t n = 0; + for(const auto& itr : _call_stack) + { + std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " "; + std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] " + << std::setw(20) << std::left << itr.function; + if(!itr.context.empty()) std::clog << " :: " << itr.context; + std::clog << "\n"; + } + + std::clog << std::flush; +} + +void +store_buffer_id_names(call_stack_t* tool_data) +{ + // + // buffered for each kind operation + // + static auto tracing_operation_names_cb = [](rocprofiler_service_buffer_tracing_kind_t /*kindv*/, + uint32_t /*operation*/, + const char* operation_name, + void* data_v) { + static_cast(data_v)->emplace_back( + source_location{"rocprofiler_iterate_buffer_trace_kind_operation_names", + __FILE__, + __LINE__, + std::string{" "} + std::string{operation_name}}); + return 0; + }; + + // + // callback for each buffer kind (i.e. domain) + // + static auto tracing_kind_names_cb = + [](rocprofiler_service_buffer_tracing_kind_t kind, const char* kind_name, void* data) { + // store the buffer kind name + static_cast(data)->emplace_back( + source_location{"rocprofiler_iterate_buffer_trace_kind_names ", + __FILE__, + __LINE__, + kind_name}); + + // store the operation names for the HSA API + if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API) + { + rocprofiler_iterate_buffer_tracing_kind_operation_names( + kind, tracing_operation_names_cb, data); + } + + return 0; + }; + + rocprofiler_iterate_buffer_tracing_kind_names(tracing_kind_names_cb, + static_cast(tool_data)); +} + +void +tool_tracing_callback(rocprofiler_context_id_t context, + rocprofiler_buffer_id_t buffer_id, + rocprofiler_record_header_t** headers, + size_t num_headers, + void* user_data, + uint64_t drop_count) +{ + assert(user_data != nullptr); + + if(num_headers == 0) + throw std::runtime_error{ + "rocprofiler invoked a buffer callback with no headers. this should never happen"}; + else if(headers == nullptr) + throw std::runtime_error{"rocprofiler invoked a buffer callback with a null pointer to the " + "array of headers. this should never happen"}; + + for(size_t i = 0; i < num_headers; ++i) + { + auto* header = headers[i]; + + if(header == nullptr) + { + throw std::runtime_error{ + "rocprofiler provided a null pointer to header. this should never happen"}; + } + else if(header->hash != + rocprofiler_record_header_compute_hash(header->category, header->kind)) + { + throw std::runtime_error{"rocprofiler_record_header_t (category | kind) != hash"}; + } + else if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING && + header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API) + { + auto* record = + static_cast(header->payload); + auto info = std::stringstream{}; + info << "tid=" << record->thread_id << ", context=" << context.handle + << ", buffer_id=" << buffer_id.handle << ", cid=" << record->correlation_id.id + << ", kind=" << record->kind << ", operation=" << record->operation + << ", drop_count=" << drop_count << ", start=" << record->start_timestamp + << ", stop=" << record->end_timestamp; + + if(record->start_timestamp > record->end_timestamp) + throw std::runtime_error("start > end"); + + static_cast(user_data)->emplace_back( + source_location{__FUNCTION__, __FILE__, __LINE__, info.str()}); + } + else + { + throw std::runtime_error{"unexpected rocprofiler_record_header_t category + kind"}; + } + } +} + +void +thread_precreate(rocprofiler_internal_thread_library_t lib, void* tool_data) +{ + static_cast(tool_data)->emplace_back( + source_location{__FUNCTION__, + __FILE__, + __LINE__, + std::string{"internal thread about to be created by rocprofiler (lib="} + + std::to_string(lib) + ")"}); +} + +void +thread_postcreate(rocprofiler_internal_thread_library_t lib, void* tool_data) +{ + static_cast(tool_data)->emplace_back( + source_location{__FUNCTION__, + __FILE__, + __LINE__, + std::string{"internal thread was created by rocprofiler (lib="} + + std::to_string(lib) + ")"}); +} + +int +tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data) +{ + assert(tool_data != nullptr); + + static_cast(tool_data)->emplace_back( + source_location{__FUNCTION__, __FILE__, __LINE__, ""}); + + store_buffer_id_names(static_cast(tool_data)); + + client_fini_func = fini_func; + + ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed"); + + ROCPROFILER_CALL(rocprofiler_create_buffer(client_ctx, + 4096, + 2048, + ROCPROFILER_BUFFER_POLICY_LOSSLESS, + tool_tracing_callback, + tool_data, + &client_buffer), + "buffer creation failed"); + + ROCPROFILER_CALL( + rocprofiler_configure_buffer_tracing_service( + client_ctx, ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, nullptr, 0, client_buffer), + "buffer tracing service failed to configure"); + + auto client_thread = rocprofiler_callback_thread_t{}; + ROCPROFILER_CALL(rocprofiler_create_callback_thread(&client_thread), + "failure creating callback thread"); + + ROCPROFILER_CALL(rocprofiler_assign_callback_thread(client_buffer, client_thread), + "failed to assign thread for buffer"); + + int valid_ctx = 0; + ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx), + "failure checking context validity"); + if(valid_ctx == 0) + { + // notify rocprofiler that initialization failed + // and all the contexts, buffers, etc. created + // should be ignored + return -1; + } + + ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed"); + + // no errors + return 0; +} + +void +tool_fini(void* tool_data) +{ + assert(tool_data != nullptr); + + auto* _call_stack = static_cast(tool_data); + _call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""}); + + print_call_stack(*_call_stack); + + delete _call_stack; +} +} // namespace + +void +setup() +{ + ROCPROFILER_CALL(rocprofiler_force_configure(&rocprofiler_configure), + "failed to force configuration"); +} + +void +shutdown() +{ + if(client_id) + { + auto status = ROCPROFILER_STATUS_SUCCESS; + while((status = rocprofiler_flush_buffer(client_buffer)) == + ROCPROFILER_STATUS_ERROR_BUFFER_BUSY) + { + std::this_thread::yield(); + std::this_thread::sleep_for(std::chrono::milliseconds{10}); + } + ROCPROFILER_CALL(status, "rocprofiler_flush_buffer failed"); + while((status = rocprofiler_flush_buffer(client_buffer)) == + ROCPROFILER_STATUS_ERROR_BUFFER_BUSY) + { + std::this_thread::yield(); + std::this_thread::sleep_for(std::chrono::milliseconds{10}); + } + client_fini_func(*client_id); + } +} + +void +start() +{ + ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed"); +} + +void +stop() +{ + ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed"); +} +} // namespace client + +extern "C" rocprofiler_tool_configure_result_t* +rocprofiler_configure(uint32_t version, + const char* runtime_version, + uint32_t priority, + rocprofiler_client_id_t* id) +{ + // only activate if main tool + if(priority > 0) return nullptr; + + // set the client name + id->name = "ExampleTool"; + + // store client info + client::client_id = id; + + // compute major/minor/patch version info + uint32_t major = version / 10000; + uint32_t minor = (version % 10000) / 100; + uint32_t patch = version % 100; + + // generate info string + auto info = std::stringstream{}; + info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " (" + << runtime_version << ")"; + + std::clog << info.str() << std::endl; + + auto* client_tool_data = new std::vector{}; + + client_tool_data->emplace_back( + client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()}); + + ROCPROFILER_CALL(rocprofiler_at_internal_thread_create( + client::thread_precreate, + client::thread_postcreate, + ROCPROFILER_LIBRARY | ROCPROFILER_HSA_LIBRARY | ROCPROFILER_HIP_LIBRARY | + ROCPROFILER_MARKER_LIBRARY, + static_cast(client_tool_data)), + "failed to register for thread creation notifications"); + + // create configure data + static auto cfg = + rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t), + &client::tool_init, + &client::tool_fini, + static_cast(client_tool_data)}; + + // return pointer to configure data + return &cfg; +} diff --git a/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.hpp b/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.hpp new file mode 100644 index 0000000000..c58ea04b07 --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_buffered_tracing/client.hpp @@ -0,0 +1,44 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#ifdef buffered_api_tracing_client_EXPORTS +# define CLIENT_API __attribute__((visibility("default"))) +#else +# define CLIENT_API +#endif + +namespace client +{ +void +setup() CLIENT_API; + +void +shutdown() CLIENT_API; + +void +start() CLIENT_API; + +void +stop() CLIENT_API; +} // namespace client diff --git a/projects/rocprofiler-sdk/samples/api_buffered_tracing/main.cpp b/projects/rocprofiler-sdk/samples/api_buffered_tracing/main.cpp new file mode 100644 index 0000000000..3754a825da --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_buffered_tracing/main.cpp @@ -0,0 +1,244 @@ +/* +Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + +#include "client.hpp" + +#include "hip/hip_runtime.h" + +#include +#include +#include +#include +#include +#include +#include + +#define HIP_API_CALL(CALL) \ + { \ + hipError_t error_ = (CALL); \ + if(error_ != hipSuccess) \ + { \ + auto _hip_api_print_lk = auto_lock_t{print_lock}; \ + fprintf(stderr, \ + "%s:%d :: HIP error : %s\n", \ + __FILE__, \ + __LINE__, \ + hipGetErrorString(error_)); \ + throw std::runtime_error("hip_api_call"); \ + } \ + } + +namespace +{ +using auto_lock_t = std::unique_lock; +auto print_lock = std::mutex{}; +size_t nthreads = 2; +size_t nitr = 500; +size_t nsync = 10; +constexpr unsigned shared_mem_tile_dim = 32; + +void +check_hip_error(void); + +void +verify(int* in, int* out, int M, int N); +} // namespace + +__global__ void +transpose_a(int* in, int* out, int M, int N); + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv); + +int +main(int argc, char** argv) +{ + client::setup(); // forces rocprofiler to configure/initialize + client::start(); // starts context before any API tables are available + + int rank = 0; + int size = 1; + for(int i = 1; i < argc; ++i) + { + auto _arg = std::string{argv[i]}; + if(_arg == "?" || _arg == "-h" || _arg == "--help") + { + fprintf(stderr, + "usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] " + "[SYNC_EVERY_N_ITERATIONS (%zu)]\n", + nthreads, + nitr, + nsync); + exit(EXIT_SUCCESS); + } + } + if(argc > 1) nthreads = atoll(argv[1]); + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + printf("[transpose] Number of threads: %zu\n", nthreads); + printf("[transpose] Number of iterations: %zu\n", nitr); + printf("[transpose] Syncing every %zu iterations\n", nsync); + + // this is a temporary workaround in omnitrace when HIP + MPI is enabled + int ndevice = 0; + int devid = rank; + HIP_API_CALL(hipGetDeviceCount(&ndevice)); + printf("[transpose] Number of devices found: %i\n", ndevice); + if(ndevice > 0) + { + devid = rank % ndevice; + HIP_API_CALL(hipSetDevice(devid)); + printf("[transpose] Rank %i assigned to device %i\n", rank, devid); + } + if(rank == devid && rank < ndevice) + { + std::vector _threads{}; + std::vector _streams(nthreads); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamCreate(&_streams.at(i))); + for(size_t i = 1; i < nthreads; ++i) + _threads.emplace_back(run, rank, i, _streams.at(i), argc, argv); + run(rank, 0, _streams.at(0), argc, argv); + for(auto& itr : _threads) + itr.join(); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamDestroy(_streams.at(i))); + } + HIP_API_CALL(hipDeviceSynchronize()); + HIP_API_CALL(hipDeviceReset()); + + client::stop(); + client::shutdown(); + + return 0; +} + +__global__ void +transpose_a(int* in, int* out, int M, int N) +{ + __shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim]; + + int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x; + tile[threadIdx.y][threadIdx.x] = in[idx]; + __syncthreads(); + idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x; + out[idx] = tile[threadIdx.x][threadIdx.y]; +} + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv) +{ + unsigned int M = 4960 * 2; + unsigned int N = 4960 * 2; + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + auto_lock_t _lk{print_lock}; + std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl; + _lk.unlock(); + + std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)}; + std::uniform_int_distribution _dist{0, 1000}; + + size_t size = sizeof(int) * M * N; + int* inp_matrix = new int[size]; + int* out_matrix = new int[size]; + for(size_t i = 0; i < M * N; i++) + { + inp_matrix[i] = _dist(_engine); + out_matrix[i] = 0; + } + int* in = nullptr; + int* out = nullptr; + + HIP_API_CALL(hipMalloc(&in, size)); + HIP_API_CALL(hipMalloc(&out, size)); + HIP_API_CALL(hipMemsetAsync(in, 0, size, stream)); + HIP_API_CALL(hipMemsetAsync(out, 0, size, stream)); + HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream)); + HIP_API_CALL(hipStreamSynchronize(stream)); + + dim3 grid(M / 32, N / 32, 1); + dim3 block(32, 32, 1); // transpose_a + + auto t1 = std::chrono::high_resolution_clock::now(); + for(size_t i = 0; i < nitr; ++i) + { + transpose_a<<>>(in, out, M, N); + check_hip_error(); + if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream)); + } + auto t2 = std::chrono::high_resolution_clock::now(); + HIP_API_CALL(hipStreamSynchronize(stream)); + HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream)); + double time = std::chrono::duration_cast>(t2 - t1).count(); + float GB = (float) size * nitr * 2 / (1 << 30); + + print_lock.lock(); + std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n" + << "The average performance of transpose is " << GB / time << " GBytes/sec" + << std::endl; + print_lock.unlock(); + + HIP_API_CALL(hipStreamSynchronize(stream)); + + // cpu_transpose(matrix, out_matrix, M, N); + verify(inp_matrix, out_matrix, M, N); + + HIP_API_CALL(hipFree(in)); + HIP_API_CALL(hipFree(out)); + + delete[] inp_matrix; + delete[] out_matrix; +} + +namespace +{ +void +check_hip_error(void) +{ + hipError_t err = hipGetLastError(); + if(err != hipSuccess) + { + auto_lock_t _lk{print_lock}; + std::cerr << "Error: " << hipGetErrorString(err) << std::endl; + throw std::runtime_error("hip_api_call"); + } +} + +void +verify(int* in, int* out, int M, int N) +{ + for(int i = 0; i < 10; i++) + { + int row = rand() % M; + int col = rand() % N; + if(in[row * N + col] != out[col * M + row]) + { + auto_lock_t _lk{print_lock}; + std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | " + << out[col * M + row] << "\n"; + } + } +} +} // namespace diff --git a/projects/rocprofiler-sdk/samples/api_callback_tracing/CMakeLists.txt b/projects/rocprofiler-sdk/samples/api_callback_tracing/CMakeLists.txt new file mode 100644 index 0000000000..7c349b05fa --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_callback_tracing/CMakeLists.txt @@ -0,0 +1,52 @@ +# +# +# +cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR) + +if(NOT CMAKE_HIP_COMPILER) + find_program( + amdclangpp_EXECUTABLE + NAMES amdclang++ + HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm + PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm + PATH_SUFFIXES bin llvm/bin NO_CACHE) + mark_as_advanced(amdclangpp_EXECUTABLE) + + if(amdclangpp_EXECUTABLE) + set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}") + endif() +endif() + +project(rocprofiler-samples-callback-api-tracing LANGUAGES CXX HIP) + +foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO) + if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "") + set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}") + endif() +endforeach() + +add_library(callback-api-tracing-client SHARED) +target_sources(callback-api-tracing-client PRIVATE client.cpp client.hpp) +target_link_libraries(callback-api-tracing-client + PRIVATE rocprofiler::rocprofiler-library) + +set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP) +find_package(Threads REQUIRED) + +add_executable(callback-api-tracing) +target_sources(callback-api-tracing PRIVATE main.cpp) +target_link_libraries(callback-api-tracing PRIVATE callback-api-tracing-client + Threads::Threads) + +add_test(NAME callback-api-tracing COMMAND $) + +set_tests_properties( + callback-api-tracing + PROPERTIES + TIMEOUT + 45 + LABELS + "samples" + ENVIRONMENT + "${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$" + ) diff --git a/projects/rocprofiler-sdk/samples/api_callback_tracing/client.cpp b/projects/rocprofiler-sdk/samples/api_callback_tracing/client.cpp new file mode 100644 index 0000000000..b8dcc064c6 --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_callback_tracing/client.cpp @@ -0,0 +1,317 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +// undefine NDEBUG so asserts are implemented +#ifdef NDEBUG +# undef NDEBUG +#endif + +/** + * @file samples/api_callback_tracing/client.cpp + * + * @brief Example rocprofiler client (tool) + */ + +#include "client.hpp" + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define ROCPROFILER_CALL(result, msg) \ + { \ + rocprofiler_status_t CHECKSTATUS = result; \ + if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \ + { \ + std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \ + throw std::runtime_error(#result " failure"); \ + } \ + } + +namespace client +{ +namespace +{ +struct source_location +{ + std::string function = {}; + std::string file = {}; + uint32_t line = 0; + std::string context = {}; +}; + +using call_stack_t = std::vector; + +rocprofiler_client_id_t* client_id = nullptr; +rocprofiler_client_finalize_t client_fini_func = nullptr; +rocprofiler_context_id_t client_ctx = {}; + +void +print_call_stack(const call_stack_t& _call_stack) +{ + namespace fs = ::std::filesystem; + + size_t n = 0; + for(const auto& itr : _call_stack) + { + std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " "; + std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] " + << std::setw(20) << std::left << itr.function; + if(!itr.context.empty()) std::clog << " :: " << itr.context; + std::clog << "\n"; + } + + std::clog << std::flush; +} + +void +store_callback_id_names(call_stack_t* tool_data) +{ + // + // callback for each kind operation + // + static auto tracing_operation_names_cb = + [](rocprofiler_service_callback_tracing_kind_t /*kindv*/, + uint32_t /*operation*/, + const char* operation_name, + void* data_v) { + static_cast(data_v)->emplace_back( + source_location{"rocprofiler_iterate_callback_tracing_kind_operation_names", + __FILE__, + __LINE__, + std::string{" "} + std::string{operation_name}}); + return 0; + }; + + // + // callback for each callback kind (i.e. domain) + // + static auto tracing_kind_names_cb = [](rocprofiler_service_callback_tracing_kind_t kind, + const char* kind_name, + void* data) { + // store the callback kind name + static_cast(data)->emplace_back(source_location{ + "rocprofiler_iterate_callback_tracing_kind_names ", __FILE__, __LINE__, kind_name}); + + // store the operation names for the HSA API + if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API) + { + rocprofiler_iterate_callback_tracing_kind_operation_names( + kind, tracing_operation_names_cb, data); + } + + return 0; + }; + + rocprofiler_iterate_callback_tracing_kind_names(tracing_kind_names_cb, + static_cast(tool_data)); +} + +void +tool_tracing_callback(rocprofiler_callback_tracing_record_t record, void* user_data) +{ + assert(user_data != nullptr); + + auto info = std::stringstream{}; + info << "tid=" << record.thread_id << ", cid=" << record.correlation_id.id + << ", kind=" << record.kind << ", operation=" << record.operation + << ", phase=" << record.phase; + + auto info_data_cb = [](rocprofiler_service_callback_tracing_kind_t, + uint32_t, + uint32_t arg_num, + const char* arg_name, + const char* arg_value_str, + const void* const arg_value_addr, + void* cb_data) -> int { + auto& dss = *static_cast(cb_data); + dss << ((arg_num == 0) ? "(" : ", "); + dss << arg_num << ": " << arg_name << "=" << arg_value_str; + (void) arg_value_addr; + return 0; + }; + + auto info_data = std::stringstream{}; + ROCPROFILER_CALL(rocprofiler_iterate_callback_tracing_operation_args( + record, info_data_cb, static_cast(&info_data)), + "Failure iterating trace operation args"); + + auto info_data_str = info_data.str(); + if(!info_data_str.empty()) info << " " << info_data_str << ")"; + + static auto _mutex = std::mutex{}; + _mutex.lock(); + static_cast(user_data)->emplace_back( + source_location{__FUNCTION__, __FILE__, __LINE__, info.str()}); + _mutex.unlock(); +} + +int +tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data) +{ + assert(tool_data != nullptr); + + static_cast(tool_data)->emplace_back( + source_location{__FUNCTION__, __FILE__, __LINE__, ""}); + + store_callback_id_names(static_cast(tool_data)); + + client_fini_func = fini_func; + + ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed"); + + ROCPROFILER_CALL( + rocprofiler_configure_callback_tracing_service(client_ctx, + ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, + nullptr, + 0, + tool_tracing_callback, + tool_data), + "callback tracing service failed to configure"); + + int valid_ctx = 0; + ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx), + "failure checking context validity"); + if(valid_ctx == 0) + { + // notify rocprofiler that initialization failed + // and all the contexts, buffers, etc. created + // should be ignored + return -1; + } + + ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed"); + + // no errors + return 0; +} + +void +tool_fini(void* tool_data) +{ + assert(tool_data != nullptr); + + auto* _call_stack = static_cast(tool_data); + _call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""}); + + print_call_stack(*_call_stack); + + delete _call_stack; +} +} // namespace + +void +setup() +{} + +void +shutdown() +{ + if(client_id) client_fini_func(*client_id); +} + +void +start() +{ + ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed"); +} + +void +stop() +{ + int status = 0; + ROCPROFILER_CALL(rocprofiler_is_initialized(&status), "failed to retrieve init status"); + if(status != 0) + { + ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed"); + } +} +} // namespace client + +extern "C" rocprofiler_tool_configure_result_t* +rocprofiler_configure(uint32_t version, + const char* runtime_version, + uint32_t priority, + rocprofiler_client_id_t* id) +{ + // only activate if main tool + if(priority > 0) return nullptr; + + // set the client name + id->name = "ExampleTool"; + + // store client info + client::client_id = id; + + // compute major/minor/patch version info + uint32_t major = version / 10000; + uint32_t minor = (version % 10000) / 100; + uint32_t patch = version % 100; + + // generate info string + auto info = std::stringstream{}; + info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " (" + << runtime_version << ")"; + + std::clog << info.str() << std::endl; + + // demonstration of alternative way to get the version info + { + auto version_info = std::array{}; + ROCPROFILER_CALL( + rocprofiler_get_version(&version_info.at(0), &version_info.at(1), &version_info.at(2)), + "failed to get version info"); + + if(std::array{major, minor, patch} != version_info) + { + throw std::runtime_error{"version info mismatch"}; + } + } + + // data passed around all the callbacks + auto* client_tool_data = new std::vector{}; + + // add first entry + client_tool_data->emplace_back( + client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()}); + + // create configure data + static auto cfg = + rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t), + &client::tool_init, + &client::tool_fini, + static_cast(client_tool_data)}; + + // return pointer to configure data + return &cfg; +} diff --git a/projects/rocprofiler-sdk/samples/api_callback_tracing/client.hpp b/projects/rocprofiler-sdk/samples/api_callback_tracing/client.hpp new file mode 100644 index 0000000000..134efb027d --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_callback_tracing/client.hpp @@ -0,0 +1,44 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#ifdef callback_api_tracing_client_EXPORTS +# define CLIENT_API __attribute__((visibility("default"))) +#else +# define CLIENT_API +#endif + +namespace client +{ +void +setup() CLIENT_API; + +void +shutdown() CLIENT_API; + +void +start() CLIENT_API; + +void +stop() CLIENT_API; +} // namespace client diff --git a/projects/rocprofiler-sdk/samples/api_callback_tracing/main.cpp b/projects/rocprofiler-sdk/samples/api_callback_tracing/main.cpp new file mode 100644 index 0000000000..268b8b64f0 --- /dev/null +++ b/projects/rocprofiler-sdk/samples/api_callback_tracing/main.cpp @@ -0,0 +1,244 @@ +/* +Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + +#include "client.hpp" + +#include "hip/hip_runtime.h" + +#include +#include +#include +#include +#include +#include +#include + +#define HIP_API_CALL(CALL) \ + { \ + hipError_t error_ = (CALL); \ + if(error_ != hipSuccess) \ + { \ + auto _hip_api_print_lk = auto_lock_t{print_lock}; \ + fprintf(stderr, \ + "%s:%d :: HIP error : %s\n", \ + __FILE__, \ + __LINE__, \ + hipGetErrorString(error_)); \ + throw std::runtime_error("hip_api_call"); \ + } \ + } + +namespace +{ +using auto_lock_t = std::unique_lock; +auto print_lock = std::mutex{}; +size_t nthreads = 2; +size_t nitr = 500; +size_t nsync = 10; +constexpr unsigned shared_mem_tile_dim = 32; + +void +check_hip_error(void); + +void +verify(int* in, int* out, int M, int N); +} // namespace + +__global__ void +transpose_a(int* in, int* out, int M, int N); + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv); + +int +main(int argc, char** argv) +{ + client::setup(); // currently does nothing + // client::start(); // currently will fail + + int rank = 0; + int size = 1; + for(int i = 1; i < argc; ++i) + { + auto _arg = std::string{argv[i]}; + if(_arg == "?" || _arg == "-h" || _arg == "--help") + { + fprintf(stderr, + "usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] " + "[SYNC_EVERY_N_ITERATIONS (%zu)]\n", + nthreads, + nitr, + nsync); + exit(EXIT_SUCCESS); + } + } + if(argc > 1) nthreads = atoll(argv[1]); + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + printf("[transpose] Number of threads: %zu\n", nthreads); + printf("[transpose] Number of iterations: %zu\n", nitr); + printf("[transpose] Syncing every %zu iterations\n", nsync); + + // this is a temporary workaround in omnitrace when HIP + MPI is enabled + int ndevice = 0; + int devid = rank; + HIP_API_CALL(hipGetDeviceCount(&ndevice)); + printf("[transpose] Number of devices found: %i\n", ndevice); + if(ndevice > 0) + { + devid = rank % ndevice; + HIP_API_CALL(hipSetDevice(devid)); + printf("[transpose] Rank %i assigned to device %i\n", rank, devid); + } + if(rank == devid && rank < ndevice) + { + std::vector _threads{}; + std::vector _streams(nthreads); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamCreate(&_streams.at(i))); + for(size_t i = 1; i < nthreads; ++i) + _threads.emplace_back(run, rank, i, _streams.at(i), argc, argv); + run(rank, 0, _streams.at(0), argc, argv); + for(auto& itr : _threads) + itr.join(); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamDestroy(_streams.at(i))); + } + HIP_API_CALL(hipDeviceSynchronize()); + HIP_API_CALL(hipDeviceReset()); + + client::stop(); + client::shutdown(); + + return 0; +} + +__global__ void +transpose_a(int* in, int* out, int M, int N) +{ + __shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim]; + + int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x; + tile[threadIdx.y][threadIdx.x] = in[idx]; + __syncthreads(); + idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x; + out[idx] = tile[threadIdx.x][threadIdx.y]; +} + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv) +{ + unsigned int M = 4960 * 2; + unsigned int N = 4960 * 2; + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + auto_lock_t _lk{print_lock}; + std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl; + _lk.unlock(); + + std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)}; + std::uniform_int_distribution _dist{0, 1000}; + + size_t size = sizeof(int) * M * N; + int* inp_matrix = new int[size]; + int* out_matrix = new int[size]; + for(size_t i = 0; i < M * N; i++) + { + inp_matrix[i] = _dist(_engine); + out_matrix[i] = 0; + } + int* in = nullptr; + int* out = nullptr; + + HIP_API_CALL(hipMalloc(&in, size)); + HIP_API_CALL(hipMalloc(&out, size)); + HIP_API_CALL(hipMemsetAsync(in, 0, size, stream)); + HIP_API_CALL(hipMemsetAsync(out, 0, size, stream)); + HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream)); + HIP_API_CALL(hipStreamSynchronize(stream)); + + dim3 grid(M / 32, N / 32, 1); + dim3 block(32, 32, 1); // transpose_a + + auto t1 = std::chrono::high_resolution_clock::now(); + for(size_t i = 0; i < nitr; ++i) + { + transpose_a<<>>(in, out, M, N); + check_hip_error(); + if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream)); + } + auto t2 = std::chrono::high_resolution_clock::now(); + HIP_API_CALL(hipStreamSynchronize(stream)); + HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream)); + double time = std::chrono::duration_cast>(t2 - t1).count(); + float GB = (float) size * nitr * 2 / (1 << 30); + + print_lock.lock(); + std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n" + << "The average performance of transpose is " << GB / time << " GBytes/sec" + << std::endl; + print_lock.unlock(); + + HIP_API_CALL(hipStreamSynchronize(stream)); + + // cpu_transpose(matrix, out_matrix, M, N); + verify(inp_matrix, out_matrix, M, N); + + HIP_API_CALL(hipFree(in)); + HIP_API_CALL(hipFree(out)); + + delete[] inp_matrix; + delete[] out_matrix; +} + +namespace +{ +void +check_hip_error(void) +{ + hipError_t err = hipGetLastError(); + if(err != hipSuccess) + { + auto_lock_t _lk{print_lock}; + std::cerr << "Error: " << hipGetErrorString(err) << std::endl; + throw std::runtime_error("hip_api_call"); + } +} + +void +verify(int* in, int* out, int M, int N) +{ + for(int i = 0; i < 10; i++) + { + int row = rand() % M; + int col = rand() % N; + if(in[row * N + col] != out[col * M + row]) + { + auto_lock_t _lk{print_lock}; + std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | " + << out[col * M + row] << "\n"; + } + } +} +} // namespace diff --git a/projects/rocprofiler-sdk/samples/apps/transpose/CMakeLists.txt b/projects/rocprofiler-sdk/samples/apps/transpose/CMakeLists.txt new file mode 100644 index 0000000000..a9604846bb --- /dev/null +++ b/projects/rocprofiler-sdk/samples/apps/transpose/CMakeLists.txt @@ -0,0 +1,38 @@ +cmake_minimum_required(VERSION 3.21 FATAL_ERROR) + +find_program( + HIPCC_EXECUTABLE + NAMES hipcc + HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm + PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm NO_CACHE) +mark_as_advanced(HIPCC_EXECUTABLE) + +if(HIPCC_EXECUTABLE) + set(CMAKE_CXX_COMPILER ${HIPCC_EXECUTABLE}) +endif() + +project(rocprofiler-transpose-sample LANGUAGES CXX) + +option(TRANSPOSE_USE_MPI "Enable MPI support in transpose exe" OFF) + +set(CMAKE_CXX_STANDARD 17) +set(CMAKE_CXX_EXTENSIONS OFF) +set(CMAKE_CXX_STANDARD_REQUIRED ON) + +add_executable(transpose) +target_sources(transpose PRIVATE transpose.cpp) +target_compile_options(transpose PRIVATE -W -Wall -Wextra -Wpedantic -Wshadow -Werror) + +find_package(Threads REQUIRED) +target_link_libraries(transpose PRIVATE Threads::Threads) + +if(TRANSPOSE_USE_MPI) + find_package(MPI REQUIRED) + target_compile_definitions(transpose PRIVATE USE_MPI) + target_link_libraries(transpose PRIVATE MPI::MPI_C) +endif() + +install( + TARGETS transpose + DESTINATION bin + COMPONENT rocprofiler-samples) diff --git a/projects/rocprofiler-sdk/samples/apps/transpose/transpose.cpp b/projects/rocprofiler-sdk/samples/apps/transpose/transpose.cpp new file mode 100644 index 0000000000..d473fd1d7e --- /dev/null +++ b/projects/rocprofiler-sdk/samples/apps/transpose/transpose.cpp @@ -0,0 +1,278 @@ +/* +Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + +#include "hip/hip_runtime.h" + +#include +#include +#include +#include +#include +#include +#include + +#if defined(USE_MPI) +# include +#endif + +#define HIP_API_CALL(CALL) \ + { \ + hipError_t error_ = (CALL); \ + if(error_ != hipSuccess) \ + { \ + auto _hip_api_print_lk = auto_lock_t{print_lock}; \ + fprintf(stderr, \ + "%s:%d :: HIP error : %s\n", \ + __FILE__, \ + __LINE__, \ + hipGetErrorString(error_)); \ + throw std::runtime_error("hip_api_call"); \ + } \ + } + +namespace +{ +using auto_lock_t = std::unique_lock; +auto print_lock = std::mutex{}; +size_t nthreads = 2; +size_t nitr = 500; +size_t nsync = 10; +constexpr unsigned shared_mem_tile_dim = 32; + +void +check_hip_error(void); + +void +verify(int* in, int* out, int M, int N); +} // namespace + +__global__ void +transpose_a(int* in, int* out, int M, int N); + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv); + +#if defined(USE_MPI) +void +do_a2a(int rank); +#endif + +int +main(int argc, char** argv) +{ + int rank = 0; + int size = 1; + for(int i = 1; i < argc; ++i) + { + auto _arg = std::string{argv[i]}; + if(_arg == "?" || _arg == "-h" || _arg == "--help") + { + fprintf(stderr, + "usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] " + "[SYNC_EVERY_N_ITERATIONS (%zu)]\n", + nthreads, + nitr, + nsync); + exit(EXIT_SUCCESS); + } + } + if(argc > 1) nthreads = atoll(argv[1]); + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + printf("[transpose] Number of threads: %zu\n", nthreads); + printf("[transpose] Number of iterations: %zu\n", nitr); + printf("[transpose] Syncing every %zu iterations\n", nsync); + +#if defined(USE_MPI) + MPI_Init(&argc, &argv); + MPI_Comm_rank(MPI_COMM_WORLD, &rank); + MPI_Comm_size(MPI_COMM_WORLD, &size); +#else + (void) size; +#endif + // this is a temporary workaround in omnitrace when HIP + MPI is enabled + int ndevice = 0; + int devid = rank; + HIP_API_CALL(hipGetDeviceCount(&ndevice)); + printf("[transpose] Number of devices found: %i\n", ndevice); + if(ndevice > 0) + { + devid = rank % ndevice; + HIP_API_CALL(hipSetDevice(devid)); + printf("[transpose] Rank %i assigned to device %i\n", rank, devid); + } + if(rank == devid && rank < ndevice) + { + std::vector _threads{}; + std::vector _streams(nthreads); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamCreate(&_streams.at(i))); + for(size_t i = 1; i < nthreads; ++i) + _threads.emplace_back(run, rank, i, _streams.at(i), argc, argv); + run(rank, 0, _streams.at(0), argc, argv); + for(auto& itr : _threads) + itr.join(); + for(size_t i = 0; i < nthreads; ++i) + HIP_API_CALL(hipStreamDestroy(_streams.at(i))); + } + HIP_API_CALL(hipDeviceSynchronize()); + HIP_API_CALL(hipDeviceReset()); + +#if defined(USE_MPI) + MPI_Barrier(MPI_COMM_WORLD); + do_a2a(rank); + MPI_Finalize(); +#endif + + return 0; +} + +__global__ void +transpose_a(int* in, int* out, int M, int N) +{ + __shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim]; + + int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x; + tile[threadIdx.y][threadIdx.x] = in[idx]; + __syncthreads(); + idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x; + out[idx] = tile[threadIdx.x][threadIdx.y]; +} + +void +run(int rank, int tid, hipStream_t stream, int argc, char** argv) +{ + unsigned int M = 4960 * 2; + unsigned int N = 4960 * 2; + if(argc > 2) nitr = atoll(argv[2]); + if(argc > 3) nsync = atoll(argv[3]); + + auto_lock_t _lk{print_lock}; + std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl; + _lk.unlock(); + + std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)}; + std::uniform_int_distribution _dist{0, 1000}; + + size_t size = sizeof(int) * M * N; + int* inp_matrix = new int[size]; + int* out_matrix = new int[size]; + for(size_t i = 0; i < M * N; i++) + { + inp_matrix[i] = _dist(_engine); + out_matrix[i] = 0; + } + int* in = nullptr; + int* out = nullptr; + + HIP_API_CALL(hipMalloc(&in, size)); + HIP_API_CALL(hipMalloc(&out, size)); + HIP_API_CALL(hipMemsetAsync(in, 0, size, stream)); + HIP_API_CALL(hipMemsetAsync(out, 0, size, stream)); + HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream)); + HIP_API_CALL(hipStreamSynchronize(stream)); + + dim3 grid(M / 32, N / 32, 1); + dim3 block(32, 32, 1); // transpose_a + + auto t1 = std::chrono::high_resolution_clock::now(); + for(size_t i = 0; i < nitr; ++i) + { + transpose_a<<>>(in, out, M, N); + check_hip_error(); + if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream)); + } + auto t2 = std::chrono::high_resolution_clock::now(); + HIP_API_CALL(hipStreamSynchronize(stream)); + HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream)); + double time = std::chrono::duration_cast>(t2 - t1).count(); + float GB = (float) size * nitr * 2 / (1 << 30); + + print_lock.lock(); + std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n" + << "The average performance of transpose is " << GB / time << " GBytes/sec" + << std::endl; + print_lock.unlock(); + + HIP_API_CALL(hipStreamSynchronize(stream)); + + // cpu_transpose(matrix, out_matrix, M, N); + verify(inp_matrix, out_matrix, M, N); + + HIP_API_CALL(hipFree(in)); + HIP_API_CALL(hipFree(out)); + + delete[] inp_matrix; + delete[] out_matrix; +} + +namespace +{ +void +check_hip_error(void) +{ + hipError_t err = hipGetLastError(); + if(err != hipSuccess) + { + auto_lock_t _lk{print_lock}; + std::cerr << "Error: " << hipGetErrorString(err) << std::endl; + throw std::runtime_error("hip_api_call"); + } +} + +void +verify(int* in, int* out, int M, int N) +{ + for(int i = 0; i < 10; i++) + { + int row = rand() % M; + int col = rand() % N; + if(in[row * N + col] != out[col * M + row]) + { + auto_lock_t _lk{print_lock}; + std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | " + << out[col * M + row] << "\n"; + } + } +} +} // namespace + +#if defined(USE_MPI) +void +do_a2a(int rank) +{ + // Define my value + int values[3]; + for(int i = 0; i < 3; ++i) + values[i] = rank * 300 + i * 100; + printf("Process %d, values = %d, %d, %d.\n", rank, values[0], values[1], values[2]); + + int buffer_recv[3]; + MPI_Alltoall(&values, 1, MPI_INT, buffer_recv, 1, MPI_INT, MPI_COMM_WORLD); + printf("Values collected on process %d: %d, %d, %d.\n", + rank, + buffer_recv[0], + buffer_recv[1], + buffer_recv[2]); +} +#endif diff --git a/projects/rocprofiler-sdk/samples/pc_sampling/common.h b/projects/rocprofiler-sdk/samples/pc_sampling/common.h index fa49286a4a..b492112f99 100644 --- a/projects/rocprofiler-sdk/samples/pc_sampling/common.h +++ b/projects/rocprofiler-sdk/samples/pc_sampling/common.h @@ -102,7 +102,7 @@ rocprofiler_pc_sampling_callback(rocprofiler_context_id_t /*context_id*/, for(size_t i = 0; i < num_headers; i++) { auto* cur_header = headers[i]; - if(cur_header->kind == 0) + if(cur_header->category == ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING) { auto* pc_sample = static_cast(cur_header->payload); printf("--- pc: %lx, dispatch_id: %lx, timestamp: %lu, hardware_id: %lu\n", diff --git a/projects/rocprofiler-sdk/source/docs/rocprofiler.dox.in b/projects/rocprofiler-sdk/source/docs/rocprofiler.dox.in index 7a9b0d25cd..9c3e72b103 100644 --- a/projects/rocprofiler-sdk/source/docs/rocprofiler.dox.in +++ b/projects/rocprofiler-sdk/source/docs/rocprofiler.dox.in @@ -142,14 +142,22 @@ RECURSIVE = YES EXCLUDE = EXCLUDE_SYMLINKS = YES EXCLUDE_PATTERNS = */.git/* \ - @SOURCE_DIR@/samples/* \ @SOURCE_DIR@/**/tests/* \ - @SOURCE_DIR@/source/include/rocprofiler/defines.h \ - @SOURCE_DIR@/source/include/rocprofiler/config.h + @SOURCE_DIR@/**/scripts/* \ + @SOURCE_DIR@/**/docs/* EXCLUDE_SYMBOLS = "std::*" \ "ROCPROFILER_ATTRIBUTE" \ "ROCPROFILER_API" \ - "ROCPROFILER_NONNULL" + "ROCPROFILER_NONNULL" \ + "ROCPROFILER_PUBLIC_API" \ + "ROCPROFILER_HIDDEN_API" \ + "ROCPROFILER_EXPORT_DECORATOR" \ + "ROCPROFILER_IMPORT_DECORATOR" \ + "ROCPROFILER_EXPORT" \ + "ROCPROFILER_IMPORT" \ + "ROCPROFILER_HANDLE_LITERAL" \ + "ROCPROFILER_EXTERN_C_INIT" \ + "ROCPROFILER_EXTERN_C_FINI" EXAMPLE_PATH = @SOURCE_DIR@/samples EXAMPLE_PATTERNS = *.h \ *.hh \ @@ -157,7 +165,6 @@ EXAMPLE_PATTERNS = *.h \ *.c \ *.cc \ *.cpp \ - conf.py \ *.txt EXAMPLE_RECURSIVE = YES IMAGE_PATH = @@ -330,6 +337,13 @@ PREDEFINED = "ROCPROFILER_API=" \ "ROCPROFILER_EXPORT=" \ "ROCPROFILER_IMPORT=" \ "ROCPROFILER_NONNULL(...)=" \ + "ROCPROFILER_PUBLIC_API=" \ + "ROCPROFILER_HIDDEN_API=" \ + "ROCPROFILER_EXPORT_DECORATOR=" \ + "ROCPROFILER_IMPORT_DECORATOR=" \ + "ROCPROFILER_HANDLE_LITERAL=" \ + "ROCPROFILER_EXTERN_C_INIT=" \ + "ROCPROFILER_EXTERN_C_FINI=" \ "__attribute__(x)=" \ "__declspec(x)=" \ "size_t=unsigned long" \ diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/CMakeLists.txt b/projects/rocprofiler-sdk/source/include/rocprofiler/CMakeLists.txt index 151b3616e2..1f4a3708d6 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/CMakeLists.txt +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/CMakeLists.txt @@ -6,8 +6,30 @@ configure_file(${CMAKE_CURRENT_LIST_DIR}/version.h.in ${CMAKE_CURRENT_BINARY_DIR}/version.h @ONLY) -set(ROCPROFILER_HEADER_FILES config.h defines.h hip.h hsa.h marker.h rocprofiler.h - rocprofiler_plugin.h ${CMAKE_CURRENT_BINARY_DIR}/version.h) +set(ROCPROFILER_HEADER_FILES + # core headers + rocprofiler.h + rocprofiler_plugin.h + # secondary headers + agent.h + agent_profile.h + buffer.h + buffer_tracing.h + callback_tracing.h + context.h + counters.h + defines.h + dispatch_profile.h + external_correlation.h + fwd.h + hip.h + hsa.h + internal_threading.h + marker.h + pc_sampling.h + profile_config.h + spm.h + ${CMAKE_CURRENT_BINARY_DIR}/version.h) install(FILES ${ROCPROFILER_HEADER_FILES} DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/rocprofiler) diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/agent.h b/projects/rocprofiler-sdk/source/include/rocprofiler/agent.h new file mode 100644 index 0000000000..f45dfc9030 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/agent.h @@ -0,0 +1,72 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup AGENTS Agent Information + * @{ + */ + +/** + * @brief Agent. + */ +typedef struct +{ + rocprofiler_agent_id_t id; + rocprofiler_agent_type_t type; + const char* name; + rocprofiler_pc_sampling_config_array_t pc_sampling_configs; +} rocprofiler_agent_t; + +/** + * @brief Callback function type for querying the available agents + * + * @param [in] agents Array of pointers to agents + * @param [in] num_agents Number of agents in array + * @param [in] user_data Data pointer passback + * @return ::rocprofiler_status_t + */ +typedef rocprofiler_status_t (*rocprofiler_available_agents_cb_t)(rocprofiler_agent_t** agents, + size_t num_agents, + void* user_data); + +/** + * @brief Receive synchronous callback with an array of available agents at moment of invocation + * + * @param [in] callback Callback function accepting list of agents + * @param [in] agent_size Should be set to sizeof(rocprofiler_agent_t) + * @param [in] user_data Data pointer provided to callback + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback, + size_t agent_size, + void* user_data) ROCPROFILER_NONNULL(1); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/agent_profile.h b/projects/rocprofiler-sdk/source/include/rocprofiler/agent_profile.h new file mode 100644 index 0000000000..819f178f66 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/agent_profile.h @@ -0,0 +1,70 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup AGENT_PROFILE_COUNTING_SERVICE Agent Profile Counting Service + * @{ + */ + +/** + * @brief ROCProfiler Agent Profile Counting Data. + * + * Counters, including identifiers to get counter information and Counters values + */ +typedef struct +{ + /** + */ + rocprofiler_record_counter_t* counters; + uint64_t counters_count; +} rocprofiler_agent_profile_counting_data_t; + +/** + * @brief Configure Profile Counting Service for agent. + * + * @param [in] buffer_id + * @param [in] profile_config_id + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_agent_profile_counting_service( + rocprofiler_buffer_id_t buffer_id, + rocprofiler_profile_config_id_t profile_config_id); + +/** + * @brief Sample Profile Counting Service for agent. + * + * @param [out] data // It is always a size of one + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_sample_agent_profile_counting_service(rocprofiler_agent_profile_counting_data_t* data); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/buffer.h b/projects/rocprofiler-sdk/source/include/rocprofiler/buffer.h new file mode 100644 index 0000000000..cd7f1a1af1 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/buffer.h @@ -0,0 +1,106 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup BUFFER_HANDLING Buffer + * @{ + * + * Every Buffer is associated with a specific service kind. + * OR + * Every Buffer is associated with a specific service ID. + * + */ + +/** + * @brief Async callback function. + * + * @code{.cpp} + * for(size_t i = 0; i < num_headers; ++i) + * { + * rocprofiler_record_header_t* hdr = headers[i]; + * if(hdr->kind == ROCPROFILER_RECORD_KIND_PC_SAMPLE) + * { + * auto* data = static_cast(&hdr->payload); + * ... + * } + * } + * @endcode + */ +typedef void (*rocprofiler_buffer_tracing_cb_t)(rocprofiler_context_id_t context, + rocprofiler_buffer_id_t buffer_id, + rocprofiler_record_header_t** headers, + size_t num_headers, + void* data, + uint64_t drop_count); + +/** + * @brief Create buffer. + * + * @param [in] context Context identifier associated with buffer + * @param [in] size Size of the buffer in bytes + * @param [in] watermark - watermark size, where the callback is called, if set + * to 0 then the callback will be called on every record + * @param [in] policy Behavior policy when buffer is full + * @param [in] callback Callback to invoke when buffer is flushed/full + * @param [in] callback_data Data to provide in callback function + * @param [out] buffer_id Identification handle for buffer + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_create_buffer(rocprofiler_context_id_t context, + size_t size, + size_t watermark, + rocprofiler_buffer_policy_t policy, + rocprofiler_buffer_tracing_cb_t callback, + void* callback_data, + rocprofiler_buffer_id_t* buffer_id) ROCPROFILER_NONNULL(5, 7); + +/** + * @brief Destroy buffer. + * + * @param [in] buffer_id + * @return ::rocprofiler_status_t + * + * Note: This will destroy the buffer even if it is not empty. The user can + * call @ref ::rocprofiler_flush_buffer before it to make sure the buffer is empty. + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id); + +/** + * @brief Flush buffer. + * + * @param [in] buffer_id + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/buffer_tracing.h b/projects/rocprofiler-sdk/source/include/rocprofiler/buffer_tracing.h new file mode 100644 index 0000000000..275659e3ed --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/buffer_tracing.h @@ -0,0 +1,278 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup BUFFER_TRACING_SERVICE Asynchronous Tracing Service + * + * Receive callbacks for batches of records from an internal (background) thread + * + * @{ + */ + +/** + * @brief ROCProfiler Buffer HSA API Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_tracing_operation_t operation; // rocprofiler/hsa.h + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_thread_id_t thread_id; +} rocprofiler_buffer_tracing_hsa_api_record_t; + +/** + * @brief ROCProfiler Buffer HIP API Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_tracing_operation_t operation; // rocprofiler/hip.h + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_thread_id_t thread_id; +} rocprofiler_buffer_tracing_hip_api_record_t; + +/** + * @brief ROCProfiler Buffer Marker Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_tracing_operation_t operation; // rocprofiler/marker.h + rocprofiler_timestamp_t timestamp; + rocprofiler_thread_id_t thread_id; + uint64_t marker_id; // rocprofiler_marker_id_t + // const char* message; // (Need Review?) +} rocprofiler_buffer_tracing_marker_record_t; + +/** + * @brief ROCProfiler Buffer Memory Copy Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + /** + * Memory copy operation that can be derived from + * ::rocprofiler_tracing_operation_t + */ + uint32_t operation; + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_queue_id_t queue_id; +} rocprofiler_buffer_tracing_memory_copy_record_t; + +/** + * @brief ROCProfiler Buffer Kernel Dispatch Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_queue_id_t queue_id; + const char* kernel_name; +} rocprofiler_buffer_tracing_kernel_dispatch_record_t; + +/** + * @brief ROCProfiler Buffer Page Migration Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_queue_id_t queue_id; + // Not Sure What is the info needed here? +} rocprofiler_buffer_tracing_page_migration_record_t; + +/** + * @brief ROCProfiler Buffer Scratch Memory Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_queue_id_t queue_id; + // Not Sure What is the info needed here? +} rocprofiler_buffer_tracing_scratch_memory_record_t; + +/** + * @brief ROCProfiler Buffer Queue Scheduling Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + rocprofiler_queue_id_t queue_id; + // Not Sure What is the info needed here? +} rocprofiler_buffer_tracing_queue_scheduling_record_t; + +/** + * @brief ROCProfiler Code Object Tracer Buffer Record. + * + * We need to guarantee that these records are in the buffer before the + * corresponding Exit Phase API calls are called. + */ +// typedef struct { +// rocprofiler_buffer_tracing_record_header_t header; +// rocprofiler_tracing_code_object_kind_id_t kind; +// } rocprofiler_buffer_tracing_code_object_header_t; + +/** + * @brief ROCProfiler Code Object Load Tracer Buffer Record. + * + */ +// typedef struct { +// rocprofiler_buffer_tracing_code_object_header_t header; +// uint64_t load_base; // code object load base +// uint64_t load_size; // code object load size +// const char *uri; // URI string (NULL terminated) +// rocprofiler_timestamp_t timestamp; +// // uint32_t storage_type; // code object storage type (Need Review?) +// // int storage_file; // origin file descriptor (Need Review?) +// // uint64_t memory_base; // origin memory base (Need Review?) +// // uint64_t memory_size; // origin memory size (Need Review?) +// // uint64_t load_delta; // code object load delta (Need Review?) +// } rocprofiler_buffer_tracing_code_object_load_record_t; + +/** + * @brief ROCProfiler Code Object UnLoad Tracer Buffer Record. + * + */ +// typedef struct { +// rocprofiler_buffer_tracing_code_object_header_t header; +// uint64_t load_base; // code object load base +// rocprofiler_timestamp_t timestamp; +// } rocprofiler_buffer_tracing_code_object_unload_record_t; + +/** + * @brief ROCProfiler Code Object Kernel Symbol Tracer Buffer Record. + * + */ +// typedef struct { +// rocprofiler_buffer_tracing_code_object_header_t header; +// const char *kernel_name; // kernel name string (NULL terminated) +// uint64_t kernel_descriptor; // kernel descriptor (Need to be changed from +// // uint64_t to ::rocprofiler_address_t) +// // rocprofiler_timestamp_t timestamp; // (Need Review?) +// } rocprofiler_buffer_tracing_code_object_kernel_symbol_record_t; + +/** + * @brief ROCProfiler Buffer External Correlation Tracer Record. + */ +typedef struct +{ + rocprofiler_service_buffer_tracing_kind_t kind; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_external_correlation_id_t external_correlation_id; +} rocprofiler_buffer_tracing_external_correlation_record_t; + +/** + * @brief Callback function for mapping @ref rocprofiler_service_buffer_tracing_kind_t ids to + * string names. @see rocprofiler_iterate_buffer_trace_kind_names. + */ +typedef int (*rocprofiler_buffer_tracing_kind_name_cb_t)( + rocprofiler_service_buffer_tracing_kind_t kind, + const char* kind_name, + void* data); + +/** + * @brief Callback function for mapping the operations of a given @ref + * rocprofiler_service_buffer_tracing_kind_t to string names. @see + * rocprofiler_iterate_buffer_trace_kind_operation_names. + */ +typedef int (*rocprofiler_buffer_tracing_operation_name_cb_t)( + rocprofiler_service_buffer_tracing_kind_t kind, + uint32_t operation, + const char* operation_name, + void* data); + +/** + * @brief Configure Buffer Tracing Service. + * + * @param [in] context_id + * @param [in] kind + * @param [in] operations + * @param [in] operations_count + * @param [in] buffer_id + * @return ::rocprofiler_status_t + * + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id, + rocprofiler_service_buffer_tracing_kind_t kind, + rocprofiler_tracing_operation_t* operations, + size_t operations_count, + rocprofiler_buffer_id_t buffer_id); + +/** + * @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id + * mapped to a constant string. The strings provided in the arg will be valid pointers for the + * entire duration of the program. It is recommended to call this function once and cache this data + * in the client instead of making multiple on-demand calls. + * + * @param [in] callback Callback function invoked for each enumeration value in @ref + * rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values. + * @param [in] data User data passed back into the callback + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback, + void* data) ROCPROFILER_NONNULL(1); + +/** + * @brief Iterates over all the mappings of the operations for a given @ref + * rocprofiler_service_buffer_tracing_kind_t and invokes the callback with the kind, operation id, + * and the string mapping to the operation id. The strings provided in the callback arg will be + * valid pointers for the entire duration of the program. It is recommended to call this function + * once per kind, and cache this data in the client instead of making multiple on-demand calls. + * + * @param [in] kind which buffer tracing kind operations to iterate over + * @param [in] callback Callback function invoked for each operation associated with @ref + * rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values. + * @param [in] data User data passed back into the callback + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_iterate_buffer_tracing_kind_operation_names( + rocprofiler_service_buffer_tracing_kind_t kind, + rocprofiler_buffer_tracing_operation_name_cb_t callback, + void* data) ROCPROFILER_NONNULL(2); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/callback_tracing.h b/projects/rocprofiler-sdk/source/include/rocprofiler/callback_tracing.h new file mode 100644 index 0000000000..75b2fa16cc --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/callback_tracing.h @@ -0,0 +1,252 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup CALLBACK_TRACING_SERVICE Synchronous Tracing Services + * + * Receive immediate callbacks on the calling thread + * + * @{ + */ + +/** + * @brief ROCProfiler HSA API Callback Data. + */ +typedef struct +{ + size_t size; ///< provides the size of this struct + rocprofiler_hsa_api_args_t args; + rocprofiler_hsa_api_retval_t retval; +} rocprofiler_hsa_api_callback_tracer_data_t; + +/** + * @brief ROCProfiler HIP API Callback Data. + * + * Depending on the operation kind, the data can be casted to the corresponding + * structure. + * + */ +typedef void* rocprofiler_hip_api_callback_api_data_t; + +/** + * @brief ROCProfiler HIP API Tracer Callback Data. + */ +typedef struct +{ + size_t size; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_address_t host_kernel_address; + rocprofiler_hip_api_callback_api_data_t data; // Arguments or api_data? +} rocprofiler_hip_api_callback_tracer_data_t; + +/** + * @brief ROCProfiler Marker Callback Data. + * + * Depending on the operation kind, the data can be casted to the corresponding + * structure. + * + */ +typedef void* rocprofiler_marker_callback_api_data_t; + +/** + * @brief ROCProfiler Marker Tracer Callback Data. + */ +typedef struct +{ + size_t size; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_marker_callback_api_data_t data; // Arguments or api_data? +} rocprofiler_marker_callback_tracer_data_t; + +/** + * @brief ROCProfiler Code Object Load Tracer Callback Record. + */ +typedef struct +{ + uint64_t load_base; // code object load base + uint64_t load_size; // code object load size + const char* uri; // URI string (NULL terminated) + // uint32_t storage_type; // code object storage type (Need Review?) + // int storage_file; // origin file descriptor (Need Review?) + // uint64_t memory_base; // origin memory base (Need Review?) + // uint64_t memory_size; // origin memory size (Need Review?) + // uint64_t load_delta; // code object load delta (Need Review?) +} rocprofiler_callback_tracer_code_object_load_data_t; + +/** + * @brief ROCProfiler Code Object UnLoad Tracer Callback Record. + * + */ +typedef struct +{ + uint64_t load_base; // code object load base +} rocprofiler_callback_tracer_code_object_unload_data_t; + +/** + * @brief ROCProfiler Code Object Device Kernel Symbol Tracer Callback Record. + * + */ +typedef struct +{ + const char* kernel_name; // kernel name string (NULL terminated) + rocprofiler_address_t kernel_descriptor; // kernel descriptor +} rocprofiler_callback_tracer_code_object_device_kernel_symbol_data_t; + +/** + * @brief ROCProfiler Code Object Register Host Kernel Symbol Tracer Callback + * Record. + * + */ +typedef struct +{ + rocprofiler_address_t host_address; // host address + // Should this be nullptr if it is unregister? + const char* kernel_name; // kernel name string (NULL terminated) + rocprofiler_address_t kernel_descriptor; // kernel descriptor +} rocprofiler_callback_tracer_code_object_register_host_kernel_symbol_data_t; + +/** + * @brief API Tracing callback function. + */ +typedef void (*rocprofiler_callback_tracing_cb_t)(rocprofiler_callback_tracing_record_t record, + void* user_data); + +/** + * @brief Callback function for mapping @ref rocprofiler_service_callback_tracing_kind_t ids to + * string names. @see rocprofiler_iterate_callback_tracing_kind_names. + */ +typedef int (*rocprofiler_callback_tracing_kind_name_cb_t)( + rocprofiler_service_callback_tracing_kind_t kind, + const char* kind_name, + void* data); + +/** + * @brief Callback function for mapping the operations of a given @ref + * rocprofiler_service_callback_tracing_kind_t to string names. @see + * rocprofiler_iterate_callback_tracing_kind_operation_names. + */ +typedef int (*rocprofiler_callback_tracing_operation_name_cb_t)( + rocprofiler_service_callback_tracing_kind_t kind, + uint32_t operation, + const char* operation_name, + void* data); + +/** + * @brief Callback function for iterating over the function arguments to a traced function. + * This function will be invoked for each argument. + * @see rocprofiler_iterate_callback_tracing_operation_args + * + * @param kind [in] domain + * @param operation [in] associated domain operation + * @param arg_number [in] the argument number, starting at zero + * @param arg_name [in] the name of the argument in the prototype (or rocprofiler union) + * @param arg_value_str [in] conversion of the argument to a string, e.g. operator<< overload + * @param arg_value_addr [in] the address of the argument stored by rocprofiler. + * @param data [in] user data + */ +typedef int (*rocprofiler_callback_tracing_operation_args_cb_t)( + rocprofiler_service_callback_tracing_kind_t kind, + uint32_t operation, + uint32_t arg_number, + const char* arg_name, + const char* arg_value_str, + const void* const arg_value_addr, + void* data); + +/** + * @brief Configure Callback Tracing Service. + * + * @param [in] context_id + * @param [in] kind + * @param [in] operations + * @param [in] operations_count + * @param [in] callback + * @param [in] callback_args + * @return ::rocprofiler_status_t + * + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id, + rocprofiler_service_callback_tracing_kind_t kind, + rocprofiler_tracing_operation_t* operations, + size_t operations_count, + rocprofiler_callback_tracing_cb_t callback, + void* callback_args); + +/** + * @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id + * mapped to a constant string. The strings provided in the arg will be valid pointers for the + * entire duration of the program. It is recommended to call this function once and cache this data + * in the client instead of making multiple on-demand calls. + * + * @param [in] callback Callback function invoked for each enumeration value in @ref + * rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values. + * @param [in] data User data passed back into the callback + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_iterate_callback_tracing_kind_names( + rocprofiler_callback_tracing_kind_name_cb_t callback, + void* data) ROCPROFILER_NONNULL(1); + +/** + * @brief Iterates over all the mappings of the operations for a given @ref + * rocprofiler_service_callback_tracing_kind_t and invokes the callback with the kind, operation id, + * and the string mapping to the operation id. The strings provided in the callback arg will be + * valid pointers for the entire duration of the program. It is recommended to call this function + * once per kind, and cache this data in the client instead of making multiple on-demand calls. + * + * @param [in] kind which tracing callback kind operations to iterate over + * @param [in] callback Callback function invoked for each operation associated with @ref + * rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values. + * @param [in] data User data passed back into the callback + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_iterate_callback_tracing_kind_operation_names( + rocprofiler_service_callback_tracing_kind_t kind, + rocprofiler_callback_tracing_operation_name_cb_t callback, + void* data) ROCPROFILER_NONNULL(2); + +/** + * @brief Iterates over all the arguments for the traced function (when available). This is + * particularly useful when tools want to annotate traces with the function arguments. See + * @example samples/api_callback_tracing/client.cpp for a usage example. + * + * @param[in] record Record provided by service callback + * @param[in] callback The callback function which will be invoked for each argument + * @param[in] user_data Data to be passed to each invocation of the callback + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_iterate_callback_tracing_operation_args( + rocprofiler_callback_tracing_record_t record, + rocprofiler_callback_tracing_operation_args_cb_t callback, + void* user_data) ROCPROFILER_NONNULL(2); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/config.h b/projects/rocprofiler-sdk/source/include/rocprofiler/config.h deleted file mode 100644 index 4fee724195..0000000000 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/config.h +++ /dev/null @@ -1,210 +0,0 @@ -// MIT License -// -// Copyright (c) 2023 ROCm Developer Tools -// -// Permission is hereby granted, free of charge, to any person obtaining a copy -// of this software and associated documentation files (the "Software"), to deal -// in the Software without restriction, including without limitation the rights -// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -// copies of the Software, and to permit persons to whom the Software is -// furnished to do so, subject to the following conditions: -// -// The above copyright notice and this permission notice shall be included in all -// copies or substantial portions of the Software. -// -// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -// SOFTWARE. - -#pragma once - -#include - -#ifdef __cplusplus -extern "C" { -#endif - -#define ROCPROFILER_API_VERSION_ID 1 -#define ROCPROFILER_DOMAIN_OPS_MAX 512 -#define ROCPROFILER_DOMAIN_OPS_RESERVED \ - ((ROCPROFILER_DOMAIN_OPS_MAX * ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST / 8)) - -typedef uint64_t (*rocprofiler_external_cid_cb_t)(rocprofiler_tracer_activity_domain_t, - uint32_t, - uint64_t); -typedef int (*rocprofiler_filter_name_t)(const char*); -typedef int (*rocprofiler_filter_op_id_t)(uint32_t); -typedef int (*rocprofiler_filter_range_t)(uint32_t, uint32_t); -typedef int (*rocprofiler_filter_dispatch_id_t)(uint64_t); - -/// permits tools opportunity to modify the correlation id based on the domain, op, and -/// the rocprofiler generated correlation id -struct rocprofiler_correlation_config -{ - rocprofiler_external_cid_cb_t external_id_callback; -}; - -/// how the tools specify the tracing domain and (optionally) which operations in the -/// domain they want to trace -struct rocprofiler_domain_config -{ - rocprofiler_tracer_callback_t callback; - char reserved0[sizeof(uint64_t)]; - char reserved1[ROCPROFILER_DOMAIN_OPS_RESERVED]; -}; - -/// for buffered callbacks, the tool provides a callback to create a buffer and the size -struct rocprofiler_buffer_config -{ - rocprofiler_buffer_callback_t callback; - uint64_t buffer_size; - // void* reserved0; - char reserved1[sizeof(uint64_t)]; -}; - -/// filters are available to make quick decisions about whether rocprofiler should -/// assemble the data necessary for a callback. This is more for convenience and -/// performance -- anything decisions here could be made in the callback but rocprofiler -/// has to first assemble all the infomation on the callback before it (eventually) gets -/// discarded because the tool has decided it (after configuration), that it no longer -/// wants info meeting certain requirements -struct rocprofiler_filter_config -{ - // filter callbacks - rocprofiler_filter_name_t name; - rocprofiler_filter_op_id_t hip_function_id; - rocprofiler_filter_op_id_t hsa_function_id; - rocprofiler_filter_range_t range; - rocprofiler_filter_dispatch_id_t dispatch_id; - - // reserved padding - char padding[24 * sizeof(void*)]; -}; - -/// this is the "single source of truth" for the capabilities of rocprofiler. -/// you can one configuration that activates all the capabilities you want -/// and holistically start/stop the sum of those features. Alternatively, -/// you can have multiple configurations in order to activate certain features -/// modularly. -/// -/// The general workflow is: -/// -/// 1. invoke rocprofiler_allocate_config(...) -/// - rocprofiler allocates any space internally needed for the config -/// - rocprofiler sets a few initial values: -/// - "size" to the size of the config structure used internally -/// - "api_version" to the version id of the API in the rocprofiler library that -/// is being used. -/// - these two values can be used by the tool to identify any potential -/// incompatibilities that the tool might want to know about -/// - rocprofiler checks whether it is too late to configure the tool, e.g. -/// something went wrong and rocprofiler was not able to set itself up as -/// the intercepter -/// 2. tool sets up the configuration struct and sets the "size" variable to the size of -/// their configuration struct and sets the "compat_version" field to the -/// ROCPROFILER_API_VERSION_ID defined by the rocprofiler headers when the tool was -/// built -/// - in other words, the user can communicate to rocprofiler, don't read -/// past this distance in my configuration struct and I built against X version -/// so assume the default behavior and capabilties of version X. -/// 3. tool passes this struct to rocprofiler_validate_config(...) -/// - this step checks the config in isolation and will communicate any potential -/// warnings/issues with that configuration, e.g. rocprofiler_X_config is needed, -/// to HW counters XYZ are not available, etc. The tool then has an opportunity -/// to address these issues however they see fit. -/// 4. tool passes this struct to rocprofiler_start_config(...) -/// - internally, we make a call to rocprofiler_validate_config(...) and if any -/// issues still exist with the config in isolation, rocprofiler tells the app -/// to abort -- mechanisms were provided to prevent aborting prior to this call, -/// aborting the app at this point is to guard against rocprofiler "silently" -/// not working because error codes were ignored -/// - rocprofiler then checks whether this config can actually be activated -/// alongside any other active configuration, e.g. this config wants 4 HW counters -/// and another wants 4 HW counters but we can only activate 6 out of 8 of -/// them in this run. Any issues here will not abort execution but, instead, -/// the features of this configuration will not happen (i.e. config won't be -/// activated) and the issues will be communicated with error codes -- giving -/// the tool the opportunity to address the conflicts (i.e. only request tracing -/// and no HW counters) before attempting to activate the modified config. -/// - once rocprofiler determines all features of a config can be activated, it -/// makes an internal copy of the config and returns an identifier for that -/// configuration. The tool is then free to delete the config and any modification -/// to the config will NOT be reflected in the behavior of rocprofiler. -/// -/// -struct rocprofiler_config -{ - // size is used to ensure that we never read past the end of the version - size_t size; // = sizeof(rocprofiler_config) - uint32_t compat_version; // set by user - uint32_t api_version; // set by rocprofiler - uint64_t reserved0; // internal field - void* user_data; // data passed to callbacks - struct rocprofiler_correlation_config* correlation_id; // = &my_cid_config (optional) - struct rocprofiler_buffer_config* buffer; // = &my_buffer_config (required) - struct rocprofiler_domain_config* domain; // = &my_domain_config (required) - struct rocprofiler_filter_config* filter; // = &my_filter_config (optional) -}; - -/// \brief returns a properly initialized config struct and allocates any data structures -/// necessary for the config to be used -/// -/// \param [out] cfg may adjust config or assign values within structs. -rocprofiler_status_t -rocprofiler_allocate_config(struct rocprofiler_config* cfg); - -/// \brief rocprofiler validates config, checks for conflicts, etc. Ensures that -/// the configuration is valid *in isolation*, e.g. it may check that the user -/// set the compat_version field and that required config fields, such as buffer -/// are set. This function will be called before \ref rocprofiler_start_config -/// but is provided to help the user validate one or more configs without starting -/// them -/// -/// \param [in] cfg configuration to validate -rocprofiler_status_t -rocprofiler_validate_config(const struct rocprofiler_config* cfg); - -/// \brief rocprofiler activates configuration and provides a context identifier -/// \param [in] cfg may adjust config or assign values within structs. If error -/// occurs, could nullptr valid sub-configs and leave the pointers to -/// invalid configs -/// \param [out] id the context identifier for this config. -rocprofiler_status_t -rocprofiler_start_config(struct rocprofiler_config*, rocprofiler_context_id_t* id); - -/// \brief disable the configuration. -rocprofiler_status_t rocprofiler_stop_config(rocprofiler_context_id_t); - -/// -/// -/// the following 4 functions may be changed to permit removing domain/ops and/or -/// identifying domains and operations via strings -/// -/// -rocprofiler_status_t -rocprofiler_domain_set_domain(struct rocprofiler_domain_config*, - rocprofiler_tracer_activity_domain_t); - -rocprofiler_status_t -rocprofiler_domain_add_domains(struct rocprofiler_domain_config*, - rocprofiler_tracer_activity_domain_t*, - size_t); - -rocprofiler_status_t -rocprofiler_domain_add_op(struct rocprofiler_domain_config*, - rocprofiler_tracer_activity_domain_t, - uint32_t); - -rocprofiler_status_t -rocprofiler_domain_add_ops(struct rocprofiler_domain_config*, - rocprofiler_tracer_activity_domain_t, - uint32_t*, - size_t); - -#ifdef __cplusplus -} -#endif diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/context.h b/projects/rocprofiler-sdk/source/include/rocprofiler/context.h new file mode 100644 index 0000000000..34199ec455 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/context.h @@ -0,0 +1,91 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** + * @defgroup CONTEXT_OPERATIONS Context + * @{ + */ + +/** + * The NULL Context handle. + */ +#define ROCPROFILER_CONTEXT_NONE ROCPROFILER_HANDLE_LITERAL(rocprofiler_context_id_t, UINT64_MAX) + +/** + * @brief Create context. + * + * @param context_id [out] Context identifier + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_create_context(rocprofiler_context_id_t* context_id) ROCPROFILER_NONNULL(1); + +/** + * @brief Start context. + * + * @param [in] context_id + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_start_context(rocprofiler_context_id_t context_id); + +/** + * @brief Stop context. + * + * @param [in] context_id + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_stop_context(rocprofiler_context_id_t context_id); + +/** + * @brief Query whether context is active. + * + * @param [in] context_id + * @param [out] status If context is active, this will be a nonzero value + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status) + ROCPROFILER_NONNULL(2); + +/** + * @brief Query whether the context is valid + * + * @param [in] context_id + * @param [out] status If context is invalid, this will be a nonzero value + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status) + ROCPROFILER_NONNULL(2); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/counters.h b/projects/rocprofiler-sdk/source/include/rocprofiler/counters.h new file mode 100644 index 0000000000..cf762c4c64 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/counters.h @@ -0,0 +1,73 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup COUNTERS Hardware counters + * @{ + */ + +/** + * @brief Query Counter name. + * + * @param [in] counter_id + * @param [out] name if nullptr, size will be returned + * @param [out] size + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_query_counter_name(rocprofiler_counter_id_t counter_id, const char* name, size_t* size) + ROCPROFILER_NONNULL(3); + +/** + * @brief Query Counter Instances Count. + * + * @param [in] counter_id + * @param [out] instance_count + * @return rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_query_counter_instance_count(rocprofiler_counter_id_t counter_id, + size_t* instance_count) ROCPROFILER_NONNULL(2); + +/** + * @brief Query Agent Counters Availability. + * + * @param [in] agent + * @param [out] counters_list + * @param [out] counters_count + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_query_agent_supported_counters(rocprofiler_agent_t agent, + rocprofiler_counter_id_t* counters_list, + size_t* counters_count) ROCPROFILER_NONNULL(2, 3); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/defines.h b/projects/rocprofiler-sdk/source/include/rocprofiler/defines.h index 41da65489a..bed0d10e97 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/defines.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/defines.h @@ -22,6 +22,29 @@ #pragma once +/** @defgroup SYMBOL_VERSIONING_GROUP Symbol Versions + * + * The names used for the shared library versioned symbols. + * + * Every function is annotated with one of the version macros defined in this + * section. Each macro specifies a corresponding symbol version string. After + * dynamically loading the shared library with @p dlopen, the address of each + * function can be obtained using @p dlsym with the name of the function and + * its corresponding symbol version string. An error will be reported by @p + * dlvsym if the installed library does not support the version for the + * function specified in this version of the interface. + * + * @{ + */ + +/** + * @brief The function was introduced in version 10.0 of the interface and has the + * symbol version string of ``"ROCPROFILER_10.0"``. + */ +#define ROCPROFILER_VERSION_10_0 + +/** @} */ + #if !defined(ROCPROFILER_ATTRIBUTE) # if defined(_MSC_VER) # define ROCPROFILER_ATTRIBUTE(...) __declspec(__VA_ARGS__) @@ -95,3 +118,11 @@ value \ } #endif + +#ifdef __cplusplus +# define ROCPROFILER_EXTERN_C_INIT extern "C" { +# define ROCPROFILER_EXTERN_C_FINI } +#else +# define ROCPROFILER_EXTERN_C_INIT +# define ROCPROFILER_EXTERN_C_FINI +#endif diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/dispatch_profile.h b/projects/rocprofiler-sdk/source/include/rocprofiler/dispatch_profile.h new file mode 100644 index 0000000000..ddeaea755a --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/dispatch_profile.h @@ -0,0 +1,97 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup DISPATCH_PROFILE_COUNTING_SERVICE Dispatch Profile Counting + * Service + * @{ + */ + +/** + * @brief ROCProfiler Profile Counting Data. + * + */ +typedef struct +{ + rocprofiler_timestamp_t start_timestamp; + rocprofiler_timestamp_t end_timestamp; + /** + * Counters, including identifiers to get counter information and Counters + * values + * + * Should it be a record per counter? + */ + rocprofiler_record_counter_t* counters; + uint64_t counters_count; + rocprofiler_correlation_id_t correlation_id; +} rocprofiler_dispatch_profile_counting_record_t; + +/** + * @brief Kernel Dispatch Callback + * + * @param [out] queue_id + * @param [out] agent_id + * @param [out] correlation_id + * @param [out] dispatch_packet It can be used to get the kernel descriptor and then using + * code_object tracing, we can get the kernel name. `dispatch_packet->reserved2` is the + * correlation_id used to correlate the dispatch packet with the corresponding API call. + * @param [out] callback_data_args + * @param [in] config + */ +typedef void (*rocprofiler_profile_counting_dispatch_callback_t)( + rocprofiler_queue_id_t queue_id, + rocprofiler_agent_t agent_id, + rocprofiler_correlation_id_t correlation_id, + const hsa_kernel_dispatch_packet_t* dispatch_packet, + void* callback_data_args, + rocprofiler_profile_config_id_t* config); + +/** + * @brief Configure Dispatch Profile Counting Service. + * + * @param [in] context_id + * @param [in] agent_id + * @param [in] buffer_id + * @param [in] callback + * @param [in] callback_data_args + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_dispatch_profile_counting_service( + rocprofiler_context_id_t context_id, + rocprofiler_agent_t agent_id, + rocprofiler_buffer_id_t buffer_id, + rocprofiler_profile_counting_dispatch_callback_t callback, + void* callback_data_args); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/external_correlation.h b/projects/rocprofiler-sdk/source/include/rocprofiler/external_correlation.h new file mode 100644 index 0000000000..85461e6611 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/external_correlation.h @@ -0,0 +1,60 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** + * @defgroup EXTERNAL_CORRELATION External Correlation IDs + * + * User-defined correlation identifiers to supplement rocprofiler generated correlation ids + * + * @{ + */ + +/** @} */ + +/** + * @brief ROCProfiler Push External Correlation ID. + * + * @param external_correlation_id + * @return rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_push_external_correlation_id( + rocprofiler_external_correlation_id_t external_correlation_id); + +/** + * @brief ROCProfiler Push External Correlation ID. + * + * @param external_correlation_id + * @return rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_pop_external_correlation_id( + rocprofiler_external_correlation_id_t* external_correlation_id); + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/fwd.h b/projects/rocprofiler-sdk/source/include/rocprofiler/fwd.h new file mode 100644 index 0000000000..45965da2ae --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/fwd.h @@ -0,0 +1,457 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +//--------------------------------------------------------------------------------------// +// +// ENUMERATIONS +// +//--------------------------------------------------------------------------------------// + +/** + * @defgroup BASIC_DATA_TYPES Basic data types + * + * Basic data types and typedefs + * @{ + */ + +/** + * @brief Status codes. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_STATUS_SUCCESS = 0, ///< No error occurred + ROCPROFILER_STATUS_ERROR, ///< Generalized error + ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND, ///< No valid context for given context id + ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND, ///< No valid buffer for given buffer id + ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND, ///< Domain identifier is invalid + ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND, ///< Operation identifier is invalid for domain + ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND, ///< No valid thread for given thread id + ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR, ///> Generalized context error + ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID, ///< Context configuration is not valid + ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED, ///< Context was not started (maybe already + ///< started or atomic swap into active array + ///< failed) + ROCPROFILER_STATUS_ERROR_BUFFER_BUSY, ///< buffer operation failed because it currently busy + ///< handling another request (e.g. flushing) + ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED, ///< service has already been configured + ///< in context + ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED, ///< Function call is not valid outside of + ///< rocprofiler configuration (i.e. + ///< function called post-initialization) + ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED, ///< Function is not implemented + ROCPROFILER_STATUS_LAST, +} rocprofiler_status_t; + +/** + * @brief Buffer record categories. This enumeration type is encoded in @ref + * rocprofiler_record_header_t category field + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_BUFFER_CATEGORY_NONE = 0, + ROCPROFILER_BUFFER_CATEGORY_TRACING, + ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING, + ROCPROFILER_BUFFER_CATEGORY_LAST, +} rocprofiler_buffer_category_t; + +/** + * @brief Agent type. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_AGENT_TYPE_NONE = 0, ///< Agent type is unknown + ROCPROFILER_AGENT_TYPE_CPU, ///< Agent type is a CPU + ROCPROFILER_AGENT_TYPE_GPU, ///< Agent type is a GPU + ROCPROFILER_AGENT_TYPE_LAST, +} rocprofiler_agent_type_t; + +/** + * @brief Service Callback Phase. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_SERVICE_CALLBACK_PHASE_NONE = 0, ///< Callback has no phase + ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER, ///< Callback invoked prior to function execution + ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT, ///< Callback invoked after to function execution + ROCPROFILER_SERVICE_CALLBACK_PHASE_LAST, +} rocprofiler_service_callback_phase_t; + +/** + * @brief Service Callback Tracing Kind. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE = 0, + ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, ///< Callbacks for HSA functions + ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, ///< Callbacks for HIP functions + ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, ///< Callbacks for ROCTx functions + ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, ///< Callbacks for code object info + ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, ///< Callbacks for kernel dispatches + ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST, +} rocprofiler_service_callback_tracing_kind_t; + +/** + * @brief Service Buffer Tracing Kind. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_SERVICE_BUFFER_TRACING_NONE = 0, + ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, ///< Buffer HSA function calls + ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, ///< Buffer HIP function calls + ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, ///< Buffer ROCTx function calls + ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, ///< Buffer memory copy info + ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, ///< Buffer kernel dispatch info + ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, ///< Buffer page migration info + ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, ///< Buffer scratch memory reclaimation info + ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, ///< Buffer external correlation info + // To determine if this is possible to implement? + // ROCPROFILER_SERVICE_BUFFER_TRACING_QUEUE_SCHEDULING, + ROCPROFILER_SERVICE_BUFFER_TRACING_LAST, +} rocprofiler_service_buffer_tracing_kind_t; + +/** + * @brief ROCProfiler Code Object Tracer Operation. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_NONE = 0, + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LOAD, + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_UNLOAD, + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_REGISTER, + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER, + // next two are part of hipRegisterFunction API. + // ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_REGISTER, + // ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_UNREGISTER, + ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LAST, +} rocprofiler_callback_tracing_code_object_operation_t; + +/** + * @brief Memory Copy Operation. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE = 0, + ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_HOST, + ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_HOST_TO_DEVICE, + ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_DEVICE, + ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_LAST, +} rocprofiler_buffer_tracing_memory_copy_operation_t; + +/** + * @brief PC Sampling Method. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_PC_SAMPLING_METHOD_NONE = 0, + ROCPROFILER_PC_SAMPLING_METHOD_STOCHASTIC, + ROCPROFILER_PC_SAMPLING_METHOD_HOST_TRAP, + ROCPROFILER_PC_SAMPLING_METHOD_LAST, +} rocprofiler_pc_sampling_method_t; + +/** + * @brief PC Sampling Unit. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_PC_SAMPLING_UNIT_NONE = 0, ///< Sample interval has unspecified units + ROCPROFILER_PC_SAMPLING_UNIT_INSTRUCTIONS, ///< Sample interval is in instructions + ROCPROFILER_PC_SAMPLING_UNIT_CYCLES, ///< Sample interval is in cycles + ROCPROFILER_PC_SAMPLING_UNIT_TIME, ///< Sample internval is in nanoseconds + ROCPROFILER_PC_SAMPLING_UNIT_LAST, +} rocprofiler_pc_sampling_unit_t; + +/** + * @brief Actions when Buffer is full. + */ +typedef enum // NOLINT(performance-enum-size) +{ + ROCPROFILER_BUFFER_POLICY_NONE = 0, ///< No policy has been set + ROCPROFILER_BUFFER_POLICY_DISCARD, ///< Drop records when buffer is full + ROCPROFILER_BUFFER_POLICY_LOSSLESS, ///< Block when buffer is full + ROCPROFILER_BUFFER_POLICY_LAST, +} rocprofiler_buffer_policy_t; + +//--------------------------------------------------------------------------------------// +// +// ALIASES +// +//--------------------------------------------------------------------------------------// + +/** + * @brief ROCProfiler Timestamp. + */ +typedef uint64_t rocprofiler_timestamp_t; + +/** + * @brief ROCProfiler Address. + */ +typedef uint64_t rocprofiler_address_t; + +/** + * @brief Thread ID. Value will be equivalent to `syscall(__NR_gettid)` + */ +typedef uint64_t rocprofiler_thread_id_t; + +/** + * @brief Tracing Operation ID. Depending on the kind, operations can be determined. + * If the value is equal to zero that means all operations will be considered + * for tracing. + */ +typedef uint32_t rocprofiler_tracing_operation_t; + +/** + * @brief Needs non-typedef specification? + */ +typedef uint32_t rocprofiler_counter_instance_id_t; + +// forward declaration of struct +typedef struct rocprofiler_pc_sampling_configuration_s rocprofiler_pc_sampling_configuration_t; + +//--------------------------------------------------------------------------------------// +// +// UNIONS +// +//--------------------------------------------------------------------------------------// + +/** + * @brief User-assignable data type + * + */ +typedef union rocprofiler_user_data_t +{ + uint64_t value; + void* ptr; +} rocprofiler_user_data_t; + +//--------------------------------------------------------------------------------------// +// +// STRUCTS +// +//--------------------------------------------------------------------------------------// + +/** + * @brief Context ID. + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_context_id_t; + +/** + * @brief Queue ID. + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_queue_id_t; + +/** + * @brief ROCProfiler Record Correlation ID. + */ +typedef struct +{ + uint64_t id; +} rocprofiler_correlation_id_t; + +/** + * @brief ROCProfiler External Correlation ID. + */ +typedef struct +{ + uint64_t id; +} rocprofiler_external_correlation_id_t; + +/** + * @brief Buffer ID. + * @addtogroup BUFFER_HANDLING + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_buffer_id_t; + +/** + * @brief Agent Identifier + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_agent_id_t; + +/** + * @brief Counter ID. + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_counter_id_t; + +/** + * @brief Profile Configurations + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_profile_config_id_t; + +/** + * @brief Array of PC Sampling Configurations + */ +typedef struct rocprofiler_pc_sampling_config_array_s +{ + rocprofiler_pc_sampling_configuration_t* data; + size_t size; +} rocprofiler_pc_sampling_config_array_t; + +/** + * @brief Tracing record + * + */ +typedef struct rocprofiler_callback_tracing_record_t +{ + rocprofiler_thread_id_t thread_id; + rocprofiler_correlation_id_t correlation_id; + rocprofiler_external_correlation_id_t external_correlation_id; + rocprofiler_service_callback_tracing_kind_t kind; + uint32_t operation; + rocprofiler_service_callback_phase_t phase; + rocprofiler_user_data_t data; + void* payload; +} rocprofiler_callback_tracing_record_t; + +/** + * @brief Generic record with type identifier(s) and a pointer to data. This data type is used with + * buffered data. + * + * @code{.cpp} + * void + * tool_tracing_callback(rocprofiler_record_header_t** headers, + * size_t num_headers) + * { + * for(size_t i = 0; i < num_headers; ++i) + * { + * rocprofiler_record_header_t* header = headers[i]; + * + * if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING && + * header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API) + * { + * // cast to rocprofiler_buffer_tracing_hsa_api_record_t which + * // is type associated with this category + kind + * auto* record = + * static_cast(header->payload); + * + * // trivial test + * assert(record->start_timestamp <= record->end_timestamp); + * } + * } + * } + * + * @endcode + */ +typedef struct +{ + union + { + struct + { + uint32_t category; ///< rocprofiler_buffer_category_t + uint32_t kind; ///< domain + }; + uint64_t hash; ///< generic identifier. You can compute this via: `uint64_t hash = category + ///< | ((uint64_t)(kind) << 32)`, e.g. + }; + void* payload; +} rocprofiler_record_header_t; + +/** + * @brief Function for computing the unsigned 64-bit hash value in @ref rocprofiler_record_header_t + * from a category and kind (two unsigned 32-bit values) + * + * @param category [in] a value from @ref rocprofiler_buffer_category_t + * @param kind [in] depending on the category, this is the domain value, e.g., @ref + * rocprofiler_service_buffer_tracing_kind_t value + * @return uint64_t hash value of category and kind + */ +static inline uint64_t +rocprofiler_record_header_compute_hash(uint32_t category, uint32_t kind) +{ + uint64_t value = category; + value |= ((uint64_t)(kind)) << 32; + return value; +} + +/** + * @brief ROCProfiler Profile Counting Counter per instance. + */ +typedef struct +{ + rocprofiler_counter_id_t counter_id; + rocprofiler_counter_instance_id_t instance_id; + double counter_value; +} rocprofiler_record_counter_t; + +/** + * @brief ROCProfiler PC Sampling Record. + * + */ +typedef struct +{ + uint64_t pc; + uint64_t dispatch_id; + uint64_t timestamp; + uint64_t hardware_id; + union + { + uint8_t arb_value; + }; + union + { + void* data; + }; +} rocprofiler_pc_sampling_record_t; + +/** + * @brief ROCProfiler SPM Record. + * + */ +typedef struct +{ + /** + * Counters, including identifiers to get counter information and Counters + * values + */ + rocprofiler_record_counter_t* counters; + uint64_t counters_count; +} rocprofiler_spm_record_t; + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/hip.h b/projects/rocprofiler-sdk/source/include/rocprofiler/hip.h index fe5d36a131..3a6eff20f4 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/hip.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/hip.h @@ -27,7 +27,6 @@ #include -typedef uint32_t rocprofiler_trace_record_hip_operation_kind_t; typedef struct rocprofiler_hip_trace_data_s rocprofiler_hip_trace_data_t; typedef struct rocprofiler_hip_api_data_s rocprofiler_hip_api_data_t; diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/hsa.h b/projects/rocprofiler-sdk/source/include/rocprofiler/hsa.h index d19cac16cc..9ebed17b45 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/hsa.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/hsa.h @@ -30,33 +30,14 @@ #include -typedef uint32_t rocprofiler_trace_record_hsa_operation_kind_t; typedef struct hsa_kernel_dispatch_packet_s hsa_kernel_dispatch_packet_t; -typedef struct rocprofiler_hsa_trace_data_s rocprofiler_hsa_trace_data_t; typedef struct rocprofiler_hsa_api_data_s rocprofiler_hsa_api_data_t; struct rocprofiler_hsa_api_data_s { - uint64_t correlation_id; - uint32_t phase; - union - { - uint64_t uint64_t_retval; - uint32_t uint32_t_retval; - hsa_signal_value_t hsa_signal_value_t_retval; - hsa_status_t hsa_status_t_retval; - }; - rocprofiler_hsa_api_args_t args; - uint64_t* phase_data; -}; - -struct rocprofiler_hsa_trace_data_s -{ - rocprofiler_hsa_api_data_t api_data; - uint64_t phase_enter_timestamp; - uint64_t phase_exit_timestamp; - uint64_t phase_data; - - void (*phase_enter)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data); - void (*phase_exit)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data); + uint64_t correlation_id; + uint32_t phase; + rocprofiler_hsa_api_args_t args; + rocprofiler_hsa_api_retval_t retval; + uint64_t* phase_data; }; diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/hsa/api_args.h b/projects/rocprofiler-sdk/source/include/rocprofiler/hsa/api_args.h index 13133ba059..3010ccf9fa 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/hsa/api_args.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/hsa/api_args.h @@ -26,6 +26,14 @@ #include #include +typedef union rocprofiler_hsa_api_retval_u +{ + uint64_t uint64_t_retval; + uint32_t uint32_t_retval; + hsa_signal_value_t hsa_signal_value_t_retval; + hsa_status_t hsa_status_t_retval; +} rocprofiler_hsa_api_retval_t; + typedef union rocprofiler_hsa_api_args_u { // block: CoreApi API diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/internal_threading.h b/projects/rocprofiler-sdk/source/include/rocprofiler/internal_threading.h new file mode 100644 index 0000000000..e8133c914e --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/internal_threading.h @@ -0,0 +1,123 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup INTERNAL_THREADING Internal thread handling + * + * Callbacks before and after threads created internally by libraries + * + * @{ + */ + +/** + * @brief Enumeration for specifying which libraries you want callbacks before and after the library + * creates an internal thread. These callbacks will be invoked on the thread that is about to create + * the new thread (not on the newly created thread). In thread-aware tools that wrap pthread_create, + * this can be used to disable the wrapper before the pthread_create invocation and re-enable the + * wrapper afterwards. In many cases, tools will want to ignore the thread(s) created by rocprofiler + * since these threads do not exist in the normal application execution, whereas the internal + * threads for HSA, HIP, etc. are created in normal application execution; however, the HIP, HSA, + * etc. internal threads are typically background threads which just monitor kernel completion and + * are unlikely to contribute to any performance issues. + */ +typedef enum +{ + ROCPROFILER_LIBRARY = (1 << 0), + ROCPROFILER_HSA_LIBRARY = (1 << 1), + ROCPROFILER_HIP_LIBRARY = (1 << 2), + ROCPROFILER_MARKER_LIBRARY = (1 << 3), + ROCPROFILER_LIBRARY_LAST = ROCPROFILER_MARKER_LIBRARY, +} rocprofiler_internal_thread_library_t; + +/** + * @brief Callback type before and after internal thread creation. @see + * rocprofiler_at_internal_thread_create + * + */ +typedef void (*rocprofiler_internal_thread_library_cb_t)(rocprofiler_internal_thread_library_t, + void*); + +/** + * @brief Invoke this function to receive callbacks before and after the creation of an internal + * thread by a library which as invoked on the thread which is creating the internal thread(s). + * Please note that the postcreate callback is guaranteed to be invoked after the underlying + * system call to create a new thread but it does not guarantee that the new thread has been + * started. Please note, that once these callbacks are registered, they cannot be removed so the + * caller is responsible for ignoring these callbacks if they want to ignore them beyond a certain + * point in the application. + * + * @param precreate [in] Callback invoked immediately before a new internal thread is created + * @param postcreate [in] Callback invoked immediately after a new internal thread is created + * @param libs [in] Bitwise-or of libraries, e.g. `ROCPROFILER_LIBRARY | ROCPROFILER_MARKER_LIBRARY` + * means the callbacks will be invoked whenever rocprofiler and/or the marker library create + * internal threads but not when the HSA or HIP libraries create internal threads. + * @param data [in] Data shared between callbacks + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate, + rocprofiler_internal_thread_library_cb_t postcreate, + int libs, + void* data); + +/** + * @brief opaque handle to an internal thread identifier which delivers callbacks for buffers + */ +typedef struct +{ + uint64_t handle; +} rocprofiler_callback_thread_t; + +/** + * @brief Create a handle to a unique thread (created by rocprofiler) which, when associated with a + * particular buffer, will guarantee those buffered results always get delivered on the same thread. + * This is useful to prevent/control thread-safety issues and/or enable multithreaded processing of + * buffers with non-overlapping data + * + * @param [in] cb_thread_id User-provided pointer to a @ref rocprofiler_callback_thread_t + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id) + ROCPROFILER_NONNULL(1); + +/** + * @brief By default, all buffered results are delivered on the same thread. Using @ref + * rocprofiler_create_callback_thread, one or more buffers can be assigned to deliever their results + * on a unique, dedicated thread. + * + * @param [in] buffer_id Buffer identifier + * @param [in] cb_thread_id Callback thread identifier via @ref rocprofiler_create_callback_thread + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id, + rocprofiler_callback_thread_t cb_thread_id); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/marker.h b/projects/rocprofiler-sdk/source/include/rocprofiler/marker.h index 432bfca574..9bb3afe123 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/marker.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/marker.h @@ -24,9 +24,6 @@ #include -#include - -typedef uint32_t rocprofiler_trace_record_marker_operation_kind_t; typedef struct rocprofiler_roctx_api_data_s rocprofiler_roctx_api_data_t; struct rocprofiler_roctx_api_data_s diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/pc_sampling.h b/projects/rocprofiler-sdk/source/include/rocprofiler/pc_sampling.h new file mode 100644 index 0000000000..14992b6e45 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/pc_sampling.h @@ -0,0 +1,79 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup PC_SAMPLING_SERVICE PC Sampling Service + * @{ + */ + +/** + * @brief Create PC Sampling Service. + * + * @param [in] context_id + * @param [in] agent + * @param [in] method + * @param [in] unit + * @param [in] interval + * @param [in] buffer_id + * @return ::rocprofiler_status_t + * + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id, + rocprofiler_agent_t agent, + rocprofiler_pc_sampling_method_t method, + rocprofiler_pc_sampling_unit_t unit, + uint64_t interval, + rocprofiler_buffer_id_t buffer_id); + +struct rocprofiler_pc_sampling_configuration_s +{ + rocprofiler_pc_sampling_method_t method; + rocprofiler_pc_sampling_unit_t unit; + size_t min_interval; + size_t max_interval; + uint64_t flags; +}; + +/** + * @brief Query PC Sampling Configuration. + * + * @param [in] agent + * @param [out] config + * @param [out] config_count + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_query_pc_sampling_agent_configurations(rocprofiler_agent_t agent, + rocprofiler_pc_sampling_configuration_t* config, + size_t* config_count) ROCPROFILER_NONNULL(2, 3); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/profile_config.h b/projects/rocprofiler-sdk/source/include/rocprofiler/profile_config.h new file mode 100644 index 0000000000..21e07a7593 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/profile_config.h @@ -0,0 +1,63 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup PROFILE_CONFIG Profile Configurations + * + * @{ + */ + +/** + * @brief Create Profile Configuration. + * + * @param [in] agent Agent identifier + * @param [in] counters_list List of GPU counters + * @param [in] counters_count Size of counters list + * @param [out] config_id Identifier for GPU counters group + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_create_profile_config(rocprofiler_agent_t agent, + rocprofiler_counter_id_t* counters_list, + size_t counters_count, + rocprofiler_profile_config_id_t* config_id) + ROCPROFILER_NONNULL(4); + +/** + * @brief Destroy Profile Configuration. + * + * @param [in] config_id + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_destroy_profile_config(rocprofiler_profile_config_id_t config_id); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/registration.h b/projects/rocprofiler-sdk/source/include/rocprofiler/registration.h new file mode 100644 index 0000000000..500e23e85c --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/registration.h @@ -0,0 +1,220 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** + * @defgroup REGISTRATION_GROUP Tool registration + * + * Data types and functions for tool registration with rocprofiler + * @{ + */ + +/** + * @brief A pointer to this data structure is provided to the client tool initialization function. + * The name member can be set by the client to assist with debugging (e.g. rocprofiler cannot start + * your context because there is a conflicting context started by `` -- at least that is the + * plan). The handle member is a unique identifer assigned by rocprofiler for the client and the + * client can store it and pass it to the @ref rocprofiler_client_finalize_t function to force + * finalization (i.e. deactivate all of it's contexts) for the client. + */ +typedef struct +{ + const char* name; ///< clients should set this value for debugging + const uint32_t handle; ///< internal handle +} rocprofiler_client_id_t; + +typedef void (*rocprofiler_client_finalize_t)(rocprofiler_client_id_t); + +typedef int (*rocprofiler_tool_initialize_t)(rocprofiler_client_finalize_t finalize_func, + void* tool_data); + +typedef void (*rocprofiler_tool_finalize_t)(void* tool_data); + +/** + * @brief Data structure containing a initialization, finalization, and data + * + */ +typedef struct +{ + size_t size; ///< in case of future extensions + rocprofiler_tool_initialize_t initialize; ///< context creation + rocprofiler_tool_finalize_t finalize; ///< cleanup + void* tool_data; ///< data to provide to init and fini callbacks +} rocprofiler_tool_configure_result_t; + +/** + * @brief Query whether rocprofiler has already scanned the binary for all the instances of @ref + * rocprofiler_configure (or is currently scanning). If rocprofiler has completed it's scan, clients + * can directly register themselves with rocprofiler. + * + * @param [out] status 0 indicates rocprofiler has not been initialized (i.e. configured), 1 + * indicates rocprofiler has been initialized, -1 indicates rocprofiler is currently initializing. + * @return rocprofiler_status_t + */ +rocprofiler_status_t +rocprofiler_is_initialized(int* status) ROCPROFILER_API; + +/** + * @brief Query rocprofiler finalization status. + * + * @param [out] status 0 indicates rocprofiler has not been finalized, 1 indicates rocprofiler has + * been finalized, -1 indicates rocprofiler is currently finalizing. + * @return rocprofiler_status_t + */ +rocprofiler_status_t +rocprofiler_is_finalized(int* status) ROCPROFILER_API; + +/** + * @brief This is the special function that tools define to enable rocprofiler support. The tool + * should return a pointer to + * @ref rocprofiler_tool_configure_result_t which will contain a function pointer to (1) an + * initialization function where all the contexts are created, (2) a finalization function (if + * necessary) which will be invoked when rocprofiler shutdown and, (3) a pointer to any data that + * the tool wants communicated between the @ref rocprofiler_tool_configure_result_t::initialize and + * @ref rocprofiler_tool_configure_result_t::finalize functions. If the user + * + * @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch` + * @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info. + * @param [in] priority How many client tools were initialized before this client tool + * @param [in, out] client_id tool identifier value. + * @return rocprofiler_tool_configure_result_t* + * + * @code{.cpp} + * #include + * + * static rocprofiler_client_id_t my_client_id; + * static rocprofiler_client_finalize_t my_fini_func; + * static int my_tool_data = 1234; + * + * static int my_init_func(rocprofiler_client_finalize_t fini_func, + * void* tool_data) + * { + * my_fini_func = fini_func; + * + * assert(*static_cast(tool_data) == 1234 && "tool_data is wrong"); + * + * rocprofiler_context_id_t ctx; + * rocprofiler_create_context(&ctx); + * + * if(int valid_ctx = 0; + * rocprofiler_context_is_valid(ctx, &valid_ctx) != ROCPROFILER_STATUS_SUCCESS || + * valid_ctx != 0) + * { + * // notify rocprofiler that initialization failed + * // and all the contexts, buffers, etc. created + * // should be ignored + * return -1; + * } + * + * if(rocprofiler_start_context(ctx) != ROCPROFILER_STATUS_SUCCESS) + * { + * // notify rocprofiler that initialization failed + * // and all the contexts, buffers, etc. created + * // should be ignored + * return -1; + * } + * + * // no errors + * return 0; + * } + * + * static int my_fini_func(void* tool_data) + * { + * assert(*static_cast(tool_data) == 1234 && "tool_data is wrong"); + * } + * + * rocprofiler_tool_configure_result_t* + * rocprofiler_configure(uint32_t version, + * const char* runtime_version, + * uint32_t priority, + * rocprofiler_client_id_t* client_id) + * { + * // only activate if main tool + * if(priority > 0) return nullptr; + * + * // set the client name + * client_id->name = "ExampleTool"; + * + * // make a copy of client info + * my_client_id = *client_id; + * + * // compute major/minor/patch version info + * uint32_t major = version / 10000; + * uint32_t minor = (version % 10000) / 100; + * uint32_t patch = version % 100; + * + * // print info + * printf("Configuring rocprofiler (v%u.%u.%u) [%s]\n", major, minor, patch, runtime_version); + * + * // create configure data + * static auto cfg = rocprofiler_tool_configure_result_t{ &my_init_func, + * &my_fini_func, + * &my_tool_data }; + * + * // return pointer to configure data + * return &cfg; + * } + * @endcode + */ +rocprofiler_tool_configure_result_t* +rocprofiler_configure(uint32_t version, + const char* runtime_version, + uint32_t priority, + rocprofiler_client_id_t* client_id) ROCPROFILER_PUBLIC_API; + +// NOTE: we use ROCPROFILER_PUBLIC_API above instead of ROCPROFILER_API because we always +// want the symbol to be visible when the user includes the header for the prototype + +/** + * @brief Function pointer typedef for @ref rocprofiler_configure function + * @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch` + * @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info. + * @param [in] priority How many client tools were initialized before this client tool + * @param [in, out] client_id tool identifier value. + */ +typedef rocprofiler_tool_configure_result_t* (*rocprofiler_configure_func_t)( + uint32_t version, + const char* runtime_version, + uint32_t priority, + rocprofiler_client_id_t* client_id); + +/** + * @brief Function for explicitly registering a configuration with rocprofiler. This can be invoked + * before any ROCm runtimes (lazily) initialize and context(s) can be started before the runtimes + * initialize. + * @param [in] configure_func Address of @ref rocprofiler_configure function. A null pointer is + * acceptable if the address is not known + * @returns rocprofiler_status_t If rocprofiler has already been configured, or is currently being + * configured, this function will return @ref ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED. + */ +rocprofiler_status_t +rocprofiler_force_configure(rocprofiler_configure_func_t configure_func) ROCPROFILER_API; + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler.h b/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler.h index 4254e0a286..f3c466fdd7 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler.h @@ -25,28 +25,7 @@ #include #include -/** @defgroup SYMBOL_VERSIONING_GROUP Symbol Versions - * - * The names used for the shared library versioned symbols. - * - * Every function is annotated with one of the version macros defined in this - * section. Each macro specifies a corresponding symbol version string. After - * dynamically loading the shared library with @p dlopen, the address of each - * function can be obtained using @p dlsym with the name of the function and - * its corresponding symbol version string. An error will be reported by @p - * dlvsym if the installed library does not support the version for the - * function specified in this version of the interface. - * - * @{ - */ - -/** - * The function was introduced in version 10.0 of the interface and has the - * symbol version string of ``"ROCPROFILER_10.0"``. - */ -#define ROCPROFILER_VERSION_10_0 - -/** @} */ +#include "rocprofiler/defines.h" /** @defgroup VERSIONING_GROUP Library Versioning * @@ -59,1267 +38,51 @@ * less than or equal to the installed library minor version number. */ -#include "rocprofiler/defines.h" +#include "rocprofiler/agent.h" +#include "rocprofiler/agent_profile.h" +#include "rocprofiler/buffer.h" +#include "rocprofiler/buffer_tracing.h" +#include "rocprofiler/callback_tracing.h" +#include "rocprofiler/context.h" +#include "rocprofiler/counters.h" +#include "rocprofiler/dispatch_profile.h" +#include "rocprofiler/external_correlation.h" +#include "rocprofiler/fwd.h" #include "rocprofiler/hip.h" #include "rocprofiler/hsa.h" #include "rocprofiler/marker.h" +#include "rocprofiler/pc_sampling.h" +#include "rocprofiler/profile_config.h" +#include "rocprofiler/spm.h" #include "rocprofiler/version.h" -#ifdef __cplusplus -extern "C" { -#endif /* __cplusplus */ +ROCPROFILER_EXTERN_C_INIT /** - * @fn void rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch) - * @param [out] major The major version number is stored if non-NULL. - * @param [out] minor The minor version number is stored if non-NULL. - * @param [out] patch The patch version number is stored if non-NULL. - * @addtogroup VERSIONING_GROUP - * * @brief Query the version of the installed library. * * Return the version of the installed library. This can be used to check if * it is compatible with this interface version. This function can be used * even when the library is not initialized. - */ -void ROCPROFILER_API -rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch) - ROCPROFILER_NONNULL(1, 2, 3); - -/** - * @defgroup STATUS_CODES Status codes - * @{ - */ - -/** - * @brief Status codes. - * - */ -typedef enum -{ - ROCPROFILER_STATUS_SUCCESS = 0, - ROCPROFILER_STATUS_ERROR, - ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND, - ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND, - ROCPROFILER_STATUS_ERROR_INCORRECT_DOMAIN, - ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID, - ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND, - ROCPROFILER_STATUS_ERROR_HAS_ACTIVE_CONTEXT, - ROCPROFILER_STATUS_ERROR_INVALID_OPERATION_ID, - ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_ACTIVE, - ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED, - ROCPROFILER_STATUS_LAST, -} rocprofiler_status_t; - -/** @} */ - -/** - * @defgroup CONTEXT_OPERATIONS Context - * @{ - */ - -/** - * @brief Context ID. - * - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_context_id_t; - -/** - * The NULL Context handle. - */ -#define ROCPROFILER_CONTEXT_NONE ROCPROFILER_HANDLE_LITERAL(rocprofiler_context_id_t, 0) - -/** - * @brief Create context. - * - * @param context_id [out] Context identifier - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_create_context(rocprofiler_context_id_t* context_id) ROCPROFILER_NONNULL(1); - -/** - * @brief Start context. - * - * @param [in] context_id - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_start_context(rocprofiler_context_id_t context_id); - -/** - * @brief Stop context. - * - * @param [in] context_id - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_stop_context(rocprofiler_context_id_t context_id); - -/** @} */ - -/** - * @defgroup RECORDS ROCProfiler Records - * @{ - */ - -/** @} */ - -/** - * @brief Buffer ID. - * @addtogroup BUFFER_HANDLING - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_buffer_id_t; - -/** @defgroup SERVICE_OPERATIONS Services - * @{ - */ - -/** - * @brief Agent type. - */ -typedef enum -{ - ROCPROFILER_AGENT_TYPE_NONE = 0, ///< agent is unknown type - ROCPROFILER_AGENT_TYPE_CPU, ///< agent is CPU - ROCPROFILER_AGENT_TYPE_GPU, ///< agent is GPU - ROCPROFILER_AGENT_TYPE_LAST, -} rocprofiler_agent_type_t; - -/** - * @brief Agent Identifier - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_agent_id_t; - -typedef struct rocprofiler_pc_sampling_configuration_s rocprofiler_pc_sampling_configuration_t; - -typedef struct rocprofiler_pc_sampling_config_array_s -{ - rocprofiler_pc_sampling_configuration_t* data; - size_t size; -} rocprofiler_pc_sampling_config_array_t; - -/** - * @brief Agent. - */ -typedef struct -{ - rocprofiler_agent_id_t id; - rocprofiler_agent_type_t type; - const char* name; - rocprofiler_pc_sampling_config_array_t pc_sampling_configs; -} rocprofiler_agent_t; - -/** - * @brief Callback function type for querying the available agents - * - * @param [in] agents Array of pointers to agents - * @param [in] num_agents Number of agents in array - * @param [in] user_data Data pointer passback - * @return ::rocprofiler_status_t - */ -typedef rocprofiler_status_t (*rocprofiler_available_agents_cb_t)(rocprofiler_agent_t** agents, - size_t num_agents, - void* user_data); - -/** - * @brief Receive synchronous callback with an array of available agents at moment of invocation - * - * @param [in] callback Callback function accepting list of agents - * @param [in] agent_size Should be set to sizeof(rocprofiler_agent_t) - * @param [in] user_data Data pointer provided to callback - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback, - size_t agent_size, - void* user_data) ROCPROFILER_NONNULL(1); - -/** - * @brief Queue ID. - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_queue_id_t; - -/** - * @brief Thread ID - */ -typedef uint64_t rocprofiler_thread_id_t; - -/** - * @brief ROCProfiler Record Correlation ID. - * To be reviewed? - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_correlation_id_t; - -/** - * @brief ROCProfiler Timestamp. - * - */ -typedef uint64_t rocprofiler_timestamp_t; - -/** - * @brief ROCProfiler Address. - */ -typedef uint64_t rocprofiler_address_t; - -/** @defgroup TRACING_SERVICES Tracing Services - * @{ - */ - -/** - * @brief Tracing Domain ID. - * - * Domains for tracing - * - * if the value is equal to zero that means all operations will be considered - * for tracing. - * - */ -typedef enum -{ - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE = 0, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_MARKER_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX = ROCPROFILER_TRACER_ACTIVITY_DOMAIN_MARKER_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_KFD_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_EXT_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_EVT, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST -} rocprofiler_tracer_activity_domain_t; - -/** - * @brief Tracing Operation ID. - * - * Depending on the kind, operations can be determined - * - * if the value is equal to zero that means all operations will be considered - * for tracing. - * - */ -typedef uint32_t rocprofiler_trace_operation_t; - -/** @defgroup CALLBACK_TRACING_SERVICE Callback Tracing Service - * @{ - */ - -/** - * @brief Service Callback Tracing Kind. - */ -typedef enum -{ - ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE = 0, - ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API = 1, - ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API = 2, - ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER = 3, - ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT = 4, - ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH = 5, - ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD = 6, - // TODO: Is tracing runtime threads possible? - // ROCPROFILER_SERVICE_CALLBACK_TRACING_RUNTIME_THREAD = 7, - ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST, -} rocprofiler_service_callback_tracing_kind_t; - -/** - * @defgroup HSA_API_CALLBACK_TRACING_RECORDS HSA API Callback Tracing Records - * @{ - */ - -/** - * @brief ROCProfiler HSA API Callback Data. - * - * Depending on the operation kind, the data can be casted to the corresponding - * structure. - * - */ -typedef void* rocprofiler_hsa_api_callback_api_data_t; - -/** - * @brief ROCProfiler HSA API Callback Data. - */ -typedef struct -{ - rocprofiler_correlation_id_t correlation_id; - rocprofiler_hsa_api_callback_api_data_t data; // Arguments or api_data? -} rocprofiler_hsa_api_callback_tracer_data_t; - -/** - * @brief ROCProfiler HIP API Callback Data. - * - * Depending on the operation kind, the data can be casted to the corresponding - * structure. - * - */ -typedef void* rocprofiler_hip_api_callback_api_data_t; - -/** - * @brief ROCProfiler HIP API Tracer Callback Data. - */ -typedef struct -{ - rocprofiler_correlation_id_t correlation_id; - rocprofiler_address_t host_kernel_address; - rocprofiler_hip_api_callback_api_data_t data; // Arguments or api_data? -} rocprofiler_hip_api_callback_tracer_data_t; - -/** - * @brief ROCProfiler Marker Callback Data. - * - * Depending on the operation kind, the data can be casted to the corresponding - * structure. - * - */ -typedef void* rocprofiler_marker_callback_api_data_t; - -/** - * @brief ROCProfiler Marker Tracer Callback Data. - */ -typedef struct -{ - rocprofiler_correlation_id_t correlation_id; - rocprofiler_marker_callback_api_data_t data; // Arguments or api_data? -} rocprofiler_marker_callback_tracer_data_t; - -/** - * @brief ROCProfiler Tracing Helper Thread. - * - */ -typedef enum -{ - - ROCPROFILER_TRACING_HELPER_THREAD_START = 0, - ROCPROFILER_TRACING_HELPER_THREAD_COMPLETE = 1, - ROCPROFILER_TRACING_HELPER_THREAD_LAST, -} rocprofiler_tracing_helper_thread_operation_t; - -/** - * @brief ROCProfiler Helper Thread Callback Data. - * - */ -typedef struct -{ - rocprofiler_tracing_helper_thread_operation_t id; -} rocprofiler_helper_thread_callback_tracer_data_t; - -/** - * @brief ROCProfiler Code Object Tracer Operation. - */ -typedef enum -{ - ROCPROFILER_TRACING_CODE_OBJECT_NONE = 0, - ROCPROFILER_TRACING_CODE_OBJECT_LOAD = 1, - ROCPROFILER_TRACING_CODE_OBJECT_UNLOAD = 2, - ROCPROFILER_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_REGISTER = 3, - ROCPROFILER_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER = 4, - // Should we remove these as they will be part of hipRegisterFunction API - // tracing? ROCPROFILER_TRACING_CODE_OBJECT_REGISTER_HOST_KERNEL_SYMBOL = 5, - // (?) ROCPROFILER_TRACING_CODE_OBJECT_UNREGISTER_HOST_KERNEL_SYMBOL = 6, (?) - ROCPROFILER_TRACING_CODE_OBJECT_LAST, -} rocprofiler_tracing_code_object_operation_t; - -/** - * @brief ROCProfiler Code Object Load Tracer Callback Record. - */ -typedef struct -{ - uint64_t load_base; // code object load base - uint64_t load_size; // code object load size - const char* uri; // URI string (NULL terminated) - // uint32_t storage_type; // code object storage type (Need Review?) - // int storage_file; // origin file descriptor (Need Review?) - // uint64_t memory_base; // origin memory base (Need Review?) - // uint64_t memory_size; // origin memory size (Need Review?) - // uint64_t load_delta; // code object load delta (Need Review?) -} rocprofiler_callback_tracer_code_object_load_data_t; - -/** - * @brief ROCProfiler Code Object UnLoad Tracer Callback Record. - * - */ -typedef struct -{ - uint64_t load_base; // code object load base -} rocprofiler_callback_tracer_code_object_unload_data_t; - -/** - * @brief ROCProfiler Code Object Device Kernel Symbol Tracer Callback Record. - * - */ -typedef struct -{ - const char* kernel_name; // kernel name string (NULL terminated) - rocprofiler_address_t kernel_descriptor; // kernel descriptor -} rocprofiler_callback_tracer_code_object_device_kernel_symbol_data_t; - -/** - * @brief ROCProfiler Code Object Register Host Kernel Symbol Tracer Callback - * Record. - * - */ -typedef struct -{ - rocprofiler_address_t host_address; // host address - // Should this be nullptr if it is unregister? - const char* kernel_name; // kernel name string (NULL terminated) - rocprofiler_address_t kernel_descriptor; // kernel descriptor -} rocprofiler_callback_tracer_code_object_register_host_kernel_symbol_data_t; - -/** @} */ - -/** - * @brief API Tracing callback data. - * - * This can be casted to: - * rocprofiler_hsa_callback_data_t if the record kind is - * @ref ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API - * rocprofiler_hip_callback_data_t if the record kind is - * @ref ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API - * rocprofiler_marker_callback_data_t if the record kind is - * @ref ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER - * - */ -typedef void* rocprofiler_tracer_callback_data_t; - -/** - * @brief API Tracing callback operation kind. - * - * Depending on the ::rocprofiler_service_callback_tracing_kind_t - * the operation kind can be determined from the following: - * rocprofiler_marker_trace_record_operation_t for Markers - * rocprofiler_hsa_trace_record_operation_t for HSA API - * rocprofiler_hip_trace_record_operation_t for HIP API - * rocprofiler_code_object_record_operation_t for Code object tracing - * - */ -typedef uint32_t rocprofiler_tracer_callback_operation_t; - -/** - * @brief API Tracing callback function. - */ -typedef void (*rocprofiler_tracer_callback_t)(rocprofiler_service_callback_tracing_kind_t kind, - rocprofiler_tracer_callback_operation_t operation, - rocprofiler_tracer_callback_data_t data, - void* callback_args); - -/** - * @brief Configure Callback Tracing Service. - * - * @param [in] context_id - * @param [in] kind - * @param [in] operations - * @param [in] operations_count - * @param [in] callback - * @param [in] callback_args - * @return ::rocprofiler_status_t - * - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id, - rocprofiler_service_callback_tracing_kind_t kind, - rocprofiler_trace_operation_t* operations, - size_t operations_count, - rocprofiler_tracer_callback_t callback, - void* callback_args); - -/** - * @brief Query Callback Trace Kind Name. - * - * @param [in] kind - * @param [out] name if nullptr, size will be returned - * @param [out] size - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_callback_trace_kind_name(rocprofiler_service_callback_tracing_kind_t kind, - const char* name, - size_t* size) ROCPROFILER_NONNULL(3); - -/** - * @brief General Operation kind - * - * That can be used to represent one of the following: - * - ::rocprofiler_trace_record_hsa_operation_kind_t - * - ::rocprofiler_trace_record_hip_operation_kind_t - * - ::rocprofiler_trace_record_marker_operation_kind_t - * - */ -typedef uint32_t rocprofiler_trace_record_operation_kind_t; - -/** - * @brief Query callback kind operation name. - * - * @param [in] kind - * @param [in] api_trace_operation - * @param [out] name if nullptr, size will be returned - * @param [out] size - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_callback_kind_operation_name( - rocprofiler_service_callback_tracing_kind_t kind, - rocprofiler_trace_record_operation_kind_t api_trace_operation, - const char* name, - size_t* size) ROCPROFILER_NONNULL(4); - -/** @} */ - -/** @defgroup BUFFER_TRACING_SERVICE Buffer Tracing Service - * @{ - */ - -/** - * @brief Service Buffer Tracing Kind. - */ -typedef enum -{ - ROCPROFILER_SERVICE_BUFFER_TRACING_NONE = 0, - ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API = 1, - ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API = 2, - ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER = 3, - ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY = 4, - ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH = 5, - ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION = 6, - ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY = 7, - ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION = 8, - // To determine if this is possible to implement? - // ROCPROFILER_SERVICE_BUFFER_TRACING_QUEUE_SCHEDULING = 9, - // Do we need to keep it in buffer tracing? - // ROCPROFILER_SERVICE_BUFFER_TRACING_CODE_OBJECT = 10, - ROCPROFILER_SERVICE_BUFFER_TRACING_LAST, -} rocprofiler_service_buffer_tracing_kind_t; - -/** - * @brief ROCProfiler Buffer Tracing Record Header. - */ -typedef struct -{ - rocprofiler_service_buffer_tracing_kind_t kind; - rocprofiler_correlation_id_t correlation_id; -} rocprofiler_buffer_tracing_record_header_t; - -/** - * @defgroup HSA_API_CALLBACK_TRACING_RECORDS HSA API Callback Tracing Records - * @{ - */ - -/** - * @brief ROCProfiler Buffer HSA API Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_trace_record_hsa_operation_kind_t operation; // rocprofiler/hsa.h - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_thread_id_t thread_id; -} rocprofiler_buffer_tracing_hsa_api_record_t; - -/** - * @brief ROCProfiler Buffer HIP API Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_trace_record_hip_operation_kind_t operation; // rocprofiler/hip.h - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_thread_id_t thread_id; -} rocprofiler_buffer_tracing_hip_api_record_t; - -/** - * @brief ROCProfiler Buffer Marker Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_trace_record_marker_operation_kind_t operation; // rocprofiler/marker.h - rocprofiler_timestamp_t timestamp; - rocprofiler_thread_id_t thread_id; - uint64_t marker_id; // rocprofiler_marker_id_t - // const char* message; // (Need Review?) -} rocprofiler_buffer_tracing_marker_record_t; - -/** - * @brief Memory Copy Operation. - */ -typedef enum -{ - ROCPROFILER_TRACER_MEMORY_NONE = 0, - ROCPROFILER_TRACER_MEMORY_COPY_DEVICE_TO_HOST = 1, - ROCPROFILER_TRACER_MEMORY_HOST_TO_DEVICE = 2, - ROCPROFILER_TRACER_MEMORY_DEVICE_TO_DEVICE = 3, - ROCPROFILER_TRACER_MEMORY_LAST, -} rocprofiler_trace_memory_copy_operation_t; - -/** - * @brief ROCProfiler Buffer Memory Copy Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - /** - * Memory copy operation that can be derived from - * ::rocprofiler_trace_record_operation_kind_t - */ - uint32_t operation; - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_queue_id_t queue_id; -} rocprofiler_buffer_tracing_memory_copy_record_t; - -/** - * @brief ROCProfiler Buffer Kernel Dispatch Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_queue_id_t queue_id; - const char* kernel_name; -} rocprofiler_buffer_tracing_kernel_dispatch_record_t; - -/** - * @brief ROCProfiler Buffer Page Migration Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_queue_id_t queue_id; - // Not Sure What is the info needed here? -} rocprofiler_buffer_tracing_page_migration_record_t; - -/** - * @brief ROCProfiler Buffer Scratch Memory Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_queue_id_t queue_id; - // Not Sure What is the info needed here? -} rocprofiler_buffer_tracing_scratch_memory_record_t; - -/** - * @brief ROCProfiler Buffer Queue Scheduling Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - rocprofiler_queue_id_t queue_id; - // Not Sure What is the info needed here? -} rocprofiler_buffer_tracing_queue_scheduling_record_t; - -/** - * @brief ROCProfiler Code Object Tracer Buffer Record. - * - * We need to guarantee that these records are in the buffer before the - * corresponding Exit Phase API calls are called. - */ -// typedef struct { -// rocprofiler_buffer_tracing_record_header_t header; -// rocprofiler_tracing_code_object_kind_id_t kind; -// } rocprofiler_buffer_tracing_code_object_header_t; - -/** - * @brief ROCProfiler Code Object Load Tracer Buffer Record. - * - */ -// typedef struct { -// rocprofiler_buffer_tracing_code_object_header_t header; -// uint64_t load_base; // code object load base -// uint64_t load_size; // code object load size -// const char *uri; // URI string (NULL terminated) -// rocprofiler_timestamp_t timestamp; -// // uint32_t storage_type; // code object storage type (Need Review?) -// // int storage_file; // origin file descriptor (Need Review?) -// // uint64_t memory_base; // origin memory base (Need Review?) -// // uint64_t memory_size; // origin memory size (Need Review?) -// // uint64_t load_delta; // code object load delta (Need Review?) -// } rocprofiler_buffer_tracing_code_object_load_record_t; - -/** - * @brief ROCProfiler Code Object UnLoad Tracer Buffer Record. - * - */ -// typedef struct { -// rocprofiler_buffer_tracing_code_object_header_t header; -// uint64_t load_base; // code object load base -// rocprofiler_timestamp_t timestamp; -// } rocprofiler_buffer_tracing_code_object_unload_record_t; - -/** - * @brief ROCProfiler Code Object Kernel Symbol Tracer Buffer Record. - * - */ -// typedef struct { -// rocprofiler_buffer_tracing_code_object_header_t header; -// const char *kernel_name; // kernel name string (NULL terminated) -// uint64_t kernel_descriptor; // kernel descriptor (Need to be changed from -// // uint64_t to ::rocprofiler_address_t) -// // rocprofiler_timestamp_t timestamp; // (Need Review?) -// } rocprofiler_buffer_tracing_code_object_kernel_symbol_record_t; - -/** - * @brief ROCProfiler External Correlation ID. - * - */ -typedef struct -{ - uint64_t id; -} rocprofiler_external_correlation_id_t; - -/** - * @brief ROCProfiler Buffer External Correlation Tracer Record. - */ -typedef struct -{ - rocprofiler_buffer_tracing_record_header_t header; - rocprofiler_external_correlation_id_t external_correlation_id; -} rocprofiler_buffer_tracing_external_correlation_record_t; - -/** @} */ - -/** - * @brief ROCProfiler Push External Correlation ID. - * - * @param external_correlation_id - * @return rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_push_external_correlation_id( - rocprofiler_external_correlation_id_t external_correlation_id); - -/** - * @brief ROCProfiler Push External Correlation ID. - * - * @param external_correlation_id - * @return rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_pop_external_correlation_id( - rocprofiler_external_correlation_id_t* external_correlation_id); - -/** - * @brief Configure Buffer Tracing Service. - * - * @param [in] context_id - * @param [in] kind - * @param [in] operations - * @param [in] operations_count - * @param [in] buffer_id - * @return ::rocprofiler_status_t - * - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id, - rocprofiler_service_buffer_tracing_kind_t kind, - rocprofiler_trace_operation_t* operations, - size_t operations_count, - rocprofiler_buffer_id_t buffer_id); - -/** - * @brief Query Buffer Trace Kind Name. - * - * @param [in] kind - * @param [out] name if nullptr, size will be returned - * @param [out] size - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_buffer_trace_kind_name(rocprofiler_service_buffer_tracing_kind_t kind, - const char* name, - size_t* size) ROCPROFILER_NONNULL(3); - -/** - * @brief Query buffer kind operation name. - * - * @param [in] kind - * @param [in] api_trace_operation_id - * @param [out] name if nullptr, size will be returned - * @param [out] size - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_buffer_kind_operation_name( - rocprofiler_service_buffer_tracing_kind_t kind, - rocprofiler_trace_record_operation_kind_t api_trace_operation_id, - const char* name, - size_t* size) ROCPROFILER_NONNULL(4); - -/** @} */ - -/** @} */ - -/** @defgroup PROFILE_CONFIG Profile Configurations - * @{ - */ - -/** - * @brief Counter ID. - * - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_counter_id_t; - -/** - * @brief Profile Configurations - * - */ -typedef struct -{ - uint64_t handle; -} rocprofiler_profile_config_id_t; - -/** - * @brief Create Profile Configuration. - * - * @param [in] agent Agent identifier - * @param [in] counters_list List of GPU counters - * @param [in] counters_count Size of counters list - * @param [out] config_id Identifier for GPU counters group - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_create_profile_config(rocprofiler_agent_t agent, - rocprofiler_counter_id_t* counters_list, - size_t counters_count, - rocprofiler_profile_config_id_t* config_id) - ROCPROFILER_NONNULL(4); - -/** - * @brief Destroy Profile Configuration. - * - * @param [in] config_id - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_destroy_profile_config(rocprofiler_profile_config_id_t config_id); - -/** @} */ - -/** @defgroup PROFILE_COUNTING Profile Counting - * @{ - */ - -/** - * @brief Needs non-typedef specification? - */ -typedef uint32_t rocprofiler_counter_instance_id_t; - -/** - * @brief ROCProfiler Profile Counting Counter per instance. - */ -typedef struct -{ - rocprofiler_counter_id_t counter_id; - rocprofiler_counter_instance_id_t instance_id; - double counter_value; -} rocprofiler_record_counter_t; - -/** @defgroup DISPATCH_PROFILE_COUNTING_SERVICE Dispatch Profile Counting - * Service - * @{ - */ - -/** - * @brief ROCProfiler Profile Counting Data. - * - */ -typedef struct -{ - rocprofiler_timestamp_t start_timestamp; - rocprofiler_timestamp_t end_timestamp; - /** - * Counters, including identifiers to get counter information and Counters - * values - * - * Should it be a record per counter? - */ - rocprofiler_record_counter_t* counters; - uint64_t counters_count; - rocprofiler_correlation_id_t correlation_id; -} rocprofiler_dispatch_profile_counting_record_t; - -/** - * @brief Kernel Dispatch Callback - * - * @param [out] queue_id - * @param [out] agent_id - * @param [out] correlation_id - * @param [out] dispatch_packet It can be used to get the kernel descriptor and then using - * code_object tracing, we can get the kernel name. `dispatch_packet->reserved2` is the - * correlation_id used to correlate the dispatch packet with the corresponding API call. - * @param [out] callback_data_args - * @param [in] config - */ -typedef void (*rocprofiler_profile_counting_dispatch_callback_t)( - rocprofiler_queue_id_t queue_id, - rocprofiler_agent_t agent_id, - rocprofiler_correlation_id_t correlation_id, - const hsa_kernel_dispatch_packet_t* dispatch_packet, - void* callback_data_args, - rocprofiler_profile_config_id_t* config); - -/** - * @brief Configure Dispatch Profile Counting Service. - * - * @param [in] context_id - * @param [in] agent_id - * @param [in] buffer_id - * @param [in] callback - * @param [in] callback_data_args - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_dispatch_profile_counting_service( - rocprofiler_context_id_t context_id, - rocprofiler_agent_t agent_id, - rocprofiler_buffer_id_t buffer_id, - rocprofiler_profile_counting_dispatch_callback_t callback, - void* callback_data_args); - -/** @} */ - -/** @defgroup AGENT_PROFILE_COUNTING_SERVICE Agent Profile Counting Service - * @{ - */ - -/** - * @brief ROCProfiler Agent Profile Counting Data. - * - */ -typedef struct -{ - /** - * Counters, including identifiers to get counter information and Counters - * values - */ - rocprofiler_record_counter_t* counters; - uint64_t counters_count; -} rocprofiler_agent_profile_counting_data_t; - -/** - * @brief Configure Profile Counting Service for agent. - * - * @param [in] buffer_id - * @param [in] profile_config_id - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_agent_profile_counting_service( - rocprofiler_buffer_id_t buffer_id, - rocprofiler_profile_config_id_t profile_config_id); - -/** - * @brief Sample Profile Counting Service for agent. - * - * @param [out] data // It is always a size of one - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_sample_agent_profile_counting_service(rocprofiler_agent_profile_counting_data_t* data); - -/** @} */ - -/** - * @brief Query Counter name. * - * @param [in] counter_id - * @param [out] name if nullptr, size will be returned - * @param [out] size - * @return ::rocprofiler_status_t + * @param [out] major The major version number is stored if non-NULL. + * @param [out] minor The minor version number is stored if non-NULL. + * @param [out] patch The patch version number is stored if non-NULL. + * @addtogroup VERSIONING_GROUP */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_counter_name(rocprofiler_counter_id_t counter_id, const char* name, size_t* size) - ROCPROFILER_NONNULL(3); +rocprofiler_status_t +rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch) ROCPROFILER_API; /** - * @brief Query Counter Instances Count. - * - * @param [in] counter_id - * @param [out] instance_count - * @return rocprofiler_status_t + * @defgroup MISCELLANEOUS_GROUP Miscellaneous utility functions */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_counter_instance_count(rocprofiler_counter_id_t counter_id, - size_t* instance_count) ROCPROFILER_NONNULL(2); /** - * @brief Query Agent Counters Availability. - * - * @param [in] agent - * @param [out] counters_list - * @param [out] counters_count - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_agent_supported_counters(rocprofiler_agent_t agent, - rocprofiler_counter_id_t* counters_list, - size_t* counters_count) ROCPROFILER_NONNULL(2, 3); - -/** @} */ - -/** @defgroup PC_SAMPLING_SERVICE PC Sampling Service - * @{ - */ - -/** - * @brief ROCProfiler PC Sampling Record. - * + * @brief Get the timestamp value that rocprofiler uses + * @param [out] ts Output address of the rocprofiler timestamp value + * @addtogroup MISCELLANEOUS_GROUP */ -typedef struct -{ - uint64_t pc; - uint64_t dispatch_id; - uint64_t timestamp; - uint64_t hardware_id; - union - { - uint8_t arb_value; - }; - union - { - void* data; - }; -} rocprofiler_pc_sampling_record_t; - -/** - * @brief PC Sampling Method. - * - */ -typedef enum -{ - ROCPROFILER_PC_SAMPLING_METHOD_NONE = 0, - ROCPROFILER_PC_SAMPLING_METHOD_STOCHASTIC = 1, - ROCPROFILER_PC_SAMPLING_METHOD_HOST_TRAP = 2, - ROCPROFILER_PC_SAMPLING_METHOD_LAST, -} rocprofiler_pc_sampling_method_t; - -/** - * @brief PC Sampling Unit. - * - */ -typedef enum -{ - ROCPROFILER_PC_SAMPLING_UNIT_NONE = 0, - ROCPROFILER_PC_SAMPLING_UNIT_INSTRUCTIONS = 1, - ROCPROFILER_PC_SAMPLING_UNIT_CYCLES = 2, - ROCPROFILER_PC_SAMPLING_UNIT_TIME = 3, - ROCPROFILER_PC_SAMPLING_UNIT_LAST, -} rocprofiler_pc_sampling_unit_t; - -/** - * @brief Create PC Sampling Service. - * - * @param [in] context_id - * @param [in] agent - * @param [in] method - * @param [in] unit - * @param [in] interval - * @param [in] buffer_id - * @return ::rocprofiler_status_t - * - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id, - rocprofiler_agent_t agent, - rocprofiler_pc_sampling_method_t method, - rocprofiler_pc_sampling_unit_t unit, - uint64_t interval, - rocprofiler_buffer_id_t buffer_id); - -struct rocprofiler_pc_sampling_configuration_s -{ - rocprofiler_pc_sampling_method_t method; - rocprofiler_pc_sampling_unit_t unit; - size_t min_interval; - size_t max_interval; - uint64_t flags; -}; - -/** - * @brief Query PC Sampling Configuration. - * - * @param [in] agent - * @param [out] config - * @param [out] config_count - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_query_pc_sampling_agent_configurations(rocprofiler_agent_t agent, - rocprofiler_pc_sampling_configuration_t* config, - size_t* config_count) ROCPROFILER_NONNULL(2, 3); - -/** @} */ - -/** @defgroup SPM_SERVICE SPM Service - * @{ - */ - -/** - * @brief ROCProfiler SPM Record. - * - */ -typedef struct -{ - /** - * Counters, including identifiers to get counter information and Counters - * values - */ - rocprofiler_record_counter_t* counters; - uint64_t counters_count; -} rocprofiler_spm_record_t; - -/** - * @brief Configure SPM Service. - * - * @param [in] context_id - * @param [in] buffer_id - * @param [in] profile_config - * @param [in] interval - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_configure_spm_service(rocprofiler_context_id_t context_id, - rocprofiler_buffer_id_t buffer_id, - rocprofiler_profile_config_id_t profile_config, - uint64_t interval); - -/** @} */ - -/** @} */ - -/** @defgroup BUFFER_HANDLING Buffer - * @{ - * - * Every Buffer is associated with a specific service kind. - * OR - * Every Buffer is associated with a specific service ID. - * - */ - -// TODO: We need to add rocprofiler_record_header_t -/** - * @brief Generic record with a type and a pointer to data - */ -typedef struct -{ - uint64_t kind; - void* payload; -} rocprofiler_record_header_t; - -typedef rocprofiler_record_header_t rocprofiler_record_tracer_t; - -/** - * @brief Async callback function. - * - * @code{.cpp} - * for(size_t i = 0; i < num_headers; ++i) - * { - * rocprofiler_record_header_t* hdr = headers[i]; - * if(hdr->kind == ROCPROFILER_RECORD_KIND_PC_SAMPLE) - * { - * auto* data = static_cast(&hdr->payload); - * ... - * } - * } - * @endcode - */ -typedef void (*rocprofiler_buffer_callback_t)(rocprofiler_context_id_t context, - rocprofiler_buffer_id_t buffer_id, - rocprofiler_record_header_t** headers, - size_t num_headers, - void* data, - uint64_t drop_count); - -/** - * @brief Actions when Buffer is full. - * - */ -typedef enum -{ - ROCPROFILER_BUFFER_POLICY_NONE = 0, - /** - * Drop records when buffer is full. - */ - ROCPROFILER_BUFFER_POLICY_DISCARD = 1, - /** - * Block when buffer is full. - */ - ROCPROFILER_BUFFER_POLICY_LOSSLESS = 2, - ROCPROFILER_BUFFER_POLICY_LAST, -} rocprofiler_buffer_policy_t; - -/** - * @brief Create buffer. - * - * @param [in] context Context identifier associated with buffer - * @param [in] size Size of the buffer in bytes - * @param [in] watermark - watermark size, where the callback is called, if set - * to 0 then the callback will be called on every record - * @param [in] policy Behavior policy when buffer is full - * @param [in] callback Callback to invoke when buffer is flushed/full - * @param [in] callback_data Data to provide in callback function - * @param [out] buffer_id Identification handle for buffer - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_create_buffer(rocprofiler_context_id_t context, - size_t size, - size_t watermark, - rocprofiler_buffer_policy_t policy, - rocprofiler_buffer_callback_t callback, - void* callback_data, - rocprofiler_buffer_id_t* buffer_id) ROCPROFILER_NONNULL(5, 7); - -/** - * @brief Destroy buffer. - * - * @param [in] buffer_id - * @return ::rocprofiler_status_t - * - * Note: This will destroy the buffer even if it is not empty. The user can - * call @ref ::rocprofiler_flush_buffer before it to make sure the buffer is empty. - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id); - -/** - * @brief Flush buffer. - * - * @param [in] buffer_id - * @return ::rocprofiler_status_t - */ -rocprofiler_status_t ROCPROFILER_API -rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id); - -/** @} */ +rocprofiler_status_t +rocprofiler_get_timestamp(rocprofiler_timestamp_t* ts) ROCPROFILER_API ROCPROFILER_NONNULL(1); -#ifdef __cplusplus -} // extern "C" block -#endif // __cplusplus +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler_plugin.h b/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler_plugin.h index 9fa8d34ce7..209d136a2c 100644 --- a/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler_plugin.h +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/rocprofiler_plugin.h @@ -20,7 +20,7 @@ // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE // SOFTWARE. -/** \section rocprofiler_plugin_api ROCProfiler Plugin API +/** @section rocprofiler_plugin_api ROCProfiler Plugin API * * The ROCProfiler Plugin API is used by the ROCProfiler Tool to output all * profiling information. Different implementations of the ROCProfiler Plugin @@ -37,7 +37,7 @@ */ /** - * \file + * @file * ROCProfiler Tool Plugin API interface. */ @@ -47,44 +47,42 @@ #include -#ifdef __cplusplus -extern "C" { -#endif /* __cplusplus */ +ROCPROFILER_EXTERN_C_INIT /* __cplusplus */ -/** \defgroup rocprofiler_plugins ROCProfiler Plugin API Specification - * @{ - */ + /** @defgroup rocprofiler_plugins ROCProfiler Plugin API Specification + * @{ + */ -/** \defgroup initialization_group Initialization and Finalization - * \ingroup rocprofiler_plugins - * - * The ROCProfiler Plugin API must be initialized before using any of the - * operations to report trace data, and finalized after the last trace data has - * been reported. - * - * @{ - */ + /** @defgroup initialization_group Initialization and Finalization + * @ingroup rocprofiler_plugins + * + * The ROCProfiler Plugin API must be initialized before using any of the + * operations to report trace data, and finalized after the last trace data has + * been reported. + * + * @{ + */ -/** - * Initialize plugin. - * Must be called before any other operation. - * - * @param[in] rocprofiler_major_version The major version of the ROCProfiler API - * being used by the ROCProfiler Tool. An error is reported if this does not - * match the major version of the ROCProfiler API used to build the plugin - * library. This ensures compatibility of the trace data format. - * @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API - * being used by the ROCProfiler Tool. An error is reported if the - * \p rocprofiler_major_version matches and this is greater than the minor - * version of the ROCProfiler API used to build the plugin library. This ensures - * compatibility of the trace data format. - * @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool - * @return Returns 0 on success and -1 on error. - */ -ROCPROFILER_EXPORT int -rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version, - uint32_t rocprofiler_minor_version, - void* data); + /** + * Initialize plugin. + * Must be called before any other operation. + * + * @param[in] rocprofiler_major_version The major version of the ROCProfiler API + * being used by the ROCProfiler Tool. An error is reported if this does not + * match the major version of the ROCProfiler API used to build the plugin + * library. This ensures compatibility of the trace data format. + * @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API + * being used by the ROCProfiler Tool. An error is reported if the + * @p rocprofiler_major_version matches and this is greater than the minor + * version of the ROCProfiler API used to build the plugin library. This ensures + * compatibility of the trace data format. + * @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool + * @return Returns 0 on success and -1 on error. + */ + ROCPROFILER_EXPORT int + rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version, + uint32_t rocprofiler_minor_version, + void* data); /** * Finalize plugin. @@ -97,8 +95,8 @@ rocprofiler_plugin_finalize(); /** @} */ -/** \defgroup profiling_record_write_functions Profiling data reporting - * \ingroup rocprofiler_plugins +/** @defgroup profiling_record_write_functions Profiling data reporting + * @ingroup rocprofiler_plugins * Operations to output profiling data. * @{ */ @@ -128,12 +126,10 @@ rocprofiler_plugin_write_buffer_records(rocprofiler_context_id_t context_id */ ROCPROFILER_EXPORT int -rocprofiler_plugin_write_record(rocprofiler_record_tracer_t record); +rocprofiler_plugin_write_record(rocprofiler_record_header_t record); /** @} */ /** @} */ -#ifdef __cplusplus -} /* extern "C" */ -#endif /* __cplusplus */ +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/include/rocprofiler/spm.h b/projects/rocprofiler-sdk/source/include/rocprofiler/spm.h new file mode 100644 index 0000000000..c23bcf98d7 --- /dev/null +++ b/projects/rocprofiler-sdk/source/include/rocprofiler/spm.h @@ -0,0 +1,51 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +ROCPROFILER_EXTERN_C_INIT + +/** @defgroup SPM_SERVICE SPM Service + * @{ + */ + +/** + * @brief Configure SPM Service. + * + * @param [in] context_id + * @param [in] buffer_id + * @param [in] profile_config + * @param [in] interval + * @return ::rocprofiler_status_t + */ +rocprofiler_status_t ROCPROFILER_API +rocprofiler_configure_spm_service(rocprofiler_context_id_t context_id, + rocprofiler_buffer_id_t buffer_id, + rocprofiler_profile_config_id_t profile_config, + uint64_t interval); + +/** @} */ + +ROCPROFILER_EXTERN_C_FINI diff --git a/projects/rocprofiler-sdk/source/lib/common/CMakeLists.txt b/projects/rocprofiler-sdk/source/lib/common/CMakeLists.txt index 84139c3190..c67c3f976c 100644 --- a/projects/rocprofiler-sdk/source/lib/common/CMakeLists.txt +++ b/projects/rocprofiler-sdk/source/lib/common/CMakeLists.txt @@ -26,6 +26,7 @@ target_link_libraries( $ $ $ - $) + $ + $) set_target_properties(rocprofiler-common-library PROPERTIES OUTPUT_NAME rocprofiler-common) diff --git a/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.cpp b/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.cpp index 22ec8a1b8e..cf35a38a19 100644 --- a/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.cpp +++ b/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.cpp @@ -59,7 +59,7 @@ record_header_buffer::operator=(record_header_buffer&& _rhs) noexcept if(this != &_rhs) { auto _lk = rhb_raii_lock{_rhs}; - m_index = _rhs.m_index.load(std::memory_order_relaxed); + m_index = _rhs.m_index.load(std::memory_order_acquire); m_buffer = std::move(_rhs.m_buffer); m_headers = std::move(_rhs.m_headers); _rhs.reset(); @@ -74,7 +74,8 @@ record_header_buffer::allocate(size_t num_bytes) auto _lk = rhb_raii_lock{*this}; m_buffer.init(num_bytes); - m_headers.resize(m_buffer.capacity(), rocprofiler_record_header_t{0, nullptr}); + m_headers.resize(m_buffer.capacity(), + rocprofiler_record_header_t{.hash = 0, .payload = nullptr}); return true; } @@ -83,13 +84,13 @@ record_header_buffer::get_record_headers(size_t _n) { auto _lk = rhb_raii_lock{*this}; - auto _sz = m_index.load(std::memory_order_relaxed); + auto _sz = m_index.load(std::memory_order_acquire); if(_n > _sz) _n = _sz; auto _ret = record_ptr_vec_t{}; _ret.reserve(_n); for(size_t i = 0; i < _n; ++i) { - if(auto& itr = m_headers.at(i); itr.kind > 0 && itr.payload != nullptr) + if(auto& itr = m_headers.at(i); itr.hash > 0 && itr.payload != nullptr) _ret.emplace_back(&itr); } return _ret; @@ -105,9 +106,9 @@ record_header_buffer::clear() auto _sz = m_buffer.capacity(); if(!m_buffer.clear(std::nothrow_t{})) return 0; std::for_each(m_headers.begin(), m_headers.end(), [](auto& itr) { - itr = rocprofiler_record_header_t{0, nullptr}; + itr = rocprofiler_record_header_t{.hash = 0, .payload = nullptr}; }); - m_headers.resize(_sz, rocprofiler_record_header_t{0, nullptr}); + m_headers.resize(_sz, rocprofiler_record_header_t{.hash = 0, .payload = nullptr}); m_index.store(0, std::memory_order_release); } diff --git a/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.hpp b/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.hpp index b4906e3ce0..f5b8c7ec38 100644 --- a/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.hpp +++ b/projects/rocprofiler-sdk/source/lib/common/container/record_header_buffer.hpp @@ -29,6 +29,7 @@ #include #include #include +#include #include namespace rocprofiler @@ -70,17 +71,29 @@ struct record_header_buffer template bool emplace(uint64_t, Tp&); + /// place an object in the buffer using the specified numerical identifier + template + bool emplace(uint32_t, uint32_t, Tp&); + /// this function will return a vector of pointers to the record headers /// at the time of invocation. record_ptr_vec_t get_record_headers(size_t _n = std::numeric_limits::max()); - /// prevent emplace + /// record_header_buffer is a multiple writer, single reader data structure so + /// this function prevents writing via emplace void lock(); - /// try to re-enable emplace + /// potentially re-enable emplace if no other readers have locked void unlock(); - /// check if emplace is available + /// record_header_buffer is a multiple writer, single reader data structure so + /// this function prevents reading while emplacing + void read_lock(); + + /// potentially allow reading after writing via emplace + void read_unlock(); + + /// check if writing is available bool is_locked() const; /// restores to original empty state @@ -116,6 +129,7 @@ struct record_header_buffer private: std::atomic m_locked = {0}; std::atomic m_index = {}; + std::shared_mutex m_shared = {}; base_buffer_t m_buffer = {}; record_vec_t m_headers = {}; }; @@ -129,13 +143,27 @@ record_header_buffer::is_locked() const inline void record_header_buffer::lock() { - m_locked.fetch_add(1, std::memory_order_release); + auto n = m_locked.fetch_add(1, std::memory_order_release); + if(n == 0) m_shared.lock(); } inline void record_header_buffer::unlock() { - m_locked.fetch_add(-1, std::memory_order_release); + auto n = m_locked.fetch_add(-1, std::memory_order_release); + if(n <= 1) m_shared.unlock(); +} + +inline void +record_header_buffer::read_lock() +{ + m_shared.lock_shared(); +} + +inline void +record_header_buffer::read_unlock() +{ + m_shared.unlock_shared(); } inline bool @@ -182,7 +210,7 @@ record_header_buffer::is_full() const template bool -record_header_buffer::emplace(uint64_t _kind, Tp& _v) +record_header_buffer::emplace(uint64_t _hash, Tp& _v) { if(is_locked() || m_headers.empty()) return false; @@ -195,6 +223,7 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v) return _ptr; }; + read_lock(); auto _addr = _create_record(m_buffer, _v); if(_addr) { @@ -202,9 +231,41 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v) // for where the header record should be placed. // NOTE: m_headers was resized to be large enough to accomodate // sizeof(Tp) == 1 for every entry in buffer - auto _idx = m_index++; - m_headers.at(_idx) = rocprofiler_record_header_t{_kind, _addr}; + auto idx = m_index.fetch_add(1, std::memory_order_release); + m_headers.at(idx) = rocprofiler_record_header_t{.hash = _hash, .payload = _addr}; } + read_unlock(); + return (_addr != nullptr); +} + +template +bool +record_header_buffer::emplace(uint32_t _category, uint32_t _kind, Tp& _v) +{ + if(is_locked() || m_headers.empty()) return false; + + // request N bytes in the buffer (where N=sizeof(Tp)) and if + // available, copy _v into the buffer region + auto _create_record = [](auto& _buf, auto& _data) { + constexpr auto buffer_sz = sizeof(Tp); + void* _ptr = _buf.request(buffer_sz, false); + if(_ptr) new(_ptr) Tp{_data}; + return _ptr; + }; + + read_lock(); + auto _addr = _create_record(m_buffer, _v); + if(_addr) + { + // if there is space in the buffer, atomically get an index + // for where the header record should be placed. + // NOTE: m_headers was resized to be large enough to accomodate + // sizeof(Tp) == 1 for every entry in buffer + auto idx = m_index.fetch_add(1, std::memory_order_release); + m_headers.at(idx) = + rocprofiler_record_header_t{.category = _category, .kind = _kind, .payload = _addr}; + } + read_unlock(); return (_addr != nullptr); } diff --git a/projects/rocprofiler-sdk/source/lib/common/container/stable_vector.hpp b/projects/rocprofiler-sdk/source/lib/common/container/stable_vector.hpp index fcf972aa80..bfd50ed3a7 100644 --- a/projects/rocprofiler-sdk/source/lib/common/container/stable_vector.hpp +++ b/projects/rocprofiler-sdk/source/lib/common/container/stable_vector.hpp @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -40,6 +41,15 @@ namespace common { namespace container { +struct reserve_size +{ + explicit reserve_size(size_t _v) + : value{_v} + {} + + size_t value; +}; + template class stable_vector { @@ -155,6 +165,7 @@ public: stable_vector() = default; explicit stable_vector(size_type count, const Tp& value); explicit stable_vector(size_type count); + explicit stable_vector(reserve_size&& reserve_count); template ::stable_vector(size_type count) } } +template +stable_vector::stable_vector(reserve_size&& reserve_count) +{ + reserve(reserve_count.value); +} + template template stable_vector::stable_vector(InputItrT first, InputItrT last) diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/CMakeLists.txt b/projects/rocprofiler-sdk/source/lib/rocprofiler/CMakeLists.txt index b478d2c1db..888b535cc4 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/CMakeLists.txt +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/CMakeLists.txt @@ -1,8 +1,10 @@ # # # -set(ROCPROFILER_LIB_HEADERS config_helpers.hpp config_internal.hpp tracer.hpp) -set(ROCPROFILER_LIB_SOURCES config_internal.cpp rocprofiler_config.cpp rocprofiler.cpp) +set(ROCPROFILER_LIB_HEADERS buffer.hpp internal_threading.hpp registration.hpp) +set(ROCPROFILER_LIB_SOURCES + buffer.cpp buffer_tracing.cpp callback_tracing.cpp context.cpp internal_threading.cpp + rocprofiler.cpp registration.cpp) add_library(rocprofiler-library SHARED) add_library(rocprofiler::rocprofiler-library ALIAS rocprofiler-library) @@ -11,6 +13,7 @@ target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_SOURCES} ${ROCPROFILER_LIB_HEADERS}) add_subdirectory(hsa) +add_subdirectory(context) target_link_libraries( rocprofiler-library diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.cpp new file mode 100644 index 0000000000..bc3a0249ff --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.cpp @@ -0,0 +1,203 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include "lib/rocprofiler/buffer.hpp" + +#include + +#include "lib/common/container/stable_vector.hpp" +#include "lib/common/utility.hpp" +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/context/domain.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/internal_threading.hpp" +#include "lib/rocprofiler/registration.hpp" + +#include +#include +#include + +namespace rocprofiler +{ +namespace buffer +{ +namespace +{ +using reserve_size_t = common::container::reserve_size; + +auto& +get_buffers_mutex() +{ + static auto _v = std::mutex{}; + return _v; +} +} // namespace + +unique_buffer_vec_t& +get_buffers() +{ + static auto _v = unique_buffer_vec_t{reserve_size_t{unique_buffer_vec_t::chunk_size}}; + return _v; +} + +std::optional +allocate_buffer() +{ + // ... allocate any internal space needed to handle another context ... + auto _lk = std::unique_lock{get_buffers_mutex()}; + + // initial context identifier number + auto _idx = get_buffers().size(); + + // make space in registered + get_buffers().emplace_back(nullptr); + + // create an entry in the registered + auto& _cfg_v = get_buffers().back(); + _cfg_v = std::make_unique(); + auto* _cfg = _cfg_v.get(); + + if(!_cfg) return std::nullopt; + + return rocprofiler_buffer_id_t{_idx}; +} + +rocprofiler_status_t +flush(rocprofiler_buffer_id_t buffer_id, bool wait) +{ + if(buffer_id.handle >= get_buffers().size()) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; + + auto& buff = get_buffers().at(buffer_id.handle); + + auto* task_group = rocprofiler::internal_threading::get_task_group( + rocprofiler_callback_thread_t{buff->task_group_id}); + + if(task_group) task_group->wait(); + + // buffer is currently being flushed or destroyed + if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY; + + auto buff_idx = buff->buffer_idx++; + + auto _task = [buff_idx, buffer_id]() { + auto& _buff = get_buffers().at(buffer_id.handle); + auto& buff_v = _buff->buffers.at(buff_idx % _buff->buffers.size()); + + if(!buff_v.is_empty()) + { + // get the array of record headers + auto buff_data = buff_v.get_record_headers(); + + // invoke buffer callback + try + { + _buff->callback(rocprofiler_context_id_t{_buff->context_id}, + rocprofiler_buffer_id_t{_buff->buffer_id}, + buff_data.data(), + buff_data.size(), + _buff->callback_data, + _buff->drop_count); + } catch(std::exception& e) + { + LOG(ERROR) << "buffer callback threw an exception: " << e.what(); + } + // clear the buffer + buff_v.clear(); + } + + _buff->syncer.clear(); + }; + + if(task_group) + { + task_group->exec(_task); + if(wait) task_group->wait(); + } + else + { + _task(); + } + + return ROCPROFILER_STATUS_SUCCESS; +} +} // namespace buffer +} // namespace rocprofiler + +extern "C" { +rocprofiler_status_t +rocprofiler_create_buffer(rocprofiler_context_id_t context, + size_t size, + size_t watermark, + rocprofiler_buffer_policy_t action, + rocprofiler_buffer_tracing_cb_t callback, + void* callback_data, + rocprofiler_buffer_id_t* buffer_id) +{ + if(rocprofiler::registration::get_init_status() > 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + auto opt_buff_id = rocprofiler::buffer::allocate_buffer(); + if(!opt_buff_id) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; + *buffer_id = *opt_buff_id; + + auto& buff = rocprofiler::buffer::get_buffers().at(opt_buff_id->handle); + + // allocate the buffers. if it is lossless, we allocate a second buffer to store data while + // other buffer is being flushed + buff->buffers.front().allocate(size); + if(action == ROCPROFILER_BUFFER_POLICY_LOSSLESS) buff->buffers.back().allocate(size); + + buff->watermark = watermark; + buff->policy = action; + buff->callback = callback; + buff->callback_data = callback_data; + buff->context_id = context.handle; + buff->buffer_idx = buffer_id->handle; + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id) +{ + return rocprofiler::buffer::flush(buffer_id, true); +} + +rocprofiler_status_t +rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id) +{ + if(buffer_id.handle >= rocprofiler::buffer::get_buffers().size()) + return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; + + auto& buff = rocprofiler::buffer::get_buffers().at(buffer_id.handle); + + // buffer is currently being flushed or destroyed + if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY; + + for(auto& itr : buff->buffers) + itr.reset(); + + buff->syncer.clear(); + + return ROCPROFILER_STATUS_SUCCESS; +} +} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.hpp new file mode 100644 index 0000000000..65b3a066f7 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer.hpp @@ -0,0 +1,122 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include +#include + +#include "lib/common/container/record_header_buffer.hpp" +#include "lib/common/container/stable_vector.hpp" +#include "lib/common/demangle.hpp" + +#include +#include +#include +#include + +namespace rocprofiler +{ +namespace buffer +{ +struct instance +{ + using buffer_t = common::container::record_header_buffer; + + mutable std::array buffers = {}; + mutable std::atomic buffer_idx = {}; + mutable std::atomic_flag syncer = ATOMIC_FLAG_INIT; + mutable std::atomic drop_count = {}; + uint64_t watermark = 0; + uint64_t context_id = 0; + uint64_t buffer_id = 0; + uint64_t task_group_id = 0; + rocprofiler_buffer_tracing_cb_t callback = nullptr; + void* callback_data = nullptr; + rocprofiler_buffer_policy_t policy = ROCPROFILER_BUFFER_POLICY_NONE; + + template + void emplace(uint32_t, uint32_t, Tp&); +}; + +using unique_buffer_vec_t = common::container::stable_vector, 4>; + +std::optional +allocate_buffer(); + +unique_buffer_vec_t& +get_buffers(); + +rocprofiler_status_t +flush(rocprofiler_buffer_id_t buffer_id, bool wait); + +inline rocprofiler_status_t +flush(uint64_t buffer_idx, bool wait) +{ + return flush(rocprofiler_buffer_id_t{buffer_idx}, wait); +} +} // namespace buffer +} // namespace rocprofiler + +template +inline void +rocprofiler::buffer::instance::emplace(uint32_t category, uint32_t kind, Tp& value) +{ + // get the index of the current buffer + auto get_idx = [this]() { return buffer_idx.load(std::memory_order_acquire) % buffers.size(); }; + + auto idx = get_idx(); + if(!buffers.at(idx).emplace(category, kind, value)) + { + if(buffers.at(idx).size() < sizeof(value)) + { + auto msg = std::stringstream{}; + msg << "buffer " << buffer_id << " to small (size=" << buffers.at(idx).size() + << ") to hold an object of type " << common::cxx_demangle(typeid(value).name()) + << " with size " << sizeof(value); + throw std::runtime_error(msg.str()); + } + + if(policy == ROCPROFILER_BUFFER_POLICY_LOSSLESS) + { + // blocks until buffer is flushed + bool success = false; + while(!success) + { + buffer::flush(buffer_id, true); + idx = get_idx(); + success = buffers.at(idx).emplace(category, kind, value); + } + } + else + { + ++drop_count; + } + } + + if(buffers.at(idx).count() >= watermark) + { + // flush without syncing + buffer::flush(buffer_id, false); + } +} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer_tracing.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer_tracing.cpp new file mode 100644 index 0000000000..b0ca72fb4e --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/buffer_tracing.cpp @@ -0,0 +1,151 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include +#include + +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/context/domain.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/registration.hpp" + +#include + +#include +#include +#include + +#define RETURN_STATUS_ON_FAIL(...) \ + if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \ + { \ + return _status; \ + } + +extern "C" { +rocprofiler_status_t +rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id, + rocprofiler_service_buffer_tracing_kind_t kind, + rocprofiler_tracing_operation_t* operations, + size_t operations_count, + rocprofiler_buffer_id_t buffer_id) +{ + if(rocprofiler::registration::get_init_status() > 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + if(context_id.handle >= rocprofiler::context::get_registered_contexts().size()) + { + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + } + + auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle); + + if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + + constexpr auto invalid_buffer_id = + rocprofiler_buffer_id_t{std::numeric_limits::max()}; + + if(!ctx->buffered_tracer) + { + ctx->buffered_tracer = std::make_unique(); + ctx->buffered_tracer->buffer_data.fill(invalid_buffer_id); + } + + if(ctx->buffered_tracer->buffer_data.at(kind).handle != invalid_buffer_id.handle) + return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED; + + RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->buffered_tracer->domains, kind)); + + ctx->buffered_tracer->buffer_data.at(kind) = buffer_id; + + for(size_t i = 0; i < operations_count; ++i) + { + RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op( + ctx->buffered_tracer->domains, kind, operations[i])); + } + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback, + void* data) +{ + // TODO(jrmadsen): need to add for other kinds + size_t n = 0; + bool premature = false; + using pair_t = std::pair; + for(auto [eitr, sitr] : { + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, "HSA_API"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, "HIP_API"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, "MARKER_API"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, "MEMORY_COPY"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, "PAGE_MIGRATION"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, "SCRATCH_MEMORY"}, + pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, "EXTERNAL_CORRELATION"}, + }) + { + auto _success = callback(eitr, sitr, data); + if(_success != 0) + { + premature = true; + break; + } + ++n; + } + +#if defined(ROCPROFILER_CI) + if(!premature) + { + LOG_ASSERT(n == ROCPROFILER_SERVICE_BUFFER_TRACING_LAST - 1) + << " :: new enumeration value added. Update this function"; + } +#else + (void) n; + (void) premature; +#endif + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_iterate_buffer_tracing_kind_operation_names( + rocprofiler_service_buffer_tracing_kind_t kind, + rocprofiler_buffer_tracing_operation_name_cb_t callback, + void* data) +{ + if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API) + { + auto ops = rocprofiler::hsa::get_ids(); + for(const auto& itr : ops) + { + auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data); + if(_success != 0) break; + } + return ROCPROFILER_STATUS_SUCCESS; + } + + return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; +} +} + +#undef RETURN_STATUS_ON_FAIL diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/callback_tracing.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/callback_tracing.cpp new file mode 100644 index 0000000000..f4411fcc71 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/callback_tracing.cpp @@ -0,0 +1,161 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include + +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/context/domain.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/registration.hpp" + +#include + +#include +#include + +#define RETURN_STATUS_ON_FAIL(...) \ + if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \ + { \ + return _status; \ + } + +extern "C" { +rocprofiler_status_t +rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id, + rocprofiler_service_callback_tracing_kind_t kind, + rocprofiler_tracing_operation_t* operations, + size_t operations_count, + rocprofiler_callback_tracing_cb_t callback, + void* callback_args) +{ + if(rocprofiler::registration::get_init_status() > 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + if(context_id.handle >= rocprofiler::context::get_registered_contexts().size()) + { + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + } + + auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle); + + if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + + if(!ctx->callback_tracer) + ctx->callback_tracer = std::make_unique(); + + if(ctx->callback_tracer->callback_data.at(kind).callback) + return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED; + + RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->callback_tracer->domains, kind)); + + ctx->callback_tracer->callback_data.at(kind) = {callback, callback_args}; + + for(size_t i = 0; i < operations_count; ++i) + { + RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op( + ctx->callback_tracer->domains, kind, operations[i])); + } + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_iterate_callback_tracing_kind_names( + rocprofiler_callback_tracing_kind_name_cb_t callback, + void* data) +{ + // TODO(jrmadsen): need to add for other kinds + size_t n = 0; + bool premature = false; + using pair_t = std::pair; + for(auto [eitr, sitr] : { + pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, "HSA_API"}, + pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, "HIP_API"}, + pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, "MARKER_API"}, + pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, "CODE_OBJECT"}, + pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"}, + }) + { + auto _success = callback(eitr, sitr, data); + if(_success != 0) + { + premature = true; + break; + } + ++n; + } + +#if defined(ROCPROFILER_CI) + if(!premature) + { + LOG_ASSERT(n == ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST - 1) + << " :: new enumeration value added. Update this function"; + } +#else + (void) n; + (void) premature; +#endif + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_iterate_callback_tracing_kind_operation_names( + rocprofiler_service_callback_tracing_kind_t kind, + rocprofiler_callback_tracing_operation_name_cb_t callback, + void* data) +{ + if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API) + { + auto ops = rocprofiler::hsa::get_ids(); + for(const auto& itr : ops) + { + auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data); + if(_success != 0) break; + } + return ROCPROFILER_STATUS_SUCCESS; + } + + return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; +} + +rocprofiler_status_t +rocprofiler_iterate_callback_tracing_operation_args( + rocprofiler_callback_tracing_record_t record, + rocprofiler_callback_tracing_operation_args_cb_t callback, + void* user_data) +{ + if(record.kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API) + { + rocprofiler::hsa::iterate_args( + record.operation, + *static_cast(record.payload), + callback, + user_data); + return ROCPROFILER_STATUS_SUCCESS; + } + + return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; +} +} + +#undef RETURN_STATUS_ON_FAIL diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.cpp deleted file mode 100644 index 95178e2530..0000000000 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.cpp +++ /dev/null @@ -1,28 +0,0 @@ - -#include "config_internal.hpp" - -namespace rocprofiler -{ -namespace internal -{ -uint64_t -correlation_config::get_unique_record_id() -{ - static auto _v = std::atomic{}; - return _v++; -} - -bool -domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain) const -{ - return ((1 << _domain) & domains) == (1 << _domain); -} - -bool -domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain, uint32_t _op) const -{ - auto _offset = (_domain * rocprofiler::internal::domain_ops_offset); - return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op)); -} -} // namespace internal -} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.hpp deleted file mode 100644 index a55ebbb88f..0000000000 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_internal.hpp +++ /dev/null @@ -1,74 +0,0 @@ - -#pragma once - -#include -#include - -#include -#include -#include -#include -#include - -namespace rocprofiler -{ -namespace internal -{ -// number of bits to reserve all op codes -constexpr size_t domain_ops_offset = ROCPROFILER_DOMAIN_OPS_MAX; -constexpr size_t reserved_domain_size = ROCPROFILER_DOMAIN_OPS_RESERVED * 8; -constexpr size_t max_configs_count = 8; - -struct correlation_config -{ - uint64_t id = 0; - uint64_t external_id = 0; - ::rocprofiler_external_cid_cb_t external_id_callback = nullptr; - - static uint64_t get_unique_record_id(); -}; - -struct domain_config -{ - ::rocprofiler_tracer_callback_t user_sync_callback = nullptr; - int64_t domains = 0; - std::bitset opcodes = {}; - - /// check if domain is enabled - bool operator()(::rocprofiler_tracer_activity_domain_t) const; - - /// check if op in a domain is enabled - bool operator()(::rocprofiler_tracer_activity_domain_t, uint32_t) const; -}; - -struct buffer_config -{ - ::rocprofiler_buffer_callback_t callback = nullptr; - uint64_t buffer_size; - // Memory::GenericBuffer* buffer = nullptr; - uint64_t buffer_idx = 0; -}; - -using filter_config = ::rocprofiler_filter_config; - -struct config -{ - // size is used to ensure that we never read past the end of the version - size_t size = 0; // = sizeof(rocprofiler_config) - uint32_t compat_version = 0; // set by user - uint32_t api_version = 0; // set by rocprofiler - uint64_t context_idx = 0; // context id index - void* user_data = nullptr; // user data passed to callbacks - correlation_config* correlation_id = nullptr; // &my_cid_config (optional) - buffer_config* buffer = nullptr; // = &my_buffer_config (required) - domain_config* domain = nullptr; // = &my_domain_config (required) - filter_config* filter = nullptr; // = &my_filter_config (optional) -}; - -std::array& -get_registered_configs(); - -std::array, max_configs_count>& -get_active_configs(); -} // namespace internal -} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context.cpp new file mode 100644 index 0000000000..5fb2b37553 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context.cpp @@ -0,0 +1,89 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include + +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/context/domain.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/registration.hpp" + +#include +#include + +extern "C" { +rocprofiler_status_t +rocprofiler_create_context(rocprofiler_context_id_t* context_id) +{ + if(rocprofiler::registration::get_init_status() > 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + auto cfg_id = rocprofiler::context::allocate_context(); + if(!cfg_id) return ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR; + *context_id = *cfg_id; + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_start_context(rocprofiler_context_id_t context_id) +{ + return rocprofiler::context::start_context(context_id); +} + +rocprofiler_status_t +rocprofiler_stop_context(rocprofiler_context_id_t context_id) +{ + return rocprofiler::context::stop_context(context_id); +} + +rocprofiler_status_t +rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status) +{ + *status = 0; + for(const auto& itr : rocprofiler::context::get_active_contexts()) + { + auto* cfg = itr.load(std::memory_order_relaxed); + if(cfg && cfg->context_idx == context_id.handle) + { + *status = 1; + return ROCPROFILER_STATUS_SUCCESS; + } + } + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; +} + +rocprofiler_status_t +rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status) +{ + *status = 0; + for(const auto& itr : rocprofiler::context::get_registered_contexts()) + { + if(itr && itr->context_idx == context_id.handle) + { + auto _ret = rocprofiler::context::validate_context(itr.get()); + *status = (_ret == ROCPROFILER_STATUS_SUCCESS) ? 1 : 0; + return _ret; + } + } + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; +} +} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context/CMakeLists.txt b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/CMakeLists.txt new file mode 100644 index 0000000000..b81bfc2795 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/CMakeLists.txt @@ -0,0 +1,14 @@ +# +# context +# +set(ROCPROFILER_LIB_CONFIG_SOURCES context.cpp domain.cpp) +set(ROCPROFILER_LIB_CONFIG_HEADERS context.hpp domain.hpp allocator.hpp) + +target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_CONFIG_SOURCES} + ${ROCPROFILER_LIB_CONFIG_HEADERS}) + +# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE +# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR} +# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src) +# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL +# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1) diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_helpers.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/allocator.hpp similarity index 59% rename from projects/rocprofiler-sdk/source/lib/rocprofiler/config_helpers.hpp rename to projects/rocprofiler-sdk/source/lib/rocprofiler/context/allocator.hpp index a055923b8d..5b6c8ff368 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/config_helpers.hpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/allocator.hpp @@ -1,36 +1,38 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. #pragma once -#include "rocprofiler/rocprofiler.h" - #include #include #include #include -namespace +namespace rocprofiler { -inline size_t // NOLINTNEXTLINE -get_domain_max_op(rocprofiler_tracer_activity_domain_t _domain) +namespace context { - switch(_domain) - { - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE: return -1; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_MARKER_API: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_KFD_API: return -1; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_EXT_API: return -1; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_EVT: return 0; - case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST: return -1; - } - return -1; -} - template -struct allocator +struct locality_allocator { void construct(Tp* const _p, const Tp& _v) const { ::new((void*) _p) Tp{_v}; } void construct(Tp* const _p, Tp&& _v) const { ::new((void*) _p) Tp{std::move(_v)}; } @@ -103,5 +105,5 @@ struct allocator void reserve(const size_t) {} }; - -} // namespace +} // namespace context +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.cpp new file mode 100644 index 0000000000..db675831eb --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.cpp @@ -0,0 +1,230 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include +#include + +#include "lib/common/container/stable_vector.hpp" +#include "lib/rocprofiler/context/context.hpp" + +#include + +#include +#include +#include +#include +#include +#include + +namespace rocprofiler +{ +namespace context +{ +namespace +{ +auto& +get_contexts_mutex() +{ + static auto _v = std::mutex{}; + return _v; +} + +constexpr auto invalid_client_idx = std::numeric_limits::max(); + +auto& +get_client_index() +{ + static auto _v = invalid_client_idx; + return _v; +} +} // namespace + +uint64_t +correlation_tracing_service::get_unique_record_id() +{ + static auto _v = std::atomic{}; + return _v++; +} + +using reserve_size_t = common::container::reserve_size; + +unique_context_vec_t& +get_registered_contexts() +{ + static auto _v = unique_context_vec_t{reserve_size_t{unique_context_vec_t::chunk_size}}; + return _v; +} + +active_context_vec_t& +get_active_contexts() +{ + static auto* _v = new active_context_vec_t{reserve_size_t{active_context_vec_t::chunk_size}}; + static auto _once = std::once_flag{}; + std::call_once(_once, std::atexit, []() { + for(auto& itr : *_v) + { + itr.store(nullptr); + } + }); + return *_v; +} + +// set the client index needs to be called before allocate_context() +void +push_client(uint32_t value) +{ + LOG_ASSERT(get_client_index() == invalid_client_idx) + << " rocprofiler client index is currently " << get_client_index() + << "... which means that a new client is initializing before the last client finished " + "initializing. This is an internal error, please file a bug report with a reproducer"; + get_client_index() = value; +} + +// remove the client index +void +pop_client(uint32_t value) +{ + LOG_ASSERT(get_client_index() == value) + << " rocprofiler client index is currently not " << value + << "... which means that a new client was initialized before this client finished " + "initializing. This is an internal error, please file a bug report with a reproducer"; + get_client_index() = invalid_client_idx; +} + +std::optional +allocate_context() +{ + // ... allocate any internal space needed to handle another context ... + auto _lk = std::unique_lock{get_contexts_mutex()}; + + // initial context identifier number + auto _idx = get_registered_contexts().size(); + + // make space in registered + get_registered_contexts().emplace_back(nullptr); + + // create an entry in the registered + auto& _cfg_v = get_registered_contexts().back(); + _cfg_v = std::make_unique(); + auto* _cfg = _cfg_v.get(); + // ... + + if(!_cfg) return std::nullopt; + + _cfg->size = sizeof(context); + _cfg->context_idx = _idx; + _cfg->client_idx = get_client_index(); + + LOG_ASSERT(_cfg->client_idx != invalid_client_idx) + << " rocprofiler internal error: a context was allocated without an associated tool client " + "identifier"; + + return rocprofiler_context_id_t{_idx}; +} + +rocprofiler_status_t +validate_context(const context* cfg) +{ + // if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; + + // if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND; + + return (cfg) ? ROCPROFILER_STATUS_SUCCESS : ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; +} + +rocprofiler_status_t +start_context(rocprofiler_context_id_t context_id) +{ + if(context_id.handle >= get_registered_contexts().size()) + { + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + } + + context* cfg = get_registered_contexts().at(context_id.handle).get(); + + if(!cfg) + { + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; + } + + if(validate_context(cfg) != ROCPROFILER_STATUS_SUCCESS) + { + return ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID; + } + + uint64_t rocp_tot_contexts = get_registered_contexts().size(); + auto idx = rocp_tot_contexts; + { + // hold a lock here so prevent multiple threads from finding the same nullptr slot + auto _lk = std::unique_lock{get_contexts_mutex()}; + // try to find a nullptr slot first + for(size_t i = 0; i < get_active_contexts().size(); ++i) + { + auto* itr = get_active_contexts().at(i).load(std::memory_order_relaxed); + if(itr == nullptr) + { + idx = i; + break; + } + else if(context_id.handle == itr->context_idx) + { + return ROCPROFILER_STATUS_SUCCESS; + } + } + // if no nullptr slot was found, then create one while lock is held + if(idx == rocp_tot_contexts) + { + idx = get_active_contexts().size(); + get_active_contexts().emplace_back(); + } + } + + // atomic swap the pointer into the "active" array used internally + context* _expected = nullptr; + bool success = get_active_contexts().at(idx).compare_exchange_strong( + _expected, get_registered_contexts().at(context_id.handle).get()); + + if(!success) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED; + + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +stop_context(rocprofiler_context_id_t idx) +{ + // atomically assign the context pointer to NULL so that it is skipped in future + // callbacks + for(auto& itr : get_active_contexts()) + { + auto* _expected = itr.load(std::memory_order_relaxed); + if(_expected && _expected->context_idx == idx.handle) + { + bool success = itr.compare_exchange_strong(_expected, nullptr); + + if(success) return ROCPROFILER_STATUS_SUCCESS; + } + } + + return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange failed +} +} // namespace context +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.hpp new file mode 100644 index 0000000000..4d84be86ea --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/context.hpp @@ -0,0 +1,130 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include + +#include "lib/common/container/stable_vector.hpp" +#include "lib/rocprofiler/context/domain.hpp" + +#include +#include +#include +#include +#include + +namespace rocprofiler +{ +namespace context +{ +using external_cid_cb_t = uint64_t (*)(rocprofiler_service_callback_tracing_kind_t, + uint32_t, + uint64_t); + +/// permits tools opportunity to modify the correlation id based on the domain, op, and +/// the rocprofiler generated correlation id +struct correlation_tracing_service +{ + uint64_t id = 0; + uint64_t external_id = 0; + external_cid_cb_t external_id_callback = nullptr; + + static uint64_t get_unique_record_id(); +}; + +struct callback_tracing_service +{ + struct callback_data + { + rocprofiler_callback_tracing_cb_t callback = nullptr; + void* data = nullptr; + }; + + using domain_t = rocprofiler_service_callback_tracing_kind_t; + using callback_array_t = std::array::last>; + + domain_context domains = {}; + callback_array_t callback_data = {}; +}; + +struct buffer_tracing_service +{ + using domain_t = rocprofiler_service_buffer_tracing_kind_t; + using buffer_array_t = std::array::last>; + + domain_context domains = {}; + buffer_array_t buffer_data = {}; +}; + +struct context +{ + // size is used to ensure that we never read past the end of the version + size_t size = 0; + uint64_t context_idx = 0; // context id + uint32_t client_idx = 0; // tool id + correlation_tracing_service correlation_tracer = {}; + std::unique_ptr callback_tracer = {}; + std::unique_ptr buffered_tracer = {}; +}; + +// set the client index needs to be called before allocate_context() +void push_client(uint32_t); + +// remove the client index +void pop_client(uint32_t); + +/// @brief creates a context struct and returns a handle for locating the context struct +/// +std::optional +allocate_context(); + +/// \brief rocprofiler validates context, checks for conflicts, etc. Ensures that +/// the contexturation is valid *in isolation*, e.g. it may check that the user +/// set the compat_version field and that required context fields, such as buffer +/// are set. This function will be called before \ref start_context +/// but is provided to help the user validate one or more contexts without starting +/// them +/// +/// \param [in] cfg contexturation to validate +rocprofiler_status_t +validate_context(const context* cfg); + +/// \brief rocprofiler activates contexturation and provides a context identifier +/// \param [in] id the context identifier to start. +rocprofiler_status_t +start_context(rocprofiler_context_id_t id); + +/// \brief disable the contexturation. +rocprofiler_status_t stop_context(rocprofiler_context_id_t); + +using unique_context_vec_t = common::container::stable_vector, 8>; +using active_context_vec_t = common::container::stable_vector, 8>; + +unique_context_vec_t& +get_registered_contexts(); + +active_context_vec_t& +get_active_contexts(); +} // namespace context +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.cpp new file mode 100644 index 0000000000..56ec2bbd1b --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.cpp @@ -0,0 +1,99 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include "lib/rocprofiler/context/domain.hpp" +#include + +namespace rocprofiler +{ +namespace context +{ +template +bool +domain_context::operator()(DomainT _domain) const +{ + return ((1 << _domain) & domains) == (1 << _domain); +} + +template +bool +domain_context::operator()(DomainT _domain, uint32_t _op) const +{ + auto _offset = (_domain * opcode_padding_v); + return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op)); +} + +template +rocprofiler_status_t +add_domain(domain_context& _cfg, DomainT _domain) +{ + if(_domain <= domain_info::none || _domain >= domain_info::last) + return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND; + + _cfg.domains |= (1 << _domain); + return ROCPROFILER_STATUS_SUCCESS; +} + +template +rocprofiler_status_t +add_domain_op(domain_context& _cfg, DomainT _domain, uint32_t _op) +{ + if(_domain <= domain_info::none || _domain >= domain_info::last) + return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND; + + if(_op >= domain_info::padding) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND; + + auto _offset = (_domain * domain_info::padding); + if(_offset >= _cfg.opcodes.size()) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND; + + _cfg.opcodes.set(_offset + _op, true); + return ROCPROFILER_STATUS_SUCCESS; +} + +// instantiate the templates +template struct domain_context; + +template rocprofiler_status_t +add_domain( + domain_context&, + rocprofiler_service_callback_tracing_kind_t); + +template rocprofiler_status_t +add_domain( + domain_context&, + rocprofiler_service_buffer_tracing_kind_t); + +template rocprofiler_status_t +add_domain_op( + domain_context&, + rocprofiler_service_callback_tracing_kind_t, + uint32_t); + +template struct domain_context; + +template rocprofiler_status_t +add_domain_op( + domain_context&, + rocprofiler_service_buffer_tracing_kind_t, + uint32_t); +} // namespace context +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.hpp new file mode 100644 index 0000000000..9c4c451a75 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/context/domain.hpp @@ -0,0 +1,89 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include + +#include "lib/common/mpl.hpp" + +#include +#include +#include + +namespace rocprofiler +{ +namespace context +{ +// number of bits to reserve all op codes +constexpr size_t domain_ops_padding = 512; + +template +struct domain_info; + +template <> +struct domain_info +{ + static constexpr size_t none = ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE; + static constexpr size_t last = ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST; + static constexpr auto padding = domain_ops_padding; +}; + +template <> +struct domain_info +{ + static constexpr size_t none = ROCPROFILER_SERVICE_BUFFER_TRACING_NONE; + static constexpr size_t last = ROCPROFILER_SERVICE_BUFFER_TRACING_LAST; + static constexpr auto padding = domain_ops_padding; +}; + +/// how the tools specify the tracing domain and (optionally) which operations in the +/// domain they want to trace +template +struct domain_context +{ + using supported_domains_v = common::mpl::type_list; + static_assert(common::mpl::is_one_of::value, + "Unsupported domain type"); + static constexpr auto opcode_padding_v = domain_info::padding; + static constexpr auto max_opcodes_v = opcode_padding_v * domain_info::last; + + /// check if domain is enabled + bool operator()(DomainT) const; + + /// check if op in a domain is enabled + bool operator()(DomainT, uint32_t) const; + + int64_t domains = 0; + std::bitset opcodes = {}; +}; + +template +rocprofiler_status_t +add_domain(domain_context&, DomainT); + +template +rocprofiler_status_t +add_domain_op(domain_context&, DomainT, uint32_t); +} // namespace context +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/CMakeLists.txt b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/CMakeLists.txt index b4504714eb..29d29b5eb5 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/CMakeLists.txt +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/CMakeLists.txt @@ -1,10 +1,10 @@ +# +# +# set(ROCPROFILER_LIB_HSA_SOURCES hsa.cpp) -set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp ostream.hpp types.hpp utils.hpp) +set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp types.hpp utils.hpp) + target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_SOURCES} ${ROCPROFILER_LIB_HSA_HEADERS}) -# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE -# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR} -# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src) -# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL -# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1) +add_subdirectory(details) diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/defines.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/defines.hpp index b03a4f5d15..b6b7117aa6 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/defines.hpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/defines.hpp @@ -32,30 +32,27 @@ #define IMPL_DETAIL_FOR_EACH(MACRO, PREFIX, ...) \ IMPL_DETAIL_FOR_EACH_(IMPL_DETAIL_FOR_EACH_NARG(__VA_ARGS__), MACRO, PREFIX, __VA_ARGS__) -#define MEMBER_0(...) -#define MEMBER_1(PREFIX, FIELD) PREFIX.FIELD -#define MEMBER_2(PREFIX, A, B) MEMBER_1(PREFIX, A), MEMBER_1(PREFIX, B) -#define MEMBER_3(PREFIX, A, B, C) MEMBER_2(PREFIX, A, B), MEMBER_1(PREFIX, C) -#define MEMBER_4(PREFIX, A, B, C, D) MEMBER_3(PREFIX, A, B, C), MEMBER_1(PREFIX, D) -#define MEMBER_5(PREFIX, A, B, C, D, E) MEMBER_4(PREFIX, A, B, C, D), MEMBER_1(PREFIX, E) -#define MEMBER_6(PREFIX, A, B, C, D, E, F) MEMBER_5(PREFIX, A, B, C, D, E), MEMBER_1(PREFIX, F) -#define MEMBER_7(PREFIX, A, B, C, D, E, F, G) \ - MEMBER_6(PREFIX, A, B, C, D, E, F), MEMBER_1(PREFIX, G) - -#define MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \ - MEMBER_7(PREFIX, A, B, C, D, E, F, G), MEMBER_1(PREFIX, H) - -#define MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \ - MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), MEMBER_1(PREFIX, I) - -#define MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \ - MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), MEMBER_1(PREFIX, J) - -#define MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \ - MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), MEMBER_1(PREFIX, K) - -#define MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \ - MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), MEMBER_1(PREFIX, L) +#define ADDR_MEMBER_0(...) +#define ADDR_MEMBER_1(PREFIX, FIELD) static_cast(&PREFIX.FIELD) +#define ADDR_MEMBER_2(PREFIX, A, B) ADDR_MEMBER_1(PREFIX, A), ADDR_MEMBER_1(PREFIX, B) +#define ADDR_MEMBER_3(PREFIX, A, B, C) ADDR_MEMBER_2(PREFIX, A, B), ADDR_MEMBER_1(PREFIX, C) +#define ADDR_MEMBER_4(PREFIX, A, B, C, D) ADDR_MEMBER_3(PREFIX, A, B, C), ADDR_MEMBER_1(PREFIX, D) +#define ADDR_MEMBER_5(PREFIX, A, B, C, D, E) \ + ADDR_MEMBER_4(PREFIX, A, B, C, D), ADDR_MEMBER_1(PREFIX, E) +#define ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F) \ + ADDR_MEMBER_5(PREFIX, A, B, C, D, E), ADDR_MEMBER_1(PREFIX, F) +#define ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G) \ + ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F), ADDR_MEMBER_1(PREFIX, G) +#define ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \ + ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G), ADDR_MEMBER_1(PREFIX, H) +#define ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \ + ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), ADDR_MEMBER_1(PREFIX, I) +#define ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \ + ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), ADDR_MEMBER_1(PREFIX, J) +#define ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \ + ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), ADDR_MEMBER_1(PREFIX, K) +#define ADDR_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \ + ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), ADDR_MEMBER_1(PREFIX, L) #define NAMED_MEMBER_0(...) #define NAMED_MEMBER_1(PREFIX, FIELD) std::make_pair(#FIELD, PREFIX.FIELD) @@ -80,44 +77,10 @@ #define NAMED_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \ NAMED_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), NAMED_MEMBER_1(PREFIX, L) -/// @def GET_MEMBER_FIELDS -/// @param VAR some struct instance -/// @param ... The member fields of the struct -/// -/// @brief this macro is used to expand one variable (VAR) + one or more member fields (FIELDS) -/// into a sequence of something like: `(VAR.FIELD, ...)` -/// For example, `GET_MEMBER_FIELDS(foo, a, b, c)` would transform into `foo.a, foo.b, foo.c`: -/// -/// @code{.cpp} -/// -/// struct Foo -/// { -/// int a; -/// float b; -/// double c; -/// }; -/// -/// // some function taking int, float, and double -/// void some_function(int, float, double); -/// -/// // overload to some_function accepting Foo instance and using -/// // the args to invoke "real" function -/// void some_function(Foo _foo_v) -/// { -/// some_function(GET_MEMBER_FIELDS(_foo_v, a, b, c)); -/// } -/// -/// int main() -/// { -/// Foo _foo_v = {-1, 0.5f, 2.0}; -/// invoke_some_function(_foo_v); -/// } -/// -/// @code -#define GET_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(MEMBER_, VAR, __VA_ARGS__) +#define GET_ADDR_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(ADDR_MEMBER_, VAR, __VA_ARGS__) #define GET_NAMED_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(NAMED_MEMBER_, VAR, __VA_ARGS__) -#define HSA_API_INFO_DEFINITION_0(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \ +#define HSA_API_INFO_DEFINITION_0(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \ namespace rocprofiler \ { \ namespace hsa \ @@ -125,10 +88,11 @@ template <> \ struct hsa_api_info \ { \ - static constexpr auto domain_idx = HSA_DOMAIN; \ - static constexpr auto table_idx = HSA_TABLE; \ - static constexpr auto operation_idx = HSA_API_ID; \ - static constexpr auto name = #HSA_FUNC; \ + static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \ + static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \ + static constexpr auto table_idx = HSA_TABLE; \ + static constexpr auto operation_idx = HSA_API_ID; \ + static constexpr auto name = #HSA_FUNC; \ \ using this_type = hsa_api_info; \ using base_type = hsa_api_impl; \ @@ -160,7 +124,7 @@ template \ static auto& get_api_data_args(DataT& _data) \ { \ - return _data.api_data.args.HSA_FUNC; \ + return _data.HSA_FUNC; \ } \ \ template \ @@ -174,18 +138,13 @@ \ static auto get_functor() { return get_functor(get_table_func()); } \ \ - static std::string as_string(rocprofiler_hsa_trace_data_t) \ + static std::vector as_arg_addr(rocprofiler_hsa_api_callback_tracer_data_t) \ { \ - return std::string{name} + "()"; \ - } \ - \ - static std::string as_named_string(rocprofiler_hsa_trace_data_t) \ - { \ - return std::string{name} + "()"; \ + return std::vector{}; \ } \ \ static std::vector> as_arg_list( \ - rocprofiler_hsa_trace_data_t) \ + rocprofiler_hsa_api_callback_tracer_data_t) \ { \ return {}; \ } \ @@ -193,7 +152,7 @@ } \ } -#define HSA_API_INFO_DEFINITION_V(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \ +#define HSA_API_INFO_DEFINITION_V(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \ namespace rocprofiler \ { \ namespace hsa \ @@ -201,10 +160,11 @@ template <> \ struct hsa_api_info \ { \ - static constexpr auto domain_idx = HSA_DOMAIN; \ - static constexpr auto table_idx = HSA_TABLE; \ - static constexpr auto operation_idx = HSA_API_ID; \ - static constexpr auto name = #HSA_FUNC; \ + static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \ + static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \ + static constexpr auto table_idx = HSA_TABLE; \ + static constexpr auto operation_idx = HSA_API_ID; \ + static constexpr auto name = #HSA_FUNC; \ \ using this_type = hsa_api_info; \ using base_type = hsa_api_impl; \ @@ -236,7 +196,7 @@ template \ static auto& get_api_data_args(DataT& _data) \ { \ - return _data.api_data.args.HSA_FUNC; \ + return _data.HSA_FUNC; \ } \ \ template \ @@ -250,23 +210,17 @@ \ static auto get_functor() { return get_functor(get_table_func()); } \ \ - static std::string as_string(rocprofiler_hsa_trace_data_t trace_data) \ + static std::vector as_arg_addr( \ + rocprofiler_hsa_api_callback_tracer_data_t trace_data) \ { \ - return utils::join(utils::join_args{std::string{name} + "(", ")", ", "}, \ - GET_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \ + return std::vector{ \ + GET_ADDR_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)}; \ } \ \ - static std::string as_named_string(rocprofiler_hsa_trace_data_t trace_data) \ - { \ - return utils::join( \ - utils::join_args{std::string{name} + "(", ")", ", "}, \ - GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \ - } \ - \ - static auto as_arg_list(rocprofiler_hsa_trace_data_t trace_data) \ + static auto as_arg_list(rocprofiler_hsa_api_callback_tracer_data_t trace_data) \ { \ return utils::stringize( \ - GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \ + GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)); \ } \ }; \ } \ diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/CMakeLists.txt b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/CMakeLists.txt new file mode 100644 index 0000000000..44db613fb6 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/CMakeLists.txt @@ -0,0 +1,8 @@ +# +# +# +set(ROCPROFILER_LIB_HSA_DETAILS_SOURCES) +set(ROCPROFILER_LIB_HSA_DETAILS_HEADERS ostream.hpp) + +target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_DETAILS_SOURCES} + ${ROCPROFILER_LIB_HSA_DETAILS_HEADERS}) diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/ostream.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/ostream.hpp similarity index 76% rename from projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/ostream.hpp rename to projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/ostream.hpp index 2b5336f8cc..1fd8077b7c 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/ostream.hpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/details/ostream.hpp @@ -23,29 +23,29 @@ THE SOFTWARE. #pragma once -#include "lib/rocprofiler/tracer.hpp" +#include #include #include +#include namespace rocprofiler { namespace hsa { -static int HSA_depth_max = 1; -static int HSA_depth_max_cnt = 0; -static std::string HSA_structs_regex = ""; +static int HSA_depth_max = 1; +static thread_local int HSA_depth_max_cnt = 0; +static std::string_view HSA_structs_regex = {}; // begin ostream ops for HSA // basic ostream ops namespace detail { template - inline static std::ostream& operator<<(std::ostream& out, const T& v) { - using std:: operator<<; - static bool recursion = false; + using std:: operator<<; + static thread_local bool recursion = false; if(recursion == false) { recursion = true; @@ -77,19 +77,19 @@ operator<<(std::ostream& out, const hsa_dim3_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_dim3_t::z").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_dim3_t::z"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "z="); rocprofiler::hsa::detail::operator<<(out, v.z); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_dim3_t::y").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_dim3_t::y"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "y="); rocprofiler::hsa::detail::operator<<(out, v.y); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_dim3_t::x").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_dim3_t::x"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "x="); rocprofiler::hsa::detail::operator<<(out, v.x); @@ -107,7 +107,8 @@ operator<<(std::ostream& out, const hsa_agent_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_agent_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_agent_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -125,7 +126,8 @@ operator<<(std::ostream& out, const hsa_cache_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_cache_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_cache_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -143,7 +145,8 @@ operator<<(std::ostream& out, const hsa_signal_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_signal_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_signal_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -161,7 +164,8 @@ operator<<(std::ostream& out, const hsa_signal_group_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_signal_group_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_signal_group_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -179,7 +183,8 @@ operator<<(std::ostream& out, const hsa_region_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_region_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_region_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -197,37 +202,40 @@ operator<<(std::ostream& out, const hsa_queue_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_queue_t::id").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::id"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "id="); rocprofiler::hsa::detail::operator<<(out, v.id); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_queue_t::reserved1").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::reserved1"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved1="); rocprofiler::hsa::detail::operator<<(out, v.reserved1); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_queue_t::size").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::size"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "size="); rocprofiler::hsa::detail::operator<<(out, v.size); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_queue_t::doorbell_signal").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::doorbell_signal"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "doorbell_signal="); rocprofiler::hsa::detail::operator<<(out, v.doorbell_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_queue_t::features").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::features"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "features="); rocprofiler::hsa::detail::operator<<(out, v.features); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_queue_t::type").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_queue_t::type"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "type="); rocprofiler::hsa::detail::operator<<(out, v.type); @@ -245,99 +253,99 @@ operator<<(std::ostream& out, const hsa_kernel_dispatch_packet_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_kernel_dispatch_packet_t::completion_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::completion_signal"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "completion_signal="); rocprofiler::hsa::detail::operator<<(out, v.completion_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::reserved2").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::reserved2"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved2="); rocprofiler::hsa::detail::operator<<(out, v.reserved2); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::kernel_object").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::kernel_object"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "kernel_object="); rocprofiler::hsa::detail::operator<<(out, v.kernel_object); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_kernel_dispatch_packet_t::group_segment_size") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "group_segment_size="); rocprofiler::hsa::detail::operator<<(out, v.group_segment_size); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_kernel_dispatch_packet_t::private_segment_size") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "private_segment_size="); rocprofiler::hsa::detail::operator<<(out, v.private_segment_size); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::grid_size_z").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::grid_size_z"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "grid_size_z="); rocprofiler::hsa::detail::operator<<(out, v.grid_size_z); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::grid_size_y").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::grid_size_y"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "grid_size_y="); rocprofiler::hsa::detail::operator<<(out, v.grid_size_y); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::grid_size_x").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::grid_size_x"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "grid_size_x="); rocprofiler::hsa::detail::operator<<(out, v.grid_size_x); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::reserved0").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::reserved0"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved0="); rocprofiler::hsa::detail::operator<<(out, v.reserved0); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::workgroup_size_z").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::workgroup_size_z"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "workgroup_size_z="); rocprofiler::hsa::detail::operator<<(out, v.workgroup_size_z); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::workgroup_size_y").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::workgroup_size_y"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "workgroup_size_y="); rocprofiler::hsa::detail::operator<<(out, v.workgroup_size_y); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::workgroup_size_x").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::workgroup_size_x"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "workgroup_size_x="); rocprofiler::hsa::detail::operator<<(out, v.workgroup_size_x); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::setup").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::setup"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "setup="); rocprofiler::hsa::detail::operator<<(out, v.setup); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_kernel_dispatch_packet_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_kernel_dispatch_packet_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -355,43 +363,43 @@ operator<<(std::ostream& out, const hsa_agent_dispatch_packet_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_agent_dispatch_packet_t::completion_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::completion_signal"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "completion_signal="); rocprofiler::hsa::detail::operator<<(out, v.completion_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_agent_dispatch_packet_t::reserved2").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::reserved2"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved2="); rocprofiler::hsa::detail::operator<<(out, v.reserved2); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_agent_dispatch_packet_t::arg").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::arg"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "arg="); rocprofiler::hsa::detail::operator<<(out, v.arg); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_agent_dispatch_packet_t::reserved0").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::reserved0"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved0="); rocprofiler::hsa::detail::operator<<(out, v.reserved0); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_agent_dispatch_packet_t::type").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::type"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "type="); rocprofiler::hsa::detail::operator<<(out, v.type); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_agent_dispatch_packet_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_agent_dispatch_packet_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -409,43 +417,43 @@ operator<<(std::ostream& out, const hsa_barrier_and_packet_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_barrier_and_packet_t::completion_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::completion_signal"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "completion_signal="); rocprofiler::hsa::detail::operator<<(out, v.completion_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_and_packet_t::reserved2").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::reserved2"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved2="); rocprofiler::hsa::detail::operator<<(out, v.reserved2); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_and_packet_t::dep_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::dep_signal"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "dep_signal="); rocprofiler::hsa::detail::operator<<(out, v.dep_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_and_packet_t::reserved1").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::reserved1"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved1="); rocprofiler::hsa::detail::operator<<(out, v.reserved1); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_and_packet_t::reserved0").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::reserved0"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved0="); rocprofiler::hsa::detail::operator<<(out, v.reserved0); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_and_packet_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_and_packet_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -463,43 +471,43 @@ operator<<(std::ostream& out, const hsa_barrier_or_packet_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_barrier_or_packet_t::completion_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::completion_signal"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "completion_signal="); rocprofiler::hsa::detail::operator<<(out, v.completion_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_or_packet_t::reserved2").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::reserved2"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved2="); rocprofiler::hsa::detail::operator<<(out, v.reserved2); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_or_packet_t::dep_signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::dep_signal"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "dep_signal="); rocprofiler::hsa::detail::operator<<(out, v.dep_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_or_packet_t::reserved1").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::reserved1"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved1="); rocprofiler::hsa::detail::operator<<(out, v.reserved1); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_or_packet_t::reserved0").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::reserved0"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved0="); rocprofiler::hsa::detail::operator<<(out, v.reserved0); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_barrier_or_packet_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_barrier_or_packet_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -517,7 +525,7 @@ operator<<(std::ostream& out, const hsa_isa_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_isa_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_isa_t::handle"}.find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -535,7 +543,8 @@ operator<<(std::ostream& out, const hsa_wavefront_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_wavefront_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_wavefront_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -553,8 +562,8 @@ operator<<(std::ostream& out, const hsa_code_object_reader_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_code_object_reader_t::handle").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_code_object_reader_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -572,7 +581,8 @@ operator<<(std::ostream& out, const hsa_executable_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_executable_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_executable_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -590,8 +600,8 @@ operator<<(std::ostream& out, const hsa_loaded_code_object_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_loaded_code_object_t::handle").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_loaded_code_object_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -609,8 +619,8 @@ operator<<(std::ostream& out, const hsa_executable_symbol_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_executable_symbol_t::handle").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_executable_symbol_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -628,7 +638,8 @@ operator<<(std::ostream& out, const hsa_code_object_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_code_object_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_code_object_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -646,7 +657,8 @@ operator<<(std::ostream& out, const hsa_callback_data_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_callback_data_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_callback_data_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -664,7 +676,8 @@ operator<<(std::ostream& out, const hsa_code_symbol_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_code_symbol_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_code_symbol_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -682,7 +695,8 @@ operator<<(std::ostream& out, const hsa_ext_image_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_image_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_ext_image_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -700,15 +714,15 @@ operator<<(std::ostream& out, const hsa_ext_image_format_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_image_format_t::channel_order").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_format_t::channel_order"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "channel_order="); rocprofiler::hsa::detail::operator<<(out, v.channel_order); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_format_t::channel_type").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_format_t::channel_type"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "channel_type="); rocprofiler::hsa::detail::operator<<(out, v.channel_type); @@ -726,43 +740,43 @@ operator<<(std::ostream& out, const hsa_ext_image_descriptor_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_image_descriptor_t::format").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::format"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "format="); rocprofiler::hsa::detail::operator<<(out, v.format); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_descriptor_t::array_size").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::array_size"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "array_size="); rocprofiler::hsa::detail::operator<<(out, v.array_size); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_descriptor_t::depth").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::depth"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "depth="); rocprofiler::hsa::detail::operator<<(out, v.depth); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_descriptor_t::height").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::height"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "height="); rocprofiler::hsa::detail::operator<<(out, v.height); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_descriptor_t::width").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::width"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "width="); rocprofiler::hsa::detail::operator<<(out, v.width); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_descriptor_t::geometry").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_descriptor_t::geometry"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "geometry="); rocprofiler::hsa::detail::operator<<(out, v.geometry); @@ -780,15 +794,15 @@ operator<<(std::ostream& out, const hsa_ext_image_data_info_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_image_data_info_t::alignment").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_data_info_t::alignment"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "alignment="); rocprofiler::hsa::detail::operator<<(out, v.alignment); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_data_info_t::size").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_data_info_t::size"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "size="); rocprofiler::hsa::detail::operator<<(out, v.size); @@ -806,15 +820,15 @@ operator<<(std::ostream& out, const hsa_ext_image_region_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_image_region_t::range").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_region_t::range"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "range="); rocprofiler::hsa::detail::operator<<(out, v.range); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_image_region_t::offset").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_image_region_t::offset"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "offset="); rocprofiler::hsa::detail::operator<<(out, v.offset); @@ -832,7 +846,8 @@ operator<<(std::ostream& out, const hsa_ext_sampler_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_sampler_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_ext_sampler_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -850,22 +865,22 @@ operator<<(std::ostream& out, const hsa_ext_sampler_descriptor_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_ext_sampler_descriptor_t::address_mode").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_sampler_descriptor_t::address_mode"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "address_mode="); rocprofiler::hsa::detail::operator<<(out, v.address_mode); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_sampler_descriptor_t::filter_mode").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_sampler_descriptor_t::filter_mode"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "filter_mode="); rocprofiler::hsa::detail::operator<<(out, v.filter_mode); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_sampler_descriptor_t::coordinate_mode").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_sampler_descriptor_t::coordinate_mode"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "coordinate_mode="); rocprofiler::hsa::detail::operator<<(out, v.coordinate_mode); @@ -884,42 +899,42 @@ operator<<(std::ostream& out, const hsa_ext_images_1_00_pfn_t& v) if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_sampler_destroy") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_sampler_destroy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_sampler_destroy); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_sampler_create") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_sampler_create="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_sampler_create); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_image_copy").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_images_1_00_pfn_t::hsa_ext_image_copy"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_copy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_copy); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_image_destroy") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_destroy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_destroy); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_image_data_get_info") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_data_get_info="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_data_get_info); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_00_pfn_t::hsa_ext_image_get_capability") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_get_capability="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_get_capability); @@ -938,56 +953,56 @@ operator<<(std::ostream& out, const hsa_ext_images_1_pfn_t& v) if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_data_get_info_with_layout") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_data_get_info_with_layout="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_data_get_info_with_layout); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_get_capability_with_layout") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_get_capability_with_layout="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_get_capability_with_layout); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_sampler_destroy").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_images_1_pfn_t::hsa_ext_sampler_destroy"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_sampler_destroy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_sampler_destroy); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_sampler_create").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_images_1_pfn_t::hsa_ext_sampler_create"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_sampler_create="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_sampler_create); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_copy").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_images_1_pfn_t::hsa_ext_image_copy"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_copy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_copy); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_destroy").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_ext_images_1_pfn_t::hsa_ext_image_destroy"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_destroy="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_destroy); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_data_get_info") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_data_get_info="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_data_get_info); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_ext_images_1_pfn_t::hsa_ext_image_get_capability") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "hsa_ext_image_get_capability="); rocprofiler::hsa::detail::operator<<(out, v.hsa_ext_image_get_capability); @@ -1005,22 +1020,22 @@ operator<<(std::ostream& out, const hsa_amd_vendor_packet_header_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_vendor_packet_header_t::reserved").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_vendor_packet_header_t::reserved"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved="); rocprofiler::hsa::detail::operator<<(out, v.reserved); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_vendor_packet_header_t::AmdFormat").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_vendor_packet_header_t::AmdFormat"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "AmdFormat="); rocprofiler::hsa::detail::operator<<(out, v.AmdFormat); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_vendor_packet_header_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_vendor_packet_header_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -1039,70 +1054,70 @@ operator<<(std::ostream& out, const hsa_amd_barrier_value_packet_t& v) if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { if(std::string("hsa_amd_barrier_value_packet_t::completion_signal") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "completion_signal="); rocprofiler::hsa::detail::operator<<(out, v.completion_signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::reserved3").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::reserved3"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved3="); rocprofiler::hsa::detail::operator<<(out, v.reserved3); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::reserved2").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::reserved2"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved2="); rocprofiler::hsa::detail::operator<<(out, v.reserved2); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::reserved1").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::reserved1"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved1="); rocprofiler::hsa::detail::operator<<(out, v.reserved1); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::cond").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::cond"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "cond="); rocprofiler::hsa::detail::operator<<(out, v.cond); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::mask").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::mask"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "mask="); rocprofiler::hsa::detail::operator<<(out, v.mask); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::value").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::value"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "value="); rocprofiler::hsa::detail::operator<<(out, v.value); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::signal").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::signal"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "signal="); rocprofiler::hsa::detail::operator<<(out, v.signal); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::reserved0").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::reserved0"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "reserved0="); rocprofiler::hsa::detail::operator<<(out, v.reserved0); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_barrier_value_packet_t::header").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_barrier_value_packet_t::header"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "header="); rocprofiler::hsa::detail::operator<<(out, v.header); @@ -1120,15 +1135,15 @@ operator<<(std::ostream& out, const hsa_amd_hdp_flush_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_hdp_flush_t::HDP_REG_FLUSH_CNTL").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_hdp_flush_t::HDP_REG_FLUSH_CNTL"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "HDP_REG_FLUSH_CNTL="); rocprofiler::hsa::detail::operator<<(out, v.HDP_REG_FLUSH_CNTL); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_hdp_flush_t::HDP_MEM_FLUSH_CNTL").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_hdp_flush_t::HDP_MEM_FLUSH_CNTL"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "HDP_MEM_FLUSH_CNTL="); rocprofiler::hsa::detail::operator<<(out, v.HDP_MEM_FLUSH_CNTL); @@ -1146,15 +1161,15 @@ operator<<(std::ostream& out, const hsa_amd_profiling_dispatch_time_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_profiling_dispatch_time_t::end").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_profiling_dispatch_time_t::end"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "end="); rocprofiler::hsa::detail::operator<<(out, v.end); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_profiling_dispatch_time_t::start").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_profiling_dispatch_time_t::start"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "start="); rocprofiler::hsa::detail::operator<<(out, v.start); @@ -1172,15 +1187,15 @@ operator<<(std::ostream& out, const hsa_amd_profiling_async_copy_time_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_profiling_async_copy_time_t::end").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_profiling_async_copy_time_t::end"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "end="); rocprofiler::hsa::detail::operator<<(out, v.end); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_profiling_async_copy_time_t::start").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_profiling_async_copy_time_t::start"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "start="); rocprofiler::hsa::detail::operator<<(out, v.start); @@ -1198,8 +1213,8 @@ operator<<(std::ostream& out, const hsa_amd_memory_pool_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_memory_pool_t::handle").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -1217,13 +1232,15 @@ operator<<(std::ostream& out, const hsa_pitched_ptr_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_pitched_ptr_t::slice").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_pitched_ptr_t::slice"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "slice="); rocprofiler::hsa::detail::operator<<(out, v.slice); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_pitched_ptr_t::pitch").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_pitched_ptr_t::pitch"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "pitch="); rocprofiler::hsa::detail::operator<<(out, v.pitch); @@ -1241,43 +1258,43 @@ operator<<(std::ostream& out, const hsa_amd_memory_pool_link_info_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_memory_pool_link_info_t::numa_distance").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::numa_distance"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "numa_distance="); rocprofiler::hsa::detail::operator<<(out, v.numa_distance); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_memory_pool_link_info_t::link_type").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::link_type"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "link_type="); rocprofiler::hsa::detail::operator<<(out, v.link_type); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_memory_pool_link_info_t::max_bandwidth").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::max_bandwidth"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "max_bandwidth="); rocprofiler::hsa::detail::operator<<(out, v.max_bandwidth); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_memory_pool_link_info_t::min_bandwidth").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::min_bandwidth"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "min_bandwidth="); rocprofiler::hsa::detail::operator<<(out, v.min_bandwidth); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_memory_pool_link_info_t::max_latency").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::max_latency"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "max_latency="); rocprofiler::hsa::detail::operator<<(out, v.max_latency); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_memory_pool_link_info_t::min_latency").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_memory_pool_link_info_t::min_latency"}.find( + HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "min_latency="); rocprofiler::hsa::detail::operator<<(out, v.min_latency); @@ -1295,22 +1312,22 @@ operator<<(std::ostream& out, const hsa_amd_image_descriptor_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_image_descriptor_t::data").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_image_descriptor_t::data"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "data="); rocprofiler::hsa::detail::operator<<(out, v.data); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_image_descriptor_t::deviceID").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_image_descriptor_t::deviceID"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "deviceID="); rocprofiler::hsa::detail::operator<<(out, v.deviceID); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_image_descriptor_t::version").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_image_descriptor_t::version"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "version="); rocprofiler::hsa::detail::operator<<(out, v.version); @@ -1328,34 +1345,36 @@ operator<<(std::ostream& out, const hsa_amd_pointer_info_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_pointer_info_t::global_flags").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_pointer_info_t::global_flags"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "global_flags="); rocprofiler::hsa::detail::operator<<(out, v.global_flags); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_pointer_info_t::agentOwner").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_pointer_info_t::agentOwner"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "agentOwner="); rocprofiler::hsa::detail::operator<<(out, v.agentOwner); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_pointer_info_t::sizeInBytes").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_pointer_info_t::sizeInBytes"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "sizeInBytes="); rocprofiler::hsa::detail::operator<<(out, v.sizeInBytes); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_pointer_info_t::type").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_amd_pointer_info_t::type"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "type="); rocprofiler::hsa::detail::operator<<(out, v.type); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_pointer_info_t::size").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_amd_pointer_info_t::size"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "size="); rocprofiler::hsa::detail::operator<<(out, v.size); @@ -1373,7 +1392,8 @@ operator<<(std::ostream& out, const hsa_amd_ipc_memory_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_ipc_memory_t::handle").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_amd_ipc_memory_t::handle"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "handle="); rocprofiler::hsa::detail::operator<<(out, v.handle); @@ -1392,21 +1412,21 @@ operator<<(std::ostream& out, const hsa_amd_gpu_memory_fault_info_t& v) if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { if(std::string("hsa_amd_gpu_memory_fault_info_t::fault_reason_mask") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "fault_reason_mask="); rocprofiler::hsa::detail::operator<<(out, v.fault_reason_mask); rocprofiler::hsa::detail::operator<<(out, ", "); } if(std::string("hsa_amd_gpu_memory_fault_info_t::virtual_address") - .find(HSA_structs_regex) != std::string::npos) + .find(HSA_structs_regex) != std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "virtual_address="); rocprofiler::hsa::detail::operator<<(out, v.virtual_address); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_gpu_memory_fault_info_t::agent").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_gpu_memory_fault_info_t::agent"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "agent="); rocprofiler::hsa::detail::operator<<(out, v.agent); @@ -1424,7 +1444,8 @@ operator<<(std::ostream& out, const hsa_amd_event_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_event_t::event_type").find(HSA_structs_regex) != std::string::npos) + if(std::string_view{"hsa_amd_event_t::event_type"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "event_type="); rocprofiler::hsa::detail::operator<<(out, v.event_type); @@ -1442,15 +1463,15 @@ operator<<(std::ostream& out, const hsa_amd_svm_attribute_pair_t& v) HSA_depth_max_cnt++; if(HSA_depth_max == -1 || HSA_depth_max_cnt <= HSA_depth_max) { - if(std::string("hsa_amd_svm_attribute_pair_t::value").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_svm_attribute_pair_t::value"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "value="); rocprofiler::hsa::detail::operator<<(out, v.value); rocprofiler::hsa::detail::operator<<(out, ", "); } - if(std::string("hsa_amd_svm_attribute_pair_t::attribute").find(HSA_structs_regex) != - std::string::npos) + if(std::string_view{"hsa_amd_svm_attribute_pair_t::attribute"}.find(HSA_structs_regex) != + std::string_view::npos) { rocprofiler::hsa::detail::operator<<(out, "attribute="); rocprofiler::hsa::detail::operator<<(out, v.attribute); diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.cpp index bc06fff9a7..d076b404e2 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.cpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.cpp @@ -19,12 +19,20 @@ // THE SOFTWARE. #include "lib/rocprofiler/hsa/hsa.hpp" - #include "lib/common/defines.hpp" -#include "lib/rocprofiler/hsa/ostream.hpp" +#include "lib/common/utility.hpp" +#include "lib/rocprofiler/buffer.hpp" +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/hsa/details/ostream.hpp" #include "lib/rocprofiler/hsa/types.hpp" #include "lib/rocprofiler/hsa/utils.hpp" +#include +#include +#include + +#include + #include #include #include @@ -46,7 +54,12 @@ template void set_data_retval(DataT& _data, Tp _val) { - if constexpr(std::is_same::value) + if constexpr(std::is_same::value) + { + (void) _data; + (void) _val; + } + else if constexpr(std::is_same::value) { _data.hsa_signal_value_t_retval = _val; } @@ -100,65 +113,35 @@ get_table() } template -template +template auto -hsa_api_impl::phase_enter(DataT& _data, DataArgsT& _data_args, Args... args) +hsa_api_impl::set_data_args(DataArgsT& _data_args, Args... args) { - using info_type = hsa_api_info; - - activity_functor_t _func = report_activity.load(std::memory_order_relaxed); - if(_func) + if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect) { - if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect) - { - auto _tuple = std::make_tuple(args...); - _data.api_data.args.hsa_amd_memory_async_copy_rect.dst = std::get<0>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.dst_offset = std::get<1>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.src = std::get<2>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.src_offset = std::get<3>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.range = std::get<4>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.range__val = *(std::get<4>(_tuple)); - _data.api_data.args.hsa_amd_memory_async_copy_rect.copy_agent = std::get<5>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.dir = std::get<6>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.num_dep_signals = - std::get<7>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.dep_signals = std::get<8>(_tuple); - _data.api_data.args.hsa_amd_memory_async_copy_rect.completion_signal = - std::get<9>(_tuple); - } - else - { - _data_args = DataArgsT{args...}; - } - if(_func(info_type::domain_idx, info_type::operation_idx, &_data) == 0) - { - if(_data.phase_enter != nullptr) _data.phase_enter(info_type::operation_idx, &_data); - return true; - } - return false; + auto _tuple = std::make_tuple(args...); + _data_args.dst = std::get<0>(_tuple); + _data_args.dst_offset = std::get<1>(_tuple); + _data_args.src = std::get<2>(_tuple); + _data_args.src_offset = std::get<3>(_tuple); + _data_args.range = std::get<4>(_tuple); + _data_args.range__val = *(std::get<4>(_tuple)); + _data_args.copy_agent = std::get<5>(_tuple); + _data_args.dir = std::get<6>(_tuple); + _data_args.num_dep_signals = std::get<7>(_tuple); + _data_args.dep_signals = std::get<8>(_tuple); + _data_args.completion_signal = std::get<9>(_tuple); + } + else + { + _data_args = DataArgsT{args...}; } - return false; } template -template +template auto -hsa_api_impl::phase_exit(DataT& _data) -{ - using info_type = hsa_api_info; - - if(_data.phase_exit != nullptr) - { - _data.phase_exit(info_type::operation_idx, &_data); - return true; - } - return false; -} - -template -template -auto -hsa_api_impl::exec(DataT& _data, FuncT&& _func, Args&&... args) +hsa_api_impl::exec(FuncT&& _func, Args&&... args) { using return_type = std::decay_t>; @@ -175,9 +158,7 @@ hsa_api_impl::exec(DataT& _data, FuncT&& _func, Args&&... args) } else { - auto _ret = _func(std::forward(args)...); - set_data_retval(_data.api_data, _ret); - return _ret; + return _func(std::forward(args)...); } } @@ -194,14 +175,161 @@ hsa_api_impl::functor(Args&&... args) { using info_type = hsa_api_info; - auto trace_data = rocprofiler_hsa_trace_data_t{}; + LOG(INFO) << __PRETTY_FUNCTION__; - auto _enabled = phase_enter( - trace_data, info_type::get_api_data_args(trace_data), std::forward(args)...); + struct callback_context_data + { + context::context* ctx = nullptr; + rocprofiler_callback_tracing_record_t record = {}; + }; - auto _ret = exec(trace_data, info_type::get_table_func(), std::forward(args)...); + struct buffered_context_data + { + context::context* ctx = nullptr; + }; - if(_enabled) phase_exit(trace_data); + auto callback_contexts = std::vector{}; + auto buffered_contexts = std::vector{}; + for(const auto& aitr : context::get_active_contexts()) + { + auto* itr = aitr.load(); + if(!itr) continue; + + if(itr->callback_tracer) + { + // if the given domain + op is not enabled, skip this context + if(!itr->callback_tracer->domains(info_type::callback_domain_idx, + info_type::operation_idx)) + continue; + + callback_contexts.emplace_back( + callback_context_data{itr, rocprofiler_callback_tracing_record_t{}}); + } + + if(itr->buffered_tracer) + { + // if the given domain + op is not enabled, skip this context + if(!itr->buffered_tracer->domains(info_type::buffered_domain_idx, + info_type::operation_idx)) + continue; + + buffered_contexts.emplace_back(buffered_context_data{itr}); + } + } + + if(callback_contexts.empty() && buffered_contexts.empty()) + { + auto _ret = exec(info_type::get_table_func(), std::forward(args)...); + if constexpr(!std::is_same::value) + return _ret; + else + return HSA_STATUS_SUCCESS; + } + + auto buffer_record = rocprofiler_buffer_tracing_hsa_api_record_t{}; + auto tracer_data = rocprofiler_hsa_api_callback_tracer_data_t{}; + auto corr_id = context::correlation_tracing_service::get_unique_record_id(); + auto thr_id = common::get_tid(); + + // construct the buffered info before the callback so the callbacks are as closely wrapped + // around the function call as possible + if(!buffered_contexts.empty()) + { + buffer_record.kind = info_type::buffered_domain_idx; + buffer_record.correlation_id = rocprofiler_correlation_id_t{corr_id}; + buffer_record.operation = info_type::operation_idx; + buffer_record.thread_id = thr_id; + } + + // invoke the callbacks + if(!callback_contexts.empty()) + { + tracer_data.size = sizeof(rocprofiler_hsa_api_callback_tracer_data_t); + set_data_args(info_type::get_api_data_args(tracer_data.args), std::forward(args)...); + + for(auto& itr : callback_contexts) + { + auto& ctx = itr.ctx; + auto& record = itr.record; + + uint64_t extern_corr_id = 0; + auto& _correlation = ctx->correlation_tracer; + if(_correlation.external_id_callback) + { + _correlation.external_id = _correlation.external_id_callback( + info_type::callback_domain_idx, info_type::operation_idx, corr_id); + extern_corr_id = _correlation.external_id; + } + auto user_data = rocprofiler_user_data_t{.value = 0}; + + record = rocprofiler_callback_tracing_record_t{ + thr_id, + rocprofiler_correlation_id_t{corr_id}, + rocprofiler_external_correlation_id_t{extern_corr_id}, + info_type::callback_domain_idx, + info_type::operation_idx, + ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER, + user_data, + static_cast(&tracer_data)}; + + auto& callback_info = + ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx); + callback_info.callback(record, callback_info.data); + } + } + + // record the start timestamp as close to the function call as possible + if(!buffered_contexts.empty()) + { + buffer_record.start_timestamp = common::timestamp_ns(); + } + + auto _ret = exec(info_type::get_table_func(), std::forward(args)...); + + // record the end timestamp as close to the function call as possible + if(!buffered_contexts.empty()) + { + buffer_record.end_timestamp = common::timestamp_ns(); + } + + if(!callback_contexts.empty()) + { + set_data_retval(tracer_data.retval, _ret); + + for(auto& itr : callback_contexts) + { + auto& ctx = itr.ctx; + auto& record = itr.record; + + record.phase = ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT; + record.payload = static_cast(&tracer_data); + + auto& callback_info = + ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx); + callback_info.callback(record, callback_info.data); + } + } + + if(!buffered_contexts.empty()) + { + for(auto& itr : buffered_contexts) + { + assert(itr.ctx->buffered_tracer); + auto buffer_id = + itr.ctx->buffered_tracer->buffer_data.at(info_type::buffered_domain_idx); + for(auto& bitr : buffer::get_buffers()) + { + if(bitr && bitr->context_id == itr.ctx->context_idx && + bitr->buffer_id == buffer_id.handle) + { + bitr->emplace(ROCPROFILER_BUFFER_CATEGORY_TRACING, + info_type::buffered_domain_idx, + buffer_record); + break; + } + } + } + } if constexpr(!std::is_same::value) return _ret; @@ -222,74 +350,59 @@ namespace { template const char* -hsa_api_name(const uint32_t id, std::index_sequence) +name_by_id(const uint32_t id, std::index_sequence) { if(Idx == id) return hsa_api_info::name; if constexpr(sizeof...(IdxTail) > 0) - return hsa_api_name(id, std::index_sequence{}); + return name_by_id(id, std::index_sequence{}); else return nullptr; } template uint32_t -hsa_api_id_by_name(const char* name, std::index_sequence) +id_by_name(const char* name, std::index_sequence) { if(std::string_view{hsa_api_info::name} == std::string_view{name}) return hsa_api_info::operation_idx; if constexpr(sizeof...(IdxTail) > 0) - return hsa_api_id_by_name(name, std::index_sequence{}); + return id_by_name(name, std::index_sequence{}); else return ROCPROFILER_HSA_API_ID_NONE; } -template -std::string -hsa_api_data_string(const uint32_t id, - const rocprofiler_hsa_trace_data_t& _data, - std::index_sequence) -{ - if(Idx == id) return hsa_api_info::as_string(_data); - if constexpr(sizeof...(IdxTail) > 0) - return hsa_api_data_string(id, _data, std::index_sequence{}); - else - return std::string{}; -} - -template -std::string -hsa_api_named_data_string(const uint32_t id, - const rocprofiler_hsa_trace_data_t& _data, - std::index_sequence) -{ - if(Idx == id) return hsa_api_info::as_named_string(_data); - if constexpr(sizeof...(IdxTail) > 0) - return hsa_api_named_data_string(id, _data, std::index_sequence{}); - else - return std::string{}; -} - template void -hsa_api_iterate_args(const uint32_t id, - const rocprofiler_hsa_trace_data_t& _data, - int (*_func)(const char*, const char*), - std::index_sequence) +iterate_args(const uint32_t id, + const rocprofiler_hsa_api_callback_tracer_data_t& data, + rocprofiler_callback_tracing_operation_args_cb_t func, + void* user_data, + std::index_sequence) { if(Idx == id) { - for(auto&& itr : hsa_api_info::as_arg_list(_data)) + using info_type = hsa_api_info; + auto&& arg_list = info_type::as_arg_list(data); + auto&& arg_addr = info_type::as_arg_addr(data); + for(size_t i = 0; i < std::min(arg_list.size(), arg_addr.size()); ++i) { - _func(itr.first.c_str(), itr.second.c_str()); + auto ret = func(info_type::callback_domain_idx, // kind + id, // operation + i, // arg_number + arg_list.at(i).first.c_str(), // arg_name + arg_list.at(i).second.c_str(), // arg_value_str + arg_addr.at(i), // arg_value_addr + user_data); + if(ret != 0) break; } } if constexpr(sizeof...(IdxTail) > 0) - hsa_api_iterate_args(id, _data, _func, std::index_sequence{}); + iterate_args(id, data, func, user_data, std::index_sequence{}); } template void -hsa_api_get_ids(std::vector& _id_list, std::index_sequence) +get_ids(std::vector& _id_list, std::index_sequence) { auto _emplace = [](auto& _vec, uint32_t _v) { if(_v < ROCPROFILER_HSA_API_ID_LAST) _vec.emplace_back(_v); @@ -300,7 +413,7 @@ hsa_api_get_ids(std::vector& _id_list, std::index_sequence) template void -hsa_api_get_names(std::vector& _name_list, std::index_sequence) +get_names(std::vector& _name_list, std::index_sequence) { auto _emplace = [](auto& _vec, const char* _v) { if(_v != nullptr && strnlen(_v, 1) > 0) _vec.emplace_back(_v); @@ -311,9 +424,42 @@ hsa_api_get_names(std::vector& _name_list, std::index_sequence void -hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence) +update_table(hsa_api_table_t* _orig, std::index_sequence) { + static auto _should_wrap_functor = + [](auto _callback_domain, auto _buffered_domain, auto _operation) { + for(const auto& itr : context::get_registered_contexts()) + { + if(!itr) continue; + + if(itr->callback_tracer) + { + // domain not enabled so skip to next callback_tracer + if(!itr->callback_tracer->domains(_callback_domain)) continue; + + // if the given domain + op is enabled, we need to wrap + if(itr->callback_tracer->domains(_callback_domain, _operation)) return true; + } + + if(itr->buffered_tracer) + { + // domain not enabled so skip to next callback_tracer + if(!itr->buffered_tracer->domains(_buffered_domain)) continue; + + // if the given domain + op is enabled, we need to wrap + if(itr->buffered_tracer->domains(_buffered_domain, _operation)) return true; + } + } + return false; + }; + (void) _should_wrap_functor; + auto _update = [](hsa_api_table_t* _orig_v, auto _info) { + // check to see if there are any contexts which enable this operation in the HSA API domain + if(!_should_wrap_functor( + _info.callback_domain_idx, _info.buffered_domain_idx, _info.operation_idx)) + return; + // 1. get the sub-table containing the function pointer // 2. get reference to function pointer in sub-table // 3. update function pointer with functor @@ -328,140 +474,57 @@ hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence) // check out the assembly here... this compiles to a switch statement const char* -hsa_api_name(uint32_t id) +name_by_id(uint32_t id) { - return hsa_api_name(id, std::make_index_sequence{}); + return name_by_id(id, std::make_index_sequence{}); } uint32_t -hsa_api_id_by_name(const char* name) +id_by_name(const char* name) { - return hsa_api_id_by_name(name, std::make_index_sequence{}); -} - -std::string -hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data) -{ - return hsa_api_data_string(id, _data, std::make_index_sequence{}); -} - -std::string -hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data) -{ - return hsa_api_named_data_string( - id, _data, std::make_index_sequence{}); + return id_by_name(name, std::make_index_sequence{}); } void -hsa_api_iterate_args(uint32_t id, - const rocprofiler_hsa_trace_data_t& _data, - int (*_func)(const char*, const char*)) +iterate_args(uint32_t id, + const rocprofiler_hsa_api_callback_tracer_data_t& data, + rocprofiler_callback_tracing_operation_args_cb_t callback, + void* user_data) { - if(_func) - hsa_api_iterate_args( - id, _data, _func, std::make_index_sequence{}); + if(callback) + iterate_args( + id, data, callback, user_data, std::make_index_sequence{}); } std::vector -hsa_api_get_ids() +get_ids() { auto _data = std::vector{}; _data.reserve(ROCPROFILER_HSA_API_ID_LAST); - hsa_api_get_ids(_data, std::make_index_sequence{}); + get_ids(_data, std::make_index_sequence{}); return _data; } std::vector -hsa_api_get_names() +get_names() { auto _data = std::vector{}; _data.reserve(ROCPROFILER_HSA_API_ID_LAST); - hsa_api_get_names(_data, std::make_index_sequence{}); + get_names(_data, std::make_index_sequence{}); return _data; } void -hsa_api_set_callback(activity_functor_t _func) +set_callback(activity_functor_t _func) { auto&& _v = report_activity.load(); report_activity.compare_exchange_strong(_v, _func); } void -hsa_api_update_table(hsa_api_table_t* _orig) +update_table(hsa_api_table_t* _orig) { - if(_orig) hsa_api_update_table(_orig, std::make_index_sequence{}); + if(_orig) update_table(_orig, std::make_index_sequence{}); } } // namespace hsa } // namespace rocprofiler - -extern "C" { -bool -OnLoad(HsaApiTable* table, - uint64_t runtime_version, - uint64_t failed_tool_count, - const char* const* failed_tool_names) -{ - (void) runtime_version; - (void) failed_tool_count; - (void) failed_tool_names; - - fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__); - - auto& _saved = rocprofiler::hsa::get_table(); - ::copyTables(table, &_saved); - - rocprofiler::hsa::hsa_api_update_table(table); - - return true; -} -} - -/* -#include - -int -main() -{ - rocprofiler::hsa::activity_functor_t _cb = - [](rocprofiler_tracer_activity_domain_t domain, uint32_t operation_id, void* data) { - const auto* _name = rocprofiler::hsa::hsa_api_name(operation_id); - auto _name_id = rocprofiler::hsa::hsa_api_id_by_name(_name); - auto& _data = *static_cast(data); - std::cout << "[cb] domain=" << domain << ", op_id=" << operation_id << ", data=" << data - << ", name=" << _name << ", name_id=" << _name_id << ", named_string='" - << rocprofiler::hsa::hsa_api_named_data_string(operation_id, _data) << "'" - << "\n"; - auto _func = [](const char* name, const char* value) { - std::cout << " " << std::setw(20) << name << " = " << value << "\n"; - return 0; - }; - rocprofiler::hsa::hsa_api_iterate_args(operation_id, _data, _func); - return 0; - }; - - rocprofiler::hsa::report_activity.store(_cb); - - { - double val = 40; - hsa_code_object_t code_object = {}; - hsa_code_object_info_t attribute = HSA_CODE_OBJECT_INFO_TYPE; - void* value = &val; - - auto _func = - rocprofiler::hsa::hsa_api_info::get_functor(); - _func(code_object, attribute, value); - } - - { - bool result = false; - uint16_t ext = 1; - uint16_t major = 4; - uint16_t minor = 2; - - auto _func = rocprofiler::hsa::hsa_api_info< - HSA_API_ID_hsa_system_extension_supported>::get_functor(); - _func(ext, major, minor, &result); - } -} -*/ diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.def.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.def.cpp index 51ba7ecc8a..9a977c491f 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.def.cpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.def.cpp @@ -28,204 +28,203 @@ HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, core_) HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, amd_ext_) HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, image_ext_) -HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn) -HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data) +HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn) +HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data) +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data) // clang-format on #if HSA_AMD_EXT_API_TABLE_MAJOR_VERSION >= 0x02 -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_reserve, hsa_amd_vmem_address_reserve, hsa_amd_vmem_address_reserve_fn, @@ -233,15 +232,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, size, address, flags) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_free, hsa_amd_vmem_address_free, hsa_amd_vmem_address_free_fn, ptr, size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_create, hsa_amd_vmem_handle_create, hsa_amd_vmem_handle_create_fn, @@ -250,14 +247,12 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, type, flags, memory_handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_release, hsa_amd_vmem_handle_release, hsa_amd_vmem_handle_release_fn, memory_handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_map, hsa_amd_vmem_map, hsa_amd_vmem_map_fn, @@ -266,15 +261,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, in_offset, memory_handle, flags) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_unmap, hsa_amd_vmem_unmap, hsa_amd_vmem_unmap_fn, va, size) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_set_access, hsa_amd_vmem_set_access, hsa_amd_vmem_set_access_fn, @@ -282,38 +275,33 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, size, desc, desc_cnt) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_access, hsa_amd_vmem_get_access, hsa_amd_vmem_get_access_fn, va, perms, agent_handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_export_shareable_handle, hsa_amd_vmem_export_shareable_handle, hsa_amd_vmem_export_shareable_handle_fn, dmabuf_fd, handle, flags) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_import_shareable_handle, hsa_amd_vmem_import_shareable_handle, hsa_amd_vmem_import_shareable_handle_fn, dmabuf_fd, handle) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_retain_alloc_handle, hsa_amd_vmem_retain_alloc_handle, hsa_amd_vmem_retain_alloc_handle_fn, handle, addr) -HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_HSA_API_TABLE_ID_AmdExt, +HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_alloc_properties_from_handle, hsa_amd_vmem_get_alloc_properties_from_handle, hsa_amd_vmem_get_alloc_properties_from_handle_fn, diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.hpp index 138fed459e..78eb3991ea 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.hpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/hsa.hpp @@ -29,9 +29,9 @@ namespace rocprofiler { namespace hsa { -using activity_functor_t = int (*)(rocprofiler_tracer_activity_domain_t domain, - uint32_t operation_id, - void* data); +using activity_functor_t = int (*)(rocprofiler_service_callback_tracing_kind_t domain, + uint32_t operation_id, + void* data); using hsa_api_table_t = HsaApiTable; @@ -44,14 +44,11 @@ struct hsa_table_lookup; template struct hsa_api_impl { - template - static auto phase_enter(DataT& _data, DataArgsT&, Args... args); + template + static auto set_data_args(DataArgsT&, Args... args); - template - static auto phase_exit(DataT& _data); - - template - static auto exec(DataT& _data, FuncT&&, Args&&... args); + template + static auto exec(FuncT&&, Args&&... args); template static auto functor(Args&&... args); @@ -61,39 +58,27 @@ template struct hsa_api_info; const char* -hsa_api_name(uint32_t id); +name_by_id(uint32_t id); uint32_t -hsa_api_id_by_name(const char* name); - -std::string -hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data); - -std::string -hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data); +id_by_name(const char* name); void -hsa_api_iterate_args(uint32_t id, - const rocprofiler_hsa_trace_data_t& _data, - int (*_func)(const char*, const char*)); +iterate_args(uint32_t id, + const rocprofiler_hsa_api_callback_tracer_data_t& data, + rocprofiler_callback_tracing_operation_args_cb_t callback, + void* user_data); std::vector -hsa_api_get_names(); +get_names(); std::vector -hsa_api_get_ids(); +get_ids(); void -hsa_api_set_callback(activity_functor_t _func); +set_callback(activity_functor_t _func); + +void +update_table(hsa_api_table_t* _orig); } // namespace hsa } // namespace rocprofiler - -extern "C" { -using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*); - -bool -OnLoad(HsaApiTable* table, - uint64_t runtime_version, - uint64_t failed_tool_count, - const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API; -} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/utils.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/utils.hpp index fd3a361bd4..59cd4b3b09 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/utils.hpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/hsa/utils.hpp @@ -45,70 +45,47 @@ namespace hsa { namespace utils { -template ::value, int> = 0> -std::string -stringize_impl(Tp _v, int) -{ - return fmt::format("{}", _v); -} - template -std::string -stringize_impl(Tp _v, long) +struct is_pair_impl { - auto _ss = std::stringstream{}; - _ss << _v; - return _ss.str(); -} + static constexpr auto value = false; +}; template -auto -stringize_impl(const std::pair& _v, int) +struct is_pair_impl> { - return std::make_pair(stringize_impl(_v.first, 0), stringize_impl(_v.second, 0)); -} - -struct join_args -{ - std::string_view prefix = {}; - std::string_view suffix = {}; - std::string_view separator = {}; + static constexpr auto value = true; }; template -std::string -join_impl(const Tp& _v) -{ - return stringize_impl(_v, 0); -} +struct is_pair : is_pair_impl>>> +{}; -template -std::string -join_impl(const std::pair& _v) -{ - return fmt::format("{}={}", join_impl(_v.first), join_impl(_v.second)); -} - -template +template auto -join(join_args ja, Args... args) +stringize_impl(const Tp& _v) { - auto _content = std::string{}; + if constexpr(is_pair::value) + { + return std::make_pair(stringize_impl(_v.first), stringize_impl(_v.second)); + } + else if constexpr(fmt::is_formattable::value && !std::is_pointer::value) + { + return fmt::format("{}", _v); + } + else { auto _ss = std::stringstream{}; - ((_ss << ja.separator << join_impl(args)), ...); - auto _v = _ss.str(); - if(_v.length() > ja.separator.length()) _content = _v.substr(2); + _ss << _v; + return _ss.str(); } - - return (std::stringstream{} << ja.prefix << _content << ja.suffix).str(); } template auto stringize(Args... args) { - return std::vector>{stringize_impl(args, 0)...}; + return std::vector>{stringize_impl(args)...}; } template diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.cpp new file mode 100644 index 0000000000..a52834f754 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.cpp @@ -0,0 +1,279 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include +#include +#include + +#include "lib/common/container/stable_vector.hpp" +#include "lib/rocprofiler/buffer.hpp" +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/internal_threading.hpp" + +#include +#include +#include +#include + +namespace rocprofiler +{ +namespace internal_threading +{ +namespace +{ +template +using library_sequence_t = std::integer_sequence; +using creation_notifier_cb_t = void (*)(rocprofiler_internal_thread_library_t, void*); +using thread_pool_config_t = PTL::ThreadPool::Config; + +// this is used to loop over the different libraries +constexpr auto creation_notifier_library_seq = library_sequence_t{}; + +// check that creation_notifier_library_seq is up to date +static_assert((1 << (creation_notifier_library_seq.size() - 1)) == ROCPROFILER_LIBRARY_LAST, + "Update creation_notifier_library_seq to include new libraries"); + +// used to distinguish invoking pre vs. post at compile-time +enum class notifier_stage +{ + precreation = 0, + postcreation, +}; + +// data structure holding list of callbacks +template +struct creation_notifier +{ + static constexpr auto value = LibT; + + std::vector precreate_callbacks = {}; + std::vector postcreate_callbacks = {}; + std::vector user_data = {}; + std::mutex mutex = {}; +}; + +// static accessor for creation_notifier instance +template +auto& +get_creation_notifier() +{ + static auto _v = creation_notifier{}; + return _v; +} + +// adds callbacks to creation_notifier instance(s) +template +void +update_creation_notifiers(creation_notifier_cb_t pre, + creation_notifier_cb_t post, + int libs, + void* data, + library_sequence_t) +{ + auto update = [pre, post, libs, data](auto& notifier) { + if(libs == 0 || ((libs & notifier.value) == notifier.value)) + { + notifier.mutex.lock(); + notifier.precreate_callbacks.emplace_back(pre); + notifier.postcreate_callbacks.emplace_back(post); + notifier.user_data.emplace_back(data); + notifier.mutex.unlock(); + } + }; + + (update(get_creation_notifier()), ...); +} + +// invokes creation notifiers +template +void +execute_creation_notifiers(rocprofiler_internal_thread_library_t libs, + std::integer_sequence) +{ + auto execute = [libs](auto& notifier) { + if(((libs & notifier.value) == notifier.value)) + { + notifier.mutex.lock(); + if constexpr(StageT == notifier_stage::precreation) + { + for(size_t i = 0; i < notifier.precreate_callbacks.size(); ++i) + { + auto itr = notifier.precreate_callbacks.at(i); + if(itr) itr(notifier.value, notifier.user_data.at(i)); + } + } + else if constexpr(StageT == notifier_stage::postcreation) + { + for(size_t i = 0; i < notifier.postcreate_callbacks.size(); ++i) + { + auto itr = notifier.postcreate_callbacks.at(i); + if(itr) itr(notifier.value, notifier.user_data.at(i)); + } + } + notifier.mutex.unlock(); + } + }; + + (execute(get_creation_notifier()), ...); +} + +auto& +get_thread_pools() +{ + static auto _v = thread_pool_vec_t{}; + return _v; +} + +auto& +get_task_groups() +{ + static auto _v = task_group_vec_t{}; + return _v; +} +} // namespace + +// initialize the default thread pool +void +initialize() +{ + static auto _once = std::once_flag{}; + std::call_once(_once, create_callback_thread); +} + +// sync all the task groups and destroy the thread pools +void +finalize() +{ + for(auto& itr : get_task_groups()) + { + if(itr) itr->join(); + } + + for(auto& itr : get_thread_pools()) + { + if(itr) itr->destroy_threadpool(); + } + + for(auto& itr : get_task_groups()) + itr.reset(); + + for(auto& itr : get_thread_pools()) + itr.reset(); +} + +void +notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t libs) +{ + execute_creation_notifiers(libs, creation_notifier_library_seq); +} + +void +notify_post_internal_thread_create(rocprofiler_internal_thread_library_t libs) +{ + execute_creation_notifiers(libs, creation_notifier_library_seq); +} + +rocprofiler_callback_thread_t +create_callback_thread() +{ + // notify that rocprofiler library is about to create an inernal thread + notify_pre_internal_thread_create(ROCPROFILER_LIBRARY); + + // this will be index after emplace_back + auto idx = get_thread_pools().size(); + + auto& thr_pool = get_thread_pools().emplace_back( + new thread_pool_t{thread_pool_config_t{.pool_size = 1}}, [](thread_pool_t* v) { + v->destroy_threadpool(); + delete v; + }); + + // construct the task group to use the newly created thread pool + get_task_groups().emplace_back(new task_group_t{thr_pool.get()}); + + // notify that rocprofiler library finished creating an internal thread + notify_post_internal_thread_create(ROCPROFILER_LIBRARY); + + return rocprofiler_callback_thread_t{idx}; +} + +// returns the task group for the given callback thread identifier +task_group_t* +get_task_group(rocprofiler_callback_thread_t cb_tid) +{ + return get_task_groups().at(cb_tid.handle).get(); +} +} // namespace internal_threading +} // namespace rocprofiler + +extern "C" { +rocprofiler_status_t +rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate, + rocprofiler_internal_thread_library_cb_t postcreate, + int libs, + void* data) +{ + rocprofiler::internal_threading::update_creation_notifiers( + precreate, + postcreate, + libs, + data, + rocprofiler::internal_threading::creation_notifier_library_seq); + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id) +{ + rocprofiler::internal_threading::initialize(); + + auto cb_tid = rocprofiler::internal_threading::create_callback_thread(); + if(cb_tid.handle > 0) + { + *cb_thread_id = cb_tid; + return ROCPROFILER_STATUS_SUCCESS; + } + + return ROCPROFILER_STATUS_ERROR; +} + +rocprofiler_status_t ROCPROFILER_API +rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id, + rocprofiler_callback_thread_t cb_thread_id) +{ + if(cb_thread_id.handle >= rocprofiler::internal_threading::get_task_groups().size()) + return ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND; + + for(auto& bitr : rocprofiler::buffer::get_buffers()) + { + if(bitr && bitr->buffer_id == buffer_id.handle) + { + bitr->task_group_id = cb_thread_id.handle; + return ROCPROFILER_STATUS_SUCCESS; + } + } + return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; +} +} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.hpp new file mode 100644 index 0000000000..03730fd547 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/internal_threading.hpp @@ -0,0 +1,66 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include + +#include "lib/common/container/stable_vector.hpp" +#include "lib/common/defines.hpp" + +#include +#include + +#include +#include +#include + +namespace rocprofiler +{ +namespace internal_threading +{ +using thread_pool_t = PTL::ThreadPool; +using task_group_t = PTL::TaskGroup; +using unique_thread_pool_t = std::unique_ptr; +using unique_task_group_t = std::unique_ptr; +using thread_pool_vec_t = std::vector; +using task_group_vec_t = std::vector; + +void notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t); +void notify_post_internal_thread_create(rocprofiler_internal_thread_library_t); + +// initialize the default thread pool +void +initialize(); + +// destroy all the thread pools +void +finalize(); + +// creates a new thread +rocprofiler_callback_thread_t +create_callback_thread(); + +// returns the task group for the given callback thread identifier +task_group_t* get_task_group(rocprofiler_callback_thread_t); +} // namespace internal_threading +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.cpp new file mode 100644 index 0000000000..78c9dec108 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.cpp @@ -0,0 +1,556 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#include "lib/rocprofiler/registration.hpp" +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/internal_threading.hpp" + +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +extern "C" { +#pragma weak rocprofiler_configure + +extern rocprofiler_tool_configure_result_t* +rocprofiler_configure(uint32_t, const char*, uint32_t, rocprofiler_client_id_t*); +} + +namespace rocprofiler +{ +namespace registration +{ +namespace +{ +auto& +get_status() +{ + static auto _v = std::pair, std::atomic>{0, 0}; + return _v; +} + +auto& +get_invoked_configures() +{ + static auto _v = std::unordered_set{}; + return _v; +} + +auto& +get_forced_configure() +{ + static rocprofiler_configure_func_t _v = nullptr; + return _v; +} + +void +init_logging() +{ + static auto _once = std::once_flag{}; + std::call_once(_once, []() { + auto get_argv0 = []() { + auto ifs = std::ifstream{"/proc/self/cmdline"}; + auto sarg = std::string{}; + while(ifs && !ifs.eof()) + { + ifs >> sarg; + if(!sarg.empty()) break; + } + return sarg; + }; + + static auto argv0 = get_argv0(); + google::InitGoogleLogging(argv0.c_str()); + LOG(INFO) << "logging initialized"; + }); +} + +std::vector +get_link_map() +{ + auto chain = std::vector{}; + void* handle = nullptr; + handle = dlopen(nullptr, RTLD_LAZY | RTLD_NOLOAD); + + if(handle) + { + struct link_map* link_map_v = nullptr; + dlinfo(handle, RTLD_DI_LINKMAP, &link_map_v); + struct link_map* next_link = link_map_v->l_next; + while(next_link) + { + if(next_link->l_name != nullptr && !std::string_view{next_link->l_name}.empty()) + { + chain.emplace_back(next_link->l_name); + } + next_link = next_link->l_next; + } + } + + return chain; +} + +struct client_library +{ + std::string name = {}; + void* dlhandle = nullptr; + decltype(::rocprofiler_configure)* configure_func = nullptr; + std::unique_ptr configure_result = {}; + rocprofiler_client_id_t internal_client_id = {}; + rocprofiler_client_id_t mutable_client_id = {}; +}; + +std::vector +find_clients() +{ + auto data = std::vector{}; + + if(get_forced_configure()) + { + data.emplace_back(client_library{"(forced)", nullptr, get_forced_configure()}); + } + + if(!rocprofiler_configure && !get_forced_configure()) + { + LOG(ERROR) << "no rocprofiler_configure function found"; + return data; + } + + if(rocprofiler_configure != &rocprofiler_configure) + throw std::runtime_error("rocprofiler_configure != &rocprofiler_configure"); + + if(&rocprofiler_configure != get_forced_configure()) + data.emplace_back(client_library{"unknown", nullptr, &rocprofiler_configure}); + + for(const auto& itr : get_link_map()) + { + LOG(INFO) << "searching " << itr << " for rocprofiler_configure"; + + void* handle = dlopen(itr.c_str(), RTLD_LAZY | RTLD_NOLOAD); + LOG_IF(ERROR, handle == nullptr) << "error dlopening " << itr; + + decltype(::rocprofiler_configure)* _sym = nullptr; + *(void**) (&_sym) = dlsym(handle, "rocprofiler_configure"); + + // skip the configure function that was forced + if(_sym == get_forced_configure()) + { + data.front().name = itr; + data.front().dlhandle = handle; + data.front().internal_client_id.name = "(forced)"; + continue; + } + + if(!_sym) + { + LOG(INFO) << "|_" << itr << " did not contain rocprofiler_configure symbol"; + continue; + } + + if(_sym == &rocprofiler_configure && data.size() == 1) + { + data.front().name = itr; + data.front().dlhandle = handle; + data.front().internal_client_id.name = "default"; + } + else + { + uint32_t _prio = data.size(); + auto& entry = + data.emplace_back(client_library{itr, + handle, + _sym, + nullptr, + rocprofiler_client_id_t{nullptr, _prio}, + rocprofiler_client_id_t{nullptr, _prio}}); + entry.internal_client_id.name = entry.name.c_str(); + } + } + + LOG(ERROR) << __FUNCTION__ << " found " << data.size() << " clients"; + + return data; +} + +std::vector& +get_clients() +{ + static auto _v = find_clients(); + return _v; +} + +using mutex_t = std::recursive_mutex; +using scoped_lock_t = std::unique_lock; + +mutex_t& +get_registration_mutex() +{ + static auto _v = mutex_t{}; + return _v; +} +} // namespace + +int +get_init_status() +{ + return get_status().first.load(std::memory_order_acquire); +} + +int +get_fini_status() +{ + return get_status().second.load(std::memory_order_acquire); +} + +void +set_init_status(int v) +{ + get_status().first.store(v, std::memory_order_release); +} + +void +set_fini_status(int v) +{ + get_status().second.store(v, std::memory_order_release); +} + +bool +invoke_client_configures() +{ + if(get_init_status() > 0) return false; + + auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock}; + if(_lk.owns_lock()) return false; + _lk.lock(); + + LOG(ERROR) << __FUNCTION__; + + size_t prio = 0; + for(auto& itr : get_clients()) + { + if(get_invoked_configures().find(itr.configure_func) != get_invoked_configures().end()) + { + LOG(ERROR) << "rocprofiler::registration::invoke_client_configures() attempted to " + "invoke configure function from " + << itr.name << " (addr=" + << fmt::format("{:#018x}", reinterpret_cast(itr.configure_func)) + << ") more than once"; + continue; + } + else + { + LOG(INFO) << "rocprofiler::registration::invoke_client_configures() invoking configure " + "function from " + << itr.name << " (addr=" + << fmt::format("{:#018x}", reinterpret_cast(itr.configure_func)) + << ")"; + } + + auto* _result = itr.configure_func( + ROCPROFILER_VERSION, ROCPROFILER_VERSION_STRING, prio++, &itr.mutable_client_id); + if(_result) + itr.configure_result = std::make_unique(*_result); + + get_invoked_configures().emplace(itr.configure_func); + } + + return true; +} + +bool +invoke_client_initializers() +{ + if(get_init_status() > 0) return false; + + auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock}; + if(_lk.owns_lock()) return false; + _lk.lock(); + + LOG(ERROR) << __FUNCTION__; + + set_init_status(-1); + for(auto& itr : get_clients()) + { + if(itr.configure_result && itr.configure_result->initialize) + { + context::push_client(itr.internal_client_id.handle); + itr.configure_result->initialize(&invoke_client_finalizer, + itr.configure_result->tool_data); + context::pop_client(itr.internal_client_id.handle); + // set to nullptr so initialize only gets called once + itr.configure_result->initialize = nullptr; + } + } + + // initialization is no longer available + set_init_status(1); + + return true; +} + +bool +invoke_client_finalizers() +{ + if(get_fini_status() > 0) return false; + + auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock}; + if(_lk.owns_lock()) return false; + _lk.lock(); + + set_fini_status(-1); + for(auto& itr : get_clients()) + { + if(itr.configure_result && itr.configure_result->finalize) + { + itr.configure_result->finalize(itr.configure_result->tool_data); + // set to nullptr so finalize only gets called once + itr.configure_result->finalize = nullptr; + } + } + + set_fini_status(1); + + return true; +} + +bool +invoke_client_initializer(rocprofiler_client_id_t client_id) +{ + if(get_init_status() > 0) return false; + + auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock}; + if(_lk.owns_lock()) return false; + _lk.lock(); + + // save the original status + auto _restore_status = get_init_status(); + set_init_status(-1); + for(auto& itr : get_clients()) + { + if(itr.internal_client_id.handle == client_id.handle && + itr.mutable_client_id.handle == client_id.handle) + { + if(itr.configure_result && itr.configure_result->initialize) + { + context::push_client(itr.internal_client_id.handle); + itr.configure_result->initialize(&invoke_client_finalizer, + itr.configure_result->tool_data); + context::pop_client(itr.internal_client_id.handle); + // set to nullptr so initialize only gets called once + itr.configure_result->initialize = nullptr; + } + } + } + + // we don't want the explicit client initialization to set the init status to 1 + // we just want to restore what it previously was + set_init_status(_restore_status); + + return true; +} + +void +invoke_client_finalizer(rocprofiler_client_id_t client_id) +{ + auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock}; + if(_lk.owns_lock()) return; + _lk.lock(); + + for(auto& itr : get_clients()) + { + if(itr.internal_client_id.handle == client_id.handle && + itr.mutable_client_id.handle == client_id.handle) + { + if(itr.configure_result && itr.configure_result->finalize) + { + itr.configure_result->finalize(itr.configure_result->tool_data); + // set to nullptr so finalize only gets called once + itr.configure_result->finalize = nullptr; + } + } + } +} + +void +initialize() +{ + static auto _once = std::once_flag{}; + static auto _ready = std::atomic{false}; + + std::call_once(_once, []() { + init_logging(); + invoke_client_configures(); + invoke_client_initializers(); + internal_threading::initialize(); + std::atexit(&finalize); + _ready.store(true, std::memory_order_release); + }); + + if(!_ready.load(std::memory_order_acquire)) + { + while(!_ready.load(std::memory_order_acquire)) + std::this_thread::yield(); + } +} + +void +finalize() +{ + hsa_shut_down(); + invoke_client_finalizers(); + for(auto& itr : rocprofiler::context::get_active_contexts()) + itr.store(nullptr, std::memory_order_seq_cst); + internal_threading::finalize(); +} +} // namespace registration +} // namespace rocprofiler + +extern "C" { +rocprofiler_status_t +rocprofiler_is_initialized(int* status) +{ + *status = rocprofiler::registration::get_init_status(); + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_is_finalized(int* status) +{ + *status = rocprofiler::registration::get_fini_status(); + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_force_configure(rocprofiler_configure_func_t configure_func) +{ + auto& forced_config = rocprofiler::registration::get_forced_configure(); + + // init status may be -1 (currently initializing) or 1 (already initialized). + // if either case, we want to ignore this function call but if this is + if(rocprofiler::registration::get_init_status() != 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + // if another tool forced configure, the init status should be 1, but + // let's just make sure that the forced configure function is a nullptr + if(forced_config) return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + + forced_config = configure_func; + rocprofiler::registration::initialize(); + + return ROCPROFILER_STATUS_SUCCESS; +} + +int +rocprofiler_set_api_table(const char* name, + uint64_t lib_version, + uint64_t lib_instance, + void** tables, + uint64_t num_tables) +{ + static auto _once = std::once_flag{}; + std::call_once(_once, rocprofiler::registration::initialize); + + // pass to roctx init + LOG_IF(ERROR, num_tables == 0) << " rocprofiler expected " << name + << " library to pass at least one table, not " << num_tables; + LOG_IF(ERROR, tables == nullptr) << " rocprofiler expected pointer to array of tables from " + << name << " library, not a nullptr"; + + if(std::string_view{name} == "hip") + { + // pass to hip init + LOG_IF(ERROR, num_tables > 1) + << " rocprofiler expected HIP library to pass 1 API table, not " << num_tables; + } + else if(std::string_view{name} == "hsa") + { + // pass to hsa init + LOG_IF(ERROR, num_tables > 1) + << " rocprofiler expected HSA library to pass 1 API table, not " << num_tables; + + auto* hsa_api_table = static_cast(*tables); + auto& saved_hsa_api_table = rocprofiler::hsa::get_table(); + ::copyTables(hsa_api_table, &saved_hsa_api_table); + + rocprofiler::hsa::update_table(hsa_api_table); + } + else if(std::string_view{name} == "roctx") + { + // pass to roctx init + LOG_IF(ERROR, num_tables > 1) + << " rocprofiler expected ROCTX library to pass 1 API table, not " << num_tables; + } + else + { + LOG(ERROR) << "rocprofiler does not accept API tables from " << name; + LOG_ASSERT(false) << " rocprofiler does not accept API tables from " << name; + } + + (void) lib_version; + (void) lib_instance; + (void) tables; + (void) num_tables; + + return 0; +} + +bool +OnLoad(HsaApiTable* table, + uint64_t runtime_version, + uint64_t failed_tool_count, + const char* const* failed_tool_names) +{ + rocprofiler::registration::init_logging(); + + (void) runtime_version; + (void) failed_tool_count; + (void) failed_tool_names; + + fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__); + + void* table_v = static_cast(table); + rocprofiler_set_api_table("hsa", runtime_version, 0, &table_v, 1); + + return true; +} +} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.hpp new file mode 100644 index 0000000000..4943f3fd54 --- /dev/null +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/registration.hpp @@ -0,0 +1,95 @@ +// MIT License +// +// Copyright (c) 2023 ROCm Developer Tools +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in all +// copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. + +#pragma once + +#include +#include "lib/common/defines.hpp" + +#include +#include +#include + +extern "C" { +struct HsaApiTable; + +using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*); + +bool +OnLoad(HsaApiTable* table, + uint64_t runtime_version, + uint64_t failed_tool_count, + const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API; + +// this is the "hidden" function that rocprofiler-register invokes to pass +// the API tables to rocprofiler +int +rocprofiler_set_api_table(const char* name, + uint64_t lib_version, + uint64_t lib_instance, + void** tables, + uint64_t num_tables) ROCPROFILER_PUBLIC_API; +} + +namespace rocprofiler +{ +namespace registration +{ +// initialize the clients +void +initialize(); + +// finalize the clients +void +finalize(); + +// invoke all rocprofiler_configure symbols +bool +invoke_client_configures(); + +// invoke initialize functions returned from rocprofiler_configure +bool +invoke_client_initializers(); + +// invoke finalize functions returned from rocprofiler_configure +bool +invoke_client_finalizers(); + +// explicitly invoke the initialize function of a specific client +bool invoke_client_initializer(rocprofiler_client_id_t); + +// explicitly invoke the finalize function of a specific client +void invoke_client_finalizer(rocprofiler_client_id_t); + +int +get_init_status(); + +int +get_fini_status(); + +void +set_init_status(int); + +void +set_fini_status(int); +} // namespace registration +} // namespace rocprofiler diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler.cpp index 0e1d25dcda..4fabea84f2 100644 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler.cpp +++ b/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler.cpp @@ -20,9 +20,16 @@ // OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE // SOFTWARE. +#include #include -#include +#include "lib/common/utility.hpp" +#include "lib/rocprofiler/context/context.hpp" +#include "lib/rocprofiler/context/domain.hpp" +#include "lib/rocprofiler/hsa/hsa.hpp" +#include "lib/rocprofiler/registration.hpp" + +#include #include namespace @@ -34,6 +41,22 @@ consume_args(Tp&&...) } // namespace extern "C" { +rocprofiler_status_t +rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch) +{ + if(major) *major = ROCPROFILER_VERSION_MAJOR; + if(minor) *minor = ROCPROFILER_VERSION_MINOR; + if(patch) *patch = ROCPROFILER_VERSION_PATCH; + return ROCPROFILER_STATUS_SUCCESS; +} + +rocprofiler_status_t +rocprofiler_get_timestamp(rocprofiler_timestamp_t* ts) +{ + *ts = rocprofiler::common::timestamp_ns(); + return ROCPROFILER_STATUS_SUCCESS; +} + rocprofiler_status_t rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback, size_t agent_size, @@ -76,54 +99,6 @@ rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback, return callback(_agents.data(), _agents.size(), user_data); } -rocprofiler_status_t -rocprofiler_create_context(rocprofiler_context_id_t* context_id) -{ - consume_args(context_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - -rocprofiler_status_t -rocprofiler_start_context(rocprofiler_context_id_t context_id) -{ - consume_args(context_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - -rocprofiler_status_t -rocprofiler_stop_context(rocprofiler_context_id_t context_id) -{ - consume_args(context_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - -rocprofiler_status_t -rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id) -{ - consume_args(buffer_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - -rocprofiler_status_t -rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id) -{ - consume_args(buffer_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - -rocprofiler_status_t -rocprofiler_create_buffer(rocprofiler_context_id_t context, - size_t size, - size_t watermark, - rocprofiler_buffer_policy_t action, - rocprofiler_buffer_callback_t callback, - void* callback_data, - rocprofiler_buffer_id_t* buffer_id) -{ - consume_args(context, size, watermark, action, callback, callback_data, buffer_id); - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; -} - rocprofiler_status_t rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id, rocprofiler_agent_t agent, @@ -132,6 +107,9 @@ rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t conte uint64_t interval, rocprofiler_buffer_id_t buffer_id) { + if(rocprofiler::registration::get_init_status() > 0) + return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED; + consume_args(context_id, agent, method, unit, interval, buffer_id); return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED; } diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler_config.cpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler_config.cpp deleted file mode 100644 index 9b52c7611e..0000000000 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/rocprofiler_config.cpp +++ /dev/null @@ -1,701 +0,0 @@ - -#include -#include - -#include "config_helpers.hpp" -#include "config_internal.hpp" - -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -typedef enum -{ - ACTIVITY_API_PHASE_ENTER = 0, - ACTIVITY_API_PHASE_EXIT = 1 -} activity_api_phase_t; - -typedef struct roctx_api_data_s -{ - union - { - struct - { - const char* message; - roctx_range_id_t id; - }; - struct - { - const char* message; - } roctxMarkA; - struct - { - const char* message; - } roctxRangePushA; - struct - { - const char* message; - } roctxRangePop; - struct - { - const char* message; - roctx_range_id_t id; - } roctxRangeStartA; - struct - { - const char* message; - roctx_range_id_t id; - } roctxRangeStop; - } args; -} roctx_api_data_t; - -// helper macros ensuring C and C++ structs adhere to specific naming convention -#define ROCP_PUBLIC_CONFIG(TYPE) ::rocprofiler_##TYPE -#define ROCP_PRIVATE_CONFIG(TYPE) ::rocprofiler::internal::TYPE - -// Below asserts at compile time that the external C object has the same size as internal -// C++ object, e.g., -// sizeof(rocprofiler_domain_config) == sizeof(rocprofiler::internal::domain_config) -#define ROCP_ASSERT_CONFIG_ABI(TYPE) \ - static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)) == sizeof(ROCP_PRIVATE_CONFIG(TYPE)), \ - "Error! rocprofiler_" #TYPE " ABI error"); - -// Below asserts at compile time that the external C struct members has the same offset as -// internal C++ struct members -#define ROCP_ASSERT_CONFIG_OFFSET_ABI(TYPE, PUB_FIELD, PRIV_FIELD) \ - static_assert(offsetof(ROCP_PUBLIC_CONFIG(TYPE), PUB_FIELD) == \ - offsetof(ROCP_PRIVATE_CONFIG(TYPE), PRIV_FIELD), \ - "Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI offset error"); \ - static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)::PUB_FIELD) == \ - sizeof(ROCP_PRIVATE_CONFIG(TYPE)::PRIV_FIELD), \ - "Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI size error"); - -// this defines a template specialization for ensuring that the reinterpret_cast is only -// applied between public C structs and private C++ structs which are compatible. -#define ROCP_DEFINE_API_CAST_IMPL(INPUT_TYPE, OUTPUT_TYPE) \ - namespace traits \ - { \ - template <> \ - struct api_cast \ - { \ - using input_type = INPUT_TYPE; \ - using output_type = OUTPUT_TYPE; \ - \ - output_type* operator()(input_type* _v) const \ - { \ - return reinterpret_cast(_v); \ - } \ - \ - const output_type* operator()(const input_type* _v) const \ - { \ - return reinterpret_cast(_v); \ - } \ - }; \ - } - -// define C -> C++ and C++ -> C casting rules -#define ROCP_DEFINE_API_CAST_D(TYPE) \ - ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE)) \ - ROCP_DEFINE_API_CAST_IMPL(ROCP_PRIVATE_CONFIG(TYPE), ROCP_PUBLIC_CONFIG(TYPE)) - -// use only when C++ struct is just an alias for C struct -#define ROCP_DEFINE_API_CAST_S(TYPE) \ - ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE)) - -namespace -{ -namespace traits -{ -// left undefined to ensure template specialization -template -struct api_cast; - -// ensure api_cast where decltype(a) is const Tp equates to api_cast -template -struct api_cast : api_cast -{}; - -// ensure api_cast where decltype(a) is Tp& equates to api_cast -template -struct api_cast : api_cast -{}; - -// ensure api_cast where decltype(a) is Tp* equates to api_cast -template -struct api_cast : api_cast -{}; -} // namespace traits - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -// -// -// SEE BELOW! VERY IMPORTANT! -// -// -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -// -// -// EVERY NEW CONFIG AND ALL OF ITS MEMBER FIELDS NEED TO HAVE THESE COMPILE TIME CHECKS! -// -// these checks verify the two structs have the same size and that each -// member field has the same size and offset into the struct -// -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -ROCP_ASSERT_CONFIG_ABI(config) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, size, size) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, compat_version, compat_version) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, api_version, api_version) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, reserved0, context_idx) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, user_data, user_data) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, buffer, buffer) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, domain, domain) -ROCP_ASSERT_CONFIG_OFFSET_ABI(config, filter, filter) - -ROCP_ASSERT_CONFIG_ABI(domain_config) -ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, callback, user_sync_callback) -ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved0, domains) -ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved1, opcodes) - -ROCP_ASSERT_CONFIG_ABI(buffer_config) -ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, callback, callback) -ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, buffer_size, buffer_size) -// ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved0, buffer) -ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved1, buffer_idx) - -ROCP_DEFINE_API_CAST_D(config) -ROCP_DEFINE_API_CAST_D(domain_config) -ROCP_DEFINE_API_CAST_D(buffer_config) -ROCP_DEFINE_API_CAST_S(filter_config) - -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -// -// -// SEE ABOVE! VERY IMPORTANT! -// -// -// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -/// use this to ensure that reinterpret_cast from public C struct to internal C++ struct -/// is valid, e.g. guard against accidentally casting to wrong type -template -auto -rocp_cast(Tp* _val) -{ - return traits::api_cast{}(_val); -} - -/// helper function for making copies of the fields in rocprofiler_config. If the config -/// field needs to be copied in some special way, use a template specialization of the -/// "construct" function in the allocator to handle this, e.g.: -/// -/// using special_config = ::rocprofiler::internal::special_config; -/// -/// template <> -/// void -/// allocator::construct(special_config* const _p, -/// const special_config& _v) const -/// { -/// auto _tmp = special_config{}; -/// // ... special copy of fields from _v into _tmp -/// -/// // placement new of _tmp into _p -/// _p = new(_p) special_config{ _tmp }; -/// } -/// -/// template <> -/// void -/// allocator::construct(special_config* const _p, -/// special_config&& _v) const -/// { -/// auto _tmp = std::move(_v); -/// // ... perform special needs -/// -/// // placement new of _tmp into _p -/// _p = new(_p) special_config{ std::move(_tmp) }; -/// } -/// -template -Tp*& -copy_config_field(Tp*& _dst, Up* _src_v) -{ - static auto _allocator = allocator{}; - - if constexpr(!std::is_same::value) - { - using PrivateT = typename traits::api_cast::output_type; - static_assert(std::is_same::value, "Error incorrect field copy"); - - auto _src = rocp_cast(_src_v); - if(_src) - { - _dst = _allocator.allocate(1); - _allocator.construct(_dst, *_src); - } - return _dst; - } - else - { - if(_src_v) - { - _dst = _allocator.allocate(1); - _allocator.construct(_dst, *_src_v); - } - return _dst; - } -} - -auto& -get_configs_buffer() -{ - static char - _v[::rocprofiler::internal::max_configs_count * sizeof(rocprofiler::internal::config)]; - return _v; -} - -auto& -get_configs_mutex() -{ - static auto _v = std::mutex{}; - return _v; -} - -inline uint32_t -get_tid() -{ - return syscall(__NR_gettid); -} - -constexpr auto rocp_max_configs = ::rocprofiler::internal::max_configs_count; -} // namespace - -namespace rocprofiler -{ -namespace internal -{ -std::array& -get_registered_configs() -{ - static auto _v = std::array{}; - return _v; -} - -std::array, max_configs_count>& -get_active_configs() -{ - static auto _v = std::array, max_configs_count>{}; - return _v; -} -} // namespace internal -} // namespace rocprofiler - -extern "C" { - -rocprofiler_status_t -rocprofiler_allocate_config(rocprofiler_config* _inp_cfg) -{ - // perform checks that rocprofiler can be activated - - ::memset(_inp_cfg, 0, sizeof(rocprofiler_config)); - - auto* _cfg = rocp_cast(_inp_cfg); - - _cfg->size = sizeof(::rocprofiler_config); - _cfg->compat_version = 0; - _cfg->api_version = ROCPROFILER_API_VERSION_ID; - _cfg->context_idx = std::numeric_limitscontext_idx)>::max(); - - // initial value checks - assert(_cfg->size == sizeof(rocprofiler::internal::config)); - assert(_cfg->compat_version == 0); - assert(_cfg->api_version == ROCPROFILER_API_VERSION_ID); - assert(_cfg->buffer == nullptr); - assert(_cfg->domain == nullptr); - assert(_cfg->filter == nullptr); - assert(_cfg->context_idx == - std::numeric_limits::max()); - - // ... allocate any internal space needed to handle another config ... - { - auto _lk = std::unique_lock{get_configs_mutex()}; - // ... - } - - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_validate_config(const rocprofiler_config* cfg_v) -{ - const auto* cfg = rocp_cast(cfg_v); - - if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND; - - if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND; - - if(cfg->domain == nullptr || cfg->domain->domains == 0) - return ROCPROFILER_STATUS_ERROR_INCORRECT_DOMAIN; - - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_start_config(rocprofiler_config* cfg_v, rocprofiler_context_id_t* context_id) -{ - if(rocprofiler_validate_config(cfg_v) != ROCPROFILER_STATUS_SUCCESS) - { - std::cerr << "rocprofiler_start_config() provided an invalid configuration. tool " - "should use rocprofiler_validate_config() to check whether the " - "config is valid and adapt accordingly to issues before trying to " - "start the configuration." - << std::endl; - abort(); - } - - auto* cfg = rocp_cast(cfg_v); - - uint64_t idx = rocp_max_configs; - { - auto _lk = std::unique_lock{get_configs_mutex()}; - for(size_t i = 0; i < rocp_max_configs; ++i) - { - if(rocprofiler::internal::get_registered_configs().at(i) == nullptr) - { - idx = i; - break; - } - } - } - - // too many configs already registered - if(idx == rocp_max_configs) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_ACTIVE; - - cfg->context_idx = idx; - context_id->handle = idx; - - // using the context id, compute the location in the buffer of configs - auto* _offset = get_configs_buffer() + (idx * sizeof(rocprofiler::internal::config)); - - // placement new into the buffer - auto* _copy_cfg = new(_offset) rocprofiler::internal::config{*cfg}; - - // make copies of non-null config fields - copy_config_field(_copy_cfg->buffer, cfg->buffer); - copy_config_field(_copy_cfg->domain, cfg->domain); - copy_config_field(_copy_cfg->filter, cfg->filter); - - // store until "deallocation" - rocprofiler::internal::get_registered_configs().at(idx) = _copy_cfg; - - using config_t = rocprofiler::internal::config; - // atomic swap the pointer into the "active" array used internally - config_t* _expected = nullptr; - bool success = rocprofiler::internal::get_active_configs().at(idx).compare_exchange_strong( - _expected, rocprofiler::internal::get_registered_configs().at(idx)); - - if(!success) return ROCPROFILER_STATUS_ERROR_HAS_ACTIVE_CONTEXT; // need relevant enum - - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_stop_config(rocprofiler_context_id_t idx) -{ - // atomically assign the config pointer to NULL so that it is skipped in future - // callbacks - auto* _expected = - rocprofiler::internal::get_active_configs().at(idx.handle).load(std::memory_order_relaxed); - bool success = rocprofiler::internal::get_active_configs() - .at(idx.handle) - .compare_exchange_strong(_expected, nullptr); - - if(!success) - return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange strong - // failed - - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_domain_add_domain(struct rocprofiler_domain_config* _inp_cfg, - rocprofiler_tracer_activity_domain_t _domain) -{ - auto* _cfg = rocp_cast(_inp_cfg); - if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE || - _domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST) - return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID; - - _cfg->domains |= (1 << _domain); - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_domain_add_domains(struct rocprofiler_domain_config* _inp_cfg, - rocprofiler_tracer_activity_domain_t* _domains, - size_t _ndomains) -{ - for(size_t i = 0; i < _ndomains; ++i) - { - auto _status = rocprofiler_domain_add_domain(_inp_cfg, _domains[i]); - if(_status != ROCPROFILER_STATUS_SUCCESS) return _status; - } - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_domain_add_op(struct rocprofiler_domain_config* _inp_cfg, - rocprofiler_tracer_activity_domain_t _domain, - uint32_t _op) -{ - auto* _cfg = rocp_cast(_inp_cfg); - if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE || - _domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST) - return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID; - - if(_op >= get_domain_max_op(_domain)) return ROCPROFILER_STATUS_ERROR_INVALID_OPERATION_ID; - - auto _offset = (_domain * rocprofiler::internal::domain_ops_offset); - _cfg->opcodes.set(_offset + _op, true); - return ROCPROFILER_STATUS_SUCCESS; -} - -rocprofiler_status_t -rocprofiler_domain_add_ops(struct rocprofiler_domain_config* _inp_cfg, - rocprofiler_tracer_activity_domain_t _domain, - uint32_t* _ops, - size_t _nops) -{ - for(size_t i = 0; i < _nops; ++i) - { - auto _status = rocprofiler_domain_add_op(_inp_cfg, _domain, _ops[i]); - if(_status != ROCPROFILER_STATUS_SUCCESS) return _status; - } - return ROCPROFILER_STATUS_SUCCESS; -} - -// ------------------------------------------------------------------------------------ // -// -// demo of internal implementation -// -// ------------------------------------------------------------------------------------ // - -void -api_callback(rocprofiler_tracer_activity_domain_t domain, - uint32_t cid, - const void* /*callback_data*/, - void*) -{ - for(const auto& aitr : rocprofiler::internal::get_active_configs()) - { - auto* itr = aitr.load(); - if(!itr) continue; - - // below should be valid so this might need to raise error - if(!itr->domain) continue; - - // if the given domain + op is not enabled, skip this config - if(!(*itr->domain)(domain, cid)) continue; - - if(itr->filter) - { - if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX) - {} - else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API) - { - if(itr->filter->hsa_function_id && itr->filter->hsa_function_id(cid) == 0) continue; - } - else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API) - { - if(itr->filter->hip_function_id && itr->filter->hip_function_id(cid) == 0) continue; - } - } - - auto& _domain = (*itr->domain); - auto& _correlation = (*itr->correlation_id); - - auto _correlation_id = rocprofiler::internal::correlation_config::get_unique_record_id(); - if(_correlation.external_id_callback) - _correlation.external_id = - _correlation.external_id_callback(domain, cid, _correlation_id); - - auto timestamp_ns = []() -> uint64_t { - return std::chrono::steady_clock::now().time_since_epoch().count(); - }; - - (void) _domain; - (void) timestamp_ns; - /* - auto _header = rocprofiler_record_header_t{ROCPROFILER_TRACER_RECORD, - rocprofiler_record_id_t{_correlation_id}}; - auto _op_id = rocprofiler_tracer_operation_id_t{cid}; - auto _agent_id = rocprofiler_agent_id_t{0}; - auto _queue_id = rocprofiler_queue_id_t{0}; - auto _thread_id = rocprofiler_thread_id_t{get_tid()}; - auto _context = rocprofiler_context_id_t{itr->context_idx}; - auto _timestamp_raw = rocprofiler_timestamp_t{timestamp_ns()}; - auto _timestamp = rocprofiler_record_header_timestamp_t{_timestamp_raw, _timestamp_raw}; - - if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX) - { - auto _api_data = rocprofiler_tracer_api_data_t{}; - const roctx_api_data_t* _data = - reinterpret_cast(callback_data); - - if(itr->filter && itr->filter->name && itr->filter->name(_data->args.message) == 0) - continue; - - _api_data.roctx = _data; - - auto _phase = rocprofiler_api_tracing_phase_t{ROCPROFILER_PHASE_ENTER}; - _timestamp = {_timestamp_raw, _timestamp_raw}; - - auto _external_cid = rocprofiler_tracer_external_id_t{_data ? _data->args.id : 0}; - auto _activity_cid = rocprofiler_tracer_activity_correlation_id_t{0}; - const char* _name = _data->args.message; - - _domain.user_sync_callback(rocprofiler_record_tracer_t{_header, - _external_cid, - ACTIVITY_DOMAIN_ROCTX, - _op_id, - _api_data, - _activity_cid, - _timestamp, - _agent_id, - _queue_id, - _thread_id, - _phase, - _name}, - _context); - } - else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API) - { - auto _api_data = rocprofiler_tracer_api_data_t{}; - const hsa_api_data_t* _data = reinterpret_cast(callback_data); - _api_data.hsa = _data; - - auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER) - ? ROCPROFILER_PHASE_ENTER - : ROCPROFILER_PHASE_EXIT}; - - if(_phase == ROCPROFILER_PHASE_ENTER) - _timestamp.begin = _timestamp_raw; - else - _timestamp.end = _timestamp_raw; - - auto _external_cid = rocprofiler_tracer_external_id_t{0}; - auto _activity_cid = - rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id}; - const char* _name = nullptr; - - _domain.user_sync_callback(rocprofiler_record_tracer_t{_header, - _external_cid, - ACTIVITY_DOMAIN_HSA_API, - _op_id, - _api_data, - _activity_cid, - _timestamp, - _agent_id, - _queue_id, - _thread_id, - _phase, - _name}, - _context); - } - else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API) - { - auto _api_data = rocprofiler_tracer_api_data_t{}; - const hip_api_data_t* _data = reinterpret_cast(callback_data); - _api_data.hip = _data; - - auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER) - ? ROCPROFILER_PHASE_ENTER - : ROCPROFILER_PHASE_EXIT}; - - if(_phase == ROCPROFILER_PHASE_ENTER) - _timestamp.begin = _timestamp_raw; - else - _timestamp.end = _timestamp_raw; - - auto _external_cid = rocprofiler_tracer_external_id_t{0}; - auto _activity_cid = - rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id}; - const char* _name = nullptr; - - _domain.user_sync_callback(rocprofiler_record_tracer_t{_header, - _external_cid, - ACTIVITY_DOMAIN_HIP_API, - _op_id, - _api_data, - _activity_cid, - _timestamp, - _agent_id, - _queue_id, - _thread_id, - _phase, - _name}, - _context); - } - */ - } -} - -void -InitRoctracer() -{ - for(const auto& itr : rocprofiler::internal::get_registered_configs()) - { - if(!itr) continue; - - // below should be valid so this might need to raise error - if(!itr->domain) continue; - - for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX}) - { - if((*itr->domain)(ditr)) - { - if(itr->domain->user_sync_callback) - { - // ... - } - else - { - // ... - } - } - } - - for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS, - ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS}) - { - if((*itr->domain)(ditr)) - { - if(itr->domain->opcodes.none()) - { - // ... - } - else - { - for(size_t i = 0; i < itr->domain->opcodes.size(); ++i) - { - if((*itr->domain)(ditr, i)) - { - // ... - } - } - } - } - } - } -} -} diff --git a/projects/rocprofiler-sdk/source/lib/rocprofiler/tracer.hpp b/projects/rocprofiler-sdk/source/lib/rocprofiler/tracer.hpp deleted file mode 100644 index b60871f622..0000000000 --- a/projects/rocprofiler-sdk/source/lib/rocprofiler/tracer.hpp +++ /dev/null @@ -1,510 +0,0 @@ -/* Copyright (c) 2018-2022 Advanced Micro Devices, Inc. - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in - all copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN - THE SOFTWARE. */ - -#pragma once - -#include - -#include -#include - -typedef struct -{ - rocprofiler_context_id_t context_id; - rocprofiler_buffer_id_t buffer_id; -} context_buffer_id_t; - -typedef context_buffer_id_t roctracer_pool_t; - -/* Correlation id */ -typedef uint64_t activity_correlation_id_t; - -typedef uint32_t activity_kind_t; -typedef uint32_t activity_op_t; - -typedef uint64_t roctracer_timestamp_t; - -typedef rocprofiler_tracer_activity_domain_t roctracer_domain_t; -typedef rocprofiler_tracer_activity_domain_t activity_domain_t; - -// Prof_Protocol -/* Activity record type */ -typedef struct activity_record_s -{ - uint32_t domain; /* activity domain id */ - activity_kind_t kind; /* activity kind */ - activity_op_t op; /* activity op */ - union - { - struct - { - activity_correlation_id_t correlation_id; /* activity ID */ - roctracer_timestamp_t begin_ns; /* host begin timestamp */ - roctracer_timestamp_t end_ns; /* host end timestamp */ - }; - struct - { - uint32_t se; /* sampled SE */ - uint64_t cycle; /* sample cycle */ - uint64_t pc; /* sample PC */ - } pc_sample; - }; - union - { - struct - { - int device_id; /* device id */ - uint64_t queue_id; /* queue id */ - }; - struct - { - uint32_t process_id; /* device id */ - uint32_t thread_id; /* thread id */ - }; - struct - { - activity_correlation_id_t external_id; /* external correlation id */ - }; - }; - union - { - size_t bytes; /* data size bytes */ - const char* kernel_name; /* kernel name */ - const char* mark_message; - }; -} activity_record_t; - -typedef activity_record_t roctracer_record_t; - -/* Activity sync callback type */ -typedef void (*activity_sync_callback_t)(activity_domain_t cid, - activity_record_t* record, - const void* data, - void* arg); -/* Activity async callback type */ -typedef void (*activity_async_callback_t)(activity_domain_t op, void* record, void* arg); - -/* API callback type */ -typedef void (*activity_rtapi_callback_t)(activity_domain_t domain, - uint32_t cid, - const void* data, - void* arg); -typedef activity_rtapi_callback_t roctracer_rtapi_callback_t; - -typedef roctracer_timestamp_t (*roctracer_get_timestamp_t)(); -typedef rocprofiler_timestamp_t (*rocprofiler_get_timestamp_t)(); - -typedef uint32_t activity_kind_t; -typedef uint32_t activity_op_t; - -/* API callback phase */ -typedef enum -{ - ACTIVITY_API_PHASE_ENTER = 0, - ACTIVITY_API_PHASE_EXIT = 1 -} activity_api_phase_t; - -const char* -roctracer_op_string(uint32_t domain, uint32_t op); - -/* Trace record types */ - -/** - * Memory pool allocator callback. - * - * If \p *ptr is NULL, then allocate memory of \p size bytes and save address - * in \p *ptr. - * - * If \p *ptr is non-NULL and size is non-0, then reallocate the memory at \p - * *ptr with size \p size and save the address in \p *ptr. The memory will have - * been allocated by the same callback. - * - * If \p *ptr is non-NULL and size is 0, then deallocate the memory at \p *ptr. - * The memory will have been allocated by the same callback. - * - * \p size is the size of the memory allocation or reallocation, or 0 if - * deallocating. - * - * \p arg Argument provided - */ -typedef void (*roctracer_allocator_t)(char** ptr, size_t size, void* arg); - -/** - * Memory pool buffer callback. - * - * The callback that will be invoked when a memory pool buffer becomes full or - * is flushed. - * - * \p begin pointer to first entry entry in the buffer. - * - * \p end pointer to one past the end entry in the buffer. - * - * \p arg the argument specified when the callback was defined. - */ -typedef void (*roctracer_buffer_callback_t)(const char* begin, const char* end, void* arg); - -/** - * Memory pool properties. - * - * Defines the properties when a tracer memory pool is created. - */ -typedef struct -{ - /** - * ROC Tracer mode. - */ - uint32_t mode; - - /** - * Size of buffer in bytes. - */ - size_t buffer_size; - - /** - * The allocator function to use to allocate and deallocate the buffer. If - * NULL then \p malloc, \p realloc, and \p free are used. - */ - roctracer_allocator_t alloc_fun; - - /** - * The argument to pass when invoking the \p alloc_fun allocator. - */ - void* alloc_arg; - - /** - * The function to call when a buffer becomes full or is flushed. - */ - roctracer_buffer_callback_t buffer_callback_fun; - - /** - * The argument to pass when invoking the \p buffer_callback_fun callback. - */ - void* buffer_callback_arg; -} roctracer_properties_t; - -/** - * ROC Tracer API status codes. - */ -typedef enum -{ - /** - * The function has executed successfully. - */ - ROCTRACER_STATUS_SUCCESS = 0, - /** - * A generic error has occurred. - */ - ROCTRACER_STATUS_ERROR = -1, - /** - * The domain ID is invalid. - */ - ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID = -2, - /** - * An invalid argument was given to the function. - */ - ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT = -3, - /** - * No default pool is defined. - */ - ROCTRACER_STATUS_ERROR_DEFAULT_POOL_UNDEFINED = -4, - /** - * The default pool is already defined. - */ - ROCTRACER_STATUS_ERROR_DEFAULT_POOL_ALREADY_DEFINED = -5, - /** - * Memory allocation error. - */ - ROCTRACER_STATUS_ERROR_MEMORY_ALLOCATION = -6, - /** - * External correlation ID pop mismatch. - */ - ROCTRACER_STATUS_ERROR_MISMATCHED_EXTERNAL_CORRELATION_ID = -7, - /** - * The operation is not currently implemented. This error may be reported by - * any function. Check the \ref known_limitations section to determine the - * status of the library implementation of the interface. - */ - ROCTRACER_STATUS_ERROR_NOT_IMPLEMENTED = -8, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_UNINIT = 2, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_BREAK = 3, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_BAD_DOMAIN = ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_BAD_PARAMETER = ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_HIP_API_ERR = 6, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_HIP_OPS_ERR = 7, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_HCC_OPS_ERR = ROCTRACER_STATUS_HIP_OPS_ERR, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_HSA_ERR = 7, - /** - * Deprecated error code. - */ - ROCTRACER_STATUS_ROCTX_ERR = 8, -} roctracer_status_t; - -/** - * Query textual name of an operation of a domain. - * @param[in] domain Domain being queried. - * @param[in] op Operation within \p domain. - * @param[in] kind \todo Define kind. - * @return Returns the NUL terminated string for the operation name, or NULL if - * the domain or operation are invalid. The string is owned by the ROC Tracer - * library. - */ -const char* -roctracer_op_string(uint32_t domain, uint32_t op, uint32_t kind); - -/** - * Query the operation code given a domain and the name of an operation. - * @param[in] domain The domain being queried. - * @param[in] str The NUL terminated name of the operation name being queried. - * @param[out] op The operation code. - * @param[out] kind If not NULL then the operation kind code. - */ -void -roctracer_op_code(uint32_t domain, const char* str, uint32_t* op, uint32_t* kind); - -/** - * Set the properties of a domain. - * @param[in] domain The domain. - * @param[in] properties The properties. Each domain defines its own type for - * the properties. Some domains require the properties to be set before they - * can be enabled. - */ -void -roctracer_set_properties(roctracer_domain_t domain, void* properties); - -/** - * Enable runtime API callback for a specific operation of a domain. - * @param domain The domain. - * @param op The operation ID in \p domain. - * @param callback The callback to invoke each time the operation is performed - * on entry and exit. - * @param pool Value to pass as last argument of \p callback. - */ -void -roctracer_enable_op_callback(roctracer_domain_t domain, - uint32_t op, - roctracer_rtapi_callback_t callback); - -/** - * Enable runtime API callback for all operations of a domain. - * @param domain The domain - * @param callback The callback to invoke each time the operation is performed - * on entry and exit. - * @param arg Value to pass as last argument of \p callback. - */ -void -roctracer_enable_domain_callback(roctracer_domain_t domain, - roctracer_rtapi_callback_t callback, - void* user_data = nullptr); - -/** - * Disable runtime API callback for a specific operation of a domain. - * @param domain The domain - * @param op The operation in \p domain. - */ -void -roctracer_disable_op_callback(roctracer_domain_t domain, uint32_t op); - -/** - * Disable runtime API callback for all operations of a domain. - * @param domain The domain - */ -void -roctracer_disable_domain_callback(roctracer_domain_t domain); - -/** - * Enable activity record logging for a specified operation of a domain using - * the default memory pool. - * @param[in] domain The domain. - * @param[in] op The activity operation ID in \p domain. - */ -void -roctracer_enable_op_activity(roctracer_domain_t domain, uint32_t op, roctracer_pool_t pool); - -/** - * Enable activity record logging for all operations of a domain using the - * default memory pool. - * @param[in] domain The domain. - */ -void -roctracer_enable_domain_activity(roctracer_domain_t domain, roctracer_pool_t pool); - -/** - * Disable activity record logging for a specified operation of a domain. - * @param[in] domain The domain. - * @param[in] op The activity operation ID in \p domain. - */ -void -roctracer_disable_op_activity(roctracer_domain_t domain, uint32_t op); - -/** - * Disable activity record logging for all operations of a domain. - * @param[in] domain The domain. - */ -void -roctracer_disable_domain_activity(roctracer_domain_t domain); - -// HIP Support -typedef enum -{ - HIP_OP_ID_DISPATCH = 0, - HIP_OP_ID_COPY = 1, - HIP_OP_ID_BARRIER = 2, - HIP_OP_ID_NUMBER = 3 -} hip_op_id_t; - -// HSA Support -// HSA OP ID enumeration -enum hsa_op_id_t -{ - HSA_OP_ID_DISPATCH = 0, - HSA_OP_ID_COPY = 1, - HSA_OP_ID_BARRIER = 2, - HSA_OP_ID_RESERVED1 = 3, - HSA_OP_ID_NUMBER -}; - -// HSA EVT ID enumeration -enum hsa_evt_id_t -{ - HSA_EVT_ID_ALLOCATE = 0, // Memory allocate callback - HSA_EVT_ID_DEVICE = 1, // Device assign callback - HSA_EVT_ID_MEMCOPY = 2, // Memcopy callback - HSA_EVT_ID_SUBMIT = 3, // Packet submission callback - HSA_EVT_ID_KSYMBOL = 4, // Loading/unloading of kernel symbol - HSA_EVT_ID_CODEOBJ = 5, // Loading/unloading of device code object - HSA_EVT_ID_NUMBER -}; - -struct hsa_ops_properties_t -{ - void* reserved1[4]; -}; - -// ROCTx Support -typedef uint64_t roctx_range_id_t; - -/** - * ROCTX API ID enumeration - */ -enum roctx_api_id_t -{ - ROCTX_API_ID_roctxMarkA = 0, - ROCTX_API_ID_roctxRangePushA = 1, - ROCTX_API_ID_roctxRangePop = 2, - ROCTX_API_ID_roctxRangeStartA = 3, - ROCTX_API_ID_roctxRangeStop = 4, - ROCTX_API_ID_NUMBER, -}; - -/** - * ROCTX callbacks data type - */ -typedef struct roctx_api_data_s -{ - union - { - struct - { - const char* message; - roctx_range_id_t id; - }; - struct - { - const char* message; - } roctxMarkA; - struct - { - const char* message; - } roctxRangePushA; - struct - { - const char* message; - } roctxRangePop; - struct - { - const char* message; - roctx_range_id_t id; - } roctxRangeStartA; - struct - { - const char* message; - roctx_range_id_t id; - } roctxRangeStop; - } args; -} roctx_api_data_t; - -// External Support -/* Extension opcodes */ -typedef enum -{ - ACTIVITY_EXT_OP_MARK = 0, - ACTIVITY_EXT_OP_EXTERN_ID = 1 -} activity_ext_op_t; - -typedef void (*roctracer_start_cb_t)(); -typedef void (*roctracer_stop_cb_t)(); -typedef struct -{ - roctracer_start_cb_t start_cb; - roctracer_stop_cb_t stop_cb; -} roctracer_ext_properties_t; - -// Tracing start -void -roctracer_start(); - -// Tracing stop -void -roctracer_stop(); - -// Notifies that the calling thread is entering an external region. -// Push an external correlation id for the calling thread. -void -roctracer_activity_push_external_correlation_id(activity_correlation_id_t id); - -// Notifies that the calling thread is leaving an external region. -// Pop an external correlation id for the calling thread. -// 'lastId' returns the last external correlation if not NULL -void -roctracer_activity_pop_external_correlation_id(activity_correlation_id_t* last_id); diff --git a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-parallel.cpp b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-parallel.cpp index c4396cbc94..172a5efc3d 100644 --- a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-parallel.cpp +++ b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-parallel.cpp @@ -154,7 +154,7 @@ validate(const std::vector& _headers) auto& _ref_data = get_generated_array(); for(auto* itr : _headers) { - if(itr->kind == typeid(data_type).hash_code()) + if(itr->hash == typeid(data_type).hash_code()) { auto* _data = static_cast(itr->payload); EXPECT_EQ(_ref_data, *_data); diff --git a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-save-load.cpp b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-save-load.cpp index 18bb6e2c4c..383fd80ad9 100644 --- a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-save-load.cpp +++ b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-save-load.cpp @@ -147,7 +147,7 @@ validate(const std::vector& _headers) auto& _ref_data = get_generated_array(); for(auto* itr : _headers) { - if(itr->kind == typeid(data_type).hash_code()) + if(itr->hash == typeid(data_type).hash_code()) { auto* _data = static_cast(itr->payload); ASSERT_TRUE(_data != nullptr); diff --git a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-serial.cpp b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-serial.cpp index 03e1ba49dc..3d4c0be2a6 100644 --- a/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-serial.cpp +++ b/projects/rocprofiler-sdk/source/lib/tests/buffering/buffering-serial.cpp @@ -54,7 +54,7 @@ template void extract_header(std::vector& _arr, rocprofiler_record_header_t* _hdr) { - if(_hdr->kind == typeid(Tp).hash_code()) + if(_hdr->hash == typeid(Tp).hash_code()) { auto* _v = reinterpret_cast(_hdr->payload); _arr.emplace_back(*_v); @@ -129,17 +129,17 @@ TEST(buffering, serial) { ASSERT_TRUE(itr->payload) << "nullptr to payload not expected"; - if(itr->kind == typeid(uint_raw_array_t).hash_code()) + if(itr->hash == typeid(uint_raw_array_t).hash_code()) { extract_header(_ui_result, itr); } - else if(itr->kind == typeid(flt_raw_array_t).hash_code()) + else if(itr->hash == typeid(flt_raw_array_t).hash_code()) { extract_header(_fp_result, itr); } else { - GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->kind); + GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->hash); } } diff --git a/projects/rocprofiler-sdk/source/scripts/run-ci.py b/projects/rocprofiler-sdk/source/scripts/run-ci.py index fd37eaad5d..9897925854 100755 --- a/projects/rocprofiler-sdk/source/scripts/run-ci.py +++ b/projects/rocprofiler-sdk/source/scripts/run-ci.py @@ -105,7 +105,7 @@ def generate_custom(args, cmake_args, ctest_args): set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_ERRORS "100") set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_WARNINGS "100") set(CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE "51200") - set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*") + set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*;.*/details/.*") set(CTEST_MEMORYCHECK_TYPE "{MEMCHECK_TYPE}") set(CTEST_MEMORYCHECK_SUPPRESSIONS_FILE "{MEMCHECK_SUPPRESSION_FILE}") diff --git a/projects/rocprofiler-sdk/source/scripts/thread-sanitizer-suppr.txt b/projects/rocprofiler-sdk/source/scripts/thread-sanitizer-suppr.txt index 32d5847d3f..069460aa20 100644 --- a/projects/rocprofiler-sdk/source/scripts/thread-sanitizer-suppr.txt +++ b/projects/rocprofiler-sdk/source/scripts/thread-sanitizer-suppr.txt @@ -7,3 +7,7 @@ thread:libhsa-runtime64.so # unlock of an unlocked mutex (or by a wrong thread) mutex:librocm_smi64.so + +# google logging +race:google::LogMessageTime::CalcGmtOffset +race:tzset_internal diff --git a/projects/rocprofiler-sdk/source/scripts/update-doxygen.sh b/projects/rocprofiler-sdk/source/scripts/update-doxygen.sh index e183d771e2..b9567d8986 100755 --- a/projects/rocprofiler-sdk/source/scripts/update-doxygen.sh +++ b/projects/rocprofiler-sdk/source/scripts/update-doxygen.sh @@ -3,8 +3,14 @@ WORK_DIR=$(cd $(dirname ${BASH_SOURCE[0]})/../docs &> /dev/null && pwd) SOURCE_DIR=$(cd ${WORK_DIR}/../.. &> /dev/null && pwd) +pushd ${SOURCE_DIR} +cmake -B build-docs ${SOURCE_DIR} -DROCPROFILER_INTERNAL_BUILD_DOCS=ON +popd + +pushd ${WORK_DIR} cmake -DSOURCE_DIR=${SOURCE_DIR} -P generate-doxyfile.cmake doxygen rocprofiler.dox doxysphinx build ${WORK_DIR} ${WORK_DIR}/_build/html ${WORK_DIR}/_doxygen/html +popd