Contexts, tracing, include reorg, registration, thread-pool (#65)
* Update scripts/update-doxygen.sh
- ensure build-docs folder exists
* Update scripts/run-ci.py
- exclude files in details subdirectory from code coverage
* Update scripts/thread-sanitizer-suppr.txt
- exclude races in glog
* Update docs/rocprofiler.dox.in
- exclude defines in include/rocprofiler/defines.h from doxygen
- Tweak EXCLUDE_PATTERNS and EXAMPLE_PATTERNS
* Update docs workflow
- trigger workflow whenever there is a change to the public headers (which may be doxygen comments)
* Update include/rocprofiler (reorg and overhaul)
- rocprofiler_status_t additions
- CONTEXT_NOT_FOUND
- CONTEXT_ERROR
- INVALID_CONTEXT_ID
- INVALID_CONTEXT
- BUFFER_BUSY
- rocprofiler_context_is_active func
- rocprofiler_context_is_valid func
- rocprofiler_service_callback_tracing_kind_t update
- remove ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD
- Remove rocprofiler_tracing_helper_thread_operation_t
- Remove rocprofiler_helper_thread_callback_tracer_data_t
- Added rocprofiler_internal_thread_library_t
- Added rocprofiler_at_internal_thread_create
- split rocprofiler.h into several smaller headers
- reworked rocprofiler_status_t values
- added doxygen comments for enums
- replaced rocprofiler_trace_record_operation_kind_t with rocprofiler_trace_operation_t
- use @ instead of / in doxygen comment in rocprofiler_plugin.h
- fix ref to ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API
- end group in fwd.h
- remove PROFILE_COUNTING group in dispatch_profile.h
- remove premature group close in callback_tracing.h
- hsa.h: remove rocprofiler_hsa_trace_data_t
- fwd.h: remove rocprofiler_tracer_callback_data_t
- rename rocprofiler_correlation_id_t.handle to rocprofiler_correlation_id_t.id (consistency)
- fwd.h: add rocprofiler_callback_tracing_record_t
- callback_tracing.h: update rocprofiler_hsa_api_callback_tracer_data_t
- callback_tracing.h: add size fields
- simplify rocprofiler_tracer_callback_t
- removed ROCPROFILER_NONNULL from rocprofiler_get_version
- added rocprofiler_get_timestamp
- ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED in rocprofiler_status_t
- add ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND rocprofiler_status_t
- add rocprofiler_buffer_category_t
- rocprofiler_trace_operation_t -> rocprofiler_tracing_operation_t
- rocprofiler_user_data_t union
- tweak rocprofiler_callback_tracing_record_t
- make external_correlation_id non-pointer
- add rocprofiler_user_data_t data field
- tweak rocprofiler_record_header_t
- instead of single uint64_t kind field, have union for category + kind (two u32) with u64 hash
- API extensions for kind id <-> kind string
- API extensions for operation id <-> operation string
- rocprofiler_callback_trace_kind_name_cb_t
- rocprofiler_callback_trace_operation_name_cb_t
- rocprofiler_iterate_callback_trace_kind_names
- rocprofiler_iterate_callback_trace_kind_operation_names
- modify rocprofiler_hsa_api_callback_tracer_data_t data members (remove pointers)
- add rocprofiler_callback_trace_operation_args_cb_t function pointer typedef
- add rocprofiler_iterate_callback_trace_operation_args function
- fixed inconsistent use of *_trace_* vs. *_tracing_* (opting for tracing)
- removed rocprofiler_query_callback_trace_kind_name
- removed rocprofiler_query_callback_kind_operation_name
- Add include/rocprofiler/registration.h
- header dedicated to registering a tool/client with rocprofiler
- this header is not intended to be included by rocprofiler.h
- rocprofiler_client_id_t
- identifier for client tool
- rocprofiler_client_finalize_t
- function pointer prototype for tool-initiated finalization
- rocprofiler_tool_initialize_t
- function pointer prototype for tool initialization (i.e. configuration)
- rocprofiler_tool_finalize_t
- function pointer prototype for tool finalization
- rocprofiler_tool_configure_result_t
- struct returned by tool/client to rocprofiler
- rocprofiler_is_initialized
- function for querying whether tool-induced initialization is possible
- rocprofiler_is_finalized
- function for querying whether rocprofiler has been finalized
- rocprofiler_configure prototype
- this is the function tools implement
- prototype is always marked as having default visibility
- no implementation in rocprofiler
- added typedef for rocprofiler_configure function pointer
- added rocprofiler_force_configure to explicitly invoke rocprofiler_configure instead of relying on lazy init
- made callback typedef names more consistent (_cb_t suffix)
- typedef for rocprofiler_internal_thread_library_cb_t function pointer
- added rocprofiler_at_internal_thread_create function
- added rocprofiler_callback_thread_t struct
- added rocprofiler_create_callback_thread function
- added rocprofiler_assign_callback_thread function
- removed rocprofiler_buffer_tracing_record_header_t in favor of kind and correlation id in each record type
- added rocprofiler_buffer_tracing_kind_name_cb_t typedef
- added rocprofiler_buffer_tracing_operation_name_cb_t typedef
- added rocprofiler_iterate_buffer_tracing_kind_names function
- added rocprofiler_iterate_buffer_tracing_kind_operation_names function
- removed rocprofiler_query_buffer_trace_kind_name function
- removed rocprofiler_query_buffer_kind_operation_name function
* Update lib/common/container/stable_vector.hpp
- include limits header
- reserve_size struct
- overload stable_vector constructor to support reserving as part of construction
* Update lib/common/container/record_header_buffer.{hpp,cpp}
- add emplace member function accepting category and kind (two u32 variables) instead of one u64 kind
- use std::shared_mutex to prevent data-race when reading m_headers
- record_header_buffer is now multiple writer, single reader
- add read_lock member function (shared)
- add read_unlock member function (shared)
- lock member function gets exclusive lock
- unlock member function releases exclusive lock
* Rename "config" to "context" + restructure + implement
- Restructure config files + license
- move config files into lib/rocprofiler/config subfolder
- rename some files
- add license to some files which were missing it
- Rename config/helpers.hpp
- rename to allocator.hpp
- remove get_domain_max_ops
- Create config/domain.{hpp,cpp}
- structures for handling tracing domains and ops
- Update config/config.{hpp,cpp}
- buffer_instance struct
- callback_tracing_service struct
- buffer_tracing_service struct
- config struct
- allocate_{config,buffer} func
- {validate,start,stop}_config funcs
- get_registered_configs func
- get_active_configs func
- get_buffers func
- Update rocprofiler.cpp
- Implement rocprofiler_create_context
- Implement rocprofiler_start_context
- Implement rocprofiler_stop_context
- Implement rocprofiler_context_is_active
- Implement rocprofiler_context_is_valid
- Implement rocprofiler_flush_buffer
- Implement rocprofiler_destroy_buffer
- Implement rocprofiler_create_buffer
- Update lib/rocprofiler/hsa
- use rocprofiler_tracer_activity_domain_t instead of rocprofiler_tracer_activity_domain_t
- remove ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API fromHSA_API_INFO_DEFINITION_* macros
- Update lib/rocprofiler/context/domain.*
- fixes for domain_info (i.e. use correct enums)
- update rocprofiler_status_t codes
- fix template instantiations
- Update lib/rocprofiler/context/context.*
- use rocprofiler_service_callback_tracing_kind_t instead of rocprofiler_tracer_activity_domain_t
- rename correlation_context to correlation_tracing_service
- fix domains in callback_tracing_service and buffer_tracing_service
- unique_ptr for callback_tracer and buffered_tracer in context
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_configure_callback_tracing_service
- Update lib/rocprofiler/hsa/ostream.hpp
- include rocprofiler.h instead of tracer.hpp
- Update lib/rocprofiler/hsa
- migration to use rocprofiler_hsa_api_callback_tracer_data_t instead of rocprofiler_hsa_trace_data_t
- restructure hsa_api_impl<Idx>
- remove phase_enter and phase_exit
- add set_data_args (partial replacement for phase_enter)
- functor handles the contexts
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_get_version
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- remove hsa_api_ prefix for functions already in hsa namespace
- Update lib/rocprofiler/context/context.{hpp,cpp}
- add client_idx to context struct (tool identifier)
- add push_client function to set client_idx before context is allocated
- add pop_client function to remove client identifier from future context creations
- implemented {registered,active}_contexts and buffers to use new container::reserve_size overload to stable_vector
- fix implementation of start_context
- fix implementation of stop_context
- Update lib/rocprofiler/rocprofiler.cpp
- prevent context creation, buffer creation, pc sampling config, etc. after initialization
- add nullptr checks to rocprofiler_context_is_valid
- fix rocprofiler_configure_callback_tracing_service
- was checking size of buffers, not registered context
- implement rocprofiler_iterate_callback_trace_kind_names
- implement rocprofiler_iterate_callback_trace_kind_operation_names
- Update lib/rocprofiler/CMakeLists.txt
- add registration.{hpp,cpp} to rocprofiler-library target sources
- Update lib/rocprofiler/hsa/utils.hpp
- fix using fmt::formt with const char* strings
- remove join functions (no longer used)
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- remove args_string function
- remove named_args_string function
- update iterate_args function
- change callback type
- accept user data
- rework the hsa_api_impl<Idx>::functor function
- save the rocprofiler_callback_tracing_record_t between callbacks
- update update_table function
- check buffered_tracer domains
- remove comments
- Update lib/rocprofiler/hsa/defines.hpp
- remove MEMBER_<N> macros
- add ADDR_MEMBER_<N> macros
- remove doxygen comments for GET_MEMBER_FIELDS
- add GET_ADDR_MEMBER_FIELDS
- update HSA_API_INFO_DEFINITION_{0,V}
- rename domain_idx to callback_domain_idx
- add buffered_domain_idx
- add as_arg_addr function
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_iterate_callback_trace_operation_args
- Remove lib/rocprofiler/tracing.{hpp,cpp} and lib/rocprofiler/CMakeLists.txt
- unused
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- support buffered tracing in hsa_api_impl<Idx>::functor
- rocprofiler_callback_trace_operation_args_cb_t -> rocprofiler_callback_tracing_operation_args_cb_t
- i.e. trace -> tracing
- Update lib/rocprofiler/context/context.{hpp,cpp}
- removed buffer_instance struct
- removed allocate_buffer function
- removed get_buffers function
- changed buffer_tracing_service::buffer_array_t
- Update lib/rocprofiler/hsa: hsa.cpp, ostream.hpp, details folder
- move ostream.hpp into details folder to prevent from contributing to code coverage
- update cmake build system for new directory
* Add lib/rocprofiler/registration.{hpp,cpp}
- implements rocprofiler_set_api_table (called by rocprofiler-register)
- miscellaneous functions for client configure/initialize/finalize
- functions for querying the init/fini status
- relocated OnLoad HSA workaround to this file
- at present, this is used to workaround ROCr not having rocprofiler-register integration yet
- implement rocprofiler_force_configure function
- implement rocprofiler_is_initialized function
- implement rocprofiler_is_finalized function
- ensure configure functions only invoked once
- ensure internal thread creation notification functions are invoked
- get_status is pair of atomics
- fix heap-use-after-free in init_logging
- update finalize
- invoke hsa_shut_down
- set all active contexts to null pointers
* Add lib/rocprofiler/buffer_tracing.cpp
- contains implementations of buffer_tracing (i.e. rocprofiler/buffer_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/buffer.{hpp,cpp}
- contains implementations of buffer (i.e. rocprofiler/buffer.h) and misc internal access functions
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp and lib/rocprofiler/context/context.{hpp,cpp}
* Add lib/rocprofiler/callback_tracing.cpp
- contains implementations of callback_tracing (i.e. rocprofiler/callback_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/context.cpp
- contains implementations of context public API functions (i.e. rocprofiler/context.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/internal_threading.{hpp,cpp}
- contains implementations of internal_threading (i.e. rocprofiler/internal_threading.h)
- also contains implementations of internal access functions
- update finalize function
- join all task groups and destroy all thread pools first, then reset unique_ptr
* Update lib/rocprofiler/rocprofiler.cpp
- rocprofiler_get_version returns status
- implement rocprofiler_get_timestamp
- remove misc implementations that were split into other files
* Update lib/rocprofiler/CMakeLists.txt
- compile new implementation files
- buffer.cpp
- buffer_tracing.cpp
- callback_tracing.cpp
- context.cpp
- internal_threading.cpp
* Update lib/tests/buffering/buffering-*.cpp
- update to reflect changes to rocprofiler_record_header_t
* Update CMakeLists.txt
- increase minimum cmake version to 3.21 which added HIP support as a language
* Add samples/apps/transpose
- simple HIP application for testing
* Add samples/api_callback_tracing
- HIP application and tool library
- This effectively demos how to setup HSA API tracing
- For each function called in tool, it stores the func/file/line and prints it during finalization
- client.hpp and client.cpp are the tool library
- Implement use of rocprofiler_iterate_callback_trace_operation_args
- add demo of using rocprofiler_get_version
- add_test
- remove PASS_REGULAR_EXPRESSION
- causing false passes during memcheck
- add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
- check if rocprofiler is initialized before stopping context
* Add samples/api_buffered_tracing
- Sample demonstrating tracing the HSA API via buffering
- demo rocprofiler_record_header_compute_hash
- throw exceptions for unexpected buffer data
- add_test
- remove PASS_REGULAR_EXPRESSION
- causing false passes during memcheck
- add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
* Update samples/CMakeLists.txt
- add subdirectory for api_callback_tracing
- add subdirectory api_buffered_tracing
* Update samples/pc_sampling/common.h
- fix processing of headers
* Update lib/rocprofiler/hsa/details/ostream.hpp
- fix data race on HSA_depth_max_cnt and recursion
- HSA_depth_max_cnt and recursion is now thread-local static instead of global static
- replace std::string usage with std::string_view
* Actions update
- add dependabot.yml
- use actions/checkout@v4
- install latest libasan and libtsan in sanitizer containers
* Add PTL (Parallel Tasking Library) submodule
[ROCm/rocprofiler-sdk commit: d3eaacd610]
这个提交包含在:
@@ -0,0 +1,11 @@
|
||||
# To get started with Dependabot version updates, you'll need to specify which
|
||||
# package ecosystems to update and where the package manifests are located.
|
||||
# Please see the documentation for all configuration options:
|
||||
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
|
||||
|
||||
version: 2
|
||||
updates:
|
||||
- package-ecosystem: "github-actions" # See documentation for possible values
|
||||
directory: "/" # Location of package manifests
|
||||
schedule:
|
||||
interval: "weekly"
|
||||
@@ -72,7 +72,7 @@ jobs:
|
||||
needs: get_latest_mainline_build_number
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: List Files
|
||||
shell: bash
|
||||
@@ -161,7 +161,9 @@ jobs:
|
||||
needs: get_latest_mainline_build_number
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
submodules: true
|
||||
|
||||
- name: List Files
|
||||
shell: bash
|
||||
@@ -174,7 +176,7 @@ jobs:
|
||||
shell: bash
|
||||
run: |
|
||||
pip3 install -r requirements.txt
|
||||
apt install -y cmake libgtest-dev
|
||||
apt install -y cmake libgtest-dev libasan8 libtsan2
|
||||
git config --global --add safe.directory '*'
|
||||
|
||||
- name: Configure, Build, and Test
|
||||
|
||||
@@ -6,18 +6,20 @@ on:
|
||||
branches: [main]
|
||||
paths:
|
||||
- '*.md'
|
||||
- 'VERSION'
|
||||
- 'source/docs/**'
|
||||
- 'source/scripts/update-docs.sh'
|
||||
- 'source/include/rocprofiler/*'
|
||||
- '.github/workflows/docs.yml'
|
||||
- 'VERSION'
|
||||
pull_request:
|
||||
branches: [main]
|
||||
paths:
|
||||
- '*.md'
|
||||
- 'VERSION'
|
||||
- 'source/docs/**'
|
||||
- 'source/scripts/update-docs.sh'
|
||||
- 'source/include/rocprofiler/*'
|
||||
- '.github/workflows/docs.yml'
|
||||
- 'VERSION'
|
||||
|
||||
concurrency:
|
||||
group: "pages"
|
||||
@@ -35,7 +37,7 @@ jobs:
|
||||
id-token: write
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v3
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
submodules: true
|
||||
- name: Install Conda
|
||||
|
||||
@@ -19,7 +19,7 @@ jobs:
|
||||
runs-on: ubuntu-22.04
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Extract branch name
|
||||
shell: bash
|
||||
@@ -60,7 +60,7 @@ jobs:
|
||||
runs-on: ubuntu-22.04
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
@@ -105,7 +105,7 @@ jobs:
|
||||
python-version: ['3.10']
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Extract branch name
|
||||
shell: bash
|
||||
|
||||
@@ -10,3 +10,6 @@
|
||||
[submodule "source/docs/doxygen-awesome-css"]
|
||||
path = external/doxygen-awesome-css
|
||||
url = https://github.com/jothepro/doxygen-awesome-css.git
|
||||
[submodule "external/ptl"]
|
||||
path = external/ptl
|
||||
url = https://github.com/jrmadsen/PTL
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
cmake_minimum_required(VERSION 3.16 FATAL_ERROR)
|
||||
cmake_minimum_required(VERSION 3.21 FATAL_ERROR)
|
||||
|
||||
if(CMAKE_SOURCE_DIR STREQUAL CMAKE_BINARY_DIR AND CMAKE_CURRENT_SOURCE_DIR STREQUAL
|
||||
CMAKE_SOURCE_DIR)
|
||||
|
||||
@@ -146,3 +146,11 @@ find_package(
|
||||
lib/cmake/amd_comgr)
|
||||
|
||||
target_link_libraries(rocprofiler-amd-comgr INTERFACE amd_comgr)
|
||||
|
||||
# ----------------------------------------------------------------------------------------#
|
||||
#
|
||||
# PTL (Parallel Tasking Library)
|
||||
#
|
||||
# ----------------------------------------------------------------------------------------#
|
||||
|
||||
target_link_libraries(rocprofiler-ptl INTERFACE PTL::ptl-static)
|
||||
|
||||
@@ -49,3 +49,4 @@ rocprofiler_add_interface_library(rocprofiler-gtest "Google Test library" INTERN
|
||||
rocprofiler_add_interface_library(rocprofiler-glog "Google Log library" INTERNAL)
|
||||
rocprofiler_add_interface_library(rocprofiler-fmt "C++ format string library" INTERNAL)
|
||||
rocprofiler_add_interface_library(rocprofiler-stdcxxfs "C++ filesystem library" INTERNAL)
|
||||
rocprofiler_add_interface_library(rocprofiler-ptl "Parallel Tasking Library" INTERNAL)
|
||||
|
||||
@@ -88,3 +88,45 @@ else()
|
||||
find_package(fmt REQUIRED)
|
||||
target_link_libraries(rocprofiler-fmt INTERFACE fmt::fmt)
|
||||
endif()
|
||||
|
||||
if(NOT TARGET PTL::ptl-static)
|
||||
rocprofiler_checkout_git_submodule(
|
||||
RELATIVE_PATH external/ptl
|
||||
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||
REPO_URL https://github.com/jrmadsen/PTL.git
|
||||
REPO_BRANCH rocprofiler)
|
||||
|
||||
set(PTL_BUILD_EXAMPLES OFF)
|
||||
set(PTL_USE_TBB OFF)
|
||||
set(PTL_USE_GPU OFF)
|
||||
set(PTL_DEVELOPER_INSTALL OFF)
|
||||
|
||||
if(NOT DEFINED BUILD_OBJECT_LIBS)
|
||||
set(BUILD_OBJECT_LIBS OFF)
|
||||
endif()
|
||||
|
||||
if(NOT DEFINED BUILD_STATIC_LIBS)
|
||||
set(BUILD_STATIC_LIBS OFF)
|
||||
endif()
|
||||
|
||||
rocprofiler_save_variables(
|
||||
BUILD_CONFIG
|
||||
VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS
|
||||
CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET
|
||||
CMAKE_VISIBILITY_INLINES_HIDDEN)
|
||||
|
||||
set(BUILD_SHARED_LIBS OFF)
|
||||
set(BUILD_STATIC_LIBS ON)
|
||||
set(BUILD_OBJECT_LIBS OFF)
|
||||
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
|
||||
set(CMAKE_CXX_VISIBILITY_PRESET "hidden")
|
||||
set(CMAKE_VISIBILITY_INLINES_HIDDEN ON)
|
||||
|
||||
add_subdirectory(ptl EXCLUDE_FROM_ALL)
|
||||
|
||||
rocprofiler_restore_variables(
|
||||
BUILD_CONFIG
|
||||
VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS
|
||||
CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET
|
||||
CMAKE_VISIBILITY_INLINES_HIDDEN)
|
||||
endif()
|
||||
|
||||
+1
子模块 projects/rocprofiler-sdk/external/ptl 已添加到 7bbc5a4e66
@@ -5,3 +5,5 @@ project(rocprofiler-samples LANGUAGES C CXX)
|
||||
|
||||
# add_subdirectory(api_tracing)
|
||||
add_subdirectory(pc_sampling)
|
||||
add_subdirectory(api_callback_tracing)
|
||||
add_subdirectory(api_buffered_tracing)
|
||||
|
||||
@@ -0,0 +1,52 @@
|
||||
#
|
||||
#
|
||||
#
|
||||
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
|
||||
|
||||
if(NOT CMAKE_HIP_COMPILER)
|
||||
find_program(
|
||||
amdclangpp_EXECUTABLE
|
||||
NAMES amdclang++
|
||||
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
|
||||
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
|
||||
PATH_SUFFIXES bin llvm/bin NO_CACHE)
|
||||
mark_as_advanced(amdclangpp_EXECUTABLE)
|
||||
|
||||
if(amdclangpp_EXECUTABLE)
|
||||
set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}")
|
||||
endif()
|
||||
endif()
|
||||
|
||||
project(rocprofiler-samples-buffered-api-tracing LANGUAGES CXX HIP)
|
||||
|
||||
foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO)
|
||||
if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "")
|
||||
set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}")
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
add_library(buffered-api-tracing-client SHARED)
|
||||
target_sources(buffered-api-tracing-client PRIVATE client.cpp client.hpp)
|
||||
target_link_libraries(buffered-api-tracing-client
|
||||
PRIVATE rocprofiler::rocprofiler-library)
|
||||
|
||||
set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP)
|
||||
find_package(Threads REQUIRED)
|
||||
|
||||
add_executable(buffered-api-tracing)
|
||||
target_sources(buffered-api-tracing PRIVATE main.cpp)
|
||||
target_link_libraries(buffered-api-tracing PRIVATE buffered-api-tracing-client
|
||||
Threads::Threads)
|
||||
|
||||
add_test(NAME buffered-api-tracing COMMAND $<TARGET_FILE:buffered-api-tracing>)
|
||||
|
||||
set_tests_properties(
|
||||
buffered-api-tracing
|
||||
PROPERTIES
|
||||
TIMEOUT
|
||||
45
|
||||
LABELS
|
||||
"samples"
|
||||
ENVIRONMENT
|
||||
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$<TARGET_FILE:rocprofiler::rocprofiler-library>"
|
||||
)
|
||||
@@ -0,0 +1,383 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
// undefine NDEBUG so asserts are implemented
|
||||
#ifdef NDEBUG
|
||||
# undef NDEBUG
|
||||
#endif
|
||||
|
||||
/**
|
||||
* @file samples/api_buffered_tracing/client.cpp
|
||||
*
|
||||
* @brief Example rocprofiler client (tool)
|
||||
*/
|
||||
|
||||
#include "client.hpp"
|
||||
|
||||
#include <rocprofiler/buffer.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/internal_threading.h>
|
||||
#include <rocprofiler/registration.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include <cassert>
|
||||
#include <chrono>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <filesystem>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <string_view>
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
|
||||
#define ROCPROFILER_CALL(result, msg) \
|
||||
{ \
|
||||
rocprofiler_status_t CHECKSTATUS = result; \
|
||||
if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \
|
||||
{ \
|
||||
std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \
|
||||
throw std::runtime_error(#result " failure"); \
|
||||
} \
|
||||
}
|
||||
|
||||
namespace client
|
||||
{
|
||||
namespace
|
||||
{
|
||||
struct source_location
|
||||
{
|
||||
std::string function = {};
|
||||
std::string file = {};
|
||||
uint32_t line = 0;
|
||||
std::string context = {};
|
||||
};
|
||||
|
||||
using call_stack_t = std::vector<source_location>;
|
||||
|
||||
rocprofiler_client_id_t* client_id = nullptr;
|
||||
rocprofiler_client_finalize_t client_fini_func = nullptr;
|
||||
rocprofiler_context_id_t client_ctx = {};
|
||||
rocprofiler_buffer_id_t client_buffer = {};
|
||||
|
||||
void
|
||||
print_call_stack(const call_stack_t& _call_stack)
|
||||
{
|
||||
namespace fs = ::std::filesystem;
|
||||
|
||||
size_t n = 0;
|
||||
for(const auto& itr : _call_stack)
|
||||
{
|
||||
std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " ";
|
||||
std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] "
|
||||
<< std::setw(20) << std::left << itr.function;
|
||||
if(!itr.context.empty()) std::clog << " :: " << itr.context;
|
||||
std::clog << "\n";
|
||||
}
|
||||
|
||||
std::clog << std::flush;
|
||||
}
|
||||
|
||||
void
|
||||
store_buffer_id_names(call_stack_t* tool_data)
|
||||
{
|
||||
//
|
||||
// buffered for each kind operation
|
||||
//
|
||||
static auto tracing_operation_names_cb = [](rocprofiler_service_buffer_tracing_kind_t /*kindv*/,
|
||||
uint32_t /*operation*/,
|
||||
const char* operation_name,
|
||||
void* data_v) {
|
||||
static_cast<call_stack_t*>(data_v)->emplace_back(
|
||||
source_location{"rocprofiler_iterate_buffer_trace_kind_operation_names",
|
||||
__FILE__,
|
||||
__LINE__,
|
||||
std::string{" "} + std::string{operation_name}});
|
||||
return 0;
|
||||
};
|
||||
|
||||
//
|
||||
// callback for each buffer kind (i.e. domain)
|
||||
//
|
||||
static auto tracing_kind_names_cb =
|
||||
[](rocprofiler_service_buffer_tracing_kind_t kind, const char* kind_name, void* data) {
|
||||
// store the buffer kind name
|
||||
static_cast<call_stack_t*>(data)->emplace_back(
|
||||
source_location{"rocprofiler_iterate_buffer_trace_kind_names ",
|
||||
__FILE__,
|
||||
__LINE__,
|
||||
kind_name});
|
||||
|
||||
// store the operation names for the HSA API
|
||||
if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
|
||||
{
|
||||
rocprofiler_iterate_buffer_tracing_kind_operation_names(
|
||||
kind, tracing_operation_names_cb, data);
|
||||
}
|
||||
|
||||
return 0;
|
||||
};
|
||||
|
||||
rocprofiler_iterate_buffer_tracing_kind_names(tracing_kind_names_cb,
|
||||
static_cast<void*>(tool_data));
|
||||
}
|
||||
|
||||
void
|
||||
tool_tracing_callback(rocprofiler_context_id_t context,
|
||||
rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_record_header_t** headers,
|
||||
size_t num_headers,
|
||||
void* user_data,
|
||||
uint64_t drop_count)
|
||||
{
|
||||
assert(user_data != nullptr);
|
||||
|
||||
if(num_headers == 0)
|
||||
throw std::runtime_error{
|
||||
"rocprofiler invoked a buffer callback with no headers. this should never happen"};
|
||||
else if(headers == nullptr)
|
||||
throw std::runtime_error{"rocprofiler invoked a buffer callback with a null pointer to the "
|
||||
"array of headers. this should never happen"};
|
||||
|
||||
for(size_t i = 0; i < num_headers; ++i)
|
||||
{
|
||||
auto* header = headers[i];
|
||||
|
||||
if(header == nullptr)
|
||||
{
|
||||
throw std::runtime_error{
|
||||
"rocprofiler provided a null pointer to header. this should never happen"};
|
||||
}
|
||||
else if(header->hash !=
|
||||
rocprofiler_record_header_compute_hash(header->category, header->kind))
|
||||
{
|
||||
throw std::runtime_error{"rocprofiler_record_header_t (category | kind) != hash"};
|
||||
}
|
||||
else if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING &&
|
||||
header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
|
||||
{
|
||||
auto* record =
|
||||
static_cast<rocprofiler_buffer_tracing_hsa_api_record_t*>(header->payload);
|
||||
auto info = std::stringstream{};
|
||||
info << "tid=" << record->thread_id << ", context=" << context.handle
|
||||
<< ", buffer_id=" << buffer_id.handle << ", cid=" << record->correlation_id.id
|
||||
<< ", kind=" << record->kind << ", operation=" << record->operation
|
||||
<< ", drop_count=" << drop_count << ", start=" << record->start_timestamp
|
||||
<< ", stop=" << record->end_timestamp;
|
||||
|
||||
if(record->start_timestamp > record->end_timestamp)
|
||||
throw std::runtime_error("start > end");
|
||||
|
||||
static_cast<call_stack_t*>(user_data)->emplace_back(
|
||||
source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
|
||||
}
|
||||
else
|
||||
{
|
||||
throw std::runtime_error{"unexpected rocprofiler_record_header_t category + kind"};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
thread_precreate(rocprofiler_internal_thread_library_t lib, void* tool_data)
|
||||
{
|
||||
static_cast<call_stack_t*>(tool_data)->emplace_back(
|
||||
source_location{__FUNCTION__,
|
||||
__FILE__,
|
||||
__LINE__,
|
||||
std::string{"internal thread about to be created by rocprofiler (lib="} +
|
||||
std::to_string(lib) + ")"});
|
||||
}
|
||||
|
||||
void
|
||||
thread_postcreate(rocprofiler_internal_thread_library_t lib, void* tool_data)
|
||||
{
|
||||
static_cast<call_stack_t*>(tool_data)->emplace_back(
|
||||
source_location{__FUNCTION__,
|
||||
__FILE__,
|
||||
__LINE__,
|
||||
std::string{"internal thread was created by rocprofiler (lib="} +
|
||||
std::to_string(lib) + ")"});
|
||||
}
|
||||
|
||||
int
|
||||
tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
|
||||
{
|
||||
assert(tool_data != nullptr);
|
||||
|
||||
static_cast<call_stack_t*>(tool_data)->emplace_back(
|
||||
source_location{__FUNCTION__, __FILE__, __LINE__, ""});
|
||||
|
||||
store_buffer_id_names(static_cast<call_stack_t*>(tool_data));
|
||||
|
||||
client_fini_func = fini_func;
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed");
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_create_buffer(client_ctx,
|
||||
4096,
|
||||
2048,
|
||||
ROCPROFILER_BUFFER_POLICY_LOSSLESS,
|
||||
tool_tracing_callback,
|
||||
tool_data,
|
||||
&client_buffer),
|
||||
"buffer creation failed");
|
||||
|
||||
ROCPROFILER_CALL(
|
||||
rocprofiler_configure_buffer_tracing_service(
|
||||
client_ctx, ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, nullptr, 0, client_buffer),
|
||||
"buffer tracing service failed to configure");
|
||||
|
||||
auto client_thread = rocprofiler_callback_thread_t{};
|
||||
ROCPROFILER_CALL(rocprofiler_create_callback_thread(&client_thread),
|
||||
"failure creating callback thread");
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_assign_callback_thread(client_buffer, client_thread),
|
||||
"failed to assign thread for buffer");
|
||||
|
||||
int valid_ctx = 0;
|
||||
ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx),
|
||||
"failure checking context validity");
|
||||
if(valid_ctx == 0)
|
||||
{
|
||||
// notify rocprofiler that initialization failed
|
||||
// and all the contexts, buffers, etc. created
|
||||
// should be ignored
|
||||
return -1;
|
||||
}
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
|
||||
|
||||
// no errors
|
||||
return 0;
|
||||
}
|
||||
|
||||
void
|
||||
tool_fini(void* tool_data)
|
||||
{
|
||||
assert(tool_data != nullptr);
|
||||
|
||||
auto* _call_stack = static_cast<call_stack_t*>(tool_data);
|
||||
_call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""});
|
||||
|
||||
print_call_stack(*_call_stack);
|
||||
|
||||
delete _call_stack;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
void
|
||||
setup()
|
||||
{
|
||||
ROCPROFILER_CALL(rocprofiler_force_configure(&rocprofiler_configure),
|
||||
"failed to force configuration");
|
||||
}
|
||||
|
||||
void
|
||||
shutdown()
|
||||
{
|
||||
if(client_id)
|
||||
{
|
||||
auto status = ROCPROFILER_STATUS_SUCCESS;
|
||||
while((status = rocprofiler_flush_buffer(client_buffer)) ==
|
||||
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY)
|
||||
{
|
||||
std::this_thread::yield();
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds{10});
|
||||
}
|
||||
ROCPROFILER_CALL(status, "rocprofiler_flush_buffer failed");
|
||||
while((status = rocprofiler_flush_buffer(client_buffer)) ==
|
||||
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY)
|
||||
{
|
||||
std::this_thread::yield();
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds{10});
|
||||
}
|
||||
client_fini_func(*client_id);
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
start()
|
||||
{
|
||||
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
|
||||
}
|
||||
|
||||
void
|
||||
stop()
|
||||
{
|
||||
ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed");
|
||||
}
|
||||
} // namespace client
|
||||
|
||||
extern "C" rocprofiler_tool_configure_result_t*
|
||||
rocprofiler_configure(uint32_t version,
|
||||
const char* runtime_version,
|
||||
uint32_t priority,
|
||||
rocprofiler_client_id_t* id)
|
||||
{
|
||||
// only activate if main tool
|
||||
if(priority > 0) return nullptr;
|
||||
|
||||
// set the client name
|
||||
id->name = "ExampleTool";
|
||||
|
||||
// store client info
|
||||
client::client_id = id;
|
||||
|
||||
// compute major/minor/patch version info
|
||||
uint32_t major = version / 10000;
|
||||
uint32_t minor = (version % 10000) / 100;
|
||||
uint32_t patch = version % 100;
|
||||
|
||||
// generate info string
|
||||
auto info = std::stringstream{};
|
||||
info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " ("
|
||||
<< runtime_version << ")";
|
||||
|
||||
std::clog << info.str() << std::endl;
|
||||
|
||||
auto* client_tool_data = new std::vector<client::source_location>{};
|
||||
|
||||
client_tool_data->emplace_back(
|
||||
client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_at_internal_thread_create(
|
||||
client::thread_precreate,
|
||||
client::thread_postcreate,
|
||||
ROCPROFILER_LIBRARY | ROCPROFILER_HSA_LIBRARY | ROCPROFILER_HIP_LIBRARY |
|
||||
ROCPROFILER_MARKER_LIBRARY,
|
||||
static_cast<void*>(client_tool_data)),
|
||||
"failed to register for thread creation notifications");
|
||||
|
||||
// create configure data
|
||||
static auto cfg =
|
||||
rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t),
|
||||
&client::tool_init,
|
||||
&client::tool_fini,
|
||||
static_cast<void*>(client_tool_data)};
|
||||
|
||||
// return pointer to configure data
|
||||
return &cfg;
|
||||
}
|
||||
@@ -0,0 +1,44 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#ifdef buffered_api_tracing_client_EXPORTS
|
||||
# define CLIENT_API __attribute__((visibility("default")))
|
||||
#else
|
||||
# define CLIENT_API
|
||||
#endif
|
||||
|
||||
namespace client
|
||||
{
|
||||
void
|
||||
setup() CLIENT_API;
|
||||
|
||||
void
|
||||
shutdown() CLIENT_API;
|
||||
|
||||
void
|
||||
start() CLIENT_API;
|
||||
|
||||
void
|
||||
stop() CLIENT_API;
|
||||
} // namespace client
|
||||
@@ -0,0 +1,244 @@
|
||||
/*
|
||||
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#include "client.hpp"
|
||||
|
||||
#include "hip/hip_runtime.h"
|
||||
|
||||
#include <chrono>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <random>
|
||||
#include <stdexcept>
|
||||
|
||||
#define HIP_API_CALL(CALL) \
|
||||
{ \
|
||||
hipError_t error_ = (CALL); \
|
||||
if(error_ != hipSuccess) \
|
||||
{ \
|
||||
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
|
||||
fprintf(stderr, \
|
||||
"%s:%d :: HIP error : %s\n", \
|
||||
__FILE__, \
|
||||
__LINE__, \
|
||||
hipGetErrorString(error_)); \
|
||||
throw std::runtime_error("hip_api_call"); \
|
||||
} \
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
using auto_lock_t = std::unique_lock<std::mutex>;
|
||||
auto print_lock = std::mutex{};
|
||||
size_t nthreads = 2;
|
||||
size_t nitr = 500;
|
||||
size_t nsync = 10;
|
||||
constexpr unsigned shared_mem_tile_dim = 32;
|
||||
|
||||
void
|
||||
check_hip_error(void);
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N);
|
||||
} // namespace
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N);
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
|
||||
|
||||
int
|
||||
main(int argc, char** argv)
|
||||
{
|
||||
client::setup(); // forces rocprofiler to configure/initialize
|
||||
client::start(); // starts context before any API tables are available
|
||||
|
||||
int rank = 0;
|
||||
int size = 1;
|
||||
for(int i = 1; i < argc; ++i)
|
||||
{
|
||||
auto _arg = std::string{argv[i]};
|
||||
if(_arg == "?" || _arg == "-h" || _arg == "--help")
|
||||
{
|
||||
fprintf(stderr,
|
||||
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
|
||||
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
|
||||
nthreads,
|
||||
nitr,
|
||||
nsync);
|
||||
exit(EXIT_SUCCESS);
|
||||
}
|
||||
}
|
||||
if(argc > 1) nthreads = atoll(argv[1]);
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
printf("[transpose] Number of threads: %zu\n", nthreads);
|
||||
printf("[transpose] Number of iterations: %zu\n", nitr);
|
||||
printf("[transpose] Syncing every %zu iterations\n", nsync);
|
||||
|
||||
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
|
||||
int ndevice = 0;
|
||||
int devid = rank;
|
||||
HIP_API_CALL(hipGetDeviceCount(&ndevice));
|
||||
printf("[transpose] Number of devices found: %i\n", ndevice);
|
||||
if(ndevice > 0)
|
||||
{
|
||||
devid = rank % ndevice;
|
||||
HIP_API_CALL(hipSetDevice(devid));
|
||||
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
|
||||
}
|
||||
if(rank == devid && rank < ndevice)
|
||||
{
|
||||
std::vector<std::thread> _threads{};
|
||||
std::vector<hipStream_t> _streams(nthreads);
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
|
||||
for(size_t i = 1; i < nthreads; ++i)
|
||||
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
|
||||
run(rank, 0, _streams.at(0), argc, argv);
|
||||
for(auto& itr : _threads)
|
||||
itr.join();
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
|
||||
}
|
||||
HIP_API_CALL(hipDeviceSynchronize());
|
||||
HIP_API_CALL(hipDeviceReset());
|
||||
|
||||
client::stop();
|
||||
client::shutdown();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N)
|
||||
{
|
||||
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
|
||||
|
||||
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
|
||||
tile[threadIdx.y][threadIdx.x] = in[idx];
|
||||
__syncthreads();
|
||||
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
|
||||
out[idx] = tile[threadIdx.x][threadIdx.y];
|
||||
}
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
|
||||
{
|
||||
unsigned int M = 4960 * 2;
|
||||
unsigned int N = 4960 * 2;
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
|
||||
_lk.unlock();
|
||||
|
||||
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
|
||||
std::uniform_int_distribution<int> _dist{0, 1000};
|
||||
|
||||
size_t size = sizeof(int) * M * N;
|
||||
int* inp_matrix = new int[size];
|
||||
int* out_matrix = new int[size];
|
||||
for(size_t i = 0; i < M * N; i++)
|
||||
{
|
||||
inp_matrix[i] = _dist(_engine);
|
||||
out_matrix[i] = 0;
|
||||
}
|
||||
int* in = nullptr;
|
||||
int* out = nullptr;
|
||||
|
||||
HIP_API_CALL(hipMalloc(&in, size));
|
||||
HIP_API_CALL(hipMalloc(&out, size));
|
||||
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
|
||||
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
dim3 grid(M / 32, N / 32, 1);
|
||||
dim3 block(32, 32, 1); // transpose_a
|
||||
|
||||
auto t1 = std::chrono::high_resolution_clock::now();
|
||||
for(size_t i = 0; i < nitr; ++i)
|
||||
{
|
||||
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
|
||||
check_hip_error();
|
||||
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
}
|
||||
auto t2 = std::chrono::high_resolution_clock::now();
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
|
||||
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
|
||||
float GB = (float) size * nitr * 2 / (1 << 30);
|
||||
|
||||
print_lock.lock();
|
||||
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
|
||||
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
|
||||
<< std::endl;
|
||||
print_lock.unlock();
|
||||
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
// cpu_transpose(matrix, out_matrix, M, N);
|
||||
verify(inp_matrix, out_matrix, M, N);
|
||||
|
||||
HIP_API_CALL(hipFree(in));
|
||||
HIP_API_CALL(hipFree(out));
|
||||
|
||||
delete[] inp_matrix;
|
||||
delete[] out_matrix;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
void
|
||||
check_hip_error(void)
|
||||
{
|
||||
hipError_t err = hipGetLastError();
|
||||
if(err != hipSuccess)
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
|
||||
throw std::runtime_error("hip_api_call");
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N)
|
||||
{
|
||||
for(int i = 0; i < 10; i++)
|
||||
{
|
||||
int row = rand() % M;
|
||||
int col = rand() % N;
|
||||
if(in[row * N + col] != out[col * M + row])
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
|
||||
<< out[col * M + row] << "\n";
|
||||
}
|
||||
}
|
||||
}
|
||||
} // namespace
|
||||
@@ -0,0 +1,52 @@
|
||||
#
|
||||
#
|
||||
#
|
||||
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
|
||||
|
||||
if(NOT CMAKE_HIP_COMPILER)
|
||||
find_program(
|
||||
amdclangpp_EXECUTABLE
|
||||
NAMES amdclang++
|
||||
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
|
||||
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
|
||||
PATH_SUFFIXES bin llvm/bin NO_CACHE)
|
||||
mark_as_advanced(amdclangpp_EXECUTABLE)
|
||||
|
||||
if(amdclangpp_EXECUTABLE)
|
||||
set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}")
|
||||
endif()
|
||||
endif()
|
||||
|
||||
project(rocprofiler-samples-callback-api-tracing LANGUAGES CXX HIP)
|
||||
|
||||
foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO)
|
||||
if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "")
|
||||
set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}")
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
add_library(callback-api-tracing-client SHARED)
|
||||
target_sources(callback-api-tracing-client PRIVATE client.cpp client.hpp)
|
||||
target_link_libraries(callback-api-tracing-client
|
||||
PRIVATE rocprofiler::rocprofiler-library)
|
||||
|
||||
set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP)
|
||||
find_package(Threads REQUIRED)
|
||||
|
||||
add_executable(callback-api-tracing)
|
||||
target_sources(callback-api-tracing PRIVATE main.cpp)
|
||||
target_link_libraries(callback-api-tracing PRIVATE callback-api-tracing-client
|
||||
Threads::Threads)
|
||||
|
||||
add_test(NAME callback-api-tracing COMMAND $<TARGET_FILE:callback-api-tracing>)
|
||||
|
||||
set_tests_properties(
|
||||
callback-api-tracing
|
||||
PROPERTIES
|
||||
TIMEOUT
|
||||
45
|
||||
LABELS
|
||||
"samples"
|
||||
ENVIRONMENT
|
||||
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$<TARGET_FILE:rocprofiler::rocprofiler-library>"
|
||||
)
|
||||
@@ -0,0 +1,317 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
// undefine NDEBUG so asserts are implemented
|
||||
#ifdef NDEBUG
|
||||
# undef NDEBUG
|
||||
#endif
|
||||
|
||||
/**
|
||||
* @file samples/api_callback_tracing/client.cpp
|
||||
*
|
||||
* @brief Example rocprofiler client (tool)
|
||||
*/
|
||||
|
||||
#include "client.hpp"
|
||||
|
||||
#include <rocprofiler/registration.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include <cassert>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <filesystem>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <string_view>
|
||||
#include <vector>
|
||||
|
||||
#define ROCPROFILER_CALL(result, msg) \
|
||||
{ \
|
||||
rocprofiler_status_t CHECKSTATUS = result; \
|
||||
if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \
|
||||
{ \
|
||||
std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \
|
||||
throw std::runtime_error(#result " failure"); \
|
||||
} \
|
||||
}
|
||||
|
||||
namespace client
|
||||
{
|
||||
namespace
|
||||
{
|
||||
struct source_location
|
||||
{
|
||||
std::string function = {};
|
||||
std::string file = {};
|
||||
uint32_t line = 0;
|
||||
std::string context = {};
|
||||
};
|
||||
|
||||
using call_stack_t = std::vector<source_location>;
|
||||
|
||||
rocprofiler_client_id_t* client_id = nullptr;
|
||||
rocprofiler_client_finalize_t client_fini_func = nullptr;
|
||||
rocprofiler_context_id_t client_ctx = {};
|
||||
|
||||
void
|
||||
print_call_stack(const call_stack_t& _call_stack)
|
||||
{
|
||||
namespace fs = ::std::filesystem;
|
||||
|
||||
size_t n = 0;
|
||||
for(const auto& itr : _call_stack)
|
||||
{
|
||||
std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " ";
|
||||
std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] "
|
||||
<< std::setw(20) << std::left << itr.function;
|
||||
if(!itr.context.empty()) std::clog << " :: " << itr.context;
|
||||
std::clog << "\n";
|
||||
}
|
||||
|
||||
std::clog << std::flush;
|
||||
}
|
||||
|
||||
void
|
||||
store_callback_id_names(call_stack_t* tool_data)
|
||||
{
|
||||
//
|
||||
// callback for each kind operation
|
||||
//
|
||||
static auto tracing_operation_names_cb =
|
||||
[](rocprofiler_service_callback_tracing_kind_t /*kindv*/,
|
||||
uint32_t /*operation*/,
|
||||
const char* operation_name,
|
||||
void* data_v) {
|
||||
static_cast<call_stack_t*>(data_v)->emplace_back(
|
||||
source_location{"rocprofiler_iterate_callback_tracing_kind_operation_names",
|
||||
__FILE__,
|
||||
__LINE__,
|
||||
std::string{" "} + std::string{operation_name}});
|
||||
return 0;
|
||||
};
|
||||
|
||||
//
|
||||
// callback for each callback kind (i.e. domain)
|
||||
//
|
||||
static auto tracing_kind_names_cb = [](rocprofiler_service_callback_tracing_kind_t kind,
|
||||
const char* kind_name,
|
||||
void* data) {
|
||||
// store the callback kind name
|
||||
static_cast<call_stack_t*>(data)->emplace_back(source_location{
|
||||
"rocprofiler_iterate_callback_tracing_kind_names ", __FILE__, __LINE__, kind_name});
|
||||
|
||||
// store the operation names for the HSA API
|
||||
if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
|
||||
{
|
||||
rocprofiler_iterate_callback_tracing_kind_operation_names(
|
||||
kind, tracing_operation_names_cb, data);
|
||||
}
|
||||
|
||||
return 0;
|
||||
};
|
||||
|
||||
rocprofiler_iterate_callback_tracing_kind_names(tracing_kind_names_cb,
|
||||
static_cast<void*>(tool_data));
|
||||
}
|
||||
|
||||
void
|
||||
tool_tracing_callback(rocprofiler_callback_tracing_record_t record, void* user_data)
|
||||
{
|
||||
assert(user_data != nullptr);
|
||||
|
||||
auto info = std::stringstream{};
|
||||
info << "tid=" << record.thread_id << ", cid=" << record.correlation_id.id
|
||||
<< ", kind=" << record.kind << ", operation=" << record.operation
|
||||
<< ", phase=" << record.phase;
|
||||
|
||||
auto info_data_cb = [](rocprofiler_service_callback_tracing_kind_t,
|
||||
uint32_t,
|
||||
uint32_t arg_num,
|
||||
const char* arg_name,
|
||||
const char* arg_value_str,
|
||||
const void* const arg_value_addr,
|
||||
void* cb_data) -> int {
|
||||
auto& dss = *static_cast<std::stringstream*>(cb_data);
|
||||
dss << ((arg_num == 0) ? "(" : ", ");
|
||||
dss << arg_num << ": " << arg_name << "=" << arg_value_str;
|
||||
(void) arg_value_addr;
|
||||
return 0;
|
||||
};
|
||||
|
||||
auto info_data = std::stringstream{};
|
||||
ROCPROFILER_CALL(rocprofiler_iterate_callback_tracing_operation_args(
|
||||
record, info_data_cb, static_cast<void*>(&info_data)),
|
||||
"Failure iterating trace operation args");
|
||||
|
||||
auto info_data_str = info_data.str();
|
||||
if(!info_data_str.empty()) info << " " << info_data_str << ")";
|
||||
|
||||
static auto _mutex = std::mutex{};
|
||||
_mutex.lock();
|
||||
static_cast<call_stack_t*>(user_data)->emplace_back(
|
||||
source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
|
||||
_mutex.unlock();
|
||||
}
|
||||
|
||||
int
|
||||
tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
|
||||
{
|
||||
assert(tool_data != nullptr);
|
||||
|
||||
static_cast<call_stack_t*>(tool_data)->emplace_back(
|
||||
source_location{__FUNCTION__, __FILE__, __LINE__, ""});
|
||||
|
||||
store_callback_id_names(static_cast<call_stack_t*>(tool_data));
|
||||
|
||||
client_fini_func = fini_func;
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed");
|
||||
|
||||
ROCPROFILER_CALL(
|
||||
rocprofiler_configure_callback_tracing_service(client_ctx,
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API,
|
||||
nullptr,
|
||||
0,
|
||||
tool_tracing_callback,
|
||||
tool_data),
|
||||
"callback tracing service failed to configure");
|
||||
|
||||
int valid_ctx = 0;
|
||||
ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx),
|
||||
"failure checking context validity");
|
||||
if(valid_ctx == 0)
|
||||
{
|
||||
// notify rocprofiler that initialization failed
|
||||
// and all the contexts, buffers, etc. created
|
||||
// should be ignored
|
||||
return -1;
|
||||
}
|
||||
|
||||
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
|
||||
|
||||
// no errors
|
||||
return 0;
|
||||
}
|
||||
|
||||
void
|
||||
tool_fini(void* tool_data)
|
||||
{
|
||||
assert(tool_data != nullptr);
|
||||
|
||||
auto* _call_stack = static_cast<call_stack_t*>(tool_data);
|
||||
_call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""});
|
||||
|
||||
print_call_stack(*_call_stack);
|
||||
|
||||
delete _call_stack;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
void
|
||||
setup()
|
||||
{}
|
||||
|
||||
void
|
||||
shutdown()
|
||||
{
|
||||
if(client_id) client_fini_func(*client_id);
|
||||
}
|
||||
|
||||
void
|
||||
start()
|
||||
{
|
||||
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
|
||||
}
|
||||
|
||||
void
|
||||
stop()
|
||||
{
|
||||
int status = 0;
|
||||
ROCPROFILER_CALL(rocprofiler_is_initialized(&status), "failed to retrieve init status");
|
||||
if(status != 0)
|
||||
{
|
||||
ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed");
|
||||
}
|
||||
}
|
||||
} // namespace client
|
||||
|
||||
extern "C" rocprofiler_tool_configure_result_t*
|
||||
rocprofiler_configure(uint32_t version,
|
||||
const char* runtime_version,
|
||||
uint32_t priority,
|
||||
rocprofiler_client_id_t* id)
|
||||
{
|
||||
// only activate if main tool
|
||||
if(priority > 0) return nullptr;
|
||||
|
||||
// set the client name
|
||||
id->name = "ExampleTool";
|
||||
|
||||
// store client info
|
||||
client::client_id = id;
|
||||
|
||||
// compute major/minor/patch version info
|
||||
uint32_t major = version / 10000;
|
||||
uint32_t minor = (version % 10000) / 100;
|
||||
uint32_t patch = version % 100;
|
||||
|
||||
// generate info string
|
||||
auto info = std::stringstream{};
|
||||
info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " ("
|
||||
<< runtime_version << ")";
|
||||
|
||||
std::clog << info.str() << std::endl;
|
||||
|
||||
// demonstration of alternative way to get the version info
|
||||
{
|
||||
auto version_info = std::array<uint32_t, 3>{};
|
||||
ROCPROFILER_CALL(
|
||||
rocprofiler_get_version(&version_info.at(0), &version_info.at(1), &version_info.at(2)),
|
||||
"failed to get version info");
|
||||
|
||||
if(std::array<uint32_t, 3>{major, minor, patch} != version_info)
|
||||
{
|
||||
throw std::runtime_error{"version info mismatch"};
|
||||
}
|
||||
}
|
||||
|
||||
// data passed around all the callbacks
|
||||
auto* client_tool_data = new std::vector<client::source_location>{};
|
||||
|
||||
// add first entry
|
||||
client_tool_data->emplace_back(
|
||||
client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
|
||||
|
||||
// create configure data
|
||||
static auto cfg =
|
||||
rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t),
|
||||
&client::tool_init,
|
||||
&client::tool_fini,
|
||||
static_cast<void*>(client_tool_data)};
|
||||
|
||||
// return pointer to configure data
|
||||
return &cfg;
|
||||
}
|
||||
@@ -0,0 +1,44 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#ifdef callback_api_tracing_client_EXPORTS
|
||||
# define CLIENT_API __attribute__((visibility("default")))
|
||||
#else
|
||||
# define CLIENT_API
|
||||
#endif
|
||||
|
||||
namespace client
|
||||
{
|
||||
void
|
||||
setup() CLIENT_API;
|
||||
|
||||
void
|
||||
shutdown() CLIENT_API;
|
||||
|
||||
void
|
||||
start() CLIENT_API;
|
||||
|
||||
void
|
||||
stop() CLIENT_API;
|
||||
} // namespace client
|
||||
@@ -0,0 +1,244 @@
|
||||
/*
|
||||
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#include "client.hpp"
|
||||
|
||||
#include "hip/hip_runtime.h"
|
||||
|
||||
#include <chrono>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <random>
|
||||
#include <stdexcept>
|
||||
|
||||
#define HIP_API_CALL(CALL) \
|
||||
{ \
|
||||
hipError_t error_ = (CALL); \
|
||||
if(error_ != hipSuccess) \
|
||||
{ \
|
||||
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
|
||||
fprintf(stderr, \
|
||||
"%s:%d :: HIP error : %s\n", \
|
||||
__FILE__, \
|
||||
__LINE__, \
|
||||
hipGetErrorString(error_)); \
|
||||
throw std::runtime_error("hip_api_call"); \
|
||||
} \
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
using auto_lock_t = std::unique_lock<std::mutex>;
|
||||
auto print_lock = std::mutex{};
|
||||
size_t nthreads = 2;
|
||||
size_t nitr = 500;
|
||||
size_t nsync = 10;
|
||||
constexpr unsigned shared_mem_tile_dim = 32;
|
||||
|
||||
void
|
||||
check_hip_error(void);
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N);
|
||||
} // namespace
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N);
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
|
||||
|
||||
int
|
||||
main(int argc, char** argv)
|
||||
{
|
||||
client::setup(); // currently does nothing
|
||||
// client::start(); // currently will fail
|
||||
|
||||
int rank = 0;
|
||||
int size = 1;
|
||||
for(int i = 1; i < argc; ++i)
|
||||
{
|
||||
auto _arg = std::string{argv[i]};
|
||||
if(_arg == "?" || _arg == "-h" || _arg == "--help")
|
||||
{
|
||||
fprintf(stderr,
|
||||
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
|
||||
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
|
||||
nthreads,
|
||||
nitr,
|
||||
nsync);
|
||||
exit(EXIT_SUCCESS);
|
||||
}
|
||||
}
|
||||
if(argc > 1) nthreads = atoll(argv[1]);
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
printf("[transpose] Number of threads: %zu\n", nthreads);
|
||||
printf("[transpose] Number of iterations: %zu\n", nitr);
|
||||
printf("[transpose] Syncing every %zu iterations\n", nsync);
|
||||
|
||||
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
|
||||
int ndevice = 0;
|
||||
int devid = rank;
|
||||
HIP_API_CALL(hipGetDeviceCount(&ndevice));
|
||||
printf("[transpose] Number of devices found: %i\n", ndevice);
|
||||
if(ndevice > 0)
|
||||
{
|
||||
devid = rank % ndevice;
|
||||
HIP_API_CALL(hipSetDevice(devid));
|
||||
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
|
||||
}
|
||||
if(rank == devid && rank < ndevice)
|
||||
{
|
||||
std::vector<std::thread> _threads{};
|
||||
std::vector<hipStream_t> _streams(nthreads);
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
|
||||
for(size_t i = 1; i < nthreads; ++i)
|
||||
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
|
||||
run(rank, 0, _streams.at(0), argc, argv);
|
||||
for(auto& itr : _threads)
|
||||
itr.join();
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
|
||||
}
|
||||
HIP_API_CALL(hipDeviceSynchronize());
|
||||
HIP_API_CALL(hipDeviceReset());
|
||||
|
||||
client::stop();
|
||||
client::shutdown();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N)
|
||||
{
|
||||
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
|
||||
|
||||
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
|
||||
tile[threadIdx.y][threadIdx.x] = in[idx];
|
||||
__syncthreads();
|
||||
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
|
||||
out[idx] = tile[threadIdx.x][threadIdx.y];
|
||||
}
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
|
||||
{
|
||||
unsigned int M = 4960 * 2;
|
||||
unsigned int N = 4960 * 2;
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
|
||||
_lk.unlock();
|
||||
|
||||
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
|
||||
std::uniform_int_distribution<int> _dist{0, 1000};
|
||||
|
||||
size_t size = sizeof(int) * M * N;
|
||||
int* inp_matrix = new int[size];
|
||||
int* out_matrix = new int[size];
|
||||
for(size_t i = 0; i < M * N; i++)
|
||||
{
|
||||
inp_matrix[i] = _dist(_engine);
|
||||
out_matrix[i] = 0;
|
||||
}
|
||||
int* in = nullptr;
|
||||
int* out = nullptr;
|
||||
|
||||
HIP_API_CALL(hipMalloc(&in, size));
|
||||
HIP_API_CALL(hipMalloc(&out, size));
|
||||
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
|
||||
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
dim3 grid(M / 32, N / 32, 1);
|
||||
dim3 block(32, 32, 1); // transpose_a
|
||||
|
||||
auto t1 = std::chrono::high_resolution_clock::now();
|
||||
for(size_t i = 0; i < nitr; ++i)
|
||||
{
|
||||
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
|
||||
check_hip_error();
|
||||
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
}
|
||||
auto t2 = std::chrono::high_resolution_clock::now();
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
|
||||
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
|
||||
float GB = (float) size * nitr * 2 / (1 << 30);
|
||||
|
||||
print_lock.lock();
|
||||
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
|
||||
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
|
||||
<< std::endl;
|
||||
print_lock.unlock();
|
||||
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
// cpu_transpose(matrix, out_matrix, M, N);
|
||||
verify(inp_matrix, out_matrix, M, N);
|
||||
|
||||
HIP_API_CALL(hipFree(in));
|
||||
HIP_API_CALL(hipFree(out));
|
||||
|
||||
delete[] inp_matrix;
|
||||
delete[] out_matrix;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
void
|
||||
check_hip_error(void)
|
||||
{
|
||||
hipError_t err = hipGetLastError();
|
||||
if(err != hipSuccess)
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
|
||||
throw std::runtime_error("hip_api_call");
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N)
|
||||
{
|
||||
for(int i = 0; i < 10; i++)
|
||||
{
|
||||
int row = rand() % M;
|
||||
int col = rand() % N;
|
||||
if(in[row * N + col] != out[col * M + row])
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
|
||||
<< out[col * M + row] << "\n";
|
||||
}
|
||||
}
|
||||
}
|
||||
} // namespace
|
||||
@@ -0,0 +1,38 @@
|
||||
cmake_minimum_required(VERSION 3.21 FATAL_ERROR)
|
||||
|
||||
find_program(
|
||||
HIPCC_EXECUTABLE
|
||||
NAMES hipcc
|
||||
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
|
||||
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm NO_CACHE)
|
||||
mark_as_advanced(HIPCC_EXECUTABLE)
|
||||
|
||||
if(HIPCC_EXECUTABLE)
|
||||
set(CMAKE_CXX_COMPILER ${HIPCC_EXECUTABLE})
|
||||
endif()
|
||||
|
||||
project(rocprofiler-transpose-sample LANGUAGES CXX)
|
||||
|
||||
option(TRANSPOSE_USE_MPI "Enable MPI support in transpose exe" OFF)
|
||||
|
||||
set(CMAKE_CXX_STANDARD 17)
|
||||
set(CMAKE_CXX_EXTENSIONS OFF)
|
||||
set(CMAKE_CXX_STANDARD_REQUIRED ON)
|
||||
|
||||
add_executable(transpose)
|
||||
target_sources(transpose PRIVATE transpose.cpp)
|
||||
target_compile_options(transpose PRIVATE -W -Wall -Wextra -Wpedantic -Wshadow -Werror)
|
||||
|
||||
find_package(Threads REQUIRED)
|
||||
target_link_libraries(transpose PRIVATE Threads::Threads)
|
||||
|
||||
if(TRANSPOSE_USE_MPI)
|
||||
find_package(MPI REQUIRED)
|
||||
target_compile_definitions(transpose PRIVATE USE_MPI)
|
||||
target_link_libraries(transpose PRIVATE MPI::MPI_C)
|
||||
endif()
|
||||
|
||||
install(
|
||||
TARGETS transpose
|
||||
DESTINATION bin
|
||||
COMPONENT rocprofiler-samples)
|
||||
@@ -0,0 +1,278 @@
|
||||
/*
|
||||
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE.
|
||||
*/
|
||||
|
||||
#include "hip/hip_runtime.h"
|
||||
|
||||
#include <chrono>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
#include <random>
|
||||
#include <stdexcept>
|
||||
|
||||
#if defined(USE_MPI)
|
||||
# include <mpi.h>
|
||||
#endif
|
||||
|
||||
#define HIP_API_CALL(CALL) \
|
||||
{ \
|
||||
hipError_t error_ = (CALL); \
|
||||
if(error_ != hipSuccess) \
|
||||
{ \
|
||||
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
|
||||
fprintf(stderr, \
|
||||
"%s:%d :: HIP error : %s\n", \
|
||||
__FILE__, \
|
||||
__LINE__, \
|
||||
hipGetErrorString(error_)); \
|
||||
throw std::runtime_error("hip_api_call"); \
|
||||
} \
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
using auto_lock_t = std::unique_lock<std::mutex>;
|
||||
auto print_lock = std::mutex{};
|
||||
size_t nthreads = 2;
|
||||
size_t nitr = 500;
|
||||
size_t nsync = 10;
|
||||
constexpr unsigned shared_mem_tile_dim = 32;
|
||||
|
||||
void
|
||||
check_hip_error(void);
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N);
|
||||
} // namespace
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N);
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
|
||||
|
||||
#if defined(USE_MPI)
|
||||
void
|
||||
do_a2a(int rank);
|
||||
#endif
|
||||
|
||||
int
|
||||
main(int argc, char** argv)
|
||||
{
|
||||
int rank = 0;
|
||||
int size = 1;
|
||||
for(int i = 1; i < argc; ++i)
|
||||
{
|
||||
auto _arg = std::string{argv[i]};
|
||||
if(_arg == "?" || _arg == "-h" || _arg == "--help")
|
||||
{
|
||||
fprintf(stderr,
|
||||
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
|
||||
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
|
||||
nthreads,
|
||||
nitr,
|
||||
nsync);
|
||||
exit(EXIT_SUCCESS);
|
||||
}
|
||||
}
|
||||
if(argc > 1) nthreads = atoll(argv[1]);
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
printf("[transpose] Number of threads: %zu\n", nthreads);
|
||||
printf("[transpose] Number of iterations: %zu\n", nitr);
|
||||
printf("[transpose] Syncing every %zu iterations\n", nsync);
|
||||
|
||||
#if defined(USE_MPI)
|
||||
MPI_Init(&argc, &argv);
|
||||
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
|
||||
MPI_Comm_size(MPI_COMM_WORLD, &size);
|
||||
#else
|
||||
(void) size;
|
||||
#endif
|
||||
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
|
||||
int ndevice = 0;
|
||||
int devid = rank;
|
||||
HIP_API_CALL(hipGetDeviceCount(&ndevice));
|
||||
printf("[transpose] Number of devices found: %i\n", ndevice);
|
||||
if(ndevice > 0)
|
||||
{
|
||||
devid = rank % ndevice;
|
||||
HIP_API_CALL(hipSetDevice(devid));
|
||||
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
|
||||
}
|
||||
if(rank == devid && rank < ndevice)
|
||||
{
|
||||
std::vector<std::thread> _threads{};
|
||||
std::vector<hipStream_t> _streams(nthreads);
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
|
||||
for(size_t i = 1; i < nthreads; ++i)
|
||||
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
|
||||
run(rank, 0, _streams.at(0), argc, argv);
|
||||
for(auto& itr : _threads)
|
||||
itr.join();
|
||||
for(size_t i = 0; i < nthreads; ++i)
|
||||
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
|
||||
}
|
||||
HIP_API_CALL(hipDeviceSynchronize());
|
||||
HIP_API_CALL(hipDeviceReset());
|
||||
|
||||
#if defined(USE_MPI)
|
||||
MPI_Barrier(MPI_COMM_WORLD);
|
||||
do_a2a(rank);
|
||||
MPI_Finalize();
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
__global__ void
|
||||
transpose_a(int* in, int* out, int M, int N)
|
||||
{
|
||||
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
|
||||
|
||||
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
|
||||
tile[threadIdx.y][threadIdx.x] = in[idx];
|
||||
__syncthreads();
|
||||
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
|
||||
out[idx] = tile[threadIdx.x][threadIdx.y];
|
||||
}
|
||||
|
||||
void
|
||||
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
|
||||
{
|
||||
unsigned int M = 4960 * 2;
|
||||
unsigned int N = 4960 * 2;
|
||||
if(argc > 2) nitr = atoll(argv[2]);
|
||||
if(argc > 3) nsync = atoll(argv[3]);
|
||||
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
|
||||
_lk.unlock();
|
||||
|
||||
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
|
||||
std::uniform_int_distribution<int> _dist{0, 1000};
|
||||
|
||||
size_t size = sizeof(int) * M * N;
|
||||
int* inp_matrix = new int[size];
|
||||
int* out_matrix = new int[size];
|
||||
for(size_t i = 0; i < M * N; i++)
|
||||
{
|
||||
inp_matrix[i] = _dist(_engine);
|
||||
out_matrix[i] = 0;
|
||||
}
|
||||
int* in = nullptr;
|
||||
int* out = nullptr;
|
||||
|
||||
HIP_API_CALL(hipMalloc(&in, size));
|
||||
HIP_API_CALL(hipMalloc(&out, size));
|
||||
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
|
||||
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
dim3 grid(M / 32, N / 32, 1);
|
||||
dim3 block(32, 32, 1); // transpose_a
|
||||
|
||||
auto t1 = std::chrono::high_resolution_clock::now();
|
||||
for(size_t i = 0; i < nitr; ++i)
|
||||
{
|
||||
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
|
||||
check_hip_error();
|
||||
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
}
|
||||
auto t2 = std::chrono::high_resolution_clock::now();
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
|
||||
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
|
||||
float GB = (float) size * nitr * 2 / (1 << 30);
|
||||
|
||||
print_lock.lock();
|
||||
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
|
||||
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
|
||||
<< std::endl;
|
||||
print_lock.unlock();
|
||||
|
||||
HIP_API_CALL(hipStreamSynchronize(stream));
|
||||
|
||||
// cpu_transpose(matrix, out_matrix, M, N);
|
||||
verify(inp_matrix, out_matrix, M, N);
|
||||
|
||||
HIP_API_CALL(hipFree(in));
|
||||
HIP_API_CALL(hipFree(out));
|
||||
|
||||
delete[] inp_matrix;
|
||||
delete[] out_matrix;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
void
|
||||
check_hip_error(void)
|
||||
{
|
||||
hipError_t err = hipGetLastError();
|
||||
if(err != hipSuccess)
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
|
||||
throw std::runtime_error("hip_api_call");
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
verify(int* in, int* out, int M, int N)
|
||||
{
|
||||
for(int i = 0; i < 10; i++)
|
||||
{
|
||||
int row = rand() % M;
|
||||
int col = rand() % N;
|
||||
if(in[row * N + col] != out[col * M + row])
|
||||
{
|
||||
auto_lock_t _lk{print_lock};
|
||||
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
|
||||
<< out[col * M + row] << "\n";
|
||||
}
|
||||
}
|
||||
}
|
||||
} // namespace
|
||||
|
||||
#if defined(USE_MPI)
|
||||
void
|
||||
do_a2a(int rank)
|
||||
{
|
||||
// Define my value
|
||||
int values[3];
|
||||
for(int i = 0; i < 3; ++i)
|
||||
values[i] = rank * 300 + i * 100;
|
||||
printf("Process %d, values = %d, %d, %d.\n", rank, values[0], values[1], values[2]);
|
||||
|
||||
int buffer_recv[3];
|
||||
MPI_Alltoall(&values, 1, MPI_INT, buffer_recv, 1, MPI_INT, MPI_COMM_WORLD);
|
||||
printf("Values collected on process %d: %d, %d, %d.\n",
|
||||
rank,
|
||||
buffer_recv[0],
|
||||
buffer_recv[1],
|
||||
buffer_recv[2]);
|
||||
}
|
||||
#endif
|
||||
@@ -102,7 +102,7 @@ rocprofiler_pc_sampling_callback(rocprofiler_context_id_t /*context_id*/,
|
||||
for(size_t i = 0; i < num_headers; i++)
|
||||
{
|
||||
auto* cur_header = headers[i];
|
||||
if(cur_header->kind == 0)
|
||||
if(cur_header->category == ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING)
|
||||
{
|
||||
auto* pc_sample = static_cast<rocprofiler_pc_sampling_record_t*>(cur_header->payload);
|
||||
printf("--- pc: %lx, dispatch_id: %lx, timestamp: %lu, hardware_id: %lu\n",
|
||||
|
||||
@@ -142,14 +142,22 @@ RECURSIVE = YES
|
||||
EXCLUDE =
|
||||
EXCLUDE_SYMLINKS = YES
|
||||
EXCLUDE_PATTERNS = */.git/* \
|
||||
@SOURCE_DIR@/samples/* \
|
||||
@SOURCE_DIR@/**/tests/* \
|
||||
@SOURCE_DIR@/source/include/rocprofiler/defines.h \
|
||||
@SOURCE_DIR@/source/include/rocprofiler/config.h
|
||||
@SOURCE_DIR@/**/scripts/* \
|
||||
@SOURCE_DIR@/**/docs/*
|
||||
EXCLUDE_SYMBOLS = "std::*" \
|
||||
"ROCPROFILER_ATTRIBUTE" \
|
||||
"ROCPROFILER_API" \
|
||||
"ROCPROFILER_NONNULL"
|
||||
"ROCPROFILER_NONNULL" \
|
||||
"ROCPROFILER_PUBLIC_API" \
|
||||
"ROCPROFILER_HIDDEN_API" \
|
||||
"ROCPROFILER_EXPORT_DECORATOR" \
|
||||
"ROCPROFILER_IMPORT_DECORATOR" \
|
||||
"ROCPROFILER_EXPORT" \
|
||||
"ROCPROFILER_IMPORT" \
|
||||
"ROCPROFILER_HANDLE_LITERAL" \
|
||||
"ROCPROFILER_EXTERN_C_INIT" \
|
||||
"ROCPROFILER_EXTERN_C_FINI"
|
||||
EXAMPLE_PATH = @SOURCE_DIR@/samples
|
||||
EXAMPLE_PATTERNS = *.h \
|
||||
*.hh \
|
||||
@@ -157,7 +165,6 @@ EXAMPLE_PATTERNS = *.h \
|
||||
*.c \
|
||||
*.cc \
|
||||
*.cpp \
|
||||
conf.py \
|
||||
*.txt
|
||||
EXAMPLE_RECURSIVE = YES
|
||||
IMAGE_PATH =
|
||||
@@ -330,6 +337,13 @@ PREDEFINED = "ROCPROFILER_API=" \
|
||||
"ROCPROFILER_EXPORT=" \
|
||||
"ROCPROFILER_IMPORT=" \
|
||||
"ROCPROFILER_NONNULL(...)=" \
|
||||
"ROCPROFILER_PUBLIC_API=" \
|
||||
"ROCPROFILER_HIDDEN_API=" \
|
||||
"ROCPROFILER_EXPORT_DECORATOR=" \
|
||||
"ROCPROFILER_IMPORT_DECORATOR=" \
|
||||
"ROCPROFILER_HANDLE_LITERAL=" \
|
||||
"ROCPROFILER_EXTERN_C_INIT=" \
|
||||
"ROCPROFILER_EXTERN_C_FINI=" \
|
||||
"__attribute__(x)=" \
|
||||
"__declspec(x)=" \
|
||||
"size_t=unsigned long" \
|
||||
|
||||
@@ -6,8 +6,30 @@
|
||||
configure_file(${CMAKE_CURRENT_LIST_DIR}/version.h.in
|
||||
${CMAKE_CURRENT_BINARY_DIR}/version.h @ONLY)
|
||||
|
||||
set(ROCPROFILER_HEADER_FILES config.h defines.h hip.h hsa.h marker.h rocprofiler.h
|
||||
rocprofiler_plugin.h ${CMAKE_CURRENT_BINARY_DIR}/version.h)
|
||||
set(ROCPROFILER_HEADER_FILES
|
||||
# core headers
|
||||
rocprofiler.h
|
||||
rocprofiler_plugin.h
|
||||
# secondary headers
|
||||
agent.h
|
||||
agent_profile.h
|
||||
buffer.h
|
||||
buffer_tracing.h
|
||||
callback_tracing.h
|
||||
context.h
|
||||
counters.h
|
||||
defines.h
|
||||
dispatch_profile.h
|
||||
external_correlation.h
|
||||
fwd.h
|
||||
hip.h
|
||||
hsa.h
|
||||
internal_threading.h
|
||||
marker.h
|
||||
pc_sampling.h
|
||||
profile_config.h
|
||||
spm.h
|
||||
${CMAKE_CURRENT_BINARY_DIR}/version.h)
|
||||
|
||||
install(FILES ${ROCPROFILER_HEADER_FILES}
|
||||
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/rocprofiler)
|
||||
|
||||
@@ -0,0 +1,72 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup AGENTS Agent Information
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Agent.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_agent_id_t id;
|
||||
rocprofiler_agent_type_t type;
|
||||
const char* name;
|
||||
rocprofiler_pc_sampling_config_array_t pc_sampling_configs;
|
||||
} rocprofiler_agent_t;
|
||||
|
||||
/**
|
||||
* @brief Callback function type for querying the available agents
|
||||
*
|
||||
* @param [in] agents Array of pointers to agents
|
||||
* @param [in] num_agents Number of agents in array
|
||||
* @param [in] user_data Data pointer passback
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
typedef rocprofiler_status_t (*rocprofiler_available_agents_cb_t)(rocprofiler_agent_t** agents,
|
||||
size_t num_agents,
|
||||
void* user_data);
|
||||
|
||||
/**
|
||||
* @brief Receive synchronous callback with an array of available agents at moment of invocation
|
||||
*
|
||||
* @param [in] callback Callback function accepting list of agents
|
||||
* @param [in] agent_size Should be set to sizeof(rocprofiler_agent_t)
|
||||
* @param [in] user_data Data pointer provided to callback
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
|
||||
size_t agent_size,
|
||||
void* user_data) ROCPROFILER_NONNULL(1);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,70 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup AGENT_PROFILE_COUNTING_SERVICE Agent Profile Counting Service
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Agent Profile Counting Data.
|
||||
*
|
||||
* Counters, including identifiers to get counter information and Counters values
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
/**
|
||||
*/
|
||||
rocprofiler_record_counter_t* counters;
|
||||
uint64_t counters_count;
|
||||
} rocprofiler_agent_profile_counting_data_t;
|
||||
|
||||
/**
|
||||
* @brief Configure Profile Counting Service for agent.
|
||||
*
|
||||
* @param [in] buffer_id
|
||||
* @param [in] profile_config_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_agent_profile_counting_service(
|
||||
rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_profile_config_id_t profile_config_id);
|
||||
|
||||
/**
|
||||
* @brief Sample Profile Counting Service for agent.
|
||||
*
|
||||
* @param [out] data // It is always a size of one
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_sample_agent_profile_counting_service(rocprofiler_agent_profile_counting_data_t* data);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,106 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup BUFFER_HANDLING Buffer
|
||||
* @{
|
||||
*
|
||||
* Every Buffer is associated with a specific service kind.
|
||||
* OR
|
||||
* Every Buffer is associated with a specific service ID.
|
||||
*
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Async callback function.
|
||||
*
|
||||
* @code{.cpp}
|
||||
* for(size_t i = 0; i < num_headers; ++i)
|
||||
* {
|
||||
* rocprofiler_record_header_t* hdr = headers[i];
|
||||
* if(hdr->kind == ROCPROFILER_RECORD_KIND_PC_SAMPLE)
|
||||
* {
|
||||
* auto* data = static_cast<rocprofiler_pc_sample_t*>(&hdr->payload);
|
||||
* ...
|
||||
* }
|
||||
* }
|
||||
* @endcode
|
||||
*/
|
||||
typedef void (*rocprofiler_buffer_tracing_cb_t)(rocprofiler_context_id_t context,
|
||||
rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_record_header_t** headers,
|
||||
size_t num_headers,
|
||||
void* data,
|
||||
uint64_t drop_count);
|
||||
|
||||
/**
|
||||
* @brief Create buffer.
|
||||
*
|
||||
* @param [in] context Context identifier associated with buffer
|
||||
* @param [in] size Size of the buffer in bytes
|
||||
* @param [in] watermark - watermark size, where the callback is called, if set
|
||||
* to 0 then the callback will be called on every record
|
||||
* @param [in] policy Behavior policy when buffer is full
|
||||
* @param [in] callback Callback to invoke when buffer is flushed/full
|
||||
* @param [in] callback_data Data to provide in callback function
|
||||
* @param [out] buffer_id Identification handle for buffer
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_create_buffer(rocprofiler_context_id_t context,
|
||||
size_t size,
|
||||
size_t watermark,
|
||||
rocprofiler_buffer_policy_t policy,
|
||||
rocprofiler_buffer_tracing_cb_t callback,
|
||||
void* callback_data,
|
||||
rocprofiler_buffer_id_t* buffer_id) ROCPROFILER_NONNULL(5, 7);
|
||||
|
||||
/**
|
||||
* @brief Destroy buffer.
|
||||
*
|
||||
* @param [in] buffer_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*
|
||||
* Note: This will destroy the buffer even if it is not empty. The user can
|
||||
* call @ref ::rocprofiler_flush_buffer before it to make sure the buffer is empty.
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id);
|
||||
|
||||
/**
|
||||
* @brief Flush buffer.
|
||||
*
|
||||
* @param [in] buffer_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,278 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup BUFFER_TRACING_SERVICE Asynchronous Tracing Service
|
||||
*
|
||||
* Receive callbacks for batches of records from an internal (background) thread
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer HSA API Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_tracing_operation_t operation; // rocprofiler/hsa.h
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_thread_id_t thread_id;
|
||||
} rocprofiler_buffer_tracing_hsa_api_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer HIP API Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_tracing_operation_t operation; // rocprofiler/hip.h
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_thread_id_t thread_id;
|
||||
} rocprofiler_buffer_tracing_hip_api_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Marker Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_tracing_operation_t operation; // rocprofiler/marker.h
|
||||
rocprofiler_timestamp_t timestamp;
|
||||
rocprofiler_thread_id_t thread_id;
|
||||
uint64_t marker_id; // rocprofiler_marker_id_t
|
||||
// const char* message; // (Need Review?)
|
||||
} rocprofiler_buffer_tracing_marker_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Memory Copy Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
/**
|
||||
* Memory copy operation that can be derived from
|
||||
* ::rocprofiler_tracing_operation_t
|
||||
*/
|
||||
uint32_t operation;
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_queue_id_t queue_id;
|
||||
} rocprofiler_buffer_tracing_memory_copy_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Kernel Dispatch Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_queue_id_t queue_id;
|
||||
const char* kernel_name;
|
||||
} rocprofiler_buffer_tracing_kernel_dispatch_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Page Migration Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_queue_id_t queue_id;
|
||||
// Not Sure What is the info needed here?
|
||||
} rocprofiler_buffer_tracing_page_migration_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Scratch Memory Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_queue_id_t queue_id;
|
||||
// Not Sure What is the info needed here?
|
||||
} rocprofiler_buffer_tracing_scratch_memory_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer Queue Scheduling Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
rocprofiler_queue_id_t queue_id;
|
||||
// Not Sure What is the info needed here?
|
||||
} rocprofiler_buffer_tracing_queue_scheduling_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Tracer Buffer Record.
|
||||
*
|
||||
* We need to guarantee that these records are in the buffer before the
|
||||
* corresponding Exit Phase API calls are called.
|
||||
*/
|
||||
// typedef struct {
|
||||
// rocprofiler_buffer_tracing_record_header_t header;
|
||||
// rocprofiler_tracing_code_object_kind_id_t kind;
|
||||
// } rocprofiler_buffer_tracing_code_object_header_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Load Tracer Buffer Record.
|
||||
*
|
||||
*/
|
||||
// typedef struct {
|
||||
// rocprofiler_buffer_tracing_code_object_header_t header;
|
||||
// uint64_t load_base; // code object load base
|
||||
// uint64_t load_size; // code object load size
|
||||
// const char *uri; // URI string (NULL terminated)
|
||||
// rocprofiler_timestamp_t timestamp;
|
||||
// // uint32_t storage_type; // code object storage type (Need Review?)
|
||||
// // int storage_file; // origin file descriptor (Need Review?)
|
||||
// // uint64_t memory_base; // origin memory base (Need Review?)
|
||||
// // uint64_t memory_size; // origin memory size (Need Review?)
|
||||
// // uint64_t load_delta; // code object load delta (Need Review?)
|
||||
// } rocprofiler_buffer_tracing_code_object_load_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object UnLoad Tracer Buffer Record.
|
||||
*
|
||||
*/
|
||||
// typedef struct {
|
||||
// rocprofiler_buffer_tracing_code_object_header_t header;
|
||||
// uint64_t load_base; // code object load base
|
||||
// rocprofiler_timestamp_t timestamp;
|
||||
// } rocprofiler_buffer_tracing_code_object_unload_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Kernel Symbol Tracer Buffer Record.
|
||||
*
|
||||
*/
|
||||
// typedef struct {
|
||||
// rocprofiler_buffer_tracing_code_object_header_t header;
|
||||
// const char *kernel_name; // kernel name string (NULL terminated)
|
||||
// uint64_t kernel_descriptor; // kernel descriptor (Need to be changed from
|
||||
// // uint64_t to ::rocprofiler_address_t)
|
||||
// // rocprofiler_timestamp_t timestamp; // (Need Review?)
|
||||
// } rocprofiler_buffer_tracing_code_object_kernel_symbol_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Buffer External Correlation Tracer Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_service_buffer_tracing_kind_t kind;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_external_correlation_id_t external_correlation_id;
|
||||
} rocprofiler_buffer_tracing_external_correlation_record_t;
|
||||
|
||||
/**
|
||||
* @brief Callback function for mapping @ref rocprofiler_service_buffer_tracing_kind_t ids to
|
||||
* string names. @see rocprofiler_iterate_buffer_trace_kind_names.
|
||||
*/
|
||||
typedef int (*rocprofiler_buffer_tracing_kind_name_cb_t)(
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
const char* kind_name,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief Callback function for mapping the operations of a given @ref
|
||||
* rocprofiler_service_buffer_tracing_kind_t to string names. @see
|
||||
* rocprofiler_iterate_buffer_trace_kind_operation_names.
|
||||
*/
|
||||
typedef int (*rocprofiler_buffer_tracing_operation_name_cb_t)(
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
uint32_t operation,
|
||||
const char* operation_name,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief Configure Buffer Tracing Service.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [in] kind
|
||||
* @param [in] operations
|
||||
* @param [in] operations_count
|
||||
* @param [in] buffer_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
rocprofiler_tracing_operation_t* operations,
|
||||
size_t operations_count,
|
||||
rocprofiler_buffer_id_t buffer_id);
|
||||
|
||||
/**
|
||||
* @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id
|
||||
* mapped to a constant string. The strings provided in the arg will be valid pointers for the
|
||||
* entire duration of the program. It is recommended to call this function once and cache this data
|
||||
* in the client instead of making multiple on-demand calls.
|
||||
*
|
||||
* @param [in] callback Callback function invoked for each enumeration value in @ref
|
||||
* rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values.
|
||||
* @param [in] data User data passed back into the callback
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback,
|
||||
void* data) ROCPROFILER_NONNULL(1);
|
||||
|
||||
/**
|
||||
* @brief Iterates over all the mappings of the operations for a given @ref
|
||||
* rocprofiler_service_buffer_tracing_kind_t and invokes the callback with the kind, operation id,
|
||||
* and the string mapping to the operation id. The strings provided in the callback arg will be
|
||||
* valid pointers for the entire duration of the program. It is recommended to call this function
|
||||
* once per kind, and cache this data in the client instead of making multiple on-demand calls.
|
||||
*
|
||||
* @param [in] kind which buffer tracing kind operations to iterate over
|
||||
* @param [in] callback Callback function invoked for each operation associated with @ref
|
||||
* rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values.
|
||||
* @param [in] data User data passed back into the callback
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_iterate_buffer_tracing_kind_operation_names(
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
rocprofiler_buffer_tracing_operation_name_cb_t callback,
|
||||
void* data) ROCPROFILER_NONNULL(2);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,252 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/hsa.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup CALLBACK_TRACING_SERVICE Synchronous Tracing Services
|
||||
*
|
||||
* Receive immediate callbacks on the calling thread
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler HSA API Callback Data.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
size_t size; ///< provides the size of this struct
|
||||
rocprofiler_hsa_api_args_t args;
|
||||
rocprofiler_hsa_api_retval_t retval;
|
||||
} rocprofiler_hsa_api_callback_tracer_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler HIP API Callback Data.
|
||||
*
|
||||
* Depending on the operation kind, the data can be casted to the corresponding
|
||||
* structure.
|
||||
*
|
||||
*/
|
||||
typedef void* rocprofiler_hip_api_callback_api_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler HIP API Tracer Callback Data.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
size_t size;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_address_t host_kernel_address;
|
||||
rocprofiler_hip_api_callback_api_data_t data; // Arguments or api_data?
|
||||
} rocprofiler_hip_api_callback_tracer_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Marker Callback Data.
|
||||
*
|
||||
* Depending on the operation kind, the data can be casted to the corresponding
|
||||
* structure.
|
||||
*
|
||||
*/
|
||||
typedef void* rocprofiler_marker_callback_api_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Marker Tracer Callback Data.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
size_t size;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_marker_callback_api_data_t data; // Arguments or api_data?
|
||||
} rocprofiler_marker_callback_tracer_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Load Tracer Callback Record.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t load_base; // code object load base
|
||||
uint64_t load_size; // code object load size
|
||||
const char* uri; // URI string (NULL terminated)
|
||||
// uint32_t storage_type; // code object storage type (Need Review?)
|
||||
// int storage_file; // origin file descriptor (Need Review?)
|
||||
// uint64_t memory_base; // origin memory base (Need Review?)
|
||||
// uint64_t memory_size; // origin memory size (Need Review?)
|
||||
// uint64_t load_delta; // code object load delta (Need Review?)
|
||||
} rocprofiler_callback_tracer_code_object_load_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object UnLoad Tracer Callback Record.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t load_base; // code object load base
|
||||
} rocprofiler_callback_tracer_code_object_unload_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Device Kernel Symbol Tracer Callback Record.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
const char* kernel_name; // kernel name string (NULL terminated)
|
||||
rocprofiler_address_t kernel_descriptor; // kernel descriptor
|
||||
} rocprofiler_callback_tracer_code_object_device_kernel_symbol_data_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Register Host Kernel Symbol Tracer Callback
|
||||
* Record.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_address_t host_address; // host address
|
||||
// Should this be nullptr if it is unregister?
|
||||
const char* kernel_name; // kernel name string (NULL terminated)
|
||||
rocprofiler_address_t kernel_descriptor; // kernel descriptor
|
||||
} rocprofiler_callback_tracer_code_object_register_host_kernel_symbol_data_t;
|
||||
|
||||
/**
|
||||
* @brief API Tracing callback function.
|
||||
*/
|
||||
typedef void (*rocprofiler_callback_tracing_cb_t)(rocprofiler_callback_tracing_record_t record,
|
||||
void* user_data);
|
||||
|
||||
/**
|
||||
* @brief Callback function for mapping @ref rocprofiler_service_callback_tracing_kind_t ids to
|
||||
* string names. @see rocprofiler_iterate_callback_tracing_kind_names.
|
||||
*/
|
||||
typedef int (*rocprofiler_callback_tracing_kind_name_cb_t)(
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
const char* kind_name,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief Callback function for mapping the operations of a given @ref
|
||||
* rocprofiler_service_callback_tracing_kind_t to string names. @see
|
||||
* rocprofiler_iterate_callback_tracing_kind_operation_names.
|
||||
*/
|
||||
typedef int (*rocprofiler_callback_tracing_operation_name_cb_t)(
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
uint32_t operation,
|
||||
const char* operation_name,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief Callback function for iterating over the function arguments to a traced function.
|
||||
* This function will be invoked for each argument.
|
||||
* @see rocprofiler_iterate_callback_tracing_operation_args
|
||||
*
|
||||
* @param kind [in] domain
|
||||
* @param operation [in] associated domain operation
|
||||
* @param arg_number [in] the argument number, starting at zero
|
||||
* @param arg_name [in] the name of the argument in the prototype (or rocprofiler union)
|
||||
* @param arg_value_str [in] conversion of the argument to a string, e.g. operator<< overload
|
||||
* @param arg_value_addr [in] the address of the argument stored by rocprofiler.
|
||||
* @param data [in] user data
|
||||
*/
|
||||
typedef int (*rocprofiler_callback_tracing_operation_args_cb_t)(
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
uint32_t operation,
|
||||
uint32_t arg_number,
|
||||
const char* arg_name,
|
||||
const char* arg_value_str,
|
||||
const void* const arg_value_addr,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief Configure Callback Tracing Service.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [in] kind
|
||||
* @param [in] operations
|
||||
* @param [in] operations_count
|
||||
* @param [in] callback
|
||||
* @param [in] callback_args
|
||||
* @return ::rocprofiler_status_t
|
||||
*
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
rocprofiler_tracing_operation_t* operations,
|
||||
size_t operations_count,
|
||||
rocprofiler_callback_tracing_cb_t callback,
|
||||
void* callback_args);
|
||||
|
||||
/**
|
||||
* @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id
|
||||
* mapped to a constant string. The strings provided in the arg will be valid pointers for the
|
||||
* entire duration of the program. It is recommended to call this function once and cache this data
|
||||
* in the client instead of making multiple on-demand calls.
|
||||
*
|
||||
* @param [in] callback Callback function invoked for each enumeration value in @ref
|
||||
* rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values.
|
||||
* @param [in] data User data passed back into the callback
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_iterate_callback_tracing_kind_names(
|
||||
rocprofiler_callback_tracing_kind_name_cb_t callback,
|
||||
void* data) ROCPROFILER_NONNULL(1);
|
||||
|
||||
/**
|
||||
* @brief Iterates over all the mappings of the operations for a given @ref
|
||||
* rocprofiler_service_callback_tracing_kind_t and invokes the callback with the kind, operation id,
|
||||
* and the string mapping to the operation id. The strings provided in the callback arg will be
|
||||
* valid pointers for the entire duration of the program. It is recommended to call this function
|
||||
* once per kind, and cache this data in the client instead of making multiple on-demand calls.
|
||||
*
|
||||
* @param [in] kind which tracing callback kind operations to iterate over
|
||||
* @param [in] callback Callback function invoked for each operation associated with @ref
|
||||
* rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values.
|
||||
* @param [in] data User data passed back into the callback
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_iterate_callback_tracing_kind_operation_names(
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
rocprofiler_callback_tracing_operation_name_cb_t callback,
|
||||
void* data) ROCPROFILER_NONNULL(2);
|
||||
|
||||
/**
|
||||
* @brief Iterates over all the arguments for the traced function (when available). This is
|
||||
* particularly useful when tools want to annotate traces with the function arguments. See
|
||||
* @example samples/api_callback_tracing/client.cpp for a usage example.
|
||||
*
|
||||
* @param[in] record Record provided by service callback
|
||||
* @param[in] callback The callback function which will be invoked for each argument
|
||||
* @param[in] user_data Data to be passed to each invocation of the callback
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_iterate_callback_tracing_operation_args(
|
||||
rocprofiler_callback_tracing_record_t record,
|
||||
rocprofiler_callback_tracing_operation_args_cb_t callback,
|
||||
void* user_data) ROCPROFILER_NONNULL(2);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -1,210 +0,0 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
#define ROCPROFILER_API_VERSION_ID 1
|
||||
#define ROCPROFILER_DOMAIN_OPS_MAX 512
|
||||
#define ROCPROFILER_DOMAIN_OPS_RESERVED \
|
||||
((ROCPROFILER_DOMAIN_OPS_MAX * ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST / 8))
|
||||
|
||||
typedef uint64_t (*rocprofiler_external_cid_cb_t)(rocprofiler_tracer_activity_domain_t,
|
||||
uint32_t,
|
||||
uint64_t);
|
||||
typedef int (*rocprofiler_filter_name_t)(const char*);
|
||||
typedef int (*rocprofiler_filter_op_id_t)(uint32_t);
|
||||
typedef int (*rocprofiler_filter_range_t)(uint32_t, uint32_t);
|
||||
typedef int (*rocprofiler_filter_dispatch_id_t)(uint64_t);
|
||||
|
||||
/// permits tools opportunity to modify the correlation id based on the domain, op, and
|
||||
/// the rocprofiler generated correlation id
|
||||
struct rocprofiler_correlation_config
|
||||
{
|
||||
rocprofiler_external_cid_cb_t external_id_callback;
|
||||
};
|
||||
|
||||
/// how the tools specify the tracing domain and (optionally) which operations in the
|
||||
/// domain they want to trace
|
||||
struct rocprofiler_domain_config
|
||||
{
|
||||
rocprofiler_tracer_callback_t callback;
|
||||
char reserved0[sizeof(uint64_t)];
|
||||
char reserved1[ROCPROFILER_DOMAIN_OPS_RESERVED];
|
||||
};
|
||||
|
||||
/// for buffered callbacks, the tool provides a callback to create a buffer and the size
|
||||
struct rocprofiler_buffer_config
|
||||
{
|
||||
rocprofiler_buffer_callback_t callback;
|
||||
uint64_t buffer_size;
|
||||
// void* reserved0;
|
||||
char reserved1[sizeof(uint64_t)];
|
||||
};
|
||||
|
||||
/// filters are available to make quick decisions about whether rocprofiler should
|
||||
/// assemble the data necessary for a callback. This is more for convenience and
|
||||
/// performance -- anything decisions here could be made in the callback but rocprofiler
|
||||
/// has to first assemble all the infomation on the callback before it (eventually) gets
|
||||
/// discarded because the tool has decided it (after configuration), that it no longer
|
||||
/// wants info meeting certain requirements
|
||||
struct rocprofiler_filter_config
|
||||
{
|
||||
// filter callbacks
|
||||
rocprofiler_filter_name_t name;
|
||||
rocprofiler_filter_op_id_t hip_function_id;
|
||||
rocprofiler_filter_op_id_t hsa_function_id;
|
||||
rocprofiler_filter_range_t range;
|
||||
rocprofiler_filter_dispatch_id_t dispatch_id;
|
||||
|
||||
// reserved padding
|
||||
char padding[24 * sizeof(void*)];
|
||||
};
|
||||
|
||||
/// this is the "single source of truth" for the capabilities of rocprofiler.
|
||||
/// you can one configuration that activates all the capabilities you want
|
||||
/// and holistically start/stop the sum of those features. Alternatively,
|
||||
/// you can have multiple configurations in order to activate certain features
|
||||
/// modularly.
|
||||
///
|
||||
/// The general workflow is:
|
||||
///
|
||||
/// 1. invoke rocprofiler_allocate_config(...)
|
||||
/// - rocprofiler allocates any space internally needed for the config
|
||||
/// - rocprofiler sets a few initial values:
|
||||
/// - "size" to the size of the config structure used internally
|
||||
/// - "api_version" to the version id of the API in the rocprofiler library that
|
||||
/// is being used.
|
||||
/// - these two values can be used by the tool to identify any potential
|
||||
/// incompatibilities that the tool might want to know about
|
||||
/// - rocprofiler checks whether it is too late to configure the tool, e.g.
|
||||
/// something went wrong and rocprofiler was not able to set itself up as
|
||||
/// the intercepter
|
||||
/// 2. tool sets up the configuration struct and sets the "size" variable to the size of
|
||||
/// their configuration struct and sets the "compat_version" field to the
|
||||
/// ROCPROFILER_API_VERSION_ID defined by the rocprofiler headers when the tool was
|
||||
/// built
|
||||
/// - in other words, the user can communicate to rocprofiler, don't read
|
||||
/// past this distance in my configuration struct and I built against X version
|
||||
/// so assume the default behavior and capabilties of version X.
|
||||
/// 3. tool passes this struct to rocprofiler_validate_config(...)
|
||||
/// - this step checks the config in isolation and will communicate any potential
|
||||
/// warnings/issues with that configuration, e.g. rocprofiler_X_config is needed,
|
||||
/// to HW counters XYZ are not available, etc. The tool then has an opportunity
|
||||
/// to address these issues however they see fit.
|
||||
/// 4. tool passes this struct to rocprofiler_start_config(...)
|
||||
/// - internally, we make a call to rocprofiler_validate_config(...) and if any
|
||||
/// issues still exist with the config in isolation, rocprofiler tells the app
|
||||
/// to abort -- mechanisms were provided to prevent aborting prior to this call,
|
||||
/// aborting the app at this point is to guard against rocprofiler "silently"
|
||||
/// not working because error codes were ignored
|
||||
/// - rocprofiler then checks whether this config can actually be activated
|
||||
/// alongside any other active configuration, e.g. this config wants 4 HW counters
|
||||
/// and another wants 4 HW counters but we can only activate 6 out of 8 of
|
||||
/// them in this run. Any issues here will not abort execution but, instead,
|
||||
/// the features of this configuration will not happen (i.e. config won't be
|
||||
/// activated) and the issues will be communicated with error codes -- giving
|
||||
/// the tool the opportunity to address the conflicts (i.e. only request tracing
|
||||
/// and no HW counters) before attempting to activate the modified config.
|
||||
/// - once rocprofiler determines all features of a config can be activated, it
|
||||
/// makes an internal copy of the config and returns an identifier for that
|
||||
/// configuration. The tool is then free to delete the config and any modification
|
||||
/// to the config will NOT be reflected in the behavior of rocprofiler.
|
||||
///
|
||||
///
|
||||
struct rocprofiler_config
|
||||
{
|
||||
// size is used to ensure that we never read past the end of the version
|
||||
size_t size; // = sizeof(rocprofiler_config)
|
||||
uint32_t compat_version; // set by user
|
||||
uint32_t api_version; // set by rocprofiler
|
||||
uint64_t reserved0; // internal field
|
||||
void* user_data; // data passed to callbacks
|
||||
struct rocprofiler_correlation_config* correlation_id; // = &my_cid_config (optional)
|
||||
struct rocprofiler_buffer_config* buffer; // = &my_buffer_config (required)
|
||||
struct rocprofiler_domain_config* domain; // = &my_domain_config (required)
|
||||
struct rocprofiler_filter_config* filter; // = &my_filter_config (optional)
|
||||
};
|
||||
|
||||
/// \brief returns a properly initialized config struct and allocates any data structures
|
||||
/// necessary for the config to be used
|
||||
///
|
||||
/// \param [out] cfg may adjust config or assign values within structs.
|
||||
rocprofiler_status_t
|
||||
rocprofiler_allocate_config(struct rocprofiler_config* cfg);
|
||||
|
||||
/// \brief rocprofiler validates config, checks for conflicts, etc. Ensures that
|
||||
/// the configuration is valid *in isolation*, e.g. it may check that the user
|
||||
/// set the compat_version field and that required config fields, such as buffer
|
||||
/// are set. This function will be called before \ref rocprofiler_start_config
|
||||
/// but is provided to help the user validate one or more configs without starting
|
||||
/// them
|
||||
///
|
||||
/// \param [in] cfg configuration to validate
|
||||
rocprofiler_status_t
|
||||
rocprofiler_validate_config(const struct rocprofiler_config* cfg);
|
||||
|
||||
/// \brief rocprofiler activates configuration and provides a context identifier
|
||||
/// \param [in] cfg may adjust config or assign values within structs. If error
|
||||
/// occurs, could nullptr valid sub-configs and leave the pointers to
|
||||
/// invalid configs
|
||||
/// \param [out] id the context identifier for this config.
|
||||
rocprofiler_status_t
|
||||
rocprofiler_start_config(struct rocprofiler_config*, rocprofiler_context_id_t* id);
|
||||
|
||||
/// \brief disable the configuration.
|
||||
rocprofiler_status_t rocprofiler_stop_config(rocprofiler_context_id_t);
|
||||
|
||||
///
|
||||
///
|
||||
/// the following 4 functions may be changed to permit removing domain/ops and/or
|
||||
/// identifying domains and operations via strings
|
||||
///
|
||||
///
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_set_domain(struct rocprofiler_domain_config*,
|
||||
rocprofiler_tracer_activity_domain_t);
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_domains(struct rocprofiler_domain_config*,
|
||||
rocprofiler_tracer_activity_domain_t*,
|
||||
size_t);
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_op(struct rocprofiler_domain_config*,
|
||||
rocprofiler_tracer_activity_domain_t,
|
||||
uint32_t);
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_ops(struct rocprofiler_domain_config*,
|
||||
rocprofiler_tracer_activity_domain_t,
|
||||
uint32_t*,
|
||||
size_t);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
@@ -0,0 +1,91 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/**
|
||||
* @defgroup CONTEXT_OPERATIONS Context
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* The NULL Context handle.
|
||||
*/
|
||||
#define ROCPROFILER_CONTEXT_NONE ROCPROFILER_HANDLE_LITERAL(rocprofiler_context_id_t, UINT64_MAX)
|
||||
|
||||
/**
|
||||
* @brief Create context.
|
||||
*
|
||||
* @param context_id [out] Context identifier
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_create_context(rocprofiler_context_id_t* context_id) ROCPROFILER_NONNULL(1);
|
||||
|
||||
/**
|
||||
* @brief Start context.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_start_context(rocprofiler_context_id_t context_id);
|
||||
|
||||
/**
|
||||
* @brief Stop context.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_stop_context(rocprofiler_context_id_t context_id);
|
||||
|
||||
/**
|
||||
* @brief Query whether context is active.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [out] status If context is active, this will be a nonzero value
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status)
|
||||
ROCPROFILER_NONNULL(2);
|
||||
|
||||
/**
|
||||
* @brief Query whether the context is valid
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [out] status If context is invalid, this will be a nonzero value
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status)
|
||||
ROCPROFILER_NONNULL(2);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,73 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/agent.h>
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup COUNTERS Hardware counters
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Query Counter name.
|
||||
*
|
||||
* @param [in] counter_id
|
||||
* @param [out] name if nullptr, size will be returned
|
||||
* @param [out] size
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_query_counter_name(rocprofiler_counter_id_t counter_id, const char* name, size_t* size)
|
||||
ROCPROFILER_NONNULL(3);
|
||||
|
||||
/**
|
||||
* @brief Query Counter Instances Count.
|
||||
*
|
||||
* @param [in] counter_id
|
||||
* @param [out] instance_count
|
||||
* @return rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_query_counter_instance_count(rocprofiler_counter_id_t counter_id,
|
||||
size_t* instance_count) ROCPROFILER_NONNULL(2);
|
||||
|
||||
/**
|
||||
* @brief Query Agent Counters Availability.
|
||||
*
|
||||
* @param [in] agent
|
||||
* @param [out] counters_list
|
||||
* @param [out] counters_count
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_query_agent_supported_counters(rocprofiler_agent_t agent,
|
||||
rocprofiler_counter_id_t* counters_list,
|
||||
size_t* counters_count) ROCPROFILER_NONNULL(2, 3);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -22,6 +22,29 @@
|
||||
|
||||
#pragma once
|
||||
|
||||
/** @defgroup SYMBOL_VERSIONING_GROUP Symbol Versions
|
||||
*
|
||||
* The names used for the shared library versioned symbols.
|
||||
*
|
||||
* Every function is annotated with one of the version macros defined in this
|
||||
* section. Each macro specifies a corresponding symbol version string. After
|
||||
* dynamically loading the shared library with @p dlopen, the address of each
|
||||
* function can be obtained using @p dlsym with the name of the function and
|
||||
* its corresponding symbol version string. An error will be reported by @p
|
||||
* dlvsym if the installed library does not support the version for the
|
||||
* function specified in this version of the interface.
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief The function was introduced in version 10.0 of the interface and has the
|
||||
* symbol version string of ``"ROCPROFILER_10.0"``.
|
||||
*/
|
||||
#define ROCPROFILER_VERSION_10_0
|
||||
|
||||
/** @} */
|
||||
|
||||
#if !defined(ROCPROFILER_ATTRIBUTE)
|
||||
# if defined(_MSC_VER)
|
||||
# define ROCPROFILER_ATTRIBUTE(...) __declspec(__VA_ARGS__)
|
||||
@@ -95,3 +118,11 @@
|
||||
value \
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef __cplusplus
|
||||
# define ROCPROFILER_EXTERN_C_INIT extern "C" {
|
||||
# define ROCPROFILER_EXTERN_C_FINI }
|
||||
#else
|
||||
# define ROCPROFILER_EXTERN_C_INIT
|
||||
# define ROCPROFILER_EXTERN_C_FINI
|
||||
#endif
|
||||
|
||||
@@ -0,0 +1,97 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/agent.h>
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/hsa.h>
|
||||
#include <rocprofiler/profile_config.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup DISPATCH_PROFILE_COUNTING_SERVICE Dispatch Profile Counting
|
||||
* Service
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Profile Counting Data.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_timestamp_t start_timestamp;
|
||||
rocprofiler_timestamp_t end_timestamp;
|
||||
/**
|
||||
* Counters, including identifiers to get counter information and Counters
|
||||
* values
|
||||
*
|
||||
* Should it be a record per counter?
|
||||
*/
|
||||
rocprofiler_record_counter_t* counters;
|
||||
uint64_t counters_count;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
} rocprofiler_dispatch_profile_counting_record_t;
|
||||
|
||||
/**
|
||||
* @brief Kernel Dispatch Callback
|
||||
*
|
||||
* @param [out] queue_id
|
||||
* @param [out] agent_id
|
||||
* @param [out] correlation_id
|
||||
* @param [out] dispatch_packet It can be used to get the kernel descriptor and then using
|
||||
* code_object tracing, we can get the kernel name. `dispatch_packet->reserved2` is the
|
||||
* correlation_id used to correlate the dispatch packet with the corresponding API call.
|
||||
* @param [out] callback_data_args
|
||||
* @param [in] config
|
||||
*/
|
||||
typedef void (*rocprofiler_profile_counting_dispatch_callback_t)(
|
||||
rocprofiler_queue_id_t queue_id,
|
||||
rocprofiler_agent_t agent_id,
|
||||
rocprofiler_correlation_id_t correlation_id,
|
||||
const hsa_kernel_dispatch_packet_t* dispatch_packet,
|
||||
void* callback_data_args,
|
||||
rocprofiler_profile_config_id_t* config);
|
||||
|
||||
/**
|
||||
* @brief Configure Dispatch Profile Counting Service.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [in] agent_id
|
||||
* @param [in] buffer_id
|
||||
* @param [in] callback
|
||||
* @param [in] callback_data_args
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_dispatch_profile_counting_service(
|
||||
rocprofiler_context_id_t context_id,
|
||||
rocprofiler_agent_t agent_id,
|
||||
rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_profile_counting_dispatch_callback_t callback,
|
||||
void* callback_data_args);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,60 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/**
|
||||
* @defgroup EXTERNAL_CORRELATION External Correlation IDs
|
||||
*
|
||||
* User-defined correlation identifiers to supplement rocprofiler generated correlation ids
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/** @} */
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Push External Correlation ID.
|
||||
*
|
||||
* @param external_correlation_id
|
||||
* @return rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_push_external_correlation_id(
|
||||
rocprofiler_external_correlation_id_t external_correlation_id);
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Push External Correlation ID.
|
||||
*
|
||||
* @param external_correlation_id
|
||||
* @return rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_pop_external_correlation_id(
|
||||
rocprofiler_external_correlation_id_t* external_correlation_id);
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,457 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
//--------------------------------------------------------------------------------------//
|
||||
//
|
||||
// ENUMERATIONS
|
||||
//
|
||||
//--------------------------------------------------------------------------------------//
|
||||
|
||||
/**
|
||||
* @defgroup BASIC_DATA_TYPES Basic data types
|
||||
*
|
||||
* Basic data types and typedefs
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Status codes.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_STATUS_SUCCESS = 0, ///< No error occurred
|
||||
ROCPROFILER_STATUS_ERROR, ///< Generalized error
|
||||
ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND, ///< No valid context for given context id
|
||||
ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND, ///< No valid buffer for given buffer id
|
||||
ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND, ///< Domain identifier is invalid
|
||||
ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND, ///< Operation identifier is invalid for domain
|
||||
ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND, ///< No valid thread for given thread id
|
||||
ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR, ///> Generalized context error
|
||||
ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID, ///< Context configuration is not valid
|
||||
ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED, ///< Context was not started (maybe already
|
||||
///< started or atomic swap into active array
|
||||
///< failed)
|
||||
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY, ///< buffer operation failed because it currently busy
|
||||
///< handling another request (e.g. flushing)
|
||||
ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED, ///< service has already been configured
|
||||
///< in context
|
||||
ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED, ///< Function call is not valid outside of
|
||||
///< rocprofiler configuration (i.e.
|
||||
///< function called post-initialization)
|
||||
ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED, ///< Function is not implemented
|
||||
ROCPROFILER_STATUS_LAST,
|
||||
} rocprofiler_status_t;
|
||||
|
||||
/**
|
||||
* @brief Buffer record categories. This enumeration type is encoded in @ref
|
||||
* rocprofiler_record_header_t category field
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_BUFFER_CATEGORY_NONE = 0,
|
||||
ROCPROFILER_BUFFER_CATEGORY_TRACING,
|
||||
ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING,
|
||||
ROCPROFILER_BUFFER_CATEGORY_LAST,
|
||||
} rocprofiler_buffer_category_t;
|
||||
|
||||
/**
|
||||
* @brief Agent type.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_AGENT_TYPE_NONE = 0, ///< Agent type is unknown
|
||||
ROCPROFILER_AGENT_TYPE_CPU, ///< Agent type is a CPU
|
||||
ROCPROFILER_AGENT_TYPE_GPU, ///< Agent type is a GPU
|
||||
ROCPROFILER_AGENT_TYPE_LAST,
|
||||
} rocprofiler_agent_type_t;
|
||||
|
||||
/**
|
||||
* @brief Service Callback Phase.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_SERVICE_CALLBACK_PHASE_NONE = 0, ///< Callback has no phase
|
||||
ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER, ///< Callback invoked prior to function execution
|
||||
ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT, ///< Callback invoked after to function execution
|
||||
ROCPROFILER_SERVICE_CALLBACK_PHASE_LAST,
|
||||
} rocprofiler_service_callback_phase_t;
|
||||
|
||||
/**
|
||||
* @brief Service Callback Tracing Kind.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE = 0,
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, ///< Callbacks for HSA functions
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, ///< Callbacks for HIP functions
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, ///< Callbacks for ROCTx functions
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, ///< Callbacks for code object info
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, ///< Callbacks for kernel dispatches
|
||||
ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST,
|
||||
} rocprofiler_service_callback_tracing_kind_t;
|
||||
|
||||
/**
|
||||
* @brief Service Buffer Tracing Kind.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_NONE = 0,
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, ///< Buffer HSA function calls
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, ///< Buffer HIP function calls
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, ///< Buffer ROCTx function calls
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, ///< Buffer memory copy info
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, ///< Buffer kernel dispatch info
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, ///< Buffer page migration info
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, ///< Buffer scratch memory reclaimation info
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, ///< Buffer external correlation info
|
||||
// To determine if this is possible to implement?
|
||||
// ROCPROFILER_SERVICE_BUFFER_TRACING_QUEUE_SCHEDULING,
|
||||
ROCPROFILER_SERVICE_BUFFER_TRACING_LAST,
|
||||
} rocprofiler_service_buffer_tracing_kind_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Code Object Tracer Operation.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_NONE = 0,
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LOAD,
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_UNLOAD,
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_REGISTER,
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER,
|
||||
// next two are part of hipRegisterFunction API.
|
||||
// ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_REGISTER,
|
||||
// ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_UNREGISTER,
|
||||
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LAST,
|
||||
} rocprofiler_callback_tracing_code_object_operation_t;
|
||||
|
||||
/**
|
||||
* @brief Memory Copy Operation.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE = 0,
|
||||
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_HOST,
|
||||
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_HOST_TO_DEVICE,
|
||||
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_DEVICE,
|
||||
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_LAST,
|
||||
} rocprofiler_buffer_tracing_memory_copy_operation_t;
|
||||
|
||||
/**
|
||||
* @brief PC Sampling Method.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_PC_SAMPLING_METHOD_NONE = 0,
|
||||
ROCPROFILER_PC_SAMPLING_METHOD_STOCHASTIC,
|
||||
ROCPROFILER_PC_SAMPLING_METHOD_HOST_TRAP,
|
||||
ROCPROFILER_PC_SAMPLING_METHOD_LAST,
|
||||
} rocprofiler_pc_sampling_method_t;
|
||||
|
||||
/**
|
||||
* @brief PC Sampling Unit.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_PC_SAMPLING_UNIT_NONE = 0, ///< Sample interval has unspecified units
|
||||
ROCPROFILER_PC_SAMPLING_UNIT_INSTRUCTIONS, ///< Sample interval is in instructions
|
||||
ROCPROFILER_PC_SAMPLING_UNIT_CYCLES, ///< Sample interval is in cycles
|
||||
ROCPROFILER_PC_SAMPLING_UNIT_TIME, ///< Sample internval is in nanoseconds
|
||||
ROCPROFILER_PC_SAMPLING_UNIT_LAST,
|
||||
} rocprofiler_pc_sampling_unit_t;
|
||||
|
||||
/**
|
||||
* @brief Actions when Buffer is full.
|
||||
*/
|
||||
typedef enum // NOLINT(performance-enum-size)
|
||||
{
|
||||
ROCPROFILER_BUFFER_POLICY_NONE = 0, ///< No policy has been set
|
||||
ROCPROFILER_BUFFER_POLICY_DISCARD, ///< Drop records when buffer is full
|
||||
ROCPROFILER_BUFFER_POLICY_LOSSLESS, ///< Block when buffer is full
|
||||
ROCPROFILER_BUFFER_POLICY_LAST,
|
||||
} rocprofiler_buffer_policy_t;
|
||||
|
||||
//--------------------------------------------------------------------------------------//
|
||||
//
|
||||
// ALIASES
|
||||
//
|
||||
//--------------------------------------------------------------------------------------//
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Timestamp.
|
||||
*/
|
||||
typedef uint64_t rocprofiler_timestamp_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Address.
|
||||
*/
|
||||
typedef uint64_t rocprofiler_address_t;
|
||||
|
||||
/**
|
||||
* @brief Thread ID. Value will be equivalent to `syscall(__NR_gettid)`
|
||||
*/
|
||||
typedef uint64_t rocprofiler_thread_id_t;
|
||||
|
||||
/**
|
||||
* @brief Tracing Operation ID. Depending on the kind, operations can be determined.
|
||||
* If the value is equal to zero that means all operations will be considered
|
||||
* for tracing.
|
||||
*/
|
||||
typedef uint32_t rocprofiler_tracing_operation_t;
|
||||
|
||||
/**
|
||||
* @brief Needs non-typedef specification?
|
||||
*/
|
||||
typedef uint32_t rocprofiler_counter_instance_id_t;
|
||||
|
||||
// forward declaration of struct
|
||||
typedef struct rocprofiler_pc_sampling_configuration_s rocprofiler_pc_sampling_configuration_t;
|
||||
|
||||
//--------------------------------------------------------------------------------------//
|
||||
//
|
||||
// UNIONS
|
||||
//
|
||||
//--------------------------------------------------------------------------------------//
|
||||
|
||||
/**
|
||||
* @brief User-assignable data type
|
||||
*
|
||||
*/
|
||||
typedef union rocprofiler_user_data_t
|
||||
{
|
||||
uint64_t value;
|
||||
void* ptr;
|
||||
} rocprofiler_user_data_t;
|
||||
|
||||
//--------------------------------------------------------------------------------------//
|
||||
//
|
||||
// STRUCTS
|
||||
//
|
||||
//--------------------------------------------------------------------------------------//
|
||||
|
||||
/**
|
||||
* @brief Context ID.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_context_id_t;
|
||||
|
||||
/**
|
||||
* @brief Queue ID.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_queue_id_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Record Correlation ID.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t id;
|
||||
} rocprofiler_correlation_id_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler External Correlation ID.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t id;
|
||||
} rocprofiler_external_correlation_id_t;
|
||||
|
||||
/**
|
||||
* @brief Buffer ID.
|
||||
* @addtogroup BUFFER_HANDLING
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_buffer_id_t;
|
||||
|
||||
/**
|
||||
* @brief Agent Identifier
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_agent_id_t;
|
||||
|
||||
/**
|
||||
* @brief Counter ID.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_counter_id_t;
|
||||
|
||||
/**
|
||||
* @brief Profile Configurations
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_profile_config_id_t;
|
||||
|
||||
/**
|
||||
* @brief Array of PC Sampling Configurations
|
||||
*/
|
||||
typedef struct rocprofiler_pc_sampling_config_array_s
|
||||
{
|
||||
rocprofiler_pc_sampling_configuration_t* data;
|
||||
size_t size;
|
||||
} rocprofiler_pc_sampling_config_array_t;
|
||||
|
||||
/**
|
||||
* @brief Tracing record
|
||||
*
|
||||
*/
|
||||
typedef struct rocprofiler_callback_tracing_record_t
|
||||
{
|
||||
rocprofiler_thread_id_t thread_id;
|
||||
rocprofiler_correlation_id_t correlation_id;
|
||||
rocprofiler_external_correlation_id_t external_correlation_id;
|
||||
rocprofiler_service_callback_tracing_kind_t kind;
|
||||
uint32_t operation;
|
||||
rocprofiler_service_callback_phase_t phase;
|
||||
rocprofiler_user_data_t data;
|
||||
void* payload;
|
||||
} rocprofiler_callback_tracing_record_t;
|
||||
|
||||
/**
|
||||
* @brief Generic record with type identifier(s) and a pointer to data. This data type is used with
|
||||
* buffered data.
|
||||
*
|
||||
* @code{.cpp}
|
||||
* void
|
||||
* tool_tracing_callback(rocprofiler_record_header_t** headers,
|
||||
* size_t num_headers)
|
||||
* {
|
||||
* for(size_t i = 0; i < num_headers; ++i)
|
||||
* {
|
||||
* rocprofiler_record_header_t* header = headers[i];
|
||||
*
|
||||
* if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING &&
|
||||
* header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
|
||||
* {
|
||||
* // cast to rocprofiler_buffer_tracing_hsa_api_record_t which
|
||||
* // is type associated with this category + kind
|
||||
* auto* record =
|
||||
* static_cast<rocprofiler_buffer_tracing_hsa_api_record_t*>(header->payload);
|
||||
*
|
||||
* // trivial test
|
||||
* assert(record->start_timestamp <= record->end_timestamp);
|
||||
* }
|
||||
* }
|
||||
* }
|
||||
*
|
||||
* @endcode
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
union
|
||||
{
|
||||
struct
|
||||
{
|
||||
uint32_t category; ///< rocprofiler_buffer_category_t
|
||||
uint32_t kind; ///< domain
|
||||
};
|
||||
uint64_t hash; ///< generic identifier. You can compute this via: `uint64_t hash = category
|
||||
///< | ((uint64_t)(kind) << 32)`, e.g.
|
||||
};
|
||||
void* payload;
|
||||
} rocprofiler_record_header_t;
|
||||
|
||||
/**
|
||||
* @brief Function for computing the unsigned 64-bit hash value in @ref rocprofiler_record_header_t
|
||||
* from a category and kind (two unsigned 32-bit values)
|
||||
*
|
||||
* @param category [in] a value from @ref rocprofiler_buffer_category_t
|
||||
* @param kind [in] depending on the category, this is the domain value, e.g., @ref
|
||||
* rocprofiler_service_buffer_tracing_kind_t value
|
||||
* @return uint64_t hash value of category and kind
|
||||
*/
|
||||
static inline uint64_t
|
||||
rocprofiler_record_header_compute_hash(uint32_t category, uint32_t kind)
|
||||
{
|
||||
uint64_t value = category;
|
||||
value |= ((uint64_t)(kind)) << 32;
|
||||
return value;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler Profile Counting Counter per instance.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_counter_id_t counter_id;
|
||||
rocprofiler_counter_instance_id_t instance_id;
|
||||
double counter_value;
|
||||
} rocprofiler_record_counter_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler PC Sampling Record.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t pc;
|
||||
uint64_t dispatch_id;
|
||||
uint64_t timestamp;
|
||||
uint64_t hardware_id;
|
||||
union
|
||||
{
|
||||
uint8_t arb_value;
|
||||
};
|
||||
union
|
||||
{
|
||||
void* data;
|
||||
};
|
||||
} rocprofiler_pc_sampling_record_t;
|
||||
|
||||
/**
|
||||
* @brief ROCProfiler SPM Record.
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
/**
|
||||
* Counters, including identifiers to get counter information and Counters
|
||||
* values
|
||||
*/
|
||||
rocprofiler_record_counter_t* counters;
|
||||
uint64_t counters_count;
|
||||
} rocprofiler_spm_record_t;
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -27,7 +27,6 @@
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
typedef uint32_t rocprofiler_trace_record_hip_operation_kind_t;
|
||||
typedef struct rocprofiler_hip_trace_data_s rocprofiler_hip_trace_data_t;
|
||||
typedef struct rocprofiler_hip_api_data_s rocprofiler_hip_api_data_t;
|
||||
|
||||
|
||||
@@ -30,33 +30,14 @@
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
typedef uint32_t rocprofiler_trace_record_hsa_operation_kind_t;
|
||||
typedef struct hsa_kernel_dispatch_packet_s hsa_kernel_dispatch_packet_t;
|
||||
typedef struct rocprofiler_hsa_trace_data_s rocprofiler_hsa_trace_data_t;
|
||||
typedef struct rocprofiler_hsa_api_data_s rocprofiler_hsa_api_data_t;
|
||||
|
||||
struct rocprofiler_hsa_api_data_s
|
||||
{
|
||||
uint64_t correlation_id;
|
||||
uint32_t phase;
|
||||
union
|
||||
{
|
||||
uint64_t uint64_t_retval;
|
||||
uint32_t uint32_t_retval;
|
||||
hsa_signal_value_t hsa_signal_value_t_retval;
|
||||
hsa_status_t hsa_status_t_retval;
|
||||
};
|
||||
rocprofiler_hsa_api_args_t args;
|
||||
uint64_t* phase_data;
|
||||
};
|
||||
|
||||
struct rocprofiler_hsa_trace_data_s
|
||||
{
|
||||
rocprofiler_hsa_api_data_t api_data;
|
||||
uint64_t phase_enter_timestamp;
|
||||
uint64_t phase_exit_timestamp;
|
||||
uint64_t phase_data;
|
||||
|
||||
void (*phase_enter)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data);
|
||||
void (*phase_exit)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data);
|
||||
uint64_t correlation_id;
|
||||
uint32_t phase;
|
||||
rocprofiler_hsa_api_args_t args;
|
||||
rocprofiler_hsa_api_retval_t retval;
|
||||
uint64_t* phase_data;
|
||||
};
|
||||
|
||||
@@ -26,6 +26,14 @@
|
||||
#include <hsa/hsa_ext_image.h>
|
||||
#include <rocprofiler/version.h>
|
||||
|
||||
typedef union rocprofiler_hsa_api_retval_u
|
||||
{
|
||||
uint64_t uint64_t_retval;
|
||||
uint32_t uint32_t_retval;
|
||||
hsa_signal_value_t hsa_signal_value_t_retval;
|
||||
hsa_status_t hsa_status_t_retval;
|
||||
} rocprofiler_hsa_api_retval_t;
|
||||
|
||||
typedef union rocprofiler_hsa_api_args_u
|
||||
{
|
||||
// block: CoreApi API
|
||||
|
||||
@@ -0,0 +1,123 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup INTERNAL_THREADING Internal thread handling
|
||||
*
|
||||
* Callbacks before and after threads created internally by libraries
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Enumeration for specifying which libraries you want callbacks before and after the library
|
||||
* creates an internal thread. These callbacks will be invoked on the thread that is about to create
|
||||
* the new thread (not on the newly created thread). In thread-aware tools that wrap pthread_create,
|
||||
* this can be used to disable the wrapper before the pthread_create invocation and re-enable the
|
||||
* wrapper afterwards. In many cases, tools will want to ignore the thread(s) created by rocprofiler
|
||||
* since these threads do not exist in the normal application execution, whereas the internal
|
||||
* threads for HSA, HIP, etc. are created in normal application execution; however, the HIP, HSA,
|
||||
* etc. internal threads are typically background threads which just monitor kernel completion and
|
||||
* are unlikely to contribute to any performance issues.
|
||||
*/
|
||||
typedef enum
|
||||
{
|
||||
ROCPROFILER_LIBRARY = (1 << 0),
|
||||
ROCPROFILER_HSA_LIBRARY = (1 << 1),
|
||||
ROCPROFILER_HIP_LIBRARY = (1 << 2),
|
||||
ROCPROFILER_MARKER_LIBRARY = (1 << 3),
|
||||
ROCPROFILER_LIBRARY_LAST = ROCPROFILER_MARKER_LIBRARY,
|
||||
} rocprofiler_internal_thread_library_t;
|
||||
|
||||
/**
|
||||
* @brief Callback type before and after internal thread creation. @see
|
||||
* rocprofiler_at_internal_thread_create
|
||||
*
|
||||
*/
|
||||
typedef void (*rocprofiler_internal_thread_library_cb_t)(rocprofiler_internal_thread_library_t,
|
||||
void*);
|
||||
|
||||
/**
|
||||
* @brief Invoke this function to receive callbacks before and after the creation of an internal
|
||||
* thread by a library which as invoked on the thread which is creating the internal thread(s).
|
||||
* Please note that the postcreate callback is guaranteed to be invoked after the underlying
|
||||
* system call to create a new thread but it does not guarantee that the new thread has been
|
||||
* started. Please note, that once these callbacks are registered, they cannot be removed so the
|
||||
* caller is responsible for ignoring these callbacks if they want to ignore them beyond a certain
|
||||
* point in the application.
|
||||
*
|
||||
* @param precreate [in] Callback invoked immediately before a new internal thread is created
|
||||
* @param postcreate [in] Callback invoked immediately after a new internal thread is created
|
||||
* @param libs [in] Bitwise-or of libraries, e.g. `ROCPROFILER_LIBRARY | ROCPROFILER_MARKER_LIBRARY`
|
||||
* means the callbacks will be invoked whenever rocprofiler and/or the marker library create
|
||||
* internal threads but not when the HSA or HIP libraries create internal threads.
|
||||
* @param data [in] Data shared between callbacks
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate,
|
||||
rocprofiler_internal_thread_library_cb_t postcreate,
|
||||
int libs,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* @brief opaque handle to an internal thread identifier which delivers callbacks for buffers
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
uint64_t handle;
|
||||
} rocprofiler_callback_thread_t;
|
||||
|
||||
/**
|
||||
* @brief Create a handle to a unique thread (created by rocprofiler) which, when associated with a
|
||||
* particular buffer, will guarantee those buffered results always get delivered on the same thread.
|
||||
* This is useful to prevent/control thread-safety issues and/or enable multithreaded processing of
|
||||
* buffers with non-overlapping data
|
||||
*
|
||||
* @param [in] cb_thread_id User-provided pointer to a @ref rocprofiler_callback_thread_t
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id)
|
||||
ROCPROFILER_NONNULL(1);
|
||||
|
||||
/**
|
||||
* @brief By default, all buffered results are delivered on the same thread. Using @ref
|
||||
* rocprofiler_create_callback_thread, one or more buffers can be assigned to deliever their results
|
||||
* on a unique, dedicated thread.
|
||||
*
|
||||
* @param [in] buffer_id Buffer identifier
|
||||
* @param [in] cb_thread_id Callback thread identifier via @ref rocprofiler_create_callback_thread
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_callback_thread_t cb_thread_id);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -24,9 +24,6 @@
|
||||
|
||||
#include <rocprofiler/marker/api_args.h>
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
typedef uint32_t rocprofiler_trace_record_marker_operation_kind_t;
|
||||
typedef struct rocprofiler_roctx_api_data_s rocprofiler_roctx_api_data_t;
|
||||
|
||||
struct rocprofiler_roctx_api_data_s
|
||||
|
||||
@@ -0,0 +1,79 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/agent.h>
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup PC_SAMPLING_SERVICE PC Sampling Service
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Create PC Sampling Service.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [in] agent
|
||||
* @param [in] method
|
||||
* @param [in] unit
|
||||
* @param [in] interval
|
||||
* @param [in] buffer_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_agent_t agent,
|
||||
rocprofiler_pc_sampling_method_t method,
|
||||
rocprofiler_pc_sampling_unit_t unit,
|
||||
uint64_t interval,
|
||||
rocprofiler_buffer_id_t buffer_id);
|
||||
|
||||
struct rocprofiler_pc_sampling_configuration_s
|
||||
{
|
||||
rocprofiler_pc_sampling_method_t method;
|
||||
rocprofiler_pc_sampling_unit_t unit;
|
||||
size_t min_interval;
|
||||
size_t max_interval;
|
||||
uint64_t flags;
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief Query PC Sampling Configuration.
|
||||
*
|
||||
* @param [in] agent
|
||||
* @param [out] config
|
||||
* @param [out] config_count
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_query_pc_sampling_agent_configurations(rocprofiler_agent_t agent,
|
||||
rocprofiler_pc_sampling_configuration_t* config,
|
||||
size_t* config_count) ROCPROFILER_NONNULL(2, 3);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,63 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/agent.h>
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup PROFILE_CONFIG Profile Configurations
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Create Profile Configuration.
|
||||
*
|
||||
* @param [in] agent Agent identifier
|
||||
* @param [in] counters_list List of GPU counters
|
||||
* @param [in] counters_count Size of counters list
|
||||
* @param [out] config_id Identifier for GPU counters group
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_create_profile_config(rocprofiler_agent_t agent,
|
||||
rocprofiler_counter_id_t* counters_list,
|
||||
size_t counters_count,
|
||||
rocprofiler_profile_config_id_t* config_id)
|
||||
ROCPROFILER_NONNULL(4);
|
||||
|
||||
/**
|
||||
* @brief Destroy Profile Configuration.
|
||||
*
|
||||
* @param [in] config_id
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_destroy_profile_config(rocprofiler_profile_config_id_t config_id);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -0,0 +1,220 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/**
|
||||
* @defgroup REGISTRATION_GROUP Tool registration
|
||||
*
|
||||
* Data types and functions for tool registration with rocprofiler
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief A pointer to this data structure is provided to the client tool initialization function.
|
||||
* The name member can be set by the client to assist with debugging (e.g. rocprofiler cannot start
|
||||
* your context because there is a conflicting context started by `<name>` -- at least that is the
|
||||
* plan). The handle member is a unique identifer assigned by rocprofiler for the client and the
|
||||
* client can store it and pass it to the @ref rocprofiler_client_finalize_t function to force
|
||||
* finalization (i.e. deactivate all of it's contexts) for the client.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
const char* name; ///< clients should set this value for debugging
|
||||
const uint32_t handle; ///< internal handle
|
||||
} rocprofiler_client_id_t;
|
||||
|
||||
typedef void (*rocprofiler_client_finalize_t)(rocprofiler_client_id_t);
|
||||
|
||||
typedef int (*rocprofiler_tool_initialize_t)(rocprofiler_client_finalize_t finalize_func,
|
||||
void* tool_data);
|
||||
|
||||
typedef void (*rocprofiler_tool_finalize_t)(void* tool_data);
|
||||
|
||||
/**
|
||||
* @brief Data structure containing a initialization, finalization, and data
|
||||
*
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
size_t size; ///< in case of future extensions
|
||||
rocprofiler_tool_initialize_t initialize; ///< context creation
|
||||
rocprofiler_tool_finalize_t finalize; ///< cleanup
|
||||
void* tool_data; ///< data to provide to init and fini callbacks
|
||||
} rocprofiler_tool_configure_result_t;
|
||||
|
||||
/**
|
||||
* @brief Query whether rocprofiler has already scanned the binary for all the instances of @ref
|
||||
* rocprofiler_configure (or is currently scanning). If rocprofiler has completed it's scan, clients
|
||||
* can directly register themselves with rocprofiler.
|
||||
*
|
||||
* @param [out] status 0 indicates rocprofiler has not been initialized (i.e. configured), 1
|
||||
* indicates rocprofiler has been initialized, -1 indicates rocprofiler is currently initializing.
|
||||
* @return rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t
|
||||
rocprofiler_is_initialized(int* status) ROCPROFILER_API;
|
||||
|
||||
/**
|
||||
* @brief Query rocprofiler finalization status.
|
||||
*
|
||||
* @param [out] status 0 indicates rocprofiler has not been finalized, 1 indicates rocprofiler has
|
||||
* been finalized, -1 indicates rocprofiler is currently finalizing.
|
||||
* @return rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t
|
||||
rocprofiler_is_finalized(int* status) ROCPROFILER_API;
|
||||
|
||||
/**
|
||||
* @brief This is the special function that tools define to enable rocprofiler support. The tool
|
||||
* should return a pointer to
|
||||
* @ref rocprofiler_tool_configure_result_t which will contain a function pointer to (1) an
|
||||
* initialization function where all the contexts are created, (2) a finalization function (if
|
||||
* necessary) which will be invoked when rocprofiler shutdown and, (3) a pointer to any data that
|
||||
* the tool wants communicated between the @ref rocprofiler_tool_configure_result_t::initialize and
|
||||
* @ref rocprofiler_tool_configure_result_t::finalize functions. If the user
|
||||
*
|
||||
* @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch`
|
||||
* @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info.
|
||||
* @param [in] priority How many client tools were initialized before this client tool
|
||||
* @param [in, out] client_id tool identifier value.
|
||||
* @return rocprofiler_tool_configure_result_t*
|
||||
*
|
||||
* @code{.cpp}
|
||||
* #include <rocprofiler/registration.h>
|
||||
*
|
||||
* static rocprofiler_client_id_t my_client_id;
|
||||
* static rocprofiler_client_finalize_t my_fini_func;
|
||||
* static int my_tool_data = 1234;
|
||||
*
|
||||
* static int my_init_func(rocprofiler_client_finalize_t fini_func,
|
||||
* void* tool_data)
|
||||
* {
|
||||
* my_fini_func = fini_func;
|
||||
*
|
||||
* assert(*static_cast<int*>(tool_data) == 1234 && "tool_data is wrong");
|
||||
*
|
||||
* rocprofiler_context_id_t ctx;
|
||||
* rocprofiler_create_context(&ctx);
|
||||
*
|
||||
* if(int valid_ctx = 0;
|
||||
* rocprofiler_context_is_valid(ctx, &valid_ctx) != ROCPROFILER_STATUS_SUCCESS ||
|
||||
* valid_ctx != 0)
|
||||
* {
|
||||
* // notify rocprofiler that initialization failed
|
||||
* // and all the contexts, buffers, etc. created
|
||||
* // should be ignored
|
||||
* return -1;
|
||||
* }
|
||||
*
|
||||
* if(rocprofiler_start_context(ctx) != ROCPROFILER_STATUS_SUCCESS)
|
||||
* {
|
||||
* // notify rocprofiler that initialization failed
|
||||
* // and all the contexts, buffers, etc. created
|
||||
* // should be ignored
|
||||
* return -1;
|
||||
* }
|
||||
*
|
||||
* // no errors
|
||||
* return 0;
|
||||
* }
|
||||
*
|
||||
* static int my_fini_func(void* tool_data)
|
||||
* {
|
||||
* assert(*static_cast<int*>(tool_data) == 1234 && "tool_data is wrong");
|
||||
* }
|
||||
*
|
||||
* rocprofiler_tool_configure_result_t*
|
||||
* rocprofiler_configure(uint32_t version,
|
||||
* const char* runtime_version,
|
||||
* uint32_t priority,
|
||||
* rocprofiler_client_id_t* client_id)
|
||||
* {
|
||||
* // only activate if main tool
|
||||
* if(priority > 0) return nullptr;
|
||||
*
|
||||
* // set the client name
|
||||
* client_id->name = "ExampleTool";
|
||||
*
|
||||
* // make a copy of client info
|
||||
* my_client_id = *client_id;
|
||||
*
|
||||
* // compute major/minor/patch version info
|
||||
* uint32_t major = version / 10000;
|
||||
* uint32_t minor = (version % 10000) / 100;
|
||||
* uint32_t patch = version % 100;
|
||||
*
|
||||
* // print info
|
||||
* printf("Configuring rocprofiler (v%u.%u.%u) [%s]\n", major, minor, patch, runtime_version);
|
||||
*
|
||||
* // create configure data
|
||||
* static auto cfg = rocprofiler_tool_configure_result_t{ &my_init_func,
|
||||
* &my_fini_func,
|
||||
* &my_tool_data };
|
||||
*
|
||||
* // return pointer to configure data
|
||||
* return &cfg;
|
||||
* }
|
||||
* @endcode
|
||||
*/
|
||||
rocprofiler_tool_configure_result_t*
|
||||
rocprofiler_configure(uint32_t version,
|
||||
const char* runtime_version,
|
||||
uint32_t priority,
|
||||
rocprofiler_client_id_t* client_id) ROCPROFILER_PUBLIC_API;
|
||||
|
||||
// NOTE: we use ROCPROFILER_PUBLIC_API above instead of ROCPROFILER_API because we always
|
||||
// want the symbol to be visible when the user includes the header for the prototype
|
||||
|
||||
/**
|
||||
* @brief Function pointer typedef for @ref rocprofiler_configure function
|
||||
* @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch`
|
||||
* @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info.
|
||||
* @param [in] priority How many client tools were initialized before this client tool
|
||||
* @param [in, out] client_id tool identifier value.
|
||||
*/
|
||||
typedef rocprofiler_tool_configure_result_t* (*rocprofiler_configure_func_t)(
|
||||
uint32_t version,
|
||||
const char* runtime_version,
|
||||
uint32_t priority,
|
||||
rocprofiler_client_id_t* client_id);
|
||||
|
||||
/**
|
||||
* @brief Function for explicitly registering a configuration with rocprofiler. This can be invoked
|
||||
* before any ROCm runtimes (lazily) initialize and context(s) can be started before the runtimes
|
||||
* initialize.
|
||||
* @param [in] configure_func Address of @ref rocprofiler_configure function. A null pointer is
|
||||
* acceptable if the address is not known
|
||||
* @returns rocprofiler_status_t If rocprofiler has already been configured, or is currently being
|
||||
* configured, this function will return @ref ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED.
|
||||
*/
|
||||
rocprofiler_status_t
|
||||
rocprofiler_force_configure(rocprofiler_configure_func_t configure_func) ROCPROFILER_API;
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
文件差异内容过多而无法显示
加载差异
@@ -20,7 +20,7 @@
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
/** \section rocprofiler_plugin_api ROCProfiler Plugin API
|
||||
/** @section rocprofiler_plugin_api ROCProfiler Plugin API
|
||||
*
|
||||
* The ROCProfiler Plugin API is used by the ROCProfiler Tool to output all
|
||||
* profiling information. Different implementations of the ROCProfiler Plugin
|
||||
@@ -37,7 +37,7 @@
|
||||
*/
|
||||
|
||||
/**
|
||||
* \file
|
||||
* @file
|
||||
* ROCProfiler Tool Plugin API interface.
|
||||
*/
|
||||
|
||||
@@ -47,44 +47,42 @@
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif /* __cplusplus */
|
||||
ROCPROFILER_EXTERN_C_INIT /* __cplusplus */
|
||||
|
||||
/** \defgroup rocprofiler_plugins ROCProfiler Plugin API Specification
|
||||
* @{
|
||||
*/
|
||||
/** @defgroup rocprofiler_plugins ROCProfiler Plugin API Specification
|
||||
* @{
|
||||
*/
|
||||
|
||||
/** \defgroup initialization_group Initialization and Finalization
|
||||
* \ingroup rocprofiler_plugins
|
||||
*
|
||||
* The ROCProfiler Plugin API must be initialized before using any of the
|
||||
* operations to report trace data, and finalized after the last trace data has
|
||||
* been reported.
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
/** @defgroup initialization_group Initialization and Finalization
|
||||
* @ingroup rocprofiler_plugins
|
||||
*
|
||||
* The ROCProfiler Plugin API must be initialized before using any of the
|
||||
* operations to report trace data, and finalized after the last trace data has
|
||||
* been reported.
|
||||
*
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* Initialize plugin.
|
||||
* Must be called before any other operation.
|
||||
*
|
||||
* @param[in] rocprofiler_major_version The major version of the ROCProfiler API
|
||||
* being used by the ROCProfiler Tool. An error is reported if this does not
|
||||
* match the major version of the ROCProfiler API used to build the plugin
|
||||
* library. This ensures compatibility of the trace data format.
|
||||
* @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API
|
||||
* being used by the ROCProfiler Tool. An error is reported if the
|
||||
* \p rocprofiler_major_version matches and this is greater than the minor
|
||||
* version of the ROCProfiler API used to build the plugin library. This ensures
|
||||
* compatibility of the trace data format.
|
||||
* @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool
|
||||
* @return Returns 0 on success and -1 on error.
|
||||
*/
|
||||
ROCPROFILER_EXPORT int
|
||||
rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version,
|
||||
uint32_t rocprofiler_minor_version,
|
||||
void* data);
|
||||
/**
|
||||
* Initialize plugin.
|
||||
* Must be called before any other operation.
|
||||
*
|
||||
* @param[in] rocprofiler_major_version The major version of the ROCProfiler API
|
||||
* being used by the ROCProfiler Tool. An error is reported if this does not
|
||||
* match the major version of the ROCProfiler API used to build the plugin
|
||||
* library. This ensures compatibility of the trace data format.
|
||||
* @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API
|
||||
* being used by the ROCProfiler Tool. An error is reported if the
|
||||
* @p rocprofiler_major_version matches and this is greater than the minor
|
||||
* version of the ROCProfiler API used to build the plugin library. This ensures
|
||||
* compatibility of the trace data format.
|
||||
* @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool
|
||||
* @return Returns 0 on success and -1 on error.
|
||||
*/
|
||||
ROCPROFILER_EXPORT int
|
||||
rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version,
|
||||
uint32_t rocprofiler_minor_version,
|
||||
void* data);
|
||||
|
||||
/**
|
||||
* Finalize plugin.
|
||||
@@ -97,8 +95,8 @@ rocprofiler_plugin_finalize();
|
||||
|
||||
/** @} */
|
||||
|
||||
/** \defgroup profiling_record_write_functions Profiling data reporting
|
||||
* \ingroup rocprofiler_plugins
|
||||
/** @defgroup profiling_record_write_functions Profiling data reporting
|
||||
* @ingroup rocprofiler_plugins
|
||||
* Operations to output profiling data.
|
||||
* @{
|
||||
*/
|
||||
@@ -128,12 +126,10 @@ rocprofiler_plugin_write_buffer_records(rocprofiler_context_id_t context_id
|
||||
*/
|
||||
|
||||
ROCPROFILER_EXPORT int
|
||||
rocprofiler_plugin_write_record(rocprofiler_record_tracer_t record);
|
||||
rocprofiler_plugin_write_record(rocprofiler_record_header_t record);
|
||||
|
||||
/** @} */
|
||||
|
||||
/** @} */
|
||||
|
||||
#ifdef __cplusplus
|
||||
} /* extern "C" */
|
||||
#endif /* __cplusplus */
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/defines.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
ROCPROFILER_EXTERN_C_INIT
|
||||
|
||||
/** @defgroup SPM_SERVICE SPM Service
|
||||
* @{
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief Configure SPM Service.
|
||||
*
|
||||
* @param [in] context_id
|
||||
* @param [in] buffer_id
|
||||
* @param [in] profile_config
|
||||
* @param [in] interval
|
||||
* @return ::rocprofiler_status_t
|
||||
*/
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_configure_spm_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_profile_config_id_t profile_config,
|
||||
uint64_t interval);
|
||||
|
||||
/** @} */
|
||||
|
||||
ROCPROFILER_EXTERN_C_FINI
|
||||
@@ -26,6 +26,7 @@ target_link_libraries(
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-dl>
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hip>
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-amd-comgr>
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hsa-runtime>)
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hsa-runtime>
|
||||
$<BUILD_INTERFACE:rocprofiler::rocprofiler-ptl>)
|
||||
set_target_properties(rocprofiler-common-library PROPERTIES OUTPUT_NAME
|
||||
rocprofiler-common)
|
||||
|
||||
@@ -59,7 +59,7 @@ record_header_buffer::operator=(record_header_buffer&& _rhs) noexcept
|
||||
if(this != &_rhs)
|
||||
{
|
||||
auto _lk = rhb_raii_lock{_rhs};
|
||||
m_index = _rhs.m_index.load(std::memory_order_relaxed);
|
||||
m_index = _rhs.m_index.load(std::memory_order_acquire);
|
||||
m_buffer = std::move(_rhs.m_buffer);
|
||||
m_headers = std::move(_rhs.m_headers);
|
||||
_rhs.reset();
|
||||
@@ -74,7 +74,8 @@ record_header_buffer::allocate(size_t num_bytes)
|
||||
|
||||
auto _lk = rhb_raii_lock{*this};
|
||||
m_buffer.init(num_bytes);
|
||||
m_headers.resize(m_buffer.capacity(), rocprofiler_record_header_t{0, nullptr});
|
||||
m_headers.resize(m_buffer.capacity(),
|
||||
rocprofiler_record_header_t{.hash = 0, .payload = nullptr});
|
||||
return true;
|
||||
}
|
||||
|
||||
@@ -83,13 +84,13 @@ record_header_buffer::get_record_headers(size_t _n)
|
||||
{
|
||||
auto _lk = rhb_raii_lock{*this};
|
||||
|
||||
auto _sz = m_index.load(std::memory_order_relaxed);
|
||||
auto _sz = m_index.load(std::memory_order_acquire);
|
||||
if(_n > _sz) _n = _sz;
|
||||
auto _ret = record_ptr_vec_t{};
|
||||
_ret.reserve(_n);
|
||||
for(size_t i = 0; i < _n; ++i)
|
||||
{
|
||||
if(auto& itr = m_headers.at(i); itr.kind > 0 && itr.payload != nullptr)
|
||||
if(auto& itr = m_headers.at(i); itr.hash > 0 && itr.payload != nullptr)
|
||||
_ret.emplace_back(&itr);
|
||||
}
|
||||
return _ret;
|
||||
@@ -105,9 +106,9 @@ record_header_buffer::clear()
|
||||
auto _sz = m_buffer.capacity();
|
||||
if(!m_buffer.clear(std::nothrow_t{})) return 0;
|
||||
std::for_each(m_headers.begin(), m_headers.end(), [](auto& itr) {
|
||||
itr = rocprofiler_record_header_t{0, nullptr};
|
||||
itr = rocprofiler_record_header_t{.hash = 0, .payload = nullptr};
|
||||
});
|
||||
m_headers.resize(_sz, rocprofiler_record_header_t{0, nullptr});
|
||||
m_headers.resize(_sz, rocprofiler_record_header_t{.hash = 0, .payload = nullptr});
|
||||
m_index.store(0, std::memory_order_release);
|
||||
}
|
||||
|
||||
|
||||
@@ -29,6 +29,7 @@
|
||||
#include <atomic>
|
||||
#include <limits>
|
||||
#include <mutex>
|
||||
#include <shared_mutex>
|
||||
#include <vector>
|
||||
|
||||
namespace rocprofiler
|
||||
@@ -70,17 +71,29 @@ struct record_header_buffer
|
||||
template <typename Tp>
|
||||
bool emplace(uint64_t, Tp&);
|
||||
|
||||
/// place an object in the buffer using the specified numerical identifier
|
||||
template <typename Tp>
|
||||
bool emplace(uint32_t, uint32_t, Tp&);
|
||||
|
||||
/// this function will return a vector of pointers to the record headers
|
||||
/// at the time of invocation.
|
||||
record_ptr_vec_t get_record_headers(size_t _n = std::numeric_limits<size_t>::max());
|
||||
|
||||
/// prevent emplace
|
||||
/// record_header_buffer is a multiple writer, single reader data structure so
|
||||
/// this function prevents writing via emplace
|
||||
void lock();
|
||||
|
||||
/// try to re-enable emplace
|
||||
/// potentially re-enable emplace if no other readers have locked
|
||||
void unlock();
|
||||
|
||||
/// check if emplace is available
|
||||
/// record_header_buffer is a multiple writer, single reader data structure so
|
||||
/// this function prevents reading while emplacing
|
||||
void read_lock();
|
||||
|
||||
/// potentially allow reading after writing via emplace
|
||||
void read_unlock();
|
||||
|
||||
/// check if writing is available
|
||||
bool is_locked() const;
|
||||
|
||||
/// restores to original empty state
|
||||
@@ -116,6 +129,7 @@ struct record_header_buffer
|
||||
private:
|
||||
std::atomic<int32_t> m_locked = {0};
|
||||
std::atomic<size_t> m_index = {};
|
||||
std::shared_mutex m_shared = {};
|
||||
base_buffer_t m_buffer = {};
|
||||
record_vec_t m_headers = {};
|
||||
};
|
||||
@@ -129,13 +143,27 @@ record_header_buffer::is_locked() const
|
||||
inline void
|
||||
record_header_buffer::lock()
|
||||
{
|
||||
m_locked.fetch_add(1, std::memory_order_release);
|
||||
auto n = m_locked.fetch_add(1, std::memory_order_release);
|
||||
if(n == 0) m_shared.lock();
|
||||
}
|
||||
|
||||
inline void
|
||||
record_header_buffer::unlock()
|
||||
{
|
||||
m_locked.fetch_add(-1, std::memory_order_release);
|
||||
auto n = m_locked.fetch_add(-1, std::memory_order_release);
|
||||
if(n <= 1) m_shared.unlock();
|
||||
}
|
||||
|
||||
inline void
|
||||
record_header_buffer::read_lock()
|
||||
{
|
||||
m_shared.lock_shared();
|
||||
}
|
||||
|
||||
inline void
|
||||
record_header_buffer::read_unlock()
|
||||
{
|
||||
m_shared.unlock_shared();
|
||||
}
|
||||
|
||||
inline bool
|
||||
@@ -182,7 +210,7 @@ record_header_buffer::is_full() const
|
||||
|
||||
template <typename Tp>
|
||||
bool
|
||||
record_header_buffer::emplace(uint64_t _kind, Tp& _v)
|
||||
record_header_buffer::emplace(uint64_t _hash, Tp& _v)
|
||||
{
|
||||
if(is_locked() || m_headers.empty()) return false;
|
||||
|
||||
@@ -195,6 +223,7 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v)
|
||||
return _ptr;
|
||||
};
|
||||
|
||||
read_lock();
|
||||
auto _addr = _create_record(m_buffer, _v);
|
||||
if(_addr)
|
||||
{
|
||||
@@ -202,9 +231,41 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v)
|
||||
// for where the header record should be placed.
|
||||
// NOTE: m_headers was resized to be large enough to accomodate
|
||||
// sizeof(Tp) == 1 for every entry in buffer
|
||||
auto _idx = m_index++;
|
||||
m_headers.at(_idx) = rocprofiler_record_header_t{_kind, _addr};
|
||||
auto idx = m_index.fetch_add(1, std::memory_order_release);
|
||||
m_headers.at(idx) = rocprofiler_record_header_t{.hash = _hash, .payload = _addr};
|
||||
}
|
||||
read_unlock();
|
||||
return (_addr != nullptr);
|
||||
}
|
||||
|
||||
template <typename Tp>
|
||||
bool
|
||||
record_header_buffer::emplace(uint32_t _category, uint32_t _kind, Tp& _v)
|
||||
{
|
||||
if(is_locked() || m_headers.empty()) return false;
|
||||
|
||||
// request N bytes in the buffer (where N=sizeof(Tp)) and if
|
||||
// available, copy _v into the buffer region
|
||||
auto _create_record = [](auto& _buf, auto& _data) {
|
||||
constexpr auto buffer_sz = sizeof(Tp);
|
||||
void* _ptr = _buf.request(buffer_sz, false);
|
||||
if(_ptr) new(_ptr) Tp{_data};
|
||||
return _ptr;
|
||||
};
|
||||
|
||||
read_lock();
|
||||
auto _addr = _create_record(m_buffer, _v);
|
||||
if(_addr)
|
||||
{
|
||||
// if there is space in the buffer, atomically get an index
|
||||
// for where the header record should be placed.
|
||||
// NOTE: m_headers was resized to be large enough to accomodate
|
||||
// sizeof(Tp) == 1 for every entry in buffer
|
||||
auto idx = m_index.fetch_add(1, std::memory_order_release);
|
||||
m_headers.at(idx) =
|
||||
rocprofiler_record_header_t{.category = _category, .kind = _kind, .payload = _addr};
|
||||
}
|
||||
read_unlock();
|
||||
return (_addr != nullptr);
|
||||
}
|
||||
|
||||
|
||||
@@ -29,6 +29,7 @@
|
||||
#include <algorithm>
|
||||
#include <initializer_list>
|
||||
#include <iterator>
|
||||
#include <limits>
|
||||
#include <memory>
|
||||
#include <numeric>
|
||||
#include <type_traits>
|
||||
@@ -40,6 +41,15 @@ namespace common
|
||||
{
|
||||
namespace container
|
||||
{
|
||||
struct reserve_size
|
||||
{
|
||||
explicit reserve_size(size_t _v)
|
||||
: value{_v}
|
||||
{}
|
||||
|
||||
size_t value;
|
||||
};
|
||||
|
||||
template <typename Tp, size_t ChunkSizeV = 64>
|
||||
class stable_vector
|
||||
{
|
||||
@@ -155,6 +165,7 @@ public:
|
||||
stable_vector() = default;
|
||||
explicit stable_vector(size_type count, const Tp& value);
|
||||
explicit stable_vector(size_type count);
|
||||
explicit stable_vector(reserve_size&& reserve_count);
|
||||
|
||||
template <typename InputItrT,
|
||||
typename = std::enable_if_t<
|
||||
@@ -247,6 +258,12 @@ stable_vector<Tp, ChunkSizeV>::stable_vector(size_type count)
|
||||
}
|
||||
}
|
||||
|
||||
template <typename Tp, size_t ChunkSizeV>
|
||||
stable_vector<Tp, ChunkSizeV>::stable_vector(reserve_size&& reserve_count)
|
||||
{
|
||||
reserve(reserve_count.value);
|
||||
}
|
||||
|
||||
template <typename Tp, size_t ChunkSizeV>
|
||||
template <typename InputItrT, typename>
|
||||
stable_vector<Tp, ChunkSizeV>::stable_vector(InputItrT first, InputItrT last)
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
#
|
||||
#
|
||||
#
|
||||
set(ROCPROFILER_LIB_HEADERS config_helpers.hpp config_internal.hpp tracer.hpp)
|
||||
set(ROCPROFILER_LIB_SOURCES config_internal.cpp rocprofiler_config.cpp rocprofiler.cpp)
|
||||
set(ROCPROFILER_LIB_HEADERS buffer.hpp internal_threading.hpp registration.hpp)
|
||||
set(ROCPROFILER_LIB_SOURCES
|
||||
buffer.cpp buffer_tracing.cpp callback_tracing.cpp context.cpp internal_threading.cpp
|
||||
rocprofiler.cpp registration.cpp)
|
||||
|
||||
add_library(rocprofiler-library SHARED)
|
||||
add_library(rocprofiler::rocprofiler-library ALIAS rocprofiler-library)
|
||||
@@ -11,6 +13,7 @@ target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_SOURCES}
|
||||
${ROCPROFILER_LIB_HEADERS})
|
||||
|
||||
add_subdirectory(hsa)
|
||||
add_subdirectory(context)
|
||||
|
||||
target_link_libraries(
|
||||
rocprofiler-library
|
||||
|
||||
@@ -0,0 +1,203 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include "lib/rocprofiler/buffer.hpp"
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/common/utility.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/internal_threading.hpp"
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
|
||||
#include <atomic>
|
||||
#include <exception>
|
||||
#include <vector>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace buffer
|
||||
{
|
||||
namespace
|
||||
{
|
||||
using reserve_size_t = common::container::reserve_size;
|
||||
|
||||
auto&
|
||||
get_buffers_mutex()
|
||||
{
|
||||
static auto _v = std::mutex{};
|
||||
return _v;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
unique_buffer_vec_t&
|
||||
get_buffers()
|
||||
{
|
||||
static auto _v = unique_buffer_vec_t{reserve_size_t{unique_buffer_vec_t::chunk_size}};
|
||||
return _v;
|
||||
}
|
||||
|
||||
std::optional<rocprofiler_buffer_id_t>
|
||||
allocate_buffer()
|
||||
{
|
||||
// ... allocate any internal space needed to handle another context ...
|
||||
auto _lk = std::unique_lock<std::mutex>{get_buffers_mutex()};
|
||||
|
||||
// initial context identifier number
|
||||
auto _idx = get_buffers().size();
|
||||
|
||||
// make space in registered
|
||||
get_buffers().emplace_back(nullptr);
|
||||
|
||||
// create an entry in the registered
|
||||
auto& _cfg_v = get_buffers().back();
|
||||
_cfg_v = std::make_unique<buffer::instance>();
|
||||
auto* _cfg = _cfg_v.get();
|
||||
|
||||
if(!_cfg) return std::nullopt;
|
||||
|
||||
return rocprofiler_buffer_id_t{_idx};
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
flush(rocprofiler_buffer_id_t buffer_id, bool wait)
|
||||
{
|
||||
if(buffer_id.handle >= get_buffers().size()) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
|
||||
auto& buff = get_buffers().at(buffer_id.handle);
|
||||
|
||||
auto* task_group = rocprofiler::internal_threading::get_task_group(
|
||||
rocprofiler_callback_thread_t{buff->task_group_id});
|
||||
|
||||
if(task_group) task_group->wait();
|
||||
|
||||
// buffer is currently being flushed or destroyed
|
||||
if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY;
|
||||
|
||||
auto buff_idx = buff->buffer_idx++;
|
||||
|
||||
auto _task = [buff_idx, buffer_id]() {
|
||||
auto& _buff = get_buffers().at(buffer_id.handle);
|
||||
auto& buff_v = _buff->buffers.at(buff_idx % _buff->buffers.size());
|
||||
|
||||
if(!buff_v.is_empty())
|
||||
{
|
||||
// get the array of record headers
|
||||
auto buff_data = buff_v.get_record_headers();
|
||||
|
||||
// invoke buffer callback
|
||||
try
|
||||
{
|
||||
_buff->callback(rocprofiler_context_id_t{_buff->context_id},
|
||||
rocprofiler_buffer_id_t{_buff->buffer_id},
|
||||
buff_data.data(),
|
||||
buff_data.size(),
|
||||
_buff->callback_data,
|
||||
_buff->drop_count);
|
||||
} catch(std::exception& e)
|
||||
{
|
||||
LOG(ERROR) << "buffer callback threw an exception: " << e.what();
|
||||
}
|
||||
// clear the buffer
|
||||
buff_v.clear();
|
||||
}
|
||||
|
||||
_buff->syncer.clear();
|
||||
};
|
||||
|
||||
if(task_group)
|
||||
{
|
||||
task_group->exec(_task);
|
||||
if(wait) task_group->wait();
|
||||
}
|
||||
else
|
||||
{
|
||||
_task();
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
} // namespace buffer
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_create_buffer(rocprofiler_context_id_t context,
|
||||
size_t size,
|
||||
size_t watermark,
|
||||
rocprofiler_buffer_policy_t action,
|
||||
rocprofiler_buffer_tracing_cb_t callback,
|
||||
void* callback_data,
|
||||
rocprofiler_buffer_id_t* buffer_id)
|
||||
{
|
||||
if(rocprofiler::registration::get_init_status() > 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
auto opt_buff_id = rocprofiler::buffer::allocate_buffer();
|
||||
if(!opt_buff_id) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
*buffer_id = *opt_buff_id;
|
||||
|
||||
auto& buff = rocprofiler::buffer::get_buffers().at(opt_buff_id->handle);
|
||||
|
||||
// allocate the buffers. if it is lossless, we allocate a second buffer to store data while
|
||||
// other buffer is being flushed
|
||||
buff->buffers.front().allocate(size);
|
||||
if(action == ROCPROFILER_BUFFER_POLICY_LOSSLESS) buff->buffers.back().allocate(size);
|
||||
|
||||
buff->watermark = watermark;
|
||||
buff->policy = action;
|
||||
buff->callback = callback;
|
||||
buff->callback_data = callback_data;
|
||||
buff->context_id = context.handle;
|
||||
buff->buffer_idx = buffer_id->handle;
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
return rocprofiler::buffer::flush(buffer_id, true);
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
if(buffer_id.handle >= rocprofiler::buffer::get_buffers().size())
|
||||
return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
|
||||
auto& buff = rocprofiler::buffer::get_buffers().at(buffer_id.handle);
|
||||
|
||||
// buffer is currently being flushed or destroyed
|
||||
if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY;
|
||||
|
||||
for(auto& itr : buff->buffers)
|
||||
itr.reset();
|
||||
|
||||
buff->syncer.clear();
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,122 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <bits/stdint-uintn.h>
|
||||
#include <rocprofiler/buffer.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
#include "lib/common/container/record_header_buffer.hpp"
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/common/demangle.hpp"
|
||||
|
||||
#include <array>
|
||||
#include <atomic>
|
||||
#include <cstdint>
|
||||
#include <optional>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace buffer
|
||||
{
|
||||
struct instance
|
||||
{
|
||||
using buffer_t = common::container::record_header_buffer;
|
||||
|
||||
mutable std::array<buffer_t, 2> buffers = {};
|
||||
mutable std::atomic<unsigned short> buffer_idx = {};
|
||||
mutable std::atomic_flag syncer = ATOMIC_FLAG_INIT;
|
||||
mutable std::atomic<uint64_t> drop_count = {};
|
||||
uint64_t watermark = 0;
|
||||
uint64_t context_id = 0;
|
||||
uint64_t buffer_id = 0;
|
||||
uint64_t task_group_id = 0;
|
||||
rocprofiler_buffer_tracing_cb_t callback = nullptr;
|
||||
void* callback_data = nullptr;
|
||||
rocprofiler_buffer_policy_t policy = ROCPROFILER_BUFFER_POLICY_NONE;
|
||||
|
||||
template <typename Tp>
|
||||
void emplace(uint32_t, uint32_t, Tp&);
|
||||
};
|
||||
|
||||
using unique_buffer_vec_t = common::container::stable_vector<std::unique_ptr<instance>, 4>;
|
||||
|
||||
std::optional<rocprofiler_buffer_id_t>
|
||||
allocate_buffer();
|
||||
|
||||
unique_buffer_vec_t&
|
||||
get_buffers();
|
||||
|
||||
rocprofiler_status_t
|
||||
flush(rocprofiler_buffer_id_t buffer_id, bool wait);
|
||||
|
||||
inline rocprofiler_status_t
|
||||
flush(uint64_t buffer_idx, bool wait)
|
||||
{
|
||||
return flush(rocprofiler_buffer_id_t{buffer_idx}, wait);
|
||||
}
|
||||
} // namespace buffer
|
||||
} // namespace rocprofiler
|
||||
|
||||
template <typename Tp>
|
||||
inline void
|
||||
rocprofiler::buffer::instance::emplace(uint32_t category, uint32_t kind, Tp& value)
|
||||
{
|
||||
// get the index of the current buffer
|
||||
auto get_idx = [this]() { return buffer_idx.load(std::memory_order_acquire) % buffers.size(); };
|
||||
|
||||
auto idx = get_idx();
|
||||
if(!buffers.at(idx).emplace(category, kind, value))
|
||||
{
|
||||
if(buffers.at(idx).size() < sizeof(value))
|
||||
{
|
||||
auto msg = std::stringstream{};
|
||||
msg << "buffer " << buffer_id << " to small (size=" << buffers.at(idx).size()
|
||||
<< ") to hold an object of type " << common::cxx_demangle(typeid(value).name())
|
||||
<< " with size " << sizeof(value);
|
||||
throw std::runtime_error(msg.str());
|
||||
}
|
||||
|
||||
if(policy == ROCPROFILER_BUFFER_POLICY_LOSSLESS)
|
||||
{
|
||||
// blocks until buffer is flushed
|
||||
bool success = false;
|
||||
while(!success)
|
||||
{
|
||||
buffer::flush(buffer_id, true);
|
||||
idx = get_idx();
|
||||
success = buffers.at(idx).emplace(category, kind, value);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
++drop_count;
|
||||
}
|
||||
}
|
||||
|
||||
if(buffers.at(idx).count() >= watermark)
|
||||
{
|
||||
// flush without syncing
|
||||
buffer::flush(buffer_id, false);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,151 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
|
||||
#include <glog/logging.h>
|
||||
|
||||
#include <atomic>
|
||||
#include <limits>
|
||||
#include <vector>
|
||||
|
||||
#define RETURN_STATUS_ON_FAIL(...) \
|
||||
if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \
|
||||
{ \
|
||||
return _status; \
|
||||
}
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
rocprofiler_tracing_operation_t* operations,
|
||||
size_t operations_count,
|
||||
rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
if(rocprofiler::registration::get_init_status() > 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
if(context_id.handle >= rocprofiler::context::get_registered_contexts().size())
|
||||
{
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle);
|
||||
|
||||
if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
|
||||
constexpr auto invalid_buffer_id =
|
||||
rocprofiler_buffer_id_t{std::numeric_limits<uint64_t>::max()};
|
||||
|
||||
if(!ctx->buffered_tracer)
|
||||
{
|
||||
ctx->buffered_tracer = std::make_unique<rocprofiler::context::buffer_tracing_service>();
|
||||
ctx->buffered_tracer->buffer_data.fill(invalid_buffer_id);
|
||||
}
|
||||
|
||||
if(ctx->buffered_tracer->buffer_data.at(kind).handle != invalid_buffer_id.handle)
|
||||
return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED;
|
||||
|
||||
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->buffered_tracer->domains, kind));
|
||||
|
||||
ctx->buffered_tracer->buffer_data.at(kind) = buffer_id;
|
||||
|
||||
for(size_t i = 0; i < operations_count; ++i)
|
||||
{
|
||||
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op(
|
||||
ctx->buffered_tracer->domains, kind, operations[i]));
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback,
|
||||
void* data)
|
||||
{
|
||||
// TODO(jrmadsen): need to add for other kinds
|
||||
size_t n = 0;
|
||||
bool premature = false;
|
||||
using pair_t = std::pair<rocprofiler_service_buffer_tracing_kind_t, const char*>;
|
||||
for(auto [eitr, sitr] : {
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, "HSA_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, "HIP_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, "MARKER_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, "MEMORY_COPY"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, "PAGE_MIGRATION"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, "SCRATCH_MEMORY"},
|
||||
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, "EXTERNAL_CORRELATION"},
|
||||
})
|
||||
{
|
||||
auto _success = callback(eitr, sitr, data);
|
||||
if(_success != 0)
|
||||
{
|
||||
premature = true;
|
||||
break;
|
||||
}
|
||||
++n;
|
||||
}
|
||||
|
||||
#if defined(ROCPROFILER_CI)
|
||||
if(!premature)
|
||||
{
|
||||
LOG_ASSERT(n == ROCPROFILER_SERVICE_BUFFER_TRACING_LAST - 1)
|
||||
<< " :: new enumeration value added. Update this function";
|
||||
}
|
||||
#else
|
||||
(void) n;
|
||||
(void) premature;
|
||||
#endif
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_iterate_buffer_tracing_kind_operation_names(
|
||||
rocprofiler_service_buffer_tracing_kind_t kind,
|
||||
rocprofiler_buffer_tracing_operation_name_cb_t callback,
|
||||
void* data)
|
||||
{
|
||||
if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
|
||||
{
|
||||
auto ops = rocprofiler::hsa::get_ids();
|
||||
for(const auto& itr : ops)
|
||||
{
|
||||
auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data);
|
||||
if(_success != 0) break;
|
||||
}
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
}
|
||||
|
||||
#undef RETURN_STATUS_ON_FAIL
|
||||
@@ -0,0 +1,161 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
|
||||
#include <glog/logging.h>
|
||||
|
||||
#include <atomic>
|
||||
#include <vector>
|
||||
|
||||
#define RETURN_STATUS_ON_FAIL(...) \
|
||||
if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \
|
||||
{ \
|
||||
return _status; \
|
||||
}
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
rocprofiler_tracing_operation_t* operations,
|
||||
size_t operations_count,
|
||||
rocprofiler_callback_tracing_cb_t callback,
|
||||
void* callback_args)
|
||||
{
|
||||
if(rocprofiler::registration::get_init_status() > 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
if(context_id.handle >= rocprofiler::context::get_registered_contexts().size())
|
||||
{
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle);
|
||||
|
||||
if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
|
||||
if(!ctx->callback_tracer)
|
||||
ctx->callback_tracer = std::make_unique<rocprofiler::context::callback_tracing_service>();
|
||||
|
||||
if(ctx->callback_tracer->callback_data.at(kind).callback)
|
||||
return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED;
|
||||
|
||||
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->callback_tracer->domains, kind));
|
||||
|
||||
ctx->callback_tracer->callback_data.at(kind) = {callback, callback_args};
|
||||
|
||||
for(size_t i = 0; i < operations_count; ++i)
|
||||
{
|
||||
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op(
|
||||
ctx->callback_tracer->domains, kind, operations[i]));
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_iterate_callback_tracing_kind_names(
|
||||
rocprofiler_callback_tracing_kind_name_cb_t callback,
|
||||
void* data)
|
||||
{
|
||||
// TODO(jrmadsen): need to add for other kinds
|
||||
size_t n = 0;
|
||||
bool premature = false;
|
||||
using pair_t = std::pair<rocprofiler_service_callback_tracing_kind_t, const char*>;
|
||||
for(auto [eitr, sitr] : {
|
||||
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, "HSA_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, "HIP_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, "MARKER_API"},
|
||||
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, "CODE_OBJECT"},
|
||||
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"},
|
||||
})
|
||||
{
|
||||
auto _success = callback(eitr, sitr, data);
|
||||
if(_success != 0)
|
||||
{
|
||||
premature = true;
|
||||
break;
|
||||
}
|
||||
++n;
|
||||
}
|
||||
|
||||
#if defined(ROCPROFILER_CI)
|
||||
if(!premature)
|
||||
{
|
||||
LOG_ASSERT(n == ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST - 1)
|
||||
<< " :: new enumeration value added. Update this function";
|
||||
}
|
||||
#else
|
||||
(void) n;
|
||||
(void) premature;
|
||||
#endif
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_iterate_callback_tracing_kind_operation_names(
|
||||
rocprofiler_service_callback_tracing_kind_t kind,
|
||||
rocprofiler_callback_tracing_operation_name_cb_t callback,
|
||||
void* data)
|
||||
{
|
||||
if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
|
||||
{
|
||||
auto ops = rocprofiler::hsa::get_ids();
|
||||
for(const auto& itr : ops)
|
||||
{
|
||||
auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data);
|
||||
if(_success != 0) break;
|
||||
}
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_iterate_callback_tracing_operation_args(
|
||||
rocprofiler_callback_tracing_record_t record,
|
||||
rocprofiler_callback_tracing_operation_args_cb_t callback,
|
||||
void* user_data)
|
||||
{
|
||||
if(record.kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
|
||||
{
|
||||
rocprofiler::hsa::iterate_args(
|
||||
record.operation,
|
||||
*static_cast<rocprofiler_hsa_api_callback_tracer_data_t*>(record.payload),
|
||||
callback,
|
||||
user_data);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
}
|
||||
|
||||
#undef RETURN_STATUS_ON_FAIL
|
||||
@@ -1,28 +0,0 @@
|
||||
|
||||
#include "config_internal.hpp"
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace internal
|
||||
{
|
||||
uint64_t
|
||||
correlation_config::get_unique_record_id()
|
||||
{
|
||||
static auto _v = std::atomic<uint64_t>{};
|
||||
return _v++;
|
||||
}
|
||||
|
||||
bool
|
||||
domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain) const
|
||||
{
|
||||
return ((1 << _domain) & domains) == (1 << _domain);
|
||||
}
|
||||
|
||||
bool
|
||||
domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain, uint32_t _op) const
|
||||
{
|
||||
auto _offset = (_domain * rocprofiler::internal::domain_ops_offset);
|
||||
return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op));
|
||||
}
|
||||
} // namespace internal
|
||||
} // namespace rocprofiler
|
||||
@@ -1,74 +0,0 @@
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/config.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include <array>
|
||||
#include <atomic>
|
||||
#include <bitset>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace internal
|
||||
{
|
||||
// number of bits to reserve all op codes
|
||||
constexpr size_t domain_ops_offset = ROCPROFILER_DOMAIN_OPS_MAX;
|
||||
constexpr size_t reserved_domain_size = ROCPROFILER_DOMAIN_OPS_RESERVED * 8;
|
||||
constexpr size_t max_configs_count = 8;
|
||||
|
||||
struct correlation_config
|
||||
{
|
||||
uint64_t id = 0;
|
||||
uint64_t external_id = 0;
|
||||
::rocprofiler_external_cid_cb_t external_id_callback = nullptr;
|
||||
|
||||
static uint64_t get_unique_record_id();
|
||||
};
|
||||
|
||||
struct domain_config
|
||||
{
|
||||
::rocprofiler_tracer_callback_t user_sync_callback = nullptr;
|
||||
int64_t domains = 0;
|
||||
std::bitset<reserved_domain_size> opcodes = {};
|
||||
|
||||
/// check if domain is enabled
|
||||
bool operator()(::rocprofiler_tracer_activity_domain_t) const;
|
||||
|
||||
/// check if op in a domain is enabled
|
||||
bool operator()(::rocprofiler_tracer_activity_domain_t, uint32_t) const;
|
||||
};
|
||||
|
||||
struct buffer_config
|
||||
{
|
||||
::rocprofiler_buffer_callback_t callback = nullptr;
|
||||
uint64_t buffer_size;
|
||||
// Memory::GenericBuffer* buffer = nullptr;
|
||||
uint64_t buffer_idx = 0;
|
||||
};
|
||||
|
||||
using filter_config = ::rocprofiler_filter_config;
|
||||
|
||||
struct config
|
||||
{
|
||||
// size is used to ensure that we never read past the end of the version
|
||||
size_t size = 0; // = sizeof(rocprofiler_config)
|
||||
uint32_t compat_version = 0; // set by user
|
||||
uint32_t api_version = 0; // set by rocprofiler
|
||||
uint64_t context_idx = 0; // context id index
|
||||
void* user_data = nullptr; // user data passed to callbacks
|
||||
correlation_config* correlation_id = nullptr; // &my_cid_config (optional)
|
||||
buffer_config* buffer = nullptr; // = &my_buffer_config (required)
|
||||
domain_config* domain = nullptr; // = &my_domain_config (required)
|
||||
filter_config* filter = nullptr; // = &my_filter_config (optional)
|
||||
};
|
||||
|
||||
std::array<rocprofiler::internal::config*, max_configs_count>&
|
||||
get_registered_configs();
|
||||
|
||||
std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>&
|
||||
get_active_configs();
|
||||
} // namespace internal
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,89 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
|
||||
#include <atomic>
|
||||
#include <vector>
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_create_context(rocprofiler_context_id_t* context_id)
|
||||
{
|
||||
if(rocprofiler::registration::get_init_status() > 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
auto cfg_id = rocprofiler::context::allocate_context();
|
||||
if(!cfg_id) return ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR;
|
||||
*context_id = *cfg_id;
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_start_context(rocprofiler_context_id_t context_id)
|
||||
{
|
||||
return rocprofiler::context::start_context(context_id);
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_stop_context(rocprofiler_context_id_t context_id)
|
||||
{
|
||||
return rocprofiler::context::stop_context(context_id);
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status)
|
||||
{
|
||||
*status = 0;
|
||||
for(const auto& itr : rocprofiler::context::get_active_contexts())
|
||||
{
|
||||
auto* cfg = itr.load(std::memory_order_relaxed);
|
||||
if(cfg && cfg->context_idx == context_id.handle)
|
||||
{
|
||||
*status = 1;
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
}
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status)
|
||||
{
|
||||
*status = 0;
|
||||
for(const auto& itr : rocprofiler::context::get_registered_contexts())
|
||||
{
|
||||
if(itr && itr->context_idx == context_id.handle)
|
||||
{
|
||||
auto _ret = rocprofiler::context::validate_context(itr.get());
|
||||
*status = (_ret == ROCPROFILER_STATUS_SUCCESS) ? 1 : 0;
|
||||
return _ret;
|
||||
}
|
||||
}
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
#
|
||||
# context
|
||||
#
|
||||
set(ROCPROFILER_LIB_CONFIG_SOURCES context.cpp domain.cpp)
|
||||
set(ROCPROFILER_LIB_CONFIG_HEADERS context.hpp domain.hpp allocator.hpp)
|
||||
|
||||
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_CONFIG_SOURCES}
|
||||
${ROCPROFILER_LIB_CONFIG_HEADERS})
|
||||
|
||||
# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE
|
||||
# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR}
|
||||
# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src)
|
||||
# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL
|
||||
# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1)
|
||||
+26
-24
@@ -1,36 +1,38 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "rocprofiler/rocprofiler.h"
|
||||
|
||||
#include <array>
|
||||
#include <atomic>
|
||||
#include <cstddef>
|
||||
#include <utility>
|
||||
|
||||
namespace
|
||||
namespace rocprofiler
|
||||
{
|
||||
inline size_t // NOLINTNEXTLINE
|
||||
get_domain_max_op(rocprofiler_tracer_activity_domain_t _domain)
|
||||
namespace context
|
||||
{
|
||||
switch(_domain)
|
||||
{
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE: return -1;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_MARKER_API: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_KFD_API: return -1;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_EXT_API: return -1;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_EVT: return 0;
|
||||
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST: return -1;
|
||||
}
|
||||
return -1;
|
||||
}
|
||||
|
||||
template <typename Tp, size_t N = 8>
|
||||
struct allocator
|
||||
struct locality_allocator
|
||||
{
|
||||
void construct(Tp* const _p, const Tp& _v) const { ::new((void*) _p) Tp{_v}; }
|
||||
void construct(Tp* const _p, Tp&& _v) const { ::new((void*) _p) Tp{std::move(_v)}; }
|
||||
@@ -103,5 +105,5 @@ struct allocator
|
||||
|
||||
void reserve(const size_t) {}
|
||||
};
|
||||
|
||||
} // namespace
|
||||
} // namespace context
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,230 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
|
||||
#include <glog/logging.h>
|
||||
|
||||
#include <unistd.h>
|
||||
#include <atomic>
|
||||
#include <cstddef>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <optional>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace context
|
||||
{
|
||||
namespace
|
||||
{
|
||||
auto&
|
||||
get_contexts_mutex()
|
||||
{
|
||||
static auto _v = std::mutex{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
constexpr auto invalid_client_idx = std::numeric_limits<uint32_t>::max();
|
||||
|
||||
auto&
|
||||
get_client_index()
|
||||
{
|
||||
static auto _v = invalid_client_idx;
|
||||
return _v;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
uint64_t
|
||||
correlation_tracing_service::get_unique_record_id()
|
||||
{
|
||||
static auto _v = std::atomic<uint64_t>{};
|
||||
return _v++;
|
||||
}
|
||||
|
||||
using reserve_size_t = common::container::reserve_size;
|
||||
|
||||
unique_context_vec_t&
|
||||
get_registered_contexts()
|
||||
{
|
||||
static auto _v = unique_context_vec_t{reserve_size_t{unique_context_vec_t::chunk_size}};
|
||||
return _v;
|
||||
}
|
||||
|
||||
active_context_vec_t&
|
||||
get_active_contexts()
|
||||
{
|
||||
static auto* _v = new active_context_vec_t{reserve_size_t{active_context_vec_t::chunk_size}};
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, std::atexit, []() {
|
||||
for(auto& itr : *_v)
|
||||
{
|
||||
itr.store(nullptr);
|
||||
}
|
||||
});
|
||||
return *_v;
|
||||
}
|
||||
|
||||
// set the client index needs to be called before allocate_context()
|
||||
void
|
||||
push_client(uint32_t value)
|
||||
{
|
||||
LOG_ASSERT(get_client_index() == invalid_client_idx)
|
||||
<< " rocprofiler client index is currently " << get_client_index()
|
||||
<< "... which means that a new client is initializing before the last client finished "
|
||||
"initializing. This is an internal error, please file a bug report with a reproducer";
|
||||
get_client_index() = value;
|
||||
}
|
||||
|
||||
// remove the client index
|
||||
void
|
||||
pop_client(uint32_t value)
|
||||
{
|
||||
LOG_ASSERT(get_client_index() == value)
|
||||
<< " rocprofiler client index is currently not " << value
|
||||
<< "... which means that a new client was initialized before this client finished "
|
||||
"initializing. This is an internal error, please file a bug report with a reproducer";
|
||||
get_client_index() = invalid_client_idx;
|
||||
}
|
||||
|
||||
std::optional<rocprofiler_context_id_t>
|
||||
allocate_context()
|
||||
{
|
||||
// ... allocate any internal space needed to handle another context ...
|
||||
auto _lk = std::unique_lock<std::mutex>{get_contexts_mutex()};
|
||||
|
||||
// initial context identifier number
|
||||
auto _idx = get_registered_contexts().size();
|
||||
|
||||
// make space in registered
|
||||
get_registered_contexts().emplace_back(nullptr);
|
||||
|
||||
// create an entry in the registered
|
||||
auto& _cfg_v = get_registered_contexts().back();
|
||||
_cfg_v = std::make_unique<context>();
|
||||
auto* _cfg = _cfg_v.get();
|
||||
// ...
|
||||
|
||||
if(!_cfg) return std::nullopt;
|
||||
|
||||
_cfg->size = sizeof(context);
|
||||
_cfg->context_idx = _idx;
|
||||
_cfg->client_idx = get_client_index();
|
||||
|
||||
LOG_ASSERT(_cfg->client_idx != invalid_client_idx)
|
||||
<< " rocprofiler internal error: a context was allocated without an associated tool client "
|
||||
"identifier";
|
||||
|
||||
return rocprofiler_context_id_t{_idx};
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
validate_context(const context* cfg)
|
||||
{
|
||||
// if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
|
||||
// if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND;
|
||||
|
||||
return (cfg) ? ROCPROFILER_STATUS_SUCCESS : ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
start_context(rocprofiler_context_id_t context_id)
|
||||
{
|
||||
if(context_id.handle >= get_registered_contexts().size())
|
||||
{
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
context* cfg = get_registered_contexts().at(context_id.handle).get();
|
||||
|
||||
if(!cfg)
|
||||
{
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
|
||||
}
|
||||
|
||||
if(validate_context(cfg) != ROCPROFILER_STATUS_SUCCESS)
|
||||
{
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID;
|
||||
}
|
||||
|
||||
uint64_t rocp_tot_contexts = get_registered_contexts().size();
|
||||
auto idx = rocp_tot_contexts;
|
||||
{
|
||||
// hold a lock here so prevent multiple threads from finding the same nullptr slot
|
||||
auto _lk = std::unique_lock<std::mutex>{get_contexts_mutex()};
|
||||
// try to find a nullptr slot first
|
||||
for(size_t i = 0; i < get_active_contexts().size(); ++i)
|
||||
{
|
||||
auto* itr = get_active_contexts().at(i).load(std::memory_order_relaxed);
|
||||
if(itr == nullptr)
|
||||
{
|
||||
idx = i;
|
||||
break;
|
||||
}
|
||||
else if(context_id.handle == itr->context_idx)
|
||||
{
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
}
|
||||
// if no nullptr slot was found, then create one while lock is held
|
||||
if(idx == rocp_tot_contexts)
|
||||
{
|
||||
idx = get_active_contexts().size();
|
||||
get_active_contexts().emplace_back();
|
||||
}
|
||||
}
|
||||
|
||||
// atomic swap the pointer into the "active" array used internally
|
||||
context* _expected = nullptr;
|
||||
bool success = get_active_contexts().at(idx).compare_exchange_strong(
|
||||
_expected, get_registered_contexts().at(context_id.handle).get());
|
||||
|
||||
if(!success) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED;
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
stop_context(rocprofiler_context_id_t idx)
|
||||
{
|
||||
// atomically assign the context pointer to NULL so that it is skipped in future
|
||||
// callbacks
|
||||
for(auto& itr : get_active_contexts())
|
||||
{
|
||||
auto* _expected = itr.load(std::memory_order_relaxed);
|
||||
if(_expected && _expected->context_idx == idx.handle)
|
||||
{
|
||||
bool success = itr.compare_exchange_strong(_expected, nullptr);
|
||||
|
||||
if(success) return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange failed
|
||||
}
|
||||
} // namespace context
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,130 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
|
||||
#include <array>
|
||||
#include <atomic>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <optional>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace context
|
||||
{
|
||||
using external_cid_cb_t = uint64_t (*)(rocprofiler_service_callback_tracing_kind_t,
|
||||
uint32_t,
|
||||
uint64_t);
|
||||
|
||||
/// permits tools opportunity to modify the correlation id based on the domain, op, and
|
||||
/// the rocprofiler generated correlation id
|
||||
struct correlation_tracing_service
|
||||
{
|
||||
uint64_t id = 0;
|
||||
uint64_t external_id = 0;
|
||||
external_cid_cb_t external_id_callback = nullptr;
|
||||
|
||||
static uint64_t get_unique_record_id();
|
||||
};
|
||||
|
||||
struct callback_tracing_service
|
||||
{
|
||||
struct callback_data
|
||||
{
|
||||
rocprofiler_callback_tracing_cb_t callback = nullptr;
|
||||
void* data = nullptr;
|
||||
};
|
||||
|
||||
using domain_t = rocprofiler_service_callback_tracing_kind_t;
|
||||
using callback_array_t = std::array<callback_data, domain_info<domain_t>::last>;
|
||||
|
||||
domain_context<domain_t> domains = {};
|
||||
callback_array_t callback_data = {};
|
||||
};
|
||||
|
||||
struct buffer_tracing_service
|
||||
{
|
||||
using domain_t = rocprofiler_service_buffer_tracing_kind_t;
|
||||
using buffer_array_t = std::array<rocprofiler_buffer_id_t, domain_info<domain_t>::last>;
|
||||
|
||||
domain_context<domain_t> domains = {};
|
||||
buffer_array_t buffer_data = {};
|
||||
};
|
||||
|
||||
struct context
|
||||
{
|
||||
// size is used to ensure that we never read past the end of the version
|
||||
size_t size = 0;
|
||||
uint64_t context_idx = 0; // context id
|
||||
uint32_t client_idx = 0; // tool id
|
||||
correlation_tracing_service correlation_tracer = {};
|
||||
std::unique_ptr<callback_tracing_service> callback_tracer = {};
|
||||
std::unique_ptr<buffer_tracing_service> buffered_tracer = {};
|
||||
};
|
||||
|
||||
// set the client index needs to be called before allocate_context()
|
||||
void push_client(uint32_t);
|
||||
|
||||
// remove the client index
|
||||
void pop_client(uint32_t);
|
||||
|
||||
/// @brief creates a context struct and returns a handle for locating the context struct
|
||||
///
|
||||
std::optional<rocprofiler_context_id_t>
|
||||
allocate_context();
|
||||
|
||||
/// \brief rocprofiler validates context, checks for conflicts, etc. Ensures that
|
||||
/// the contexturation is valid *in isolation*, e.g. it may check that the user
|
||||
/// set the compat_version field and that required context fields, such as buffer
|
||||
/// are set. This function will be called before \ref start_context
|
||||
/// but is provided to help the user validate one or more contexts without starting
|
||||
/// them
|
||||
///
|
||||
/// \param [in] cfg contexturation to validate
|
||||
rocprofiler_status_t
|
||||
validate_context(const context* cfg);
|
||||
|
||||
/// \brief rocprofiler activates contexturation and provides a context identifier
|
||||
/// \param [in] id the context identifier to start.
|
||||
rocprofiler_status_t
|
||||
start_context(rocprofiler_context_id_t id);
|
||||
|
||||
/// \brief disable the contexturation.
|
||||
rocprofiler_status_t stop_context(rocprofiler_context_id_t);
|
||||
|
||||
using unique_context_vec_t = common::container::stable_vector<std::unique_ptr<context>, 8>;
|
||||
using active_context_vec_t = common::container::stable_vector<std::atomic<context*>, 8>;
|
||||
|
||||
unique_context_vec_t&
|
||||
get_registered_contexts();
|
||||
|
||||
active_context_vec_t&
|
||||
get_active_contexts();
|
||||
} // namespace context
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,99 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace context
|
||||
{
|
||||
template <typename DomainT>
|
||||
bool
|
||||
domain_context<DomainT>::operator()(DomainT _domain) const
|
||||
{
|
||||
return ((1 << _domain) & domains) == (1 << _domain);
|
||||
}
|
||||
|
||||
template <typename DomainT>
|
||||
bool
|
||||
domain_context<DomainT>::operator()(DomainT _domain, uint32_t _op) const
|
||||
{
|
||||
auto _offset = (_domain * opcode_padding_v);
|
||||
return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op));
|
||||
}
|
||||
|
||||
template <typename DomainT>
|
||||
rocprofiler_status_t
|
||||
add_domain(domain_context<DomainT>& _cfg, DomainT _domain)
|
||||
{
|
||||
if(_domain <= domain_info<DomainT>::none || _domain >= domain_info<DomainT>::last)
|
||||
return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND;
|
||||
|
||||
_cfg.domains |= (1 << _domain);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
template <typename DomainT>
|
||||
rocprofiler_status_t
|
||||
add_domain_op(domain_context<DomainT>& _cfg, DomainT _domain, uint32_t _op)
|
||||
{
|
||||
if(_domain <= domain_info<DomainT>::none || _domain >= domain_info<DomainT>::last)
|
||||
return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND;
|
||||
|
||||
if(_op >= domain_info<DomainT>::padding) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND;
|
||||
|
||||
auto _offset = (_domain * domain_info<DomainT>::padding);
|
||||
if(_offset >= _cfg.opcodes.size()) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND;
|
||||
|
||||
_cfg.opcodes.set(_offset + _op, true);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
// instantiate the templates
|
||||
template struct domain_context<rocprofiler_service_callback_tracing_kind_t>;
|
||||
|
||||
template rocprofiler_status_t
|
||||
add_domain<rocprofiler_service_callback_tracing_kind_t>(
|
||||
domain_context<rocprofiler_service_callback_tracing_kind_t>&,
|
||||
rocprofiler_service_callback_tracing_kind_t);
|
||||
|
||||
template rocprofiler_status_t
|
||||
add_domain<rocprofiler_service_buffer_tracing_kind_t>(
|
||||
domain_context<rocprofiler_service_buffer_tracing_kind_t>&,
|
||||
rocprofiler_service_buffer_tracing_kind_t);
|
||||
|
||||
template rocprofiler_status_t
|
||||
add_domain_op<rocprofiler_service_callback_tracing_kind_t>(
|
||||
domain_context<rocprofiler_service_callback_tracing_kind_t>&,
|
||||
rocprofiler_service_callback_tracing_kind_t,
|
||||
uint32_t);
|
||||
|
||||
template struct domain_context<rocprofiler_service_buffer_tracing_kind_t>;
|
||||
|
||||
template rocprofiler_status_t
|
||||
add_domain_op<rocprofiler_service_buffer_tracing_kind_t>(
|
||||
domain_context<rocprofiler_service_buffer_tracing_kind_t>&,
|
||||
rocprofiler_service_buffer_tracing_kind_t,
|
||||
uint32_t);
|
||||
} // namespace context
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,89 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/common/mpl.hpp"
|
||||
|
||||
#include <bitset>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace context
|
||||
{
|
||||
// number of bits to reserve all op codes
|
||||
constexpr size_t domain_ops_padding = 512;
|
||||
|
||||
template <typename Tp>
|
||||
struct domain_info;
|
||||
|
||||
template <>
|
||||
struct domain_info<rocprofiler_service_callback_tracing_kind_t>
|
||||
{
|
||||
static constexpr size_t none = ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE;
|
||||
static constexpr size_t last = ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST;
|
||||
static constexpr auto padding = domain_ops_padding;
|
||||
};
|
||||
|
||||
template <>
|
||||
struct domain_info<rocprofiler_service_buffer_tracing_kind_t>
|
||||
{
|
||||
static constexpr size_t none = ROCPROFILER_SERVICE_BUFFER_TRACING_NONE;
|
||||
static constexpr size_t last = ROCPROFILER_SERVICE_BUFFER_TRACING_LAST;
|
||||
static constexpr auto padding = domain_ops_padding;
|
||||
};
|
||||
|
||||
/// how the tools specify the tracing domain and (optionally) which operations in the
|
||||
/// domain they want to trace
|
||||
template <typename DomainT>
|
||||
struct domain_context
|
||||
{
|
||||
using supported_domains_v = common::mpl::type_list<rocprofiler_service_callback_tracing_kind_t,
|
||||
rocprofiler_service_buffer_tracing_kind_t>;
|
||||
static_assert(common::mpl::is_one_of<DomainT, supported_domains_v>::value,
|
||||
"Unsupported domain type");
|
||||
static constexpr auto opcode_padding_v = domain_info<DomainT>::padding;
|
||||
static constexpr auto max_opcodes_v = opcode_padding_v * domain_info<DomainT>::last;
|
||||
|
||||
/// check if domain is enabled
|
||||
bool operator()(DomainT) const;
|
||||
|
||||
/// check if op in a domain is enabled
|
||||
bool operator()(DomainT, uint32_t) const;
|
||||
|
||||
int64_t domains = 0;
|
||||
std::bitset<max_opcodes_v> opcodes = {};
|
||||
};
|
||||
|
||||
template <typename DomainT>
|
||||
rocprofiler_status_t
|
||||
add_domain(domain_context<DomainT>&, DomainT);
|
||||
|
||||
template <typename DomainT>
|
||||
rocprofiler_status_t
|
||||
add_domain_op(domain_context<DomainT>&, DomainT, uint32_t);
|
||||
} // namespace context
|
||||
} // namespace rocprofiler
|
||||
@@ -1,10 +1,10 @@
|
||||
#
|
||||
#
|
||||
#
|
||||
set(ROCPROFILER_LIB_HSA_SOURCES hsa.cpp)
|
||||
set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp ostream.hpp types.hpp utils.hpp)
|
||||
set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp types.hpp utils.hpp)
|
||||
|
||||
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_SOURCES}
|
||||
${ROCPROFILER_LIB_HSA_HEADERS})
|
||||
|
||||
# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE
|
||||
# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR}
|
||||
# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src)
|
||||
# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL
|
||||
# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1)
|
||||
add_subdirectory(details)
|
||||
|
||||
@@ -32,30 +32,27 @@
|
||||
#define IMPL_DETAIL_FOR_EACH(MACRO, PREFIX, ...) \
|
||||
IMPL_DETAIL_FOR_EACH_(IMPL_DETAIL_FOR_EACH_NARG(__VA_ARGS__), MACRO, PREFIX, __VA_ARGS__)
|
||||
|
||||
#define MEMBER_0(...)
|
||||
#define MEMBER_1(PREFIX, FIELD) PREFIX.FIELD
|
||||
#define MEMBER_2(PREFIX, A, B) MEMBER_1(PREFIX, A), MEMBER_1(PREFIX, B)
|
||||
#define MEMBER_3(PREFIX, A, B, C) MEMBER_2(PREFIX, A, B), MEMBER_1(PREFIX, C)
|
||||
#define MEMBER_4(PREFIX, A, B, C, D) MEMBER_3(PREFIX, A, B, C), MEMBER_1(PREFIX, D)
|
||||
#define MEMBER_5(PREFIX, A, B, C, D, E) MEMBER_4(PREFIX, A, B, C, D), MEMBER_1(PREFIX, E)
|
||||
#define MEMBER_6(PREFIX, A, B, C, D, E, F) MEMBER_5(PREFIX, A, B, C, D, E), MEMBER_1(PREFIX, F)
|
||||
#define MEMBER_7(PREFIX, A, B, C, D, E, F, G) \
|
||||
MEMBER_6(PREFIX, A, B, C, D, E, F), MEMBER_1(PREFIX, G)
|
||||
|
||||
#define MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \
|
||||
MEMBER_7(PREFIX, A, B, C, D, E, F, G), MEMBER_1(PREFIX, H)
|
||||
|
||||
#define MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \
|
||||
MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), MEMBER_1(PREFIX, I)
|
||||
|
||||
#define MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \
|
||||
MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), MEMBER_1(PREFIX, J)
|
||||
|
||||
#define MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \
|
||||
MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), MEMBER_1(PREFIX, K)
|
||||
|
||||
#define MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
|
||||
MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), MEMBER_1(PREFIX, L)
|
||||
#define ADDR_MEMBER_0(...)
|
||||
#define ADDR_MEMBER_1(PREFIX, FIELD) static_cast<void*>(&PREFIX.FIELD)
|
||||
#define ADDR_MEMBER_2(PREFIX, A, B) ADDR_MEMBER_1(PREFIX, A), ADDR_MEMBER_1(PREFIX, B)
|
||||
#define ADDR_MEMBER_3(PREFIX, A, B, C) ADDR_MEMBER_2(PREFIX, A, B), ADDR_MEMBER_1(PREFIX, C)
|
||||
#define ADDR_MEMBER_4(PREFIX, A, B, C, D) ADDR_MEMBER_3(PREFIX, A, B, C), ADDR_MEMBER_1(PREFIX, D)
|
||||
#define ADDR_MEMBER_5(PREFIX, A, B, C, D, E) \
|
||||
ADDR_MEMBER_4(PREFIX, A, B, C, D), ADDR_MEMBER_1(PREFIX, E)
|
||||
#define ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F) \
|
||||
ADDR_MEMBER_5(PREFIX, A, B, C, D, E), ADDR_MEMBER_1(PREFIX, F)
|
||||
#define ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G) \
|
||||
ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F), ADDR_MEMBER_1(PREFIX, G)
|
||||
#define ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \
|
||||
ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G), ADDR_MEMBER_1(PREFIX, H)
|
||||
#define ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \
|
||||
ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), ADDR_MEMBER_1(PREFIX, I)
|
||||
#define ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \
|
||||
ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), ADDR_MEMBER_1(PREFIX, J)
|
||||
#define ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \
|
||||
ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), ADDR_MEMBER_1(PREFIX, K)
|
||||
#define ADDR_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
|
||||
ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), ADDR_MEMBER_1(PREFIX, L)
|
||||
|
||||
#define NAMED_MEMBER_0(...)
|
||||
#define NAMED_MEMBER_1(PREFIX, FIELD) std::make_pair(#FIELD, PREFIX.FIELD)
|
||||
@@ -80,44 +77,10 @@
|
||||
#define NAMED_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
|
||||
NAMED_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), NAMED_MEMBER_1(PREFIX, L)
|
||||
|
||||
/// @def GET_MEMBER_FIELDS
|
||||
/// @param VAR some struct instance
|
||||
/// @param ... The member fields of the struct
|
||||
///
|
||||
/// @brief this macro is used to expand one variable (VAR) + one or more member fields (FIELDS)
|
||||
/// into a sequence of something like: `(VAR.FIELD, ...)`
|
||||
/// For example, `GET_MEMBER_FIELDS(foo, a, b, c)` would transform into `foo.a, foo.b, foo.c`:
|
||||
///
|
||||
/// @code{.cpp}
|
||||
///
|
||||
/// struct Foo
|
||||
/// {
|
||||
/// int a;
|
||||
/// float b;
|
||||
/// double c;
|
||||
/// };
|
||||
///
|
||||
/// // some function taking int, float, and double
|
||||
/// void some_function(int, float, double);
|
||||
///
|
||||
/// // overload to some_function accepting Foo instance and using
|
||||
/// // the args to invoke "real" function
|
||||
/// void some_function(Foo _foo_v)
|
||||
/// {
|
||||
/// some_function(GET_MEMBER_FIELDS(_foo_v, a, b, c));
|
||||
/// }
|
||||
///
|
||||
/// int main()
|
||||
/// {
|
||||
/// Foo _foo_v = {-1, 0.5f, 2.0};
|
||||
/// invoke_some_function(_foo_v);
|
||||
/// }
|
||||
///
|
||||
/// @code
|
||||
#define GET_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(MEMBER_, VAR, __VA_ARGS__)
|
||||
#define GET_ADDR_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(ADDR_MEMBER_, VAR, __VA_ARGS__)
|
||||
#define GET_NAMED_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(NAMED_MEMBER_, VAR, __VA_ARGS__)
|
||||
|
||||
#define HSA_API_INFO_DEFINITION_0(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \
|
||||
#define HSA_API_INFO_DEFINITION_0(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \
|
||||
namespace rocprofiler \
|
||||
{ \
|
||||
namespace hsa \
|
||||
@@ -125,10 +88,11 @@
|
||||
template <> \
|
||||
struct hsa_api_info<HSA_API_ID> \
|
||||
{ \
|
||||
static constexpr auto domain_idx = HSA_DOMAIN; \
|
||||
static constexpr auto table_idx = HSA_TABLE; \
|
||||
static constexpr auto operation_idx = HSA_API_ID; \
|
||||
static constexpr auto name = #HSA_FUNC; \
|
||||
static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \
|
||||
static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \
|
||||
static constexpr auto table_idx = HSA_TABLE; \
|
||||
static constexpr auto operation_idx = HSA_API_ID; \
|
||||
static constexpr auto name = #HSA_FUNC; \
|
||||
\
|
||||
using this_type = hsa_api_info<operation_idx>; \
|
||||
using base_type = hsa_api_impl<operation_idx>; \
|
||||
@@ -160,7 +124,7 @@
|
||||
template <typename DataT> \
|
||||
static auto& get_api_data_args(DataT& _data) \
|
||||
{ \
|
||||
return _data.api_data.args.HSA_FUNC; \
|
||||
return _data.HSA_FUNC; \
|
||||
} \
|
||||
\
|
||||
template <typename RetT, typename... Args> \
|
||||
@@ -174,18 +138,13 @@
|
||||
\
|
||||
static auto get_functor() { return get_functor(get_table_func()); } \
|
||||
\
|
||||
static std::string as_string(rocprofiler_hsa_trace_data_t) \
|
||||
static std::vector<void*> as_arg_addr(rocprofiler_hsa_api_callback_tracer_data_t) \
|
||||
{ \
|
||||
return std::string{name} + "()"; \
|
||||
} \
|
||||
\
|
||||
static std::string as_named_string(rocprofiler_hsa_trace_data_t) \
|
||||
{ \
|
||||
return std::string{name} + "()"; \
|
||||
return std::vector<void*>{}; \
|
||||
} \
|
||||
\
|
||||
static std::vector<std::pair<std::string, std::string>> as_arg_list( \
|
||||
rocprofiler_hsa_trace_data_t) \
|
||||
rocprofiler_hsa_api_callback_tracer_data_t) \
|
||||
{ \
|
||||
return {}; \
|
||||
} \
|
||||
@@ -193,7 +152,7 @@
|
||||
} \
|
||||
}
|
||||
|
||||
#define HSA_API_INFO_DEFINITION_V(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \
|
||||
#define HSA_API_INFO_DEFINITION_V(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \
|
||||
namespace rocprofiler \
|
||||
{ \
|
||||
namespace hsa \
|
||||
@@ -201,10 +160,11 @@
|
||||
template <> \
|
||||
struct hsa_api_info<HSA_API_ID> \
|
||||
{ \
|
||||
static constexpr auto domain_idx = HSA_DOMAIN; \
|
||||
static constexpr auto table_idx = HSA_TABLE; \
|
||||
static constexpr auto operation_idx = HSA_API_ID; \
|
||||
static constexpr auto name = #HSA_FUNC; \
|
||||
static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \
|
||||
static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \
|
||||
static constexpr auto table_idx = HSA_TABLE; \
|
||||
static constexpr auto operation_idx = HSA_API_ID; \
|
||||
static constexpr auto name = #HSA_FUNC; \
|
||||
\
|
||||
using this_type = hsa_api_info<operation_idx>; \
|
||||
using base_type = hsa_api_impl<operation_idx>; \
|
||||
@@ -236,7 +196,7 @@
|
||||
template <typename DataT> \
|
||||
static auto& get_api_data_args(DataT& _data) \
|
||||
{ \
|
||||
return _data.api_data.args.HSA_FUNC; \
|
||||
return _data.HSA_FUNC; \
|
||||
} \
|
||||
\
|
||||
template <typename RetT, typename... Args> \
|
||||
@@ -250,23 +210,17 @@
|
||||
\
|
||||
static auto get_functor() { return get_functor(get_table_func()); } \
|
||||
\
|
||||
static std::string as_string(rocprofiler_hsa_trace_data_t trace_data) \
|
||||
static std::vector<void*> as_arg_addr( \
|
||||
rocprofiler_hsa_api_callback_tracer_data_t trace_data) \
|
||||
{ \
|
||||
return utils::join(utils::join_args{std::string{name} + "(", ")", ", "}, \
|
||||
GET_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
|
||||
return std::vector<void*>{ \
|
||||
GET_ADDR_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)}; \
|
||||
} \
|
||||
\
|
||||
static std::string as_named_string(rocprofiler_hsa_trace_data_t trace_data) \
|
||||
{ \
|
||||
return utils::join( \
|
||||
utils::join_args{std::string{name} + "(", ")", ", "}, \
|
||||
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
|
||||
} \
|
||||
\
|
||||
static auto as_arg_list(rocprofiler_hsa_trace_data_t trace_data) \
|
||||
static auto as_arg_list(rocprofiler_hsa_api_callback_tracer_data_t trace_data) \
|
||||
{ \
|
||||
return utils::stringize( \
|
||||
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
|
||||
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)); \
|
||||
} \
|
||||
}; \
|
||||
} \
|
||||
|
||||
@@ -0,0 +1,8 @@
|
||||
#
|
||||
#
|
||||
#
|
||||
set(ROCPROFILER_LIB_HSA_DETAILS_SOURCES)
|
||||
set(ROCPROFILER_LIB_HSA_DETAILS_HEADERS ostream.hpp)
|
||||
|
||||
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_DETAILS_SOURCES}
|
||||
${ROCPROFILER_LIB_HSA_DETAILS_HEADERS})
|
||||
+244
-223
文件差异内容过多而无法显示
加载差异
@@ -19,12 +19,20 @@
|
||||
// THE SOFTWARE.
|
||||
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
|
||||
#include "lib/common/defines.hpp"
|
||||
#include "lib/rocprofiler/hsa/ostream.hpp"
|
||||
#include "lib/common/utility.hpp"
|
||||
#include "lib/rocprofiler/buffer.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/hsa/details/ostream.hpp"
|
||||
#include "lib/rocprofiler/hsa/types.hpp"
|
||||
#include "lib/rocprofiler/hsa/utils.hpp"
|
||||
|
||||
#include <rocprofiler/buffer.h>
|
||||
#include <rocprofiler/callback_tracing.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
|
||||
#include <glog/logging.h>
|
||||
|
||||
#include <atomic>
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
@@ -46,7 +54,12 @@ template <typename DataT, typename Tp>
|
||||
void
|
||||
set_data_retval(DataT& _data, Tp _val)
|
||||
{
|
||||
if constexpr(std::is_same<Tp, hsa_signal_value_t>::value)
|
||||
if constexpr(std::is_same<Tp, null_type>::value)
|
||||
{
|
||||
(void) _data;
|
||||
(void) _val;
|
||||
}
|
||||
else if constexpr(std::is_same<Tp, hsa_signal_value_t>::value)
|
||||
{
|
||||
_data.hsa_signal_value_t_retval = _val;
|
||||
}
|
||||
@@ -100,65 +113,35 @@ get_table()
|
||||
}
|
||||
|
||||
template <size_t Idx>
|
||||
template <typename DataT, typename DataArgsT, typename... Args>
|
||||
template <typename DataArgsT, typename... Args>
|
||||
auto
|
||||
hsa_api_impl<Idx>::phase_enter(DataT& _data, DataArgsT& _data_args, Args... args)
|
||||
hsa_api_impl<Idx>::set_data_args(DataArgsT& _data_args, Args... args)
|
||||
{
|
||||
using info_type = hsa_api_info<Idx>;
|
||||
|
||||
activity_functor_t _func = report_activity.load(std::memory_order_relaxed);
|
||||
if(_func)
|
||||
if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect)
|
||||
{
|
||||
if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect)
|
||||
{
|
||||
auto _tuple = std::make_tuple(args...);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.dst = std::get<0>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.dst_offset = std::get<1>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.src = std::get<2>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.src_offset = std::get<3>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.range = std::get<4>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.range__val = *(std::get<4>(_tuple));
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.copy_agent = std::get<5>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.dir = std::get<6>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.num_dep_signals =
|
||||
std::get<7>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.dep_signals = std::get<8>(_tuple);
|
||||
_data.api_data.args.hsa_amd_memory_async_copy_rect.completion_signal =
|
||||
std::get<9>(_tuple);
|
||||
}
|
||||
else
|
||||
{
|
||||
_data_args = DataArgsT{args...};
|
||||
}
|
||||
if(_func(info_type::domain_idx, info_type::operation_idx, &_data) == 0)
|
||||
{
|
||||
if(_data.phase_enter != nullptr) _data.phase_enter(info_type::operation_idx, &_data);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
auto _tuple = std::make_tuple(args...);
|
||||
_data_args.dst = std::get<0>(_tuple);
|
||||
_data_args.dst_offset = std::get<1>(_tuple);
|
||||
_data_args.src = std::get<2>(_tuple);
|
||||
_data_args.src_offset = std::get<3>(_tuple);
|
||||
_data_args.range = std::get<4>(_tuple);
|
||||
_data_args.range__val = *(std::get<4>(_tuple));
|
||||
_data_args.copy_agent = std::get<5>(_tuple);
|
||||
_data_args.dir = std::get<6>(_tuple);
|
||||
_data_args.num_dep_signals = std::get<7>(_tuple);
|
||||
_data_args.dep_signals = std::get<8>(_tuple);
|
||||
_data_args.completion_signal = std::get<9>(_tuple);
|
||||
}
|
||||
else
|
||||
{
|
||||
_data_args = DataArgsT{args...};
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
template <size_t Idx>
|
||||
template <typename DataT, typename... Args>
|
||||
template <typename FuncT, typename... Args>
|
||||
auto
|
||||
hsa_api_impl<Idx>::phase_exit(DataT& _data)
|
||||
{
|
||||
using info_type = hsa_api_info<Idx>;
|
||||
|
||||
if(_data.phase_exit != nullptr)
|
||||
{
|
||||
_data.phase_exit(info_type::operation_idx, &_data);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
template <size_t Idx>
|
||||
template <typename DataT, typename FuncT, typename... Args>
|
||||
auto
|
||||
hsa_api_impl<Idx>::exec(DataT& _data, FuncT&& _func, Args&&... args)
|
||||
hsa_api_impl<Idx>::exec(FuncT&& _func, Args&&... args)
|
||||
{
|
||||
using return_type = std::decay_t<std::invoke_result_t<FuncT, Args...>>;
|
||||
|
||||
@@ -175,9 +158,7 @@ hsa_api_impl<Idx>::exec(DataT& _data, FuncT&& _func, Args&&... args)
|
||||
}
|
||||
else
|
||||
{
|
||||
auto _ret = _func(std::forward<Args>(args)...);
|
||||
set_data_retval(_data.api_data, _ret);
|
||||
return _ret;
|
||||
return _func(std::forward<Args>(args)...);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -194,14 +175,161 @@ hsa_api_impl<Idx>::functor(Args&&... args)
|
||||
{
|
||||
using info_type = hsa_api_info<Idx>;
|
||||
|
||||
auto trace_data = rocprofiler_hsa_trace_data_t{};
|
||||
LOG(INFO) << __PRETTY_FUNCTION__;
|
||||
|
||||
auto _enabled = phase_enter(
|
||||
trace_data, info_type::get_api_data_args(trace_data), std::forward<Args>(args)...);
|
||||
struct callback_context_data
|
||||
{
|
||||
context::context* ctx = nullptr;
|
||||
rocprofiler_callback_tracing_record_t record = {};
|
||||
};
|
||||
|
||||
auto _ret = exec(trace_data, info_type::get_table_func(), std::forward<Args>(args)...);
|
||||
struct buffered_context_data
|
||||
{
|
||||
context::context* ctx = nullptr;
|
||||
};
|
||||
|
||||
if(_enabled) phase_exit(trace_data);
|
||||
auto callback_contexts = std::vector<callback_context_data>{};
|
||||
auto buffered_contexts = std::vector<buffered_context_data>{};
|
||||
for(const auto& aitr : context::get_active_contexts())
|
||||
{
|
||||
auto* itr = aitr.load();
|
||||
if(!itr) continue;
|
||||
|
||||
if(itr->callback_tracer)
|
||||
{
|
||||
// if the given domain + op is not enabled, skip this context
|
||||
if(!itr->callback_tracer->domains(info_type::callback_domain_idx,
|
||||
info_type::operation_idx))
|
||||
continue;
|
||||
|
||||
callback_contexts.emplace_back(
|
||||
callback_context_data{itr, rocprofiler_callback_tracing_record_t{}});
|
||||
}
|
||||
|
||||
if(itr->buffered_tracer)
|
||||
{
|
||||
// if the given domain + op is not enabled, skip this context
|
||||
if(!itr->buffered_tracer->domains(info_type::buffered_domain_idx,
|
||||
info_type::operation_idx))
|
||||
continue;
|
||||
|
||||
buffered_contexts.emplace_back(buffered_context_data{itr});
|
||||
}
|
||||
}
|
||||
|
||||
if(callback_contexts.empty() && buffered_contexts.empty())
|
||||
{
|
||||
auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
|
||||
if constexpr(!std::is_same<decltype(_ret), null_type>::value)
|
||||
return _ret;
|
||||
else
|
||||
return HSA_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
auto buffer_record = rocprofiler_buffer_tracing_hsa_api_record_t{};
|
||||
auto tracer_data = rocprofiler_hsa_api_callback_tracer_data_t{};
|
||||
auto corr_id = context::correlation_tracing_service::get_unique_record_id();
|
||||
auto thr_id = common::get_tid();
|
||||
|
||||
// construct the buffered info before the callback so the callbacks are as closely wrapped
|
||||
// around the function call as possible
|
||||
if(!buffered_contexts.empty())
|
||||
{
|
||||
buffer_record.kind = info_type::buffered_domain_idx;
|
||||
buffer_record.correlation_id = rocprofiler_correlation_id_t{corr_id};
|
||||
buffer_record.operation = info_type::operation_idx;
|
||||
buffer_record.thread_id = thr_id;
|
||||
}
|
||||
|
||||
// invoke the callbacks
|
||||
if(!callback_contexts.empty())
|
||||
{
|
||||
tracer_data.size = sizeof(rocprofiler_hsa_api_callback_tracer_data_t);
|
||||
set_data_args(info_type::get_api_data_args(tracer_data.args), std::forward<Args>(args)...);
|
||||
|
||||
for(auto& itr : callback_contexts)
|
||||
{
|
||||
auto& ctx = itr.ctx;
|
||||
auto& record = itr.record;
|
||||
|
||||
uint64_t extern_corr_id = 0;
|
||||
auto& _correlation = ctx->correlation_tracer;
|
||||
if(_correlation.external_id_callback)
|
||||
{
|
||||
_correlation.external_id = _correlation.external_id_callback(
|
||||
info_type::callback_domain_idx, info_type::operation_idx, corr_id);
|
||||
extern_corr_id = _correlation.external_id;
|
||||
}
|
||||
auto user_data = rocprofiler_user_data_t{.value = 0};
|
||||
|
||||
record = rocprofiler_callback_tracing_record_t{
|
||||
thr_id,
|
||||
rocprofiler_correlation_id_t{corr_id},
|
||||
rocprofiler_external_correlation_id_t{extern_corr_id},
|
||||
info_type::callback_domain_idx,
|
||||
info_type::operation_idx,
|
||||
ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER,
|
||||
user_data,
|
||||
static_cast<void*>(&tracer_data)};
|
||||
|
||||
auto& callback_info =
|
||||
ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx);
|
||||
callback_info.callback(record, callback_info.data);
|
||||
}
|
||||
}
|
||||
|
||||
// record the start timestamp as close to the function call as possible
|
||||
if(!buffered_contexts.empty())
|
||||
{
|
||||
buffer_record.start_timestamp = common::timestamp_ns();
|
||||
}
|
||||
|
||||
auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
|
||||
|
||||
// record the end timestamp as close to the function call as possible
|
||||
if(!buffered_contexts.empty())
|
||||
{
|
||||
buffer_record.end_timestamp = common::timestamp_ns();
|
||||
}
|
||||
|
||||
if(!callback_contexts.empty())
|
||||
{
|
||||
set_data_retval(tracer_data.retval, _ret);
|
||||
|
||||
for(auto& itr : callback_contexts)
|
||||
{
|
||||
auto& ctx = itr.ctx;
|
||||
auto& record = itr.record;
|
||||
|
||||
record.phase = ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT;
|
||||
record.payload = static_cast<void*>(&tracer_data);
|
||||
|
||||
auto& callback_info =
|
||||
ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx);
|
||||
callback_info.callback(record, callback_info.data);
|
||||
}
|
||||
}
|
||||
|
||||
if(!buffered_contexts.empty())
|
||||
{
|
||||
for(auto& itr : buffered_contexts)
|
||||
{
|
||||
assert(itr.ctx->buffered_tracer);
|
||||
auto buffer_id =
|
||||
itr.ctx->buffered_tracer->buffer_data.at(info_type::buffered_domain_idx);
|
||||
for(auto& bitr : buffer::get_buffers())
|
||||
{
|
||||
if(bitr && bitr->context_id == itr.ctx->context_idx &&
|
||||
bitr->buffer_id == buffer_id.handle)
|
||||
{
|
||||
bitr->emplace(ROCPROFILER_BUFFER_CATEGORY_TRACING,
|
||||
info_type::buffered_domain_idx,
|
||||
buffer_record);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if constexpr(!std::is_same<decltype(_ret), null_type>::value)
|
||||
return _ret;
|
||||
@@ -222,74 +350,59 @@ namespace
|
||||
{
|
||||
template <size_t Idx, size_t... IdxTail>
|
||||
const char*
|
||||
hsa_api_name(const uint32_t id, std::index_sequence<Idx, IdxTail...>)
|
||||
name_by_id(const uint32_t id, std::index_sequence<Idx, IdxTail...>)
|
||||
{
|
||||
if(Idx == id) return hsa_api_info<Idx>::name;
|
||||
if constexpr(sizeof...(IdxTail) > 0)
|
||||
return hsa_api_name(id, std::index_sequence<IdxTail...>{});
|
||||
return name_by_id(id, std::index_sequence<IdxTail...>{});
|
||||
else
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <size_t Idx, size_t... IdxTail>
|
||||
uint32_t
|
||||
hsa_api_id_by_name(const char* name, std::index_sequence<Idx, IdxTail...>)
|
||||
id_by_name(const char* name, std::index_sequence<Idx, IdxTail...>)
|
||||
{
|
||||
if(std::string_view{hsa_api_info<Idx>::name} == std::string_view{name})
|
||||
return hsa_api_info<Idx>::operation_idx;
|
||||
if constexpr(sizeof...(IdxTail) > 0)
|
||||
return hsa_api_id_by_name(name, std::index_sequence<IdxTail...>{});
|
||||
return id_by_name(name, std::index_sequence<IdxTail...>{});
|
||||
else
|
||||
return ROCPROFILER_HSA_API_ID_NONE;
|
||||
}
|
||||
|
||||
template <size_t Idx, size_t... IdxTail>
|
||||
std::string
|
||||
hsa_api_data_string(const uint32_t id,
|
||||
const rocprofiler_hsa_trace_data_t& _data,
|
||||
std::index_sequence<Idx, IdxTail...>)
|
||||
{
|
||||
if(Idx == id) return hsa_api_info<Idx>::as_string(_data);
|
||||
if constexpr(sizeof...(IdxTail) > 0)
|
||||
return hsa_api_data_string(id, _data, std::index_sequence<IdxTail...>{});
|
||||
else
|
||||
return std::string{};
|
||||
}
|
||||
|
||||
template <size_t Idx, size_t... IdxTail>
|
||||
std::string
|
||||
hsa_api_named_data_string(const uint32_t id,
|
||||
const rocprofiler_hsa_trace_data_t& _data,
|
||||
std::index_sequence<Idx, IdxTail...>)
|
||||
{
|
||||
if(Idx == id) return hsa_api_info<Idx>::as_named_string(_data);
|
||||
if constexpr(sizeof...(IdxTail) > 0)
|
||||
return hsa_api_named_data_string(id, _data, std::index_sequence<IdxTail...>{});
|
||||
else
|
||||
return std::string{};
|
||||
}
|
||||
|
||||
template <size_t Idx, size_t... IdxTail>
|
||||
void
|
||||
hsa_api_iterate_args(const uint32_t id,
|
||||
const rocprofiler_hsa_trace_data_t& _data,
|
||||
int (*_func)(const char*, const char*),
|
||||
std::index_sequence<Idx, IdxTail...>)
|
||||
iterate_args(const uint32_t id,
|
||||
const rocprofiler_hsa_api_callback_tracer_data_t& data,
|
||||
rocprofiler_callback_tracing_operation_args_cb_t func,
|
||||
void* user_data,
|
||||
std::index_sequence<Idx, IdxTail...>)
|
||||
{
|
||||
if(Idx == id)
|
||||
{
|
||||
for(auto&& itr : hsa_api_info<Idx>::as_arg_list(_data))
|
||||
using info_type = hsa_api_info<Idx>;
|
||||
auto&& arg_list = info_type::as_arg_list(data);
|
||||
auto&& arg_addr = info_type::as_arg_addr(data);
|
||||
for(size_t i = 0; i < std::min(arg_list.size(), arg_addr.size()); ++i)
|
||||
{
|
||||
_func(itr.first.c_str(), itr.second.c_str());
|
||||
auto ret = func(info_type::callback_domain_idx, // kind
|
||||
id, // operation
|
||||
i, // arg_number
|
||||
arg_list.at(i).first.c_str(), // arg_name
|
||||
arg_list.at(i).second.c_str(), // arg_value_str
|
||||
arg_addr.at(i), // arg_value_addr
|
||||
user_data);
|
||||
if(ret != 0) break;
|
||||
}
|
||||
}
|
||||
if constexpr(sizeof...(IdxTail) > 0)
|
||||
hsa_api_iterate_args(id, _data, _func, std::index_sequence<IdxTail...>{});
|
||||
iterate_args(id, data, func, user_data, std::index_sequence<IdxTail...>{});
|
||||
}
|
||||
|
||||
template <size_t... Idx>
|
||||
void
|
||||
hsa_api_get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
|
||||
get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
|
||||
{
|
||||
auto _emplace = [](auto& _vec, uint32_t _v) {
|
||||
if(_v < ROCPROFILER_HSA_API_ID_LAST) _vec.emplace_back(_v);
|
||||
@@ -300,7 +413,7 @@ hsa_api_get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
|
||||
|
||||
template <size_t... Idx>
|
||||
void
|
||||
hsa_api_get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx...>)
|
||||
get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx...>)
|
||||
{
|
||||
auto _emplace = [](auto& _vec, const char* _v) {
|
||||
if(_v != nullptr && strnlen(_v, 1) > 0) _vec.emplace_back(_v);
|
||||
@@ -311,9 +424,42 @@ hsa_api_get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx.
|
||||
|
||||
template <size_t... Idx>
|
||||
void
|
||||
hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
|
||||
update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
|
||||
{
|
||||
static auto _should_wrap_functor =
|
||||
[](auto _callback_domain, auto _buffered_domain, auto _operation) {
|
||||
for(const auto& itr : context::get_registered_contexts())
|
||||
{
|
||||
if(!itr) continue;
|
||||
|
||||
if(itr->callback_tracer)
|
||||
{
|
||||
// domain not enabled so skip to next callback_tracer
|
||||
if(!itr->callback_tracer->domains(_callback_domain)) continue;
|
||||
|
||||
// if the given domain + op is enabled, we need to wrap
|
||||
if(itr->callback_tracer->domains(_callback_domain, _operation)) return true;
|
||||
}
|
||||
|
||||
if(itr->buffered_tracer)
|
||||
{
|
||||
// domain not enabled so skip to next callback_tracer
|
||||
if(!itr->buffered_tracer->domains(_buffered_domain)) continue;
|
||||
|
||||
// if the given domain + op is enabled, we need to wrap
|
||||
if(itr->buffered_tracer->domains(_buffered_domain, _operation)) return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
};
|
||||
(void) _should_wrap_functor;
|
||||
|
||||
auto _update = [](hsa_api_table_t* _orig_v, auto _info) {
|
||||
// check to see if there are any contexts which enable this operation in the HSA API domain
|
||||
if(!_should_wrap_functor(
|
||||
_info.callback_domain_idx, _info.buffered_domain_idx, _info.operation_idx))
|
||||
return;
|
||||
|
||||
// 1. get the sub-table containing the function pointer
|
||||
// 2. get reference to function pointer in sub-table
|
||||
// 3. update function pointer with functor
|
||||
@@ -328,140 +474,57 @@ hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
|
||||
|
||||
// check out the assembly here... this compiles to a switch statement
|
||||
const char*
|
||||
hsa_api_name(uint32_t id)
|
||||
name_by_id(uint32_t id)
|
||||
{
|
||||
return hsa_api_name(id, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
return name_by_id(id, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
|
||||
uint32_t
|
||||
hsa_api_id_by_name(const char* name)
|
||||
id_by_name(const char* name)
|
||||
{
|
||||
return hsa_api_id_by_name(name, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
|
||||
std::string
|
||||
hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data)
|
||||
{
|
||||
return hsa_api_data_string(id, _data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
|
||||
std::string
|
||||
hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data)
|
||||
{
|
||||
return hsa_api_named_data_string(
|
||||
id, _data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
return id_by_name(name, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
|
||||
void
|
||||
hsa_api_iterate_args(uint32_t id,
|
||||
const rocprofiler_hsa_trace_data_t& _data,
|
||||
int (*_func)(const char*, const char*))
|
||||
iterate_args(uint32_t id,
|
||||
const rocprofiler_hsa_api_callback_tracer_data_t& data,
|
||||
rocprofiler_callback_tracing_operation_args_cb_t callback,
|
||||
void* user_data)
|
||||
{
|
||||
if(_func)
|
||||
hsa_api_iterate_args(
|
||||
id, _data, _func, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
if(callback)
|
||||
iterate_args(
|
||||
id, data, callback, user_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
|
||||
std::vector<uint32_t>
|
||||
hsa_api_get_ids()
|
||||
get_ids()
|
||||
{
|
||||
auto _data = std::vector<uint32_t>{};
|
||||
_data.reserve(ROCPROFILER_HSA_API_ID_LAST);
|
||||
hsa_api_get_ids(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
get_ids(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
return _data;
|
||||
}
|
||||
|
||||
std::vector<const char*>
|
||||
hsa_api_get_names()
|
||||
get_names()
|
||||
{
|
||||
auto _data = std::vector<const char*>{};
|
||||
_data.reserve(ROCPROFILER_HSA_API_ID_LAST);
|
||||
hsa_api_get_names(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
get_names(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
return _data;
|
||||
}
|
||||
|
||||
void
|
||||
hsa_api_set_callback(activity_functor_t _func)
|
||||
set_callback(activity_functor_t _func)
|
||||
{
|
||||
auto&& _v = report_activity.load();
|
||||
report_activity.compare_exchange_strong(_v, _func);
|
||||
}
|
||||
|
||||
void
|
||||
hsa_api_update_table(hsa_api_table_t* _orig)
|
||||
update_table(hsa_api_table_t* _orig)
|
||||
{
|
||||
if(_orig) hsa_api_update_table(_orig, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
if(_orig) update_table(_orig, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
|
||||
}
|
||||
} // namespace hsa
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
bool
|
||||
OnLoad(HsaApiTable* table,
|
||||
uint64_t runtime_version,
|
||||
uint64_t failed_tool_count,
|
||||
const char* const* failed_tool_names)
|
||||
{
|
||||
(void) runtime_version;
|
||||
(void) failed_tool_count;
|
||||
(void) failed_tool_names;
|
||||
|
||||
fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__);
|
||||
|
||||
auto& _saved = rocprofiler::hsa::get_table();
|
||||
::copyTables(table, &_saved);
|
||||
|
||||
rocprofiler::hsa::hsa_api_update_table(table);
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
#include <iomanip>
|
||||
|
||||
int
|
||||
main()
|
||||
{
|
||||
rocprofiler::hsa::activity_functor_t _cb =
|
||||
[](rocprofiler_tracer_activity_domain_t domain, uint32_t operation_id, void* data) {
|
||||
const auto* _name = rocprofiler::hsa::hsa_api_name(operation_id);
|
||||
auto _name_id = rocprofiler::hsa::hsa_api_id_by_name(_name);
|
||||
auto& _data = *static_cast<rocprofiler::hsa::hsa_trace_data_t*>(data);
|
||||
std::cout << "[cb] domain=" << domain << ", op_id=" << operation_id << ", data=" << data
|
||||
<< ", name=" << _name << ", name_id=" << _name_id << ", named_string='"
|
||||
<< rocprofiler::hsa::hsa_api_named_data_string(operation_id, _data) << "'"
|
||||
<< "\n";
|
||||
auto _func = [](const char* name, const char* value) {
|
||||
std::cout << " " << std::setw(20) << name << " = " << value << "\n";
|
||||
return 0;
|
||||
};
|
||||
rocprofiler::hsa::hsa_api_iterate_args(operation_id, _data, _func);
|
||||
return 0;
|
||||
};
|
||||
|
||||
rocprofiler::hsa::report_activity.store(_cb);
|
||||
|
||||
{
|
||||
double val = 40;
|
||||
hsa_code_object_t code_object = {};
|
||||
hsa_code_object_info_t attribute = HSA_CODE_OBJECT_INFO_TYPE;
|
||||
void* value = &val;
|
||||
|
||||
auto _func =
|
||||
rocprofiler::hsa::hsa_api_info<HSA_API_ID_hsa_code_object_get_info>::get_functor();
|
||||
_func(code_object, attribute, value);
|
||||
}
|
||||
|
||||
{
|
||||
bool result = false;
|
||||
uint16_t ext = 1;
|
||||
uint16_t major = 4;
|
||||
uint16_t minor = 2;
|
||||
|
||||
auto _func = rocprofiler::hsa::hsa_api_info<
|
||||
HSA_API_ID_hsa_system_extension_supported>::get_functor();
|
||||
_func(ext, major, minor, &result);
|
||||
}
|
||||
}
|
||||
*/
|
||||
|
||||
@@ -28,204 +28,203 @@ HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, core_)
|
||||
HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, amd_ext_)
|
||||
HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, image_ext_)
|
||||
|
||||
HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn)
|
||||
HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data)
|
||||
HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn)
|
||||
HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data)
|
||||
// clang-format on
|
||||
|
||||
#if HSA_AMD_EXT_API_TABLE_MAJOR_VERSION >= 0x02
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_reserve,
|
||||
hsa_amd_vmem_address_reserve,
|
||||
hsa_amd_vmem_address_reserve_fn,
|
||||
@@ -233,15 +232,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
size,
|
||||
address,
|
||||
flags)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_free,
|
||||
hsa_amd_vmem_address_free,
|
||||
hsa_amd_vmem_address_free_fn,
|
||||
ptr,
|
||||
size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_create,
|
||||
hsa_amd_vmem_handle_create,
|
||||
hsa_amd_vmem_handle_create_fn,
|
||||
@@ -250,14 +247,12 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
type,
|
||||
flags,
|
||||
memory_handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_release,
|
||||
hsa_amd_vmem_handle_release,
|
||||
hsa_amd_vmem_handle_release_fn,
|
||||
memory_handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_map,
|
||||
hsa_amd_vmem_map,
|
||||
hsa_amd_vmem_map_fn,
|
||||
@@ -266,15 +261,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
in_offset,
|
||||
memory_handle,
|
||||
flags)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_unmap,
|
||||
hsa_amd_vmem_unmap,
|
||||
hsa_amd_vmem_unmap_fn,
|
||||
va,
|
||||
size)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_set_access,
|
||||
hsa_amd_vmem_set_access,
|
||||
hsa_amd_vmem_set_access_fn,
|
||||
@@ -282,38 +275,33 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
size,
|
||||
desc,
|
||||
desc_cnt)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_access,
|
||||
hsa_amd_vmem_get_access,
|
||||
hsa_amd_vmem_get_access_fn,
|
||||
va,
|
||||
perms,
|
||||
agent_handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_export_shareable_handle,
|
||||
hsa_amd_vmem_export_shareable_handle,
|
||||
hsa_amd_vmem_export_shareable_handle_fn,
|
||||
dmabuf_fd,
|
||||
handle,
|
||||
flags)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_import_shareable_handle,
|
||||
hsa_amd_vmem_import_shareable_handle,
|
||||
hsa_amd_vmem_import_shareable_handle_fn,
|
||||
dmabuf_fd,
|
||||
handle)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_retain_alloc_handle,
|
||||
hsa_amd_vmem_retain_alloc_handle,
|
||||
hsa_amd_vmem_retain_alloc_handle_fn,
|
||||
handle,
|
||||
addr)
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
|
||||
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_alloc_properties_from_handle,
|
||||
hsa_amd_vmem_get_alloc_properties_from_handle,
|
||||
hsa_amd_vmem_get_alloc_properties_from_handle_fn,
|
||||
|
||||
@@ -29,9 +29,9 @@ namespace rocprofiler
|
||||
{
|
||||
namespace hsa
|
||||
{
|
||||
using activity_functor_t = int (*)(rocprofiler_tracer_activity_domain_t domain,
|
||||
uint32_t operation_id,
|
||||
void* data);
|
||||
using activity_functor_t = int (*)(rocprofiler_service_callback_tracing_kind_t domain,
|
||||
uint32_t operation_id,
|
||||
void* data);
|
||||
|
||||
using hsa_api_table_t = HsaApiTable;
|
||||
|
||||
@@ -44,14 +44,11 @@ struct hsa_table_lookup;
|
||||
template <size_t Idx>
|
||||
struct hsa_api_impl
|
||||
{
|
||||
template <typename DataT, typename DataArgsT, typename... Args>
|
||||
static auto phase_enter(DataT& _data, DataArgsT&, Args... args);
|
||||
template <typename DataArgsT, typename... Args>
|
||||
static auto set_data_args(DataArgsT&, Args... args);
|
||||
|
||||
template <typename DataT, typename... Args>
|
||||
static auto phase_exit(DataT& _data);
|
||||
|
||||
template <typename DataT, typename FuncT, typename... Args>
|
||||
static auto exec(DataT& _data, FuncT&&, Args&&... args);
|
||||
template <typename FuncT, typename... Args>
|
||||
static auto exec(FuncT&&, Args&&... args);
|
||||
|
||||
template <typename... Args>
|
||||
static auto functor(Args&&... args);
|
||||
@@ -61,39 +58,27 @@ template <size_t Idx>
|
||||
struct hsa_api_info;
|
||||
|
||||
const char*
|
||||
hsa_api_name(uint32_t id);
|
||||
name_by_id(uint32_t id);
|
||||
|
||||
uint32_t
|
||||
hsa_api_id_by_name(const char* name);
|
||||
|
||||
std::string
|
||||
hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data);
|
||||
|
||||
std::string
|
||||
hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data);
|
||||
id_by_name(const char* name);
|
||||
|
||||
void
|
||||
hsa_api_iterate_args(uint32_t id,
|
||||
const rocprofiler_hsa_trace_data_t& _data,
|
||||
int (*_func)(const char*, const char*));
|
||||
iterate_args(uint32_t id,
|
||||
const rocprofiler_hsa_api_callback_tracer_data_t& data,
|
||||
rocprofiler_callback_tracing_operation_args_cb_t callback,
|
||||
void* user_data);
|
||||
|
||||
std::vector<const char*>
|
||||
hsa_api_get_names();
|
||||
get_names();
|
||||
|
||||
std::vector<uint32_t>
|
||||
hsa_api_get_ids();
|
||||
get_ids();
|
||||
|
||||
void
|
||||
hsa_api_set_callback(activity_functor_t _func);
|
||||
set_callback(activity_functor_t _func);
|
||||
|
||||
void
|
||||
update_table(hsa_api_table_t* _orig);
|
||||
} // namespace hsa
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*);
|
||||
|
||||
bool
|
||||
OnLoad(HsaApiTable* table,
|
||||
uint64_t runtime_version,
|
||||
uint64_t failed_tool_count,
|
||||
const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API;
|
||||
}
|
||||
|
||||
@@ -45,70 +45,47 @@ namespace hsa
|
||||
{
|
||||
namespace utils
|
||||
{
|
||||
template <typename Tp, typename Up = Tp, std::enable_if_t<fmt::is_formattable<Tp>::value, int> = 0>
|
||||
std::string
|
||||
stringize_impl(Tp _v, int)
|
||||
{
|
||||
return fmt::format("{}", _v);
|
||||
}
|
||||
|
||||
template <typename Tp>
|
||||
std::string
|
||||
stringize_impl(Tp _v, long)
|
||||
struct is_pair_impl
|
||||
{
|
||||
auto _ss = std::stringstream{};
|
||||
_ss << _v;
|
||||
return _ss.str();
|
||||
}
|
||||
static constexpr auto value = false;
|
||||
};
|
||||
|
||||
template <typename LhsT, typename RhsT>
|
||||
auto
|
||||
stringize_impl(const std::pair<LhsT, RhsT>& _v, int)
|
||||
struct is_pair_impl<std::pair<LhsT, RhsT>>
|
||||
{
|
||||
return std::make_pair(stringize_impl(_v.first, 0), stringize_impl(_v.second, 0));
|
||||
}
|
||||
|
||||
struct join_args
|
||||
{
|
||||
std::string_view prefix = {};
|
||||
std::string_view suffix = {};
|
||||
std::string_view separator = {};
|
||||
static constexpr auto value = true;
|
||||
};
|
||||
|
||||
template <typename Tp>
|
||||
std::string
|
||||
join_impl(const Tp& _v)
|
||||
{
|
||||
return stringize_impl(_v, 0);
|
||||
}
|
||||
struct is_pair : is_pair_impl<std::remove_cv_t<std::remove_reference_t<std::decay_t<Tp>>>>
|
||||
{};
|
||||
|
||||
template <typename LhsT, typename RhsT>
|
||||
std::string
|
||||
join_impl(const std::pair<LhsT, RhsT>& _v)
|
||||
{
|
||||
return fmt::format("{}={}", join_impl(_v.first), join_impl(_v.second));
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
template <typename Tp>
|
||||
auto
|
||||
join(join_args ja, Args... args)
|
||||
stringize_impl(const Tp& _v)
|
||||
{
|
||||
auto _content = std::string{};
|
||||
if constexpr(is_pair<Tp>::value)
|
||||
{
|
||||
return std::make_pair(stringize_impl(_v.first), stringize_impl(_v.second));
|
||||
}
|
||||
else if constexpr(fmt::is_formattable<Tp>::value && !std::is_pointer<Tp>::value)
|
||||
{
|
||||
return fmt::format("{}", _v);
|
||||
}
|
||||
else
|
||||
{
|
||||
auto _ss = std::stringstream{};
|
||||
((_ss << ja.separator << join_impl(args)), ...);
|
||||
auto _v = _ss.str();
|
||||
if(_v.length() > ja.separator.length()) _content = _v.substr(2);
|
||||
_ss << _v;
|
||||
return _ss.str();
|
||||
}
|
||||
|
||||
return (std::stringstream{} << ja.prefix << _content << ja.suffix).str();
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
auto
|
||||
stringize(Args... args)
|
||||
{
|
||||
return std::vector<std::pair<std::string, std::string>>{stringize_impl(args, 0)...};
|
||||
return std::vector<std::pair<std::string, std::string>>{stringize_impl(args)...};
|
||||
}
|
||||
|
||||
template <typename Tp>
|
||||
|
||||
@@ -0,0 +1,279 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/internal_threading.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/rocprofiler/buffer.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/internal_threading.hpp"
|
||||
|
||||
#include <cstdint>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace internal_threading
|
||||
{
|
||||
namespace
|
||||
{
|
||||
template <rocprofiler_internal_thread_library_t... Idx>
|
||||
using library_sequence_t = std::integer_sequence<rocprofiler_internal_thread_library_t, Idx...>;
|
||||
using creation_notifier_cb_t = void (*)(rocprofiler_internal_thread_library_t, void*);
|
||||
using thread_pool_config_t = PTL::ThreadPool::Config;
|
||||
|
||||
// this is used to loop over the different libraries
|
||||
constexpr auto creation_notifier_library_seq = library_sequence_t<ROCPROFILER_LIBRARY,
|
||||
ROCPROFILER_HSA_LIBRARY,
|
||||
ROCPROFILER_HIP_LIBRARY,
|
||||
ROCPROFILER_MARKER_LIBRARY>{};
|
||||
|
||||
// check that creation_notifier_library_seq is up to date
|
||||
static_assert((1 << (creation_notifier_library_seq.size() - 1)) == ROCPROFILER_LIBRARY_LAST,
|
||||
"Update creation_notifier_library_seq to include new libraries");
|
||||
|
||||
// used to distinguish invoking pre vs. post at compile-time
|
||||
enum class notifier_stage
|
||||
{
|
||||
precreation = 0,
|
||||
postcreation,
|
||||
};
|
||||
|
||||
// data structure holding list of callbacks
|
||||
template <rocprofiler_internal_thread_library_t LibT>
|
||||
struct creation_notifier
|
||||
{
|
||||
static constexpr auto value = LibT;
|
||||
|
||||
std::vector<creation_notifier_cb_t> precreate_callbacks = {};
|
||||
std::vector<creation_notifier_cb_t> postcreate_callbacks = {};
|
||||
std::vector<void*> user_data = {};
|
||||
std::mutex mutex = {};
|
||||
};
|
||||
|
||||
// static accessor for creation_notifier instance
|
||||
template <rocprofiler_internal_thread_library_t LibT>
|
||||
auto&
|
||||
get_creation_notifier()
|
||||
{
|
||||
static auto _v = creation_notifier<LibT>{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
// adds callbacks to creation_notifier instance(s)
|
||||
template <rocprofiler_internal_thread_library_t... Idx>
|
||||
void
|
||||
update_creation_notifiers(creation_notifier_cb_t pre,
|
||||
creation_notifier_cb_t post,
|
||||
int libs,
|
||||
void* data,
|
||||
library_sequence_t<Idx...>)
|
||||
{
|
||||
auto update = [pre, post, libs, data](auto& notifier) {
|
||||
if(libs == 0 || ((libs & notifier.value) == notifier.value))
|
||||
{
|
||||
notifier.mutex.lock();
|
||||
notifier.precreate_callbacks.emplace_back(pre);
|
||||
notifier.postcreate_callbacks.emplace_back(post);
|
||||
notifier.user_data.emplace_back(data);
|
||||
notifier.mutex.unlock();
|
||||
}
|
||||
};
|
||||
|
||||
(update(get_creation_notifier<Idx>()), ...);
|
||||
}
|
||||
|
||||
// invokes creation notifiers
|
||||
template <notifier_stage StageT, rocprofiler_internal_thread_library_t... Idx>
|
||||
void
|
||||
execute_creation_notifiers(rocprofiler_internal_thread_library_t libs,
|
||||
std::integer_sequence<rocprofiler_internal_thread_library_t, Idx...>)
|
||||
{
|
||||
auto execute = [libs](auto& notifier) {
|
||||
if(((libs & notifier.value) == notifier.value))
|
||||
{
|
||||
notifier.mutex.lock();
|
||||
if constexpr(StageT == notifier_stage::precreation)
|
||||
{
|
||||
for(size_t i = 0; i < notifier.precreate_callbacks.size(); ++i)
|
||||
{
|
||||
auto itr = notifier.precreate_callbacks.at(i);
|
||||
if(itr) itr(notifier.value, notifier.user_data.at(i));
|
||||
}
|
||||
}
|
||||
else if constexpr(StageT == notifier_stage::postcreation)
|
||||
{
|
||||
for(size_t i = 0; i < notifier.postcreate_callbacks.size(); ++i)
|
||||
{
|
||||
auto itr = notifier.postcreate_callbacks.at(i);
|
||||
if(itr) itr(notifier.value, notifier.user_data.at(i));
|
||||
}
|
||||
}
|
||||
notifier.mutex.unlock();
|
||||
}
|
||||
};
|
||||
|
||||
(execute(get_creation_notifier<Idx>()), ...);
|
||||
}
|
||||
|
||||
auto&
|
||||
get_thread_pools()
|
||||
{
|
||||
static auto _v = thread_pool_vec_t{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
auto&
|
||||
get_task_groups()
|
||||
{
|
||||
static auto _v = task_group_vec_t{};
|
||||
return _v;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
// initialize the default thread pool
|
||||
void
|
||||
initialize()
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, create_callback_thread);
|
||||
}
|
||||
|
||||
// sync all the task groups and destroy the thread pools
|
||||
void
|
||||
finalize()
|
||||
{
|
||||
for(auto& itr : get_task_groups())
|
||||
{
|
||||
if(itr) itr->join();
|
||||
}
|
||||
|
||||
for(auto& itr : get_thread_pools())
|
||||
{
|
||||
if(itr) itr->destroy_threadpool();
|
||||
}
|
||||
|
||||
for(auto& itr : get_task_groups())
|
||||
itr.reset();
|
||||
|
||||
for(auto& itr : get_thread_pools())
|
||||
itr.reset();
|
||||
}
|
||||
|
||||
void
|
||||
notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t libs)
|
||||
{
|
||||
execute_creation_notifiers<notifier_stage::precreation>(libs, creation_notifier_library_seq);
|
||||
}
|
||||
|
||||
void
|
||||
notify_post_internal_thread_create(rocprofiler_internal_thread_library_t libs)
|
||||
{
|
||||
execute_creation_notifiers<notifier_stage::postcreation>(libs, creation_notifier_library_seq);
|
||||
}
|
||||
|
||||
rocprofiler_callback_thread_t
|
||||
create_callback_thread()
|
||||
{
|
||||
// notify that rocprofiler library is about to create an inernal thread
|
||||
notify_pre_internal_thread_create(ROCPROFILER_LIBRARY);
|
||||
|
||||
// this will be index after emplace_back
|
||||
auto idx = get_thread_pools().size();
|
||||
|
||||
auto& thr_pool = get_thread_pools().emplace_back(
|
||||
new thread_pool_t{thread_pool_config_t{.pool_size = 1}}, [](thread_pool_t* v) {
|
||||
v->destroy_threadpool();
|
||||
delete v;
|
||||
});
|
||||
|
||||
// construct the task group to use the newly created thread pool
|
||||
get_task_groups().emplace_back(new task_group_t{thr_pool.get()});
|
||||
|
||||
// notify that rocprofiler library finished creating an internal thread
|
||||
notify_post_internal_thread_create(ROCPROFILER_LIBRARY);
|
||||
|
||||
return rocprofiler_callback_thread_t{idx};
|
||||
}
|
||||
|
||||
// returns the task group for the given callback thread identifier
|
||||
task_group_t*
|
||||
get_task_group(rocprofiler_callback_thread_t cb_tid)
|
||||
{
|
||||
return get_task_groups().at(cb_tid.handle).get();
|
||||
}
|
||||
} // namespace internal_threading
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate,
|
||||
rocprofiler_internal_thread_library_cb_t postcreate,
|
||||
int libs,
|
||||
void* data)
|
||||
{
|
||||
rocprofiler::internal_threading::update_creation_notifiers(
|
||||
precreate,
|
||||
postcreate,
|
||||
libs,
|
||||
data,
|
||||
rocprofiler::internal_threading::creation_notifier_library_seq);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id)
|
||||
{
|
||||
rocprofiler::internal_threading::initialize();
|
||||
|
||||
auto cb_tid = rocprofiler::internal_threading::create_callback_thread();
|
||||
if(cb_tid.handle > 0)
|
||||
{
|
||||
*cb_thread_id = cb_tid;
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_ERROR;
|
||||
}
|
||||
|
||||
rocprofiler_status_t ROCPROFILER_API
|
||||
rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id,
|
||||
rocprofiler_callback_thread_t cb_thread_id)
|
||||
{
|
||||
if(cb_thread_id.handle >= rocprofiler::internal_threading::get_task_groups().size())
|
||||
return ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND;
|
||||
|
||||
for(auto& bitr : rocprofiler::buffer::get_buffers())
|
||||
{
|
||||
if(bitr && bitr->buffer_id == buffer_id.handle)
|
||||
{
|
||||
bitr->task_group_id = cb_thread_id.handle;
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
}
|
||||
return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,66 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/internal_threading.h>
|
||||
|
||||
#include "lib/common/container/stable_vector.hpp"
|
||||
#include "lib/common/defines.hpp"
|
||||
|
||||
#include <PTL/TaskGroup.hh>
|
||||
#include <PTL/ThreadPool.hh>
|
||||
|
||||
#include <cstdint>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace internal_threading
|
||||
{
|
||||
using thread_pool_t = PTL::ThreadPool;
|
||||
using task_group_t = PTL::TaskGroup<void>;
|
||||
using unique_thread_pool_t = std::unique_ptr<thread_pool_t, void (*)(thread_pool_t*)>;
|
||||
using unique_task_group_t = std::unique_ptr<task_group_t>;
|
||||
using thread_pool_vec_t = std::vector<unique_thread_pool_t>;
|
||||
using task_group_vec_t = std::vector<unique_task_group_t>;
|
||||
|
||||
void notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t);
|
||||
void notify_post_internal_thread_create(rocprofiler_internal_thread_library_t);
|
||||
|
||||
// initialize the default thread pool
|
||||
void
|
||||
initialize();
|
||||
|
||||
// destroy all the thread pools
|
||||
void
|
||||
finalize();
|
||||
|
||||
// creates a new thread
|
||||
rocprofiler_callback_thread_t
|
||||
create_callback_thread();
|
||||
|
||||
// returns the task group for the given callback thread identifier
|
||||
task_group_t* get_task_group(rocprofiler_callback_thread_t);
|
||||
} // namespace internal_threading
|
||||
} // namespace rocprofiler
|
||||
@@ -0,0 +1,556 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/internal_threading.hpp"
|
||||
|
||||
#include <rocprofiler/context.h>
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/hsa.h>
|
||||
#include <rocprofiler/version.h>
|
||||
|
||||
#include <fmt/format.h>
|
||||
#include <glog/logging.h>
|
||||
|
||||
#include <dlfcn.h>
|
||||
#include <link.h>
|
||||
#include <unistd.h>
|
||||
#include <atomic>
|
||||
#include <cstdint>
|
||||
#include <fstream>
|
||||
#include <iostream>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <stdexcept>
|
||||
#include <string>
|
||||
#include <string_view>
|
||||
#include <thread>
|
||||
#include <unordered_set>
|
||||
#include <vector>
|
||||
|
||||
extern "C" {
|
||||
#pragma weak rocprofiler_configure
|
||||
|
||||
extern rocprofiler_tool_configure_result_t*
|
||||
rocprofiler_configure(uint32_t, const char*, uint32_t, rocprofiler_client_id_t*);
|
||||
}
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace registration
|
||||
{
|
||||
namespace
|
||||
{
|
||||
auto&
|
||||
get_status()
|
||||
{
|
||||
static auto _v = std::pair<std::atomic<int>, std::atomic<int>>{0, 0};
|
||||
return _v;
|
||||
}
|
||||
|
||||
auto&
|
||||
get_invoked_configures()
|
||||
{
|
||||
static auto _v = std::unordered_set<rocprofiler_configure_func_t>{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
auto&
|
||||
get_forced_configure()
|
||||
{
|
||||
static rocprofiler_configure_func_t _v = nullptr;
|
||||
return _v;
|
||||
}
|
||||
|
||||
void
|
||||
init_logging()
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, []() {
|
||||
auto get_argv0 = []() {
|
||||
auto ifs = std::ifstream{"/proc/self/cmdline"};
|
||||
auto sarg = std::string{};
|
||||
while(ifs && !ifs.eof())
|
||||
{
|
||||
ifs >> sarg;
|
||||
if(!sarg.empty()) break;
|
||||
}
|
||||
return sarg;
|
||||
};
|
||||
|
||||
static auto argv0 = get_argv0();
|
||||
google::InitGoogleLogging(argv0.c_str());
|
||||
LOG(INFO) << "logging initialized";
|
||||
});
|
||||
}
|
||||
|
||||
std::vector<std::string>
|
||||
get_link_map()
|
||||
{
|
||||
auto chain = std::vector<std::string>{};
|
||||
void* handle = nullptr;
|
||||
handle = dlopen(nullptr, RTLD_LAZY | RTLD_NOLOAD);
|
||||
|
||||
if(handle)
|
||||
{
|
||||
struct link_map* link_map_v = nullptr;
|
||||
dlinfo(handle, RTLD_DI_LINKMAP, &link_map_v);
|
||||
struct link_map* next_link = link_map_v->l_next;
|
||||
while(next_link)
|
||||
{
|
||||
if(next_link->l_name != nullptr && !std::string_view{next_link->l_name}.empty())
|
||||
{
|
||||
chain.emplace_back(next_link->l_name);
|
||||
}
|
||||
next_link = next_link->l_next;
|
||||
}
|
||||
}
|
||||
|
||||
return chain;
|
||||
}
|
||||
|
||||
struct client_library
|
||||
{
|
||||
std::string name = {};
|
||||
void* dlhandle = nullptr;
|
||||
decltype(::rocprofiler_configure)* configure_func = nullptr;
|
||||
std::unique_ptr<rocprofiler_tool_configure_result_t> configure_result = {};
|
||||
rocprofiler_client_id_t internal_client_id = {};
|
||||
rocprofiler_client_id_t mutable_client_id = {};
|
||||
};
|
||||
|
||||
std::vector<client_library>
|
||||
find_clients()
|
||||
{
|
||||
auto data = std::vector<client_library>{};
|
||||
|
||||
if(get_forced_configure())
|
||||
{
|
||||
data.emplace_back(client_library{"(forced)", nullptr, get_forced_configure()});
|
||||
}
|
||||
|
||||
if(!rocprofiler_configure && !get_forced_configure())
|
||||
{
|
||||
LOG(ERROR) << "no rocprofiler_configure function found";
|
||||
return data;
|
||||
}
|
||||
|
||||
if(rocprofiler_configure != &rocprofiler_configure)
|
||||
throw std::runtime_error("rocprofiler_configure != &rocprofiler_configure");
|
||||
|
||||
if(&rocprofiler_configure != get_forced_configure())
|
||||
data.emplace_back(client_library{"unknown", nullptr, &rocprofiler_configure});
|
||||
|
||||
for(const auto& itr : get_link_map())
|
||||
{
|
||||
LOG(INFO) << "searching " << itr << " for rocprofiler_configure";
|
||||
|
||||
void* handle = dlopen(itr.c_str(), RTLD_LAZY | RTLD_NOLOAD);
|
||||
LOG_IF(ERROR, handle == nullptr) << "error dlopening " << itr;
|
||||
|
||||
decltype(::rocprofiler_configure)* _sym = nullptr;
|
||||
*(void**) (&_sym) = dlsym(handle, "rocprofiler_configure");
|
||||
|
||||
// skip the configure function that was forced
|
||||
if(_sym == get_forced_configure())
|
||||
{
|
||||
data.front().name = itr;
|
||||
data.front().dlhandle = handle;
|
||||
data.front().internal_client_id.name = "(forced)";
|
||||
continue;
|
||||
}
|
||||
|
||||
if(!_sym)
|
||||
{
|
||||
LOG(INFO) << "|_" << itr << " did not contain rocprofiler_configure symbol";
|
||||
continue;
|
||||
}
|
||||
|
||||
if(_sym == &rocprofiler_configure && data.size() == 1)
|
||||
{
|
||||
data.front().name = itr;
|
||||
data.front().dlhandle = handle;
|
||||
data.front().internal_client_id.name = "default";
|
||||
}
|
||||
else
|
||||
{
|
||||
uint32_t _prio = data.size();
|
||||
auto& entry =
|
||||
data.emplace_back(client_library{itr,
|
||||
handle,
|
||||
_sym,
|
||||
nullptr,
|
||||
rocprofiler_client_id_t{nullptr, _prio},
|
||||
rocprofiler_client_id_t{nullptr, _prio}});
|
||||
entry.internal_client_id.name = entry.name.c_str();
|
||||
}
|
||||
}
|
||||
|
||||
LOG(ERROR) << __FUNCTION__ << " found " << data.size() << " clients";
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
std::vector<client_library>&
|
||||
get_clients()
|
||||
{
|
||||
static auto _v = find_clients();
|
||||
return _v;
|
||||
}
|
||||
|
||||
using mutex_t = std::recursive_mutex;
|
||||
using scoped_lock_t = std::unique_lock<mutex_t>;
|
||||
|
||||
mutex_t&
|
||||
get_registration_mutex()
|
||||
{
|
||||
static auto _v = mutex_t{};
|
||||
return _v;
|
||||
}
|
||||
} // namespace
|
||||
|
||||
int
|
||||
get_init_status()
|
||||
{
|
||||
return get_status().first.load(std::memory_order_acquire);
|
||||
}
|
||||
|
||||
int
|
||||
get_fini_status()
|
||||
{
|
||||
return get_status().second.load(std::memory_order_acquire);
|
||||
}
|
||||
|
||||
void
|
||||
set_init_status(int v)
|
||||
{
|
||||
get_status().first.store(v, std::memory_order_release);
|
||||
}
|
||||
|
||||
void
|
||||
set_fini_status(int v)
|
||||
{
|
||||
get_status().second.store(v, std::memory_order_release);
|
||||
}
|
||||
|
||||
bool
|
||||
invoke_client_configures()
|
||||
{
|
||||
if(get_init_status() > 0) return false;
|
||||
|
||||
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
|
||||
if(_lk.owns_lock()) return false;
|
||||
_lk.lock();
|
||||
|
||||
LOG(ERROR) << __FUNCTION__;
|
||||
|
||||
size_t prio = 0;
|
||||
for(auto& itr : get_clients())
|
||||
{
|
||||
if(get_invoked_configures().find(itr.configure_func) != get_invoked_configures().end())
|
||||
{
|
||||
LOG(ERROR) << "rocprofiler::registration::invoke_client_configures() attempted to "
|
||||
"invoke configure function from "
|
||||
<< itr.name << " (addr="
|
||||
<< fmt::format("{:#018x}", reinterpret_cast<uint64_t>(itr.configure_func))
|
||||
<< ") more than once";
|
||||
continue;
|
||||
}
|
||||
else
|
||||
{
|
||||
LOG(INFO) << "rocprofiler::registration::invoke_client_configures() invoking configure "
|
||||
"function from "
|
||||
<< itr.name << " (addr="
|
||||
<< fmt::format("{:#018x}", reinterpret_cast<uint64_t>(itr.configure_func))
|
||||
<< ")";
|
||||
}
|
||||
|
||||
auto* _result = itr.configure_func(
|
||||
ROCPROFILER_VERSION, ROCPROFILER_VERSION_STRING, prio++, &itr.mutable_client_id);
|
||||
if(_result)
|
||||
itr.configure_result = std::make_unique<rocprofiler_tool_configure_result_t>(*_result);
|
||||
|
||||
get_invoked_configures().emplace(itr.configure_func);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
invoke_client_initializers()
|
||||
{
|
||||
if(get_init_status() > 0) return false;
|
||||
|
||||
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
|
||||
if(_lk.owns_lock()) return false;
|
||||
_lk.lock();
|
||||
|
||||
LOG(ERROR) << __FUNCTION__;
|
||||
|
||||
set_init_status(-1);
|
||||
for(auto& itr : get_clients())
|
||||
{
|
||||
if(itr.configure_result && itr.configure_result->initialize)
|
||||
{
|
||||
context::push_client(itr.internal_client_id.handle);
|
||||
itr.configure_result->initialize(&invoke_client_finalizer,
|
||||
itr.configure_result->tool_data);
|
||||
context::pop_client(itr.internal_client_id.handle);
|
||||
// set to nullptr so initialize only gets called once
|
||||
itr.configure_result->initialize = nullptr;
|
||||
}
|
||||
}
|
||||
|
||||
// initialization is no longer available
|
||||
set_init_status(1);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
invoke_client_finalizers()
|
||||
{
|
||||
if(get_fini_status() > 0) return false;
|
||||
|
||||
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
|
||||
if(_lk.owns_lock()) return false;
|
||||
_lk.lock();
|
||||
|
||||
set_fini_status(-1);
|
||||
for(auto& itr : get_clients())
|
||||
{
|
||||
if(itr.configure_result && itr.configure_result->finalize)
|
||||
{
|
||||
itr.configure_result->finalize(itr.configure_result->tool_data);
|
||||
// set to nullptr so finalize only gets called once
|
||||
itr.configure_result->finalize = nullptr;
|
||||
}
|
||||
}
|
||||
|
||||
set_fini_status(1);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
invoke_client_initializer(rocprofiler_client_id_t client_id)
|
||||
{
|
||||
if(get_init_status() > 0) return false;
|
||||
|
||||
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
|
||||
if(_lk.owns_lock()) return false;
|
||||
_lk.lock();
|
||||
|
||||
// save the original status
|
||||
auto _restore_status = get_init_status();
|
||||
set_init_status(-1);
|
||||
for(auto& itr : get_clients())
|
||||
{
|
||||
if(itr.internal_client_id.handle == client_id.handle &&
|
||||
itr.mutable_client_id.handle == client_id.handle)
|
||||
{
|
||||
if(itr.configure_result && itr.configure_result->initialize)
|
||||
{
|
||||
context::push_client(itr.internal_client_id.handle);
|
||||
itr.configure_result->initialize(&invoke_client_finalizer,
|
||||
itr.configure_result->tool_data);
|
||||
context::pop_client(itr.internal_client_id.handle);
|
||||
// set to nullptr so initialize only gets called once
|
||||
itr.configure_result->initialize = nullptr;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// we don't want the explicit client initialization to set the init status to 1
|
||||
// we just want to restore what it previously was
|
||||
set_init_status(_restore_status);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void
|
||||
invoke_client_finalizer(rocprofiler_client_id_t client_id)
|
||||
{
|
||||
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
|
||||
if(_lk.owns_lock()) return;
|
||||
_lk.lock();
|
||||
|
||||
for(auto& itr : get_clients())
|
||||
{
|
||||
if(itr.internal_client_id.handle == client_id.handle &&
|
||||
itr.mutable_client_id.handle == client_id.handle)
|
||||
{
|
||||
if(itr.configure_result && itr.configure_result->finalize)
|
||||
{
|
||||
itr.configure_result->finalize(itr.configure_result->tool_data);
|
||||
// set to nullptr so finalize only gets called once
|
||||
itr.configure_result->finalize = nullptr;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
initialize()
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
static auto _ready = std::atomic<bool>{false};
|
||||
|
||||
std::call_once(_once, []() {
|
||||
init_logging();
|
||||
invoke_client_configures();
|
||||
invoke_client_initializers();
|
||||
internal_threading::initialize();
|
||||
std::atexit(&finalize);
|
||||
_ready.store(true, std::memory_order_release);
|
||||
});
|
||||
|
||||
if(!_ready.load(std::memory_order_acquire))
|
||||
{
|
||||
while(!_ready.load(std::memory_order_acquire))
|
||||
std::this_thread::yield();
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
finalize()
|
||||
{
|
||||
hsa_shut_down();
|
||||
invoke_client_finalizers();
|
||||
for(auto& itr : rocprofiler::context::get_active_contexts())
|
||||
itr.store(nullptr, std::memory_order_seq_cst);
|
||||
internal_threading::finalize();
|
||||
}
|
||||
} // namespace registration
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_is_initialized(int* status)
|
||||
{
|
||||
*status = rocprofiler::registration::get_init_status();
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_is_finalized(int* status)
|
||||
{
|
||||
*status = rocprofiler::registration::get_fini_status();
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_force_configure(rocprofiler_configure_func_t configure_func)
|
||||
{
|
||||
auto& forced_config = rocprofiler::registration::get_forced_configure();
|
||||
|
||||
// init status may be -1 (currently initializing) or 1 (already initialized).
|
||||
// if either case, we want to ignore this function call but if this is
|
||||
if(rocprofiler::registration::get_init_status() != 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
// if another tool forced configure, the init status should be 1, but
|
||||
// let's just make sure that the forced configure function is a nullptr
|
||||
if(forced_config) return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
forced_config = configure_func;
|
||||
rocprofiler::registration::initialize();
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
int
|
||||
rocprofiler_set_api_table(const char* name,
|
||||
uint64_t lib_version,
|
||||
uint64_t lib_instance,
|
||||
void** tables,
|
||||
uint64_t num_tables)
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, rocprofiler::registration::initialize);
|
||||
|
||||
// pass to roctx init
|
||||
LOG_IF(ERROR, num_tables == 0) << " rocprofiler expected " << name
|
||||
<< " library to pass at least one table, not " << num_tables;
|
||||
LOG_IF(ERROR, tables == nullptr) << " rocprofiler expected pointer to array of tables from "
|
||||
<< name << " library, not a nullptr";
|
||||
|
||||
if(std::string_view{name} == "hip")
|
||||
{
|
||||
// pass to hip init
|
||||
LOG_IF(ERROR, num_tables > 1)
|
||||
<< " rocprofiler expected HIP library to pass 1 API table, not " << num_tables;
|
||||
}
|
||||
else if(std::string_view{name} == "hsa")
|
||||
{
|
||||
// pass to hsa init
|
||||
LOG_IF(ERROR, num_tables > 1)
|
||||
<< " rocprofiler expected HSA library to pass 1 API table, not " << num_tables;
|
||||
|
||||
auto* hsa_api_table = static_cast<HsaApiTable*>(*tables);
|
||||
auto& saved_hsa_api_table = rocprofiler::hsa::get_table();
|
||||
::copyTables(hsa_api_table, &saved_hsa_api_table);
|
||||
|
||||
rocprofiler::hsa::update_table(hsa_api_table);
|
||||
}
|
||||
else if(std::string_view{name} == "roctx")
|
||||
{
|
||||
// pass to roctx init
|
||||
LOG_IF(ERROR, num_tables > 1)
|
||||
<< " rocprofiler expected ROCTX library to pass 1 API table, not " << num_tables;
|
||||
}
|
||||
else
|
||||
{
|
||||
LOG(ERROR) << "rocprofiler does not accept API tables from " << name;
|
||||
LOG_ASSERT(false) << " rocprofiler does not accept API tables from " << name;
|
||||
}
|
||||
|
||||
(void) lib_version;
|
||||
(void) lib_instance;
|
||||
(void) tables;
|
||||
(void) num_tables;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
bool
|
||||
OnLoad(HsaApiTable* table,
|
||||
uint64_t runtime_version,
|
||||
uint64_t failed_tool_count,
|
||||
const char* const* failed_tool_names)
|
||||
{
|
||||
rocprofiler::registration::init_logging();
|
||||
|
||||
(void) runtime_version;
|
||||
(void) failed_tool_count;
|
||||
(void) failed_tool_names;
|
||||
|
||||
fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__);
|
||||
|
||||
void* table_v = static_cast<void*>(table);
|
||||
rocprofiler_set_api_table("hsa", runtime_version, 0, &table_v, 1);
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,95 @@
|
||||
// MIT License
|
||||
//
|
||||
// Copyright (c) 2023 ROCm Developer Tools
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/registration.h>
|
||||
#include "lib/common/defines.hpp"
|
||||
|
||||
#include <cstdint>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
extern "C" {
|
||||
struct HsaApiTable;
|
||||
|
||||
using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*);
|
||||
|
||||
bool
|
||||
OnLoad(HsaApiTable* table,
|
||||
uint64_t runtime_version,
|
||||
uint64_t failed_tool_count,
|
||||
const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API;
|
||||
|
||||
// this is the "hidden" function that rocprofiler-register invokes to pass
|
||||
// the API tables to rocprofiler
|
||||
int
|
||||
rocprofiler_set_api_table(const char* name,
|
||||
uint64_t lib_version,
|
||||
uint64_t lib_instance,
|
||||
void** tables,
|
||||
uint64_t num_tables) ROCPROFILER_PUBLIC_API;
|
||||
}
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace registration
|
||||
{
|
||||
// initialize the clients
|
||||
void
|
||||
initialize();
|
||||
|
||||
// finalize the clients
|
||||
void
|
||||
finalize();
|
||||
|
||||
// invoke all rocprofiler_configure symbols
|
||||
bool
|
||||
invoke_client_configures();
|
||||
|
||||
// invoke initialize functions returned from rocprofiler_configure
|
||||
bool
|
||||
invoke_client_initializers();
|
||||
|
||||
// invoke finalize functions returned from rocprofiler_configure
|
||||
bool
|
||||
invoke_client_finalizers();
|
||||
|
||||
// explicitly invoke the initialize function of a specific client
|
||||
bool invoke_client_initializer(rocprofiler_client_id_t);
|
||||
|
||||
// explicitly invoke the finalize function of a specific client
|
||||
void invoke_client_finalizer(rocprofiler_client_id_t);
|
||||
|
||||
int
|
||||
get_init_status();
|
||||
|
||||
int
|
||||
get_fini_status();
|
||||
|
||||
void
|
||||
set_init_status(int);
|
||||
|
||||
void
|
||||
set_fini_status(int);
|
||||
} // namespace registration
|
||||
} // namespace rocprofiler
|
||||
@@ -20,9 +20,16 @@
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#include <rocprofiler/fwd.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include "lib/common/utility.hpp"
|
||||
#include "lib/rocprofiler/context/context.hpp"
|
||||
#include "lib/rocprofiler/context/domain.hpp"
|
||||
#include "lib/rocprofiler/hsa/hsa.hpp"
|
||||
#include "lib/rocprofiler/registration.hpp"
|
||||
|
||||
#include <atomic>
|
||||
#include <vector>
|
||||
|
||||
namespace
|
||||
@@ -34,6 +41,22 @@ consume_args(Tp&&...)
|
||||
} // namespace
|
||||
|
||||
extern "C" {
|
||||
rocprofiler_status_t
|
||||
rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch)
|
||||
{
|
||||
if(major) *major = ROCPROFILER_VERSION_MAJOR;
|
||||
if(minor) *minor = ROCPROFILER_VERSION_MINOR;
|
||||
if(patch) *patch = ROCPROFILER_VERSION_PATCH;
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_get_timestamp(rocprofiler_timestamp_t* ts)
|
||||
{
|
||||
*ts = rocprofiler::common::timestamp_ns();
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
|
||||
size_t agent_size,
|
||||
@@ -76,54 +99,6 @@ rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
|
||||
return callback(_agents.data(), _agents.size(), user_data);
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_create_context(rocprofiler_context_id_t* context_id)
|
||||
{
|
||||
consume_args(context_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_start_context(rocprofiler_context_id_t context_id)
|
||||
{
|
||||
consume_args(context_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_stop_context(rocprofiler_context_id_t context_id)
|
||||
{
|
||||
consume_args(context_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
consume_args(buffer_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
consume_args(buffer_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_create_buffer(rocprofiler_context_id_t context,
|
||||
size_t size,
|
||||
size_t watermark,
|
||||
rocprofiler_buffer_policy_t action,
|
||||
rocprofiler_buffer_callback_t callback,
|
||||
void* callback_data,
|
||||
rocprofiler_buffer_id_t* buffer_id)
|
||||
{
|
||||
consume_args(context, size, watermark, action, callback, callback_data, buffer_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id,
|
||||
rocprofiler_agent_t agent,
|
||||
@@ -132,6 +107,9 @@ rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t conte
|
||||
uint64_t interval,
|
||||
rocprofiler_buffer_id_t buffer_id)
|
||||
{
|
||||
if(rocprofiler::registration::get_init_status() > 0)
|
||||
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
|
||||
|
||||
consume_args(context_id, agent, method, unit, interval, buffer_id);
|
||||
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
@@ -1,701 +0,0 @@
|
||||
|
||||
#include <rocprofiler/config.h>
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include "config_helpers.hpp"
|
||||
#include "config_internal.hpp"
|
||||
|
||||
#include <roctracer/roctx.h>
|
||||
#include <sys/syscall.h>
|
||||
#include <unistd.h>
|
||||
#include <atomic>
|
||||
#include <cstddef>
|
||||
#include <iostream>
|
||||
#include <mutex>
|
||||
|
||||
#include <hsa/hsa.h>
|
||||
#include <hsa/hsa_api_trace.h>
|
||||
#include <hsa/hsa_ext_amd.h>
|
||||
#include <hsa/hsa_ext_image.h>
|
||||
|
||||
typedef enum
|
||||
{
|
||||
ACTIVITY_API_PHASE_ENTER = 0,
|
||||
ACTIVITY_API_PHASE_EXIT = 1
|
||||
} activity_api_phase_t;
|
||||
|
||||
typedef struct roctx_api_data_s
|
||||
{
|
||||
union
|
||||
{
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
};
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxMarkA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxRangePushA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxRangePop;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
} roctxRangeStartA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
} roctxRangeStop;
|
||||
} args;
|
||||
} roctx_api_data_t;
|
||||
|
||||
// helper macros ensuring C and C++ structs adhere to specific naming convention
|
||||
#define ROCP_PUBLIC_CONFIG(TYPE) ::rocprofiler_##TYPE
|
||||
#define ROCP_PRIVATE_CONFIG(TYPE) ::rocprofiler::internal::TYPE
|
||||
|
||||
// Below asserts at compile time that the external C object has the same size as internal
|
||||
// C++ object, e.g.,
|
||||
// sizeof(rocprofiler_domain_config) == sizeof(rocprofiler::internal::domain_config)
|
||||
#define ROCP_ASSERT_CONFIG_ABI(TYPE) \
|
||||
static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)) == sizeof(ROCP_PRIVATE_CONFIG(TYPE)), \
|
||||
"Error! rocprofiler_" #TYPE " ABI error");
|
||||
|
||||
// Below asserts at compile time that the external C struct members has the same offset as
|
||||
// internal C++ struct members
|
||||
#define ROCP_ASSERT_CONFIG_OFFSET_ABI(TYPE, PUB_FIELD, PRIV_FIELD) \
|
||||
static_assert(offsetof(ROCP_PUBLIC_CONFIG(TYPE), PUB_FIELD) == \
|
||||
offsetof(ROCP_PRIVATE_CONFIG(TYPE), PRIV_FIELD), \
|
||||
"Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI offset error"); \
|
||||
static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)::PUB_FIELD) == \
|
||||
sizeof(ROCP_PRIVATE_CONFIG(TYPE)::PRIV_FIELD), \
|
||||
"Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI size error");
|
||||
|
||||
// this defines a template specialization for ensuring that the reinterpret_cast is only
|
||||
// applied between public C structs and private C++ structs which are compatible.
|
||||
#define ROCP_DEFINE_API_CAST_IMPL(INPUT_TYPE, OUTPUT_TYPE) \
|
||||
namespace traits \
|
||||
{ \
|
||||
template <> \
|
||||
struct api_cast<INPUT_TYPE> \
|
||||
{ \
|
||||
using input_type = INPUT_TYPE; \
|
||||
using output_type = OUTPUT_TYPE; \
|
||||
\
|
||||
output_type* operator()(input_type* _v) const \
|
||||
{ \
|
||||
return reinterpret_cast<output_type*>(_v); \
|
||||
} \
|
||||
\
|
||||
const output_type* operator()(const input_type* _v) const \
|
||||
{ \
|
||||
return reinterpret_cast<const output_type*>(_v); \
|
||||
} \
|
||||
}; \
|
||||
}
|
||||
|
||||
// define C -> C++ and C++ -> C casting rules
|
||||
#define ROCP_DEFINE_API_CAST_D(TYPE) \
|
||||
ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE)) \
|
||||
ROCP_DEFINE_API_CAST_IMPL(ROCP_PRIVATE_CONFIG(TYPE), ROCP_PUBLIC_CONFIG(TYPE))
|
||||
|
||||
// use only when C++ struct is just an alias for C struct
|
||||
#define ROCP_DEFINE_API_CAST_S(TYPE) \
|
||||
ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE))
|
||||
|
||||
namespace
|
||||
{
|
||||
namespace traits
|
||||
{
|
||||
// left undefined to ensure template specialization
|
||||
template <typename PublicT>
|
||||
struct api_cast;
|
||||
|
||||
// ensure api_cast<decltype(a)> where decltype(a) is const Tp equates to api_cast<Tp>
|
||||
template <typename PublicT>
|
||||
struct api_cast<const PublicT> : api_cast<PublicT>
|
||||
{};
|
||||
|
||||
// ensure api_cast<decltype(a)> where decltype(a) is Tp& equates to api_cast<Tp>
|
||||
template <typename PublicT>
|
||||
struct api_cast<PublicT&> : api_cast<PublicT>
|
||||
{};
|
||||
|
||||
// ensure api_cast<decltype(a)> where decltype(a) is Tp* equates to api_cast<Tp>
|
||||
template <typename PublicT>
|
||||
struct api_cast<PublicT*> : api_cast<PublicT>
|
||||
{};
|
||||
} // namespace traits
|
||||
|
||||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
//
|
||||
//
|
||||
// SEE BELOW! VERY IMPORTANT!
|
||||
//
|
||||
//
|
||||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
//
|
||||
//
|
||||
// EVERY NEW CONFIG AND ALL OF ITS MEMBER FIELDS NEED TO HAVE THESE COMPILE TIME CHECKS!
|
||||
//
|
||||
// these checks verify the two structs have the same size and that each
|
||||
// member field has the same size and offset into the struct
|
||||
//
|
||||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
|
||||
ROCP_ASSERT_CONFIG_ABI(config)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, size, size)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, compat_version, compat_version)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, api_version, api_version)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, reserved0, context_idx)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, user_data, user_data)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, buffer, buffer)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, domain, domain)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, filter, filter)
|
||||
|
||||
ROCP_ASSERT_CONFIG_ABI(domain_config)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, callback, user_sync_callback)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved0, domains)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved1, opcodes)
|
||||
|
||||
ROCP_ASSERT_CONFIG_ABI(buffer_config)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, callback, callback)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, buffer_size, buffer_size)
|
||||
// ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved0, buffer)
|
||||
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved1, buffer_idx)
|
||||
|
||||
ROCP_DEFINE_API_CAST_D(config)
|
||||
ROCP_DEFINE_API_CAST_D(domain_config)
|
||||
ROCP_DEFINE_API_CAST_D(buffer_config)
|
||||
ROCP_DEFINE_API_CAST_S(filter_config)
|
||||
|
||||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
//
|
||||
//
|
||||
// SEE ABOVE! VERY IMPORTANT!
|
||||
//
|
||||
//
|
||||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||
|
||||
/// use this to ensure that reinterpret_cast from public C struct to internal C++ struct
|
||||
/// is valid, e.g. guard against accidentally casting to wrong type
|
||||
template <typename Tp>
|
||||
auto
|
||||
rocp_cast(Tp* _val)
|
||||
{
|
||||
return traits::api_cast<Tp>{}(_val);
|
||||
}
|
||||
|
||||
/// helper function for making copies of the fields in rocprofiler_config. If the config
|
||||
/// field needs to be copied in some special way, use a template specialization of the
|
||||
/// "construct" function in the allocator to handle this, e.g.:
|
||||
///
|
||||
/// using special_config = ::rocprofiler::internal::special_config;
|
||||
///
|
||||
/// template <>
|
||||
/// void
|
||||
/// allocator<special_config, 8>::construct(special_config* const _p,
|
||||
/// const special_config& _v) const
|
||||
/// {
|
||||
/// auto _tmp = special_config{};
|
||||
/// // ... special copy of fields from _v into _tmp
|
||||
///
|
||||
/// // placement new of _tmp into _p
|
||||
/// _p = new(_p) special_config{ _tmp };
|
||||
/// }
|
||||
///
|
||||
/// template <>
|
||||
/// void
|
||||
/// allocator<special_config, 8>::construct(special_config* const _p,
|
||||
/// special_config&& _v) const
|
||||
/// {
|
||||
/// auto _tmp = std::move(_v);
|
||||
/// // ... perform special needs
|
||||
///
|
||||
/// // placement new of _tmp into _p
|
||||
/// _p = new(_p) special_config{ std::move(_tmp) };
|
||||
/// }
|
||||
///
|
||||
template <typename Tp, typename Up>
|
||||
Tp*&
|
||||
copy_config_field(Tp*& _dst, Up* _src_v)
|
||||
{
|
||||
static auto _allocator = allocator<Tp>{};
|
||||
|
||||
if constexpr(!std::is_same<Tp, Up>::value)
|
||||
{
|
||||
using PrivateT = typename traits::api_cast<Up>::output_type;
|
||||
static_assert(std::is_same<PrivateT, Tp>::value, "Error incorrect field copy");
|
||||
|
||||
auto _src = rocp_cast(_src_v);
|
||||
if(_src)
|
||||
{
|
||||
_dst = _allocator.allocate(1);
|
||||
_allocator.construct(_dst, *_src);
|
||||
}
|
||||
return _dst;
|
||||
}
|
||||
else
|
||||
{
|
||||
if(_src_v)
|
||||
{
|
||||
_dst = _allocator.allocate(1);
|
||||
_allocator.construct(_dst, *_src_v);
|
||||
}
|
||||
return _dst;
|
||||
}
|
||||
}
|
||||
|
||||
auto&
|
||||
get_configs_buffer()
|
||||
{
|
||||
static char
|
||||
_v[::rocprofiler::internal::max_configs_count * sizeof(rocprofiler::internal::config)];
|
||||
return _v;
|
||||
}
|
||||
|
||||
auto&
|
||||
get_configs_mutex()
|
||||
{
|
||||
static auto _v = std::mutex{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
inline uint32_t
|
||||
get_tid()
|
||||
{
|
||||
return syscall(__NR_gettid);
|
||||
}
|
||||
|
||||
constexpr auto rocp_max_configs = ::rocprofiler::internal::max_configs_count;
|
||||
} // namespace
|
||||
|
||||
namespace rocprofiler
|
||||
{
|
||||
namespace internal
|
||||
{
|
||||
std::array<rocprofiler::internal::config*, max_configs_count>&
|
||||
get_registered_configs()
|
||||
{
|
||||
static auto _v = std::array<rocprofiler::internal::config*, max_configs_count>{};
|
||||
return _v;
|
||||
}
|
||||
|
||||
std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>&
|
||||
get_active_configs()
|
||||
{
|
||||
static auto _v = std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>{};
|
||||
return _v;
|
||||
}
|
||||
} // namespace internal
|
||||
} // namespace rocprofiler
|
||||
|
||||
extern "C" {
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_allocate_config(rocprofiler_config* _inp_cfg)
|
||||
{
|
||||
// perform checks that rocprofiler can be activated
|
||||
|
||||
::memset(_inp_cfg, 0, sizeof(rocprofiler_config));
|
||||
|
||||
auto* _cfg = rocp_cast(_inp_cfg);
|
||||
|
||||
_cfg->size = sizeof(::rocprofiler_config);
|
||||
_cfg->compat_version = 0;
|
||||
_cfg->api_version = ROCPROFILER_API_VERSION_ID;
|
||||
_cfg->context_idx = std::numeric_limits<decltype(_cfg->context_idx)>::max();
|
||||
|
||||
// initial value checks
|
||||
assert(_cfg->size == sizeof(rocprofiler::internal::config));
|
||||
assert(_cfg->compat_version == 0);
|
||||
assert(_cfg->api_version == ROCPROFILER_API_VERSION_ID);
|
||||
assert(_cfg->buffer == nullptr);
|
||||
assert(_cfg->domain == nullptr);
|
||||
assert(_cfg->filter == nullptr);
|
||||
assert(_cfg->context_idx ==
|
||||
std::numeric_limits<decltype(rocprofiler::internal::config::context_idx)>::max());
|
||||
|
||||
// ... allocate any internal space needed to handle another config ...
|
||||
{
|
||||
auto _lk = std::unique_lock<std::mutex>{get_configs_mutex()};
|
||||
// ...
|
||||
}
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_validate_config(const rocprofiler_config* cfg_v)
|
||||
{
|
||||
const auto* cfg = rocp_cast(cfg_v);
|
||||
|
||||
if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
|
||||
|
||||
if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND;
|
||||
|
||||
if(cfg->domain == nullptr || cfg->domain->domains == 0)
|
||||
return ROCPROFILER_STATUS_ERROR_INCORRECT_DOMAIN;
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_start_config(rocprofiler_config* cfg_v, rocprofiler_context_id_t* context_id)
|
||||
{
|
||||
if(rocprofiler_validate_config(cfg_v) != ROCPROFILER_STATUS_SUCCESS)
|
||||
{
|
||||
std::cerr << "rocprofiler_start_config() provided an invalid configuration. tool "
|
||||
"should use rocprofiler_validate_config() to check whether the "
|
||||
"config is valid and adapt accordingly to issues before trying to "
|
||||
"start the configuration."
|
||||
<< std::endl;
|
||||
abort();
|
||||
}
|
||||
|
||||
auto* cfg = rocp_cast(cfg_v);
|
||||
|
||||
uint64_t idx = rocp_max_configs;
|
||||
{
|
||||
auto _lk = std::unique_lock<std::mutex>{get_configs_mutex()};
|
||||
for(size_t i = 0; i < rocp_max_configs; ++i)
|
||||
{
|
||||
if(rocprofiler::internal::get_registered_configs().at(i) == nullptr)
|
||||
{
|
||||
idx = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// too many configs already registered
|
||||
if(idx == rocp_max_configs) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_ACTIVE;
|
||||
|
||||
cfg->context_idx = idx;
|
||||
context_id->handle = idx;
|
||||
|
||||
// using the context id, compute the location in the buffer of configs
|
||||
auto* _offset = get_configs_buffer() + (idx * sizeof(rocprofiler::internal::config));
|
||||
|
||||
// placement new into the buffer
|
||||
auto* _copy_cfg = new(_offset) rocprofiler::internal::config{*cfg};
|
||||
|
||||
// make copies of non-null config fields
|
||||
copy_config_field(_copy_cfg->buffer, cfg->buffer);
|
||||
copy_config_field(_copy_cfg->domain, cfg->domain);
|
||||
copy_config_field(_copy_cfg->filter, cfg->filter);
|
||||
|
||||
// store until "deallocation"
|
||||
rocprofiler::internal::get_registered_configs().at(idx) = _copy_cfg;
|
||||
|
||||
using config_t = rocprofiler::internal::config;
|
||||
// atomic swap the pointer into the "active" array used internally
|
||||
config_t* _expected = nullptr;
|
||||
bool success = rocprofiler::internal::get_active_configs().at(idx).compare_exchange_strong(
|
||||
_expected, rocprofiler::internal::get_registered_configs().at(idx));
|
||||
|
||||
if(!success) return ROCPROFILER_STATUS_ERROR_HAS_ACTIVE_CONTEXT; // need relevant enum
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_stop_config(rocprofiler_context_id_t idx)
|
||||
{
|
||||
// atomically assign the config pointer to NULL so that it is skipped in future
|
||||
// callbacks
|
||||
auto* _expected =
|
||||
rocprofiler::internal::get_active_configs().at(idx.handle).load(std::memory_order_relaxed);
|
||||
bool success = rocprofiler::internal::get_active_configs()
|
||||
.at(idx.handle)
|
||||
.compare_exchange_strong(_expected, nullptr);
|
||||
|
||||
if(!success)
|
||||
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange strong
|
||||
// failed
|
||||
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_domain(struct rocprofiler_domain_config* _inp_cfg,
|
||||
rocprofiler_tracer_activity_domain_t _domain)
|
||||
{
|
||||
auto* _cfg = rocp_cast(_inp_cfg);
|
||||
if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE ||
|
||||
_domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST)
|
||||
return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID;
|
||||
|
||||
_cfg->domains |= (1 << _domain);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_domains(struct rocprofiler_domain_config* _inp_cfg,
|
||||
rocprofiler_tracer_activity_domain_t* _domains,
|
||||
size_t _ndomains)
|
||||
{
|
||||
for(size_t i = 0; i < _ndomains; ++i)
|
||||
{
|
||||
auto _status = rocprofiler_domain_add_domain(_inp_cfg, _domains[i]);
|
||||
if(_status != ROCPROFILER_STATUS_SUCCESS) return _status;
|
||||
}
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_op(struct rocprofiler_domain_config* _inp_cfg,
|
||||
rocprofiler_tracer_activity_domain_t _domain,
|
||||
uint32_t _op)
|
||||
{
|
||||
auto* _cfg = rocp_cast(_inp_cfg);
|
||||
if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE ||
|
||||
_domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST)
|
||||
return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID;
|
||||
|
||||
if(_op >= get_domain_max_op(_domain)) return ROCPROFILER_STATUS_ERROR_INVALID_OPERATION_ID;
|
||||
|
||||
auto _offset = (_domain * rocprofiler::internal::domain_ops_offset);
|
||||
_cfg->opcodes.set(_offset + _op, true);
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
rocprofiler_status_t
|
||||
rocprofiler_domain_add_ops(struct rocprofiler_domain_config* _inp_cfg,
|
||||
rocprofiler_tracer_activity_domain_t _domain,
|
||||
uint32_t* _ops,
|
||||
size_t _nops)
|
||||
{
|
||||
for(size_t i = 0; i < _nops; ++i)
|
||||
{
|
||||
auto _status = rocprofiler_domain_add_op(_inp_cfg, _domain, _ops[i]);
|
||||
if(_status != ROCPROFILER_STATUS_SUCCESS) return _status;
|
||||
}
|
||||
return ROCPROFILER_STATUS_SUCCESS;
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------------------ //
|
||||
//
|
||||
// demo of internal implementation
|
||||
//
|
||||
// ------------------------------------------------------------------------------------ //
|
||||
|
||||
void
|
||||
api_callback(rocprofiler_tracer_activity_domain_t domain,
|
||||
uint32_t cid,
|
||||
const void* /*callback_data*/,
|
||||
void*)
|
||||
{
|
||||
for(const auto& aitr : rocprofiler::internal::get_active_configs())
|
||||
{
|
||||
auto* itr = aitr.load();
|
||||
if(!itr) continue;
|
||||
|
||||
// below should be valid so this might need to raise error
|
||||
if(!itr->domain) continue;
|
||||
|
||||
// if the given domain + op is not enabled, skip this config
|
||||
if(!(*itr->domain)(domain, cid)) continue;
|
||||
|
||||
if(itr->filter)
|
||||
{
|
||||
if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX)
|
||||
{}
|
||||
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API)
|
||||
{
|
||||
if(itr->filter->hsa_function_id && itr->filter->hsa_function_id(cid) == 0) continue;
|
||||
}
|
||||
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API)
|
||||
{
|
||||
if(itr->filter->hip_function_id && itr->filter->hip_function_id(cid) == 0) continue;
|
||||
}
|
||||
}
|
||||
|
||||
auto& _domain = (*itr->domain);
|
||||
auto& _correlation = (*itr->correlation_id);
|
||||
|
||||
auto _correlation_id = rocprofiler::internal::correlation_config::get_unique_record_id();
|
||||
if(_correlation.external_id_callback)
|
||||
_correlation.external_id =
|
||||
_correlation.external_id_callback(domain, cid, _correlation_id);
|
||||
|
||||
auto timestamp_ns = []() -> uint64_t {
|
||||
return std::chrono::steady_clock::now().time_since_epoch().count();
|
||||
};
|
||||
|
||||
(void) _domain;
|
||||
(void) timestamp_ns;
|
||||
/*
|
||||
auto _header = rocprofiler_record_header_t{ROCPROFILER_TRACER_RECORD,
|
||||
rocprofiler_record_id_t{_correlation_id}};
|
||||
auto _op_id = rocprofiler_tracer_operation_id_t{cid};
|
||||
auto _agent_id = rocprofiler_agent_id_t{0};
|
||||
auto _queue_id = rocprofiler_queue_id_t{0};
|
||||
auto _thread_id = rocprofiler_thread_id_t{get_tid()};
|
||||
auto _context = rocprofiler_context_id_t{itr->context_idx};
|
||||
auto _timestamp_raw = rocprofiler_timestamp_t{timestamp_ns()};
|
||||
auto _timestamp = rocprofiler_record_header_timestamp_t{_timestamp_raw, _timestamp_raw};
|
||||
|
||||
if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX)
|
||||
{
|
||||
auto _api_data = rocprofiler_tracer_api_data_t{};
|
||||
const roctx_api_data_t* _data =
|
||||
reinterpret_cast<const roctx_api_data_t*>(callback_data);
|
||||
|
||||
if(itr->filter && itr->filter->name && itr->filter->name(_data->args.message) == 0)
|
||||
continue;
|
||||
|
||||
_api_data.roctx = _data;
|
||||
|
||||
auto _phase = rocprofiler_api_tracing_phase_t{ROCPROFILER_PHASE_ENTER};
|
||||
_timestamp = {_timestamp_raw, _timestamp_raw};
|
||||
|
||||
auto _external_cid = rocprofiler_tracer_external_id_t{_data ? _data->args.id : 0};
|
||||
auto _activity_cid = rocprofiler_tracer_activity_correlation_id_t{0};
|
||||
const char* _name = _data->args.message;
|
||||
|
||||
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
|
||||
_external_cid,
|
||||
ACTIVITY_DOMAIN_ROCTX,
|
||||
_op_id,
|
||||
_api_data,
|
||||
_activity_cid,
|
||||
_timestamp,
|
||||
_agent_id,
|
||||
_queue_id,
|
||||
_thread_id,
|
||||
_phase,
|
||||
_name},
|
||||
_context);
|
||||
}
|
||||
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API)
|
||||
{
|
||||
auto _api_data = rocprofiler_tracer_api_data_t{};
|
||||
const hsa_api_data_t* _data = reinterpret_cast<const hsa_api_data_t*>(callback_data);
|
||||
_api_data.hsa = _data;
|
||||
|
||||
auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER)
|
||||
? ROCPROFILER_PHASE_ENTER
|
||||
: ROCPROFILER_PHASE_EXIT};
|
||||
|
||||
if(_phase == ROCPROFILER_PHASE_ENTER)
|
||||
_timestamp.begin = _timestamp_raw;
|
||||
else
|
||||
_timestamp.end = _timestamp_raw;
|
||||
|
||||
auto _external_cid = rocprofiler_tracer_external_id_t{0};
|
||||
auto _activity_cid =
|
||||
rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id};
|
||||
const char* _name = nullptr;
|
||||
|
||||
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
|
||||
_external_cid,
|
||||
ACTIVITY_DOMAIN_HSA_API,
|
||||
_op_id,
|
||||
_api_data,
|
||||
_activity_cid,
|
||||
_timestamp,
|
||||
_agent_id,
|
||||
_queue_id,
|
||||
_thread_id,
|
||||
_phase,
|
||||
_name},
|
||||
_context);
|
||||
}
|
||||
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API)
|
||||
{
|
||||
auto _api_data = rocprofiler_tracer_api_data_t{};
|
||||
const hip_api_data_t* _data = reinterpret_cast<const hip_api_data_t*>(callback_data);
|
||||
_api_data.hip = _data;
|
||||
|
||||
auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER)
|
||||
? ROCPROFILER_PHASE_ENTER
|
||||
: ROCPROFILER_PHASE_EXIT};
|
||||
|
||||
if(_phase == ROCPROFILER_PHASE_ENTER)
|
||||
_timestamp.begin = _timestamp_raw;
|
||||
else
|
||||
_timestamp.end = _timestamp_raw;
|
||||
|
||||
auto _external_cid = rocprofiler_tracer_external_id_t{0};
|
||||
auto _activity_cid =
|
||||
rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id};
|
||||
const char* _name = nullptr;
|
||||
|
||||
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
|
||||
_external_cid,
|
||||
ACTIVITY_DOMAIN_HIP_API,
|
||||
_op_id,
|
||||
_api_data,
|
||||
_activity_cid,
|
||||
_timestamp,
|
||||
_agent_id,
|
||||
_queue_id,
|
||||
_thread_id,
|
||||
_phase,
|
||||
_name},
|
||||
_context);
|
||||
}
|
||||
*/
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
InitRoctracer()
|
||||
{
|
||||
for(const auto& itr : rocprofiler::internal::get_registered_configs())
|
||||
{
|
||||
if(!itr) continue;
|
||||
|
||||
// below should be valid so this might need to raise error
|
||||
if(!itr->domain) continue;
|
||||
|
||||
for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
|
||||
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API,
|
||||
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX})
|
||||
{
|
||||
if((*itr->domain)(ditr))
|
||||
{
|
||||
if(itr->domain->user_sync_callback)
|
||||
{
|
||||
// ...
|
||||
}
|
||||
else
|
||||
{
|
||||
// ...
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS,
|
||||
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS})
|
||||
{
|
||||
if((*itr->domain)(ditr))
|
||||
{
|
||||
if(itr->domain->opcodes.none())
|
||||
{
|
||||
// ...
|
||||
}
|
||||
else
|
||||
{
|
||||
for(size_t i = 0; i < itr->domain->opcodes.size(); ++i)
|
||||
{
|
||||
if((*itr->domain)(ditr, i))
|
||||
{
|
||||
// ...
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,510 +0,0 @@
|
||||
/* Copyright (c) 2018-2022 Advanced Micro Devices, Inc.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
THE SOFTWARE. */
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <rocprofiler/rocprofiler.h>
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
typedef struct
|
||||
{
|
||||
rocprofiler_context_id_t context_id;
|
||||
rocprofiler_buffer_id_t buffer_id;
|
||||
} context_buffer_id_t;
|
||||
|
||||
typedef context_buffer_id_t roctracer_pool_t;
|
||||
|
||||
/* Correlation id */
|
||||
typedef uint64_t activity_correlation_id_t;
|
||||
|
||||
typedef uint32_t activity_kind_t;
|
||||
typedef uint32_t activity_op_t;
|
||||
|
||||
typedef uint64_t roctracer_timestamp_t;
|
||||
|
||||
typedef rocprofiler_tracer_activity_domain_t roctracer_domain_t;
|
||||
typedef rocprofiler_tracer_activity_domain_t activity_domain_t;
|
||||
|
||||
// Prof_Protocol
|
||||
/* Activity record type */
|
||||
typedef struct activity_record_s
|
||||
{
|
||||
uint32_t domain; /* activity domain id */
|
||||
activity_kind_t kind; /* activity kind */
|
||||
activity_op_t op; /* activity op */
|
||||
union
|
||||
{
|
||||
struct
|
||||
{
|
||||
activity_correlation_id_t correlation_id; /* activity ID */
|
||||
roctracer_timestamp_t begin_ns; /* host begin timestamp */
|
||||
roctracer_timestamp_t end_ns; /* host end timestamp */
|
||||
};
|
||||
struct
|
||||
{
|
||||
uint32_t se; /* sampled SE */
|
||||
uint64_t cycle; /* sample cycle */
|
||||
uint64_t pc; /* sample PC */
|
||||
} pc_sample;
|
||||
};
|
||||
union
|
||||
{
|
||||
struct
|
||||
{
|
||||
int device_id; /* device id */
|
||||
uint64_t queue_id; /* queue id */
|
||||
};
|
||||
struct
|
||||
{
|
||||
uint32_t process_id; /* device id */
|
||||
uint32_t thread_id; /* thread id */
|
||||
};
|
||||
struct
|
||||
{
|
||||
activity_correlation_id_t external_id; /* external correlation id */
|
||||
};
|
||||
};
|
||||
union
|
||||
{
|
||||
size_t bytes; /* data size bytes */
|
||||
const char* kernel_name; /* kernel name */
|
||||
const char* mark_message;
|
||||
};
|
||||
} activity_record_t;
|
||||
|
||||
typedef activity_record_t roctracer_record_t;
|
||||
|
||||
/* Activity sync callback type */
|
||||
typedef void (*activity_sync_callback_t)(activity_domain_t cid,
|
||||
activity_record_t* record,
|
||||
const void* data,
|
||||
void* arg);
|
||||
/* Activity async callback type */
|
||||
typedef void (*activity_async_callback_t)(activity_domain_t op, void* record, void* arg);
|
||||
|
||||
/* API callback type */
|
||||
typedef void (*activity_rtapi_callback_t)(activity_domain_t domain,
|
||||
uint32_t cid,
|
||||
const void* data,
|
||||
void* arg);
|
||||
typedef activity_rtapi_callback_t roctracer_rtapi_callback_t;
|
||||
|
||||
typedef roctracer_timestamp_t (*roctracer_get_timestamp_t)();
|
||||
typedef rocprofiler_timestamp_t (*rocprofiler_get_timestamp_t)();
|
||||
|
||||
typedef uint32_t activity_kind_t;
|
||||
typedef uint32_t activity_op_t;
|
||||
|
||||
/* API callback phase */
|
||||
typedef enum
|
||||
{
|
||||
ACTIVITY_API_PHASE_ENTER = 0,
|
||||
ACTIVITY_API_PHASE_EXIT = 1
|
||||
} activity_api_phase_t;
|
||||
|
||||
const char*
|
||||
roctracer_op_string(uint32_t domain, uint32_t op);
|
||||
|
||||
/* Trace record types */
|
||||
|
||||
/**
|
||||
* Memory pool allocator callback.
|
||||
*
|
||||
* If \p *ptr is NULL, then allocate memory of \p size bytes and save address
|
||||
* in \p *ptr.
|
||||
*
|
||||
* If \p *ptr is non-NULL and size is non-0, then reallocate the memory at \p
|
||||
* *ptr with size \p size and save the address in \p *ptr. The memory will have
|
||||
* been allocated by the same callback.
|
||||
*
|
||||
* If \p *ptr is non-NULL and size is 0, then deallocate the memory at \p *ptr.
|
||||
* The memory will have been allocated by the same callback.
|
||||
*
|
||||
* \p size is the size of the memory allocation or reallocation, or 0 if
|
||||
* deallocating.
|
||||
*
|
||||
* \p arg Argument provided
|
||||
*/
|
||||
typedef void (*roctracer_allocator_t)(char** ptr, size_t size, void* arg);
|
||||
|
||||
/**
|
||||
* Memory pool buffer callback.
|
||||
*
|
||||
* The callback that will be invoked when a memory pool buffer becomes full or
|
||||
* is flushed.
|
||||
*
|
||||
* \p begin pointer to first entry entry in the buffer.
|
||||
*
|
||||
* \p end pointer to one past the end entry in the buffer.
|
||||
*
|
||||
* \p arg the argument specified when the callback was defined.
|
||||
*/
|
||||
typedef void (*roctracer_buffer_callback_t)(const char* begin, const char* end, void* arg);
|
||||
|
||||
/**
|
||||
* Memory pool properties.
|
||||
*
|
||||
* Defines the properties when a tracer memory pool is created.
|
||||
*/
|
||||
typedef struct
|
||||
{
|
||||
/**
|
||||
* ROC Tracer mode.
|
||||
*/
|
||||
uint32_t mode;
|
||||
|
||||
/**
|
||||
* Size of buffer in bytes.
|
||||
*/
|
||||
size_t buffer_size;
|
||||
|
||||
/**
|
||||
* The allocator function to use to allocate and deallocate the buffer. If
|
||||
* NULL then \p malloc, \p realloc, and \p free are used.
|
||||
*/
|
||||
roctracer_allocator_t alloc_fun;
|
||||
|
||||
/**
|
||||
* The argument to pass when invoking the \p alloc_fun allocator.
|
||||
*/
|
||||
void* alloc_arg;
|
||||
|
||||
/**
|
||||
* The function to call when a buffer becomes full or is flushed.
|
||||
*/
|
||||
roctracer_buffer_callback_t buffer_callback_fun;
|
||||
|
||||
/**
|
||||
* The argument to pass when invoking the \p buffer_callback_fun callback.
|
||||
*/
|
||||
void* buffer_callback_arg;
|
||||
} roctracer_properties_t;
|
||||
|
||||
/**
|
||||
* ROC Tracer API status codes.
|
||||
*/
|
||||
typedef enum
|
||||
{
|
||||
/**
|
||||
* The function has executed successfully.
|
||||
*/
|
||||
ROCTRACER_STATUS_SUCCESS = 0,
|
||||
/**
|
||||
* A generic error has occurred.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR = -1,
|
||||
/**
|
||||
* The domain ID is invalid.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID = -2,
|
||||
/**
|
||||
* An invalid argument was given to the function.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT = -3,
|
||||
/**
|
||||
* No default pool is defined.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_DEFAULT_POOL_UNDEFINED = -4,
|
||||
/**
|
||||
* The default pool is already defined.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_DEFAULT_POOL_ALREADY_DEFINED = -5,
|
||||
/**
|
||||
* Memory allocation error.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_MEMORY_ALLOCATION = -6,
|
||||
/**
|
||||
* External correlation ID pop mismatch.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_MISMATCHED_EXTERNAL_CORRELATION_ID = -7,
|
||||
/**
|
||||
* The operation is not currently implemented. This error may be reported by
|
||||
* any function. Check the \ref known_limitations section to determine the
|
||||
* status of the library implementation of the interface.
|
||||
*/
|
||||
ROCTRACER_STATUS_ERROR_NOT_IMPLEMENTED = -8,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_UNINIT = 2,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_BREAK = 3,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_BAD_DOMAIN = ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_BAD_PARAMETER = ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_HIP_API_ERR = 6,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_HIP_OPS_ERR = 7,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_HCC_OPS_ERR = ROCTRACER_STATUS_HIP_OPS_ERR,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_HSA_ERR = 7,
|
||||
/**
|
||||
* Deprecated error code.
|
||||
*/
|
||||
ROCTRACER_STATUS_ROCTX_ERR = 8,
|
||||
} roctracer_status_t;
|
||||
|
||||
/**
|
||||
* Query textual name of an operation of a domain.
|
||||
* @param[in] domain Domain being queried.
|
||||
* @param[in] op Operation within \p domain.
|
||||
* @param[in] kind \todo Define kind.
|
||||
* @return Returns the NUL terminated string for the operation name, or NULL if
|
||||
* the domain or operation are invalid. The string is owned by the ROC Tracer
|
||||
* library.
|
||||
*/
|
||||
const char*
|
||||
roctracer_op_string(uint32_t domain, uint32_t op, uint32_t kind);
|
||||
|
||||
/**
|
||||
* Query the operation code given a domain and the name of an operation.
|
||||
* @param[in] domain The domain being queried.
|
||||
* @param[in] str The NUL terminated name of the operation name being queried.
|
||||
* @param[out] op The operation code.
|
||||
* @param[out] kind If not NULL then the operation kind code.
|
||||
*/
|
||||
void
|
||||
roctracer_op_code(uint32_t domain, const char* str, uint32_t* op, uint32_t* kind);
|
||||
|
||||
/**
|
||||
* Set the properties of a domain.
|
||||
* @param[in] domain The domain.
|
||||
* @param[in] properties The properties. Each domain defines its own type for
|
||||
* the properties. Some domains require the properties to be set before they
|
||||
* can be enabled.
|
||||
*/
|
||||
void
|
||||
roctracer_set_properties(roctracer_domain_t domain, void* properties);
|
||||
|
||||
/**
|
||||
* Enable runtime API callback for a specific operation of a domain.
|
||||
* @param domain The domain.
|
||||
* @param op The operation ID in \p domain.
|
||||
* @param callback The callback to invoke each time the operation is performed
|
||||
* on entry and exit.
|
||||
* @param pool Value to pass as last argument of \p callback.
|
||||
*/
|
||||
void
|
||||
roctracer_enable_op_callback(roctracer_domain_t domain,
|
||||
uint32_t op,
|
||||
roctracer_rtapi_callback_t callback);
|
||||
|
||||
/**
|
||||
* Enable runtime API callback for all operations of a domain.
|
||||
* @param domain The domain
|
||||
* @param callback The callback to invoke each time the operation is performed
|
||||
* on entry and exit.
|
||||
* @param arg Value to pass as last argument of \p callback.
|
||||
*/
|
||||
void
|
||||
roctracer_enable_domain_callback(roctracer_domain_t domain,
|
||||
roctracer_rtapi_callback_t callback,
|
||||
void* user_data = nullptr);
|
||||
|
||||
/**
|
||||
* Disable runtime API callback for a specific operation of a domain.
|
||||
* @param domain The domain
|
||||
* @param op The operation in \p domain.
|
||||
*/
|
||||
void
|
||||
roctracer_disable_op_callback(roctracer_domain_t domain, uint32_t op);
|
||||
|
||||
/**
|
||||
* Disable runtime API callback for all operations of a domain.
|
||||
* @param domain The domain
|
||||
*/
|
||||
void
|
||||
roctracer_disable_domain_callback(roctracer_domain_t domain);
|
||||
|
||||
/**
|
||||
* Enable activity record logging for a specified operation of a domain using
|
||||
* the default memory pool.
|
||||
* @param[in] domain The domain.
|
||||
* @param[in] op The activity operation ID in \p domain.
|
||||
*/
|
||||
void
|
||||
roctracer_enable_op_activity(roctracer_domain_t domain, uint32_t op, roctracer_pool_t pool);
|
||||
|
||||
/**
|
||||
* Enable activity record logging for all operations of a domain using the
|
||||
* default memory pool.
|
||||
* @param[in] domain The domain.
|
||||
*/
|
||||
void
|
||||
roctracer_enable_domain_activity(roctracer_domain_t domain, roctracer_pool_t pool);
|
||||
|
||||
/**
|
||||
* Disable activity record logging for a specified operation of a domain.
|
||||
* @param[in] domain The domain.
|
||||
* @param[in] op The activity operation ID in \p domain.
|
||||
*/
|
||||
void
|
||||
roctracer_disable_op_activity(roctracer_domain_t domain, uint32_t op);
|
||||
|
||||
/**
|
||||
* Disable activity record logging for all operations of a domain.
|
||||
* @param[in] domain The domain.
|
||||
*/
|
||||
void
|
||||
roctracer_disable_domain_activity(roctracer_domain_t domain);
|
||||
|
||||
// HIP Support
|
||||
typedef enum
|
||||
{
|
||||
HIP_OP_ID_DISPATCH = 0,
|
||||
HIP_OP_ID_COPY = 1,
|
||||
HIP_OP_ID_BARRIER = 2,
|
||||
HIP_OP_ID_NUMBER = 3
|
||||
} hip_op_id_t;
|
||||
|
||||
// HSA Support
|
||||
// HSA OP ID enumeration
|
||||
enum hsa_op_id_t
|
||||
{
|
||||
HSA_OP_ID_DISPATCH = 0,
|
||||
HSA_OP_ID_COPY = 1,
|
||||
HSA_OP_ID_BARRIER = 2,
|
||||
HSA_OP_ID_RESERVED1 = 3,
|
||||
HSA_OP_ID_NUMBER
|
||||
};
|
||||
|
||||
// HSA EVT ID enumeration
|
||||
enum hsa_evt_id_t
|
||||
{
|
||||
HSA_EVT_ID_ALLOCATE = 0, // Memory allocate callback
|
||||
HSA_EVT_ID_DEVICE = 1, // Device assign callback
|
||||
HSA_EVT_ID_MEMCOPY = 2, // Memcopy callback
|
||||
HSA_EVT_ID_SUBMIT = 3, // Packet submission callback
|
||||
HSA_EVT_ID_KSYMBOL = 4, // Loading/unloading of kernel symbol
|
||||
HSA_EVT_ID_CODEOBJ = 5, // Loading/unloading of device code object
|
||||
HSA_EVT_ID_NUMBER
|
||||
};
|
||||
|
||||
struct hsa_ops_properties_t
|
||||
{
|
||||
void* reserved1[4];
|
||||
};
|
||||
|
||||
// ROCTx Support
|
||||
typedef uint64_t roctx_range_id_t;
|
||||
|
||||
/**
|
||||
* ROCTX API ID enumeration
|
||||
*/
|
||||
enum roctx_api_id_t
|
||||
{
|
||||
ROCTX_API_ID_roctxMarkA = 0,
|
||||
ROCTX_API_ID_roctxRangePushA = 1,
|
||||
ROCTX_API_ID_roctxRangePop = 2,
|
||||
ROCTX_API_ID_roctxRangeStartA = 3,
|
||||
ROCTX_API_ID_roctxRangeStop = 4,
|
||||
ROCTX_API_ID_NUMBER,
|
||||
};
|
||||
|
||||
/**
|
||||
* ROCTX callbacks data type
|
||||
*/
|
||||
typedef struct roctx_api_data_s
|
||||
{
|
||||
union
|
||||
{
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
};
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxMarkA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxRangePushA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
} roctxRangePop;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
} roctxRangeStartA;
|
||||
struct
|
||||
{
|
||||
const char* message;
|
||||
roctx_range_id_t id;
|
||||
} roctxRangeStop;
|
||||
} args;
|
||||
} roctx_api_data_t;
|
||||
|
||||
// External Support
|
||||
/* Extension opcodes */
|
||||
typedef enum
|
||||
{
|
||||
ACTIVITY_EXT_OP_MARK = 0,
|
||||
ACTIVITY_EXT_OP_EXTERN_ID = 1
|
||||
} activity_ext_op_t;
|
||||
|
||||
typedef void (*roctracer_start_cb_t)();
|
||||
typedef void (*roctracer_stop_cb_t)();
|
||||
typedef struct
|
||||
{
|
||||
roctracer_start_cb_t start_cb;
|
||||
roctracer_stop_cb_t stop_cb;
|
||||
} roctracer_ext_properties_t;
|
||||
|
||||
// Tracing start
|
||||
void
|
||||
roctracer_start();
|
||||
|
||||
// Tracing stop
|
||||
void
|
||||
roctracer_stop();
|
||||
|
||||
// Notifies that the calling thread is entering an external region.
|
||||
// Push an external correlation id for the calling thread.
|
||||
void
|
||||
roctracer_activity_push_external_correlation_id(activity_correlation_id_t id);
|
||||
|
||||
// Notifies that the calling thread is leaving an external region.
|
||||
// Pop an external correlation id for the calling thread.
|
||||
// 'lastId' returns the last external correlation if not NULL
|
||||
void
|
||||
roctracer_activity_pop_external_correlation_id(activity_correlation_id_t* last_id);
|
||||
@@ -154,7 +154,7 @@ validate(const std::vector<rocprofiler_record_header_t*>& _headers)
|
||||
auto& _ref_data = get_generated_array<Tp, N>();
|
||||
for(auto* itr : _headers)
|
||||
{
|
||||
if(itr->kind == typeid(data_type).hash_code())
|
||||
if(itr->hash == typeid(data_type).hash_code())
|
||||
{
|
||||
auto* _data = static_cast<data_type*>(itr->payload);
|
||||
EXPECT_EQ(_ref_data, *_data);
|
||||
|
||||
@@ -147,7 +147,7 @@ validate(const std::vector<rocprofiler_record_header_t*>& _headers)
|
||||
auto& _ref_data = get_generated_array<Tp, N>();
|
||||
for(auto* itr : _headers)
|
||||
{
|
||||
if(itr->kind == typeid(data_type).hash_code())
|
||||
if(itr->hash == typeid(data_type).hash_code())
|
||||
{
|
||||
auto* _data = static_cast<data_type*>(itr->payload);
|
||||
ASSERT_TRUE(_data != nullptr);
|
||||
|
||||
@@ -54,7 +54,7 @@ template <typename Tp>
|
||||
void
|
||||
extract_header(std::vector<Tp>& _arr, rocprofiler_record_header_t* _hdr)
|
||||
{
|
||||
if(_hdr->kind == typeid(Tp).hash_code())
|
||||
if(_hdr->hash == typeid(Tp).hash_code())
|
||||
{
|
||||
auto* _v = reinterpret_cast<Tp*>(_hdr->payload);
|
||||
_arr.emplace_back(*_v);
|
||||
@@ -129,17 +129,17 @@ TEST(buffering, serial)
|
||||
{
|
||||
ASSERT_TRUE(itr->payload) << "nullptr to payload not expected";
|
||||
|
||||
if(itr->kind == typeid(uint_raw_array_t).hash_code())
|
||||
if(itr->hash == typeid(uint_raw_array_t).hash_code())
|
||||
{
|
||||
extract_header(_ui_result, itr);
|
||||
}
|
||||
else if(itr->kind == typeid(flt_raw_array_t).hash_code())
|
||||
else if(itr->hash == typeid(flt_raw_array_t).hash_code())
|
||||
{
|
||||
extract_header(_fp_result, itr);
|
||||
}
|
||||
else
|
||||
{
|
||||
GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->kind);
|
||||
GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->hash);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -105,7 +105,7 @@ def generate_custom(args, cmake_args, ctest_args):
|
||||
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_ERRORS "100")
|
||||
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_WARNINGS "100")
|
||||
set(CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE "51200")
|
||||
set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*")
|
||||
set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*;.*/details/.*")
|
||||
|
||||
set(CTEST_MEMORYCHECK_TYPE "{MEMCHECK_TYPE}")
|
||||
set(CTEST_MEMORYCHECK_SUPPRESSIONS_FILE "{MEMCHECK_SUPPRESSION_FILE}")
|
||||
|
||||
@@ -7,3 +7,7 @@ thread:libhsa-runtime64.so
|
||||
|
||||
# unlock of an unlocked mutex (or by a wrong thread)
|
||||
mutex:librocm_smi64.so
|
||||
|
||||
# google logging
|
||||
race:google::LogMessageTime::CalcGmtOffset
|
||||
race:tzset_internal
|
||||
|
||||
@@ -3,8 +3,14 @@
|
||||
WORK_DIR=$(cd $(dirname ${BASH_SOURCE[0]})/../docs &> /dev/null && pwd)
|
||||
SOURCE_DIR=$(cd ${WORK_DIR}/../.. &> /dev/null && pwd)
|
||||
|
||||
pushd ${SOURCE_DIR}
|
||||
cmake -B build-docs ${SOURCE_DIR} -DROCPROFILER_INTERNAL_BUILD_DOCS=ON
|
||||
popd
|
||||
|
||||
pushd ${WORK_DIR}
|
||||
cmake -DSOURCE_DIR=${SOURCE_DIR} -P generate-doxyfile.cmake
|
||||
|
||||
doxygen rocprofiler.dox
|
||||
|
||||
doxysphinx build ${WORK_DIR} ${WORK_DIR}/_build/html ${WORK_DIR}/_doxygen/html
|
||||
popd
|
||||
|
||||
在新工单中引用
屏蔽一个用户