Contexts, tracing, include reorg, registration, thread-pool (#65)

* Update scripts/update-doxygen.sh

- ensure build-docs folder exists

* Update scripts/run-ci.py

- exclude files in details subdirectory from code coverage

* Update scripts/thread-sanitizer-suppr.txt

- exclude races in glog

* Update docs/rocprofiler.dox.in

- exclude defines in include/rocprofiler/defines.h from doxygen
- Tweak EXCLUDE_PATTERNS and EXAMPLE_PATTERNS

* Update docs workflow

- trigger workflow whenever there is a change to the public headers (which may be doxygen comments)

* Update include/rocprofiler (reorg and overhaul)

- rocprofiler_status_t additions
  - CONTEXT_NOT_FOUND
  - CONTEXT_ERROR
  - INVALID_CONTEXT_ID
  - INVALID_CONTEXT
  - BUFFER_BUSY
- rocprofiler_context_is_active func
- rocprofiler_context_is_valid func
- rocprofiler_service_callback_tracing_kind_t update
  - remove ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD
- Remove rocprofiler_tracing_helper_thread_operation_t
- Remove rocprofiler_helper_thread_callback_tracer_data_t
- Added rocprofiler_internal_thread_library_t
- Added rocprofiler_at_internal_thread_create
- split rocprofiler.h into several smaller headers
- reworked rocprofiler_status_t values
- added doxygen comments for enums
- replaced rocprofiler_trace_record_operation_kind_t with rocprofiler_trace_operation_t
- use @ instead of / in doxygen comment in rocprofiler_plugin.h
- fix ref to ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API
- end group in fwd.h
- remove PROFILE_COUNTING group in dispatch_profile.h
- remove premature group close in callback_tracing.h
- hsa.h: remove rocprofiler_hsa_trace_data_t
- fwd.h: remove rocprofiler_tracer_callback_data_t
- rename rocprofiler_correlation_id_t.handle to rocprofiler_correlation_id_t.id (consistency)
- fwd.h: add rocprofiler_callback_tracing_record_t
- callback_tracing.h: update rocprofiler_hsa_api_callback_tracer_data_t
- callback_tracing.h: add size fields
- simplify rocprofiler_tracer_callback_t
- removed ROCPROFILER_NONNULL from rocprofiler_get_version
- added rocprofiler_get_timestamp
- ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED in rocprofiler_status_t
- add ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND rocprofiler_status_t
- add rocprofiler_buffer_category_t
- rocprofiler_trace_operation_t -> rocprofiler_tracing_operation_t
- rocprofiler_user_data_t union
- tweak rocprofiler_callback_tracing_record_t
  - make external_correlation_id non-pointer
  - add rocprofiler_user_data_t data field
- tweak rocprofiler_record_header_t
  - instead of single uint64_t kind field, have union for category + kind (two u32) with u64 hash
- API extensions for kind id <-> kind string
- API extensions for operation id <-> operation string
- rocprofiler_callback_trace_kind_name_cb_t
- rocprofiler_callback_trace_operation_name_cb_t
- rocprofiler_iterate_callback_trace_kind_names
- rocprofiler_iterate_callback_trace_kind_operation_names
- modify rocprofiler_hsa_api_callback_tracer_data_t data members (remove pointers)
- add rocprofiler_callback_trace_operation_args_cb_t function pointer typedef
- add rocprofiler_iterate_callback_trace_operation_args function
- fixed inconsistent use of *_trace_* vs. *_tracing_* (opting for tracing)
- removed rocprofiler_query_callback_trace_kind_name
- removed rocprofiler_query_callback_kind_operation_name
- Add include/rocprofiler/registration.h
  - header dedicated to registering a tool/client with rocprofiler
  - this header is not intended to be included by rocprofiler.h
  - rocprofiler_client_id_t
    - identifier for client tool
  - rocprofiler_client_finalize_t
    - function pointer prototype for tool-initiated finalization
  - rocprofiler_tool_initialize_t
    - function pointer prototype for tool initialization (i.e. configuration)
  - rocprofiler_tool_finalize_t
    - function pointer prototype for tool finalization
  - rocprofiler_tool_configure_result_t
    - struct returned by tool/client to rocprofiler
  - rocprofiler_is_initialized
    - function for querying whether tool-induced initialization is possible
  - rocprofiler_is_finalized
    - function for querying whether rocprofiler has been finalized
  - rocprofiler_configure prototype
    - this is the function tools implement
    - prototype is always marked as having default visibility
    - no implementation in rocprofiler
  - added typedef for rocprofiler_configure function pointer
  - added rocprofiler_force_configure to explicitly invoke rocprofiler_configure instead of relying on lazy init
- made callback typedef names more consistent (_cb_t suffix)
- typedef for rocprofiler_internal_thread_library_cb_t function pointer
- added rocprofiler_at_internal_thread_create function
- added rocprofiler_callback_thread_t struct
- added rocprofiler_create_callback_thread function
- added rocprofiler_assign_callback_thread function
- removed rocprofiler_buffer_tracing_record_header_t in favor of kind and correlation id in each record type
- added rocprofiler_buffer_tracing_kind_name_cb_t typedef
- added rocprofiler_buffer_tracing_operation_name_cb_t typedef
- added rocprofiler_iterate_buffer_tracing_kind_names function
- added rocprofiler_iterate_buffer_tracing_kind_operation_names function
- removed rocprofiler_query_buffer_trace_kind_name function
- removed rocprofiler_query_buffer_kind_operation_name function

* Update lib/common/container/stable_vector.hpp

- include limits header
- reserve_size struct
- overload stable_vector constructor to support reserving as part of construction

* Update lib/common/container/record_header_buffer.{hpp,cpp}

- add emplace member function accepting category and kind (two u32 variables) instead of one u64 kind
- use std::shared_mutex to prevent data-race when reading m_headers
- record_header_buffer is now multiple writer, single reader
- add read_lock member function (shared)
- add read_unlock member function (shared)
- lock member function gets exclusive lock
- unlock member function releases exclusive lock

* Rename "config" to "context" + restructure + implement

- Restructure config files + license
  - move config files into lib/rocprofiler/config subfolder
  - rename some files
  - add license to some files which were missing it
- Rename config/helpers.hpp
  - rename to allocator.hpp
  - remove get_domain_max_ops
- Create config/domain.{hpp,cpp}
  - structures for handling tracing domains and ops
- Update config/config.{hpp,cpp}
  - buffer_instance struct
  - callback_tracing_service struct
  - buffer_tracing_service struct
  - config struct
  - allocate_{config,buffer} func
  - {validate,start,stop}_config funcs
  - get_registered_configs func
  - get_active_configs func
  - get_buffers func
- Update rocprofiler.cpp
  - Implement rocprofiler_create_context
  - Implement rocprofiler_start_context
  - Implement rocprofiler_stop_context
  - Implement rocprofiler_context_is_active
  - Implement rocprofiler_context_is_valid
  - Implement rocprofiler_flush_buffer
  - Implement rocprofiler_destroy_buffer
  - Implement rocprofiler_create_buffer
- Update lib/rocprofiler/hsa
  - use rocprofiler_tracer_activity_domain_t instead of rocprofiler_tracer_activity_domain_t
  - remove ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API fromHSA_API_INFO_DEFINITION_* macros
- Update lib/rocprofiler/context/domain.*
  - fixes for domain_info (i.e. use correct enums)
  - update rocprofiler_status_t codes
  - fix template instantiations
- Update lib/rocprofiler/context/context.*
  - use rocprofiler_service_callback_tracing_kind_t instead of rocprofiler_tracer_activity_domain_t
  - rename correlation_context to correlation_tracing_service
  - fix domains in callback_tracing_service and buffer_tracing_service
  - unique_ptr for callback_tracer and buffered_tracer in context
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_configure_callback_tracing_service
- Update lib/rocprofiler/hsa/ostream.hpp
  - include rocprofiler.h instead of tracer.hpp
- Update lib/rocprofiler/hsa
  - migration to use rocprofiler_hsa_api_callback_tracer_data_t instead of rocprofiler_hsa_trace_data_t
  - restructure hsa_api_impl<Idx>
    - remove phase_enter and phase_exit
    - add set_data_args (partial replacement for phase_enter)
    - functor handles the contexts
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_get_version
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - remove hsa_api_ prefix for functions already in hsa namespace
- Update lib/rocprofiler/context/context.{hpp,cpp}
  - add client_idx to context struct (tool identifier)
  - add push_client function to set client_idx before context is allocated
  - add pop_client function to remove client identifier from future context creations
  - implemented {registered,active}_contexts and buffers to use new container::reserve_size overload to stable_vector
  - fix implementation of start_context
  - fix implementation of stop_context
- Update lib/rocprofiler/rocprofiler.cpp
  - prevent context creation, buffer creation, pc sampling config, etc. after initialization
  - add nullptr checks to rocprofiler_context_is_valid
  - fix rocprofiler_configure_callback_tracing_service
    - was checking size of buffers, not registered context
  - implement rocprofiler_iterate_callback_trace_kind_names
  - implement rocprofiler_iterate_callback_trace_kind_operation_names
- Update lib/rocprofiler/CMakeLists.txt
  - add registration.{hpp,cpp} to rocprofiler-library target sources
- Update lib/rocprofiler/hsa/utils.hpp
  - fix using fmt::formt with const char* strings
  - remove join functions (no longer used)
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - remove args_string function
  - remove named_args_string function
  - update iterate_args function
    - change callback type
    - accept user data
  - rework the hsa_api_impl<Idx>::functor function
    - save the rocprofiler_callback_tracing_record_t between callbacks
  - update update_table function
    - check buffered_tracer domains
  - remove comments
- Update lib/rocprofiler/hsa/defines.hpp
  - remove MEMBER_<N> macros
  - add ADDR_MEMBER_<N> macros
  - remove doxygen comments for GET_MEMBER_FIELDS
  - add GET_ADDR_MEMBER_FIELDS
  - update HSA_API_INFO_DEFINITION_{0,V}
    - rename domain_idx to callback_domain_idx
    - add buffered_domain_idx
    - add as_arg_addr function
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_iterate_callback_trace_operation_args
- Remove lib/rocprofiler/tracing.{hpp,cpp} and lib/rocprofiler/CMakeLists.txt
  - unused
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - support buffered tracing in hsa_api_impl<Idx>::functor
  - rocprofiler_callback_trace_operation_args_cb_t -> rocprofiler_callback_tracing_operation_args_cb_t
    - i.e. trace -> tracing
- Update lib/rocprofiler/context/context.{hpp,cpp}
  - removed buffer_instance struct
  - removed allocate_buffer function
  - removed get_buffers function
  - changed buffer_tracing_service::buffer_array_t
- Update lib/rocprofiler/hsa: hsa.cpp, ostream.hpp, details folder
  - move ostream.hpp into details folder to prevent from contributing to code coverage
  - update cmake build system for new directory

* Add lib/rocprofiler/registration.{hpp,cpp}

- implements rocprofiler_set_api_table (called by rocprofiler-register)
- miscellaneous functions for client configure/initialize/finalize
- functions for querying the init/fini status
- relocated OnLoad HSA workaround to this file
  - at present, this is used to workaround ROCr not having rocprofiler-register integration yet
- implement rocprofiler_force_configure function
- implement rocprofiler_is_initialized function
- implement rocprofiler_is_finalized function
- ensure configure functions only invoked once
- ensure internal thread creation notification functions are invoked
- get_status is pair of atomics
- fix heap-use-after-free in init_logging
- update finalize
  - invoke hsa_shut_down
  - set all active contexts to null pointers

* Add lib/rocprofiler/buffer_tracing.cpp

- contains implementations of buffer_tracing (i.e. rocprofiler/buffer_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/buffer.{hpp,cpp}

- contains implementations of buffer (i.e. rocprofiler/buffer.h) and misc internal access functions
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp and lib/rocprofiler/context/context.{hpp,cpp}

* Add lib/rocprofiler/callback_tracing.cpp

- contains implementations of callback_tracing (i.e. rocprofiler/callback_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/context.cpp

- contains implementations of context public API functions (i.e. rocprofiler/context.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/internal_threading.{hpp,cpp}

- contains implementations of internal_threading (i.e. rocprofiler/internal_threading.h)
- also contains implementations of internal access functions
- update finalize function
  - join all task groups and destroy all thread pools first, then reset unique_ptr

* Update lib/rocprofiler/rocprofiler.cpp

- rocprofiler_get_version returns status
- implement rocprofiler_get_timestamp
- remove misc implementations that were split into other files

* Update lib/rocprofiler/CMakeLists.txt

- compile new implementation files
  - buffer.cpp
  - buffer_tracing.cpp
  - callback_tracing.cpp
  - context.cpp
  - internal_threading.cpp

* Update lib/tests/buffering/buffering-*.cpp

- update to reflect changes to rocprofiler_record_header_t

* Update CMakeLists.txt

- increase minimum cmake version to 3.21 which added HIP support as a language

* Add samples/apps/transpose

- simple HIP application for testing

* Add samples/api_callback_tracing

- HIP application and tool library
- This effectively demos how to setup HSA API tracing
  - For each function called in tool, it stores the func/file/line and prints it during finalization
- client.hpp and client.cpp are the tool library
- Implement use of rocprofiler_iterate_callback_trace_operation_args
- add demo of using rocprofiler_get_version
- add_test
  - remove PASS_REGULAR_EXPRESSION
    - causing false passes during memcheck
  - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
- check if rocprofiler is initialized before stopping context

* Add samples/api_buffered_tracing

- Sample demonstrating tracing the HSA API via buffering
- demo rocprofiler_record_header_compute_hash
- throw exceptions for unexpected buffer data
- add_test
  - remove PASS_REGULAR_EXPRESSION
    - causing false passes during memcheck
  - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment

* Update samples/CMakeLists.txt

- add subdirectory for api_callback_tracing
- add subdirectory api_buffered_tracing

* Update samples/pc_sampling/common.h

- fix processing of headers

* Update lib/rocprofiler/hsa/details/ostream.hpp

- fix data race on HSA_depth_max_cnt and recursion
- HSA_depth_max_cnt and recursion is now thread-local static instead of global static
- replace std::string usage with std::string_view

* Actions update

- add dependabot.yml
- use actions/checkout@v4
- install latest libasan and libtsan in sanitizer containers

* Add PTL (Parallel Tasking Library) submodule
Этот коммит содержится в:
Jonathan R. Madsen
2023-09-20 19:32:02 -05:00
коммит произвёл GitHub
родитель 06f7b780f9
Коммит d3eaacd610
86 изменённых файлов: 7293 добавлений и 3792 удалений
+11
Просмотреть файл
@@ -0,0 +1,11 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
version: 2
updates:
- package-ecosystem: "github-actions" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
+5 -3
Просмотреть файл
@@ -72,7 +72,7 @@ jobs:
needs: get_latest_mainline_build_number
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: List Files
shell: bash
@@ -161,7 +161,9 @@ jobs:
needs: get_latest_mainline_build_number
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
submodules: true
- name: List Files
shell: bash
@@ -174,7 +176,7 @@ jobs:
shell: bash
run: |
pip3 install -r requirements.txt
apt install -y cmake libgtest-dev
apt install -y cmake libgtest-dev libasan8 libtsan2
git config --global --add safe.directory '*'
- name: Configure, Build, and Test
+5 -3
Просмотреть файл
@@ -6,18 +6,20 @@ on:
branches: [main]
paths:
- '*.md'
- 'VERSION'
- 'source/docs/**'
- 'source/scripts/update-docs.sh'
- 'source/include/rocprofiler/*'
- '.github/workflows/docs.yml'
- 'VERSION'
pull_request:
branches: [main]
paths:
- '*.md'
- 'VERSION'
- 'source/docs/**'
- 'source/scripts/update-docs.sh'
- 'source/include/rocprofiler/*'
- '.github/workflows/docs.yml'
- 'VERSION'
concurrency:
group: "pages"
@@ -35,7 +37,7 @@ jobs:
id-token: write
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
submodules: true
- name: Install Conda
+3 -3
Просмотреть файл
@@ -19,7 +19,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Extract branch name
shell: bash
@@ -60,7 +60,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Install dependencies
run: |
@@ -105,7 +105,7 @@ jobs:
python-version: ['3.10']
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Extract branch name
shell: bash
+3
Просмотреть файл
@@ -10,3 +10,6 @@
[submodule "source/docs/doxygen-awesome-css"]
path = external/doxygen-awesome-css
url = https://github.com/jothepro/doxygen-awesome-css.git
[submodule "external/ptl"]
path = external/ptl
url = https://github.com/jrmadsen/PTL
+1 -1
Просмотреть файл
@@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 3.16 FATAL_ERROR)
cmake_minimum_required(VERSION 3.21 FATAL_ERROR)
if(CMAKE_SOURCE_DIR STREQUAL CMAKE_BINARY_DIR AND CMAKE_CURRENT_SOURCE_DIR STREQUAL
CMAKE_SOURCE_DIR)
+8
Просмотреть файл
@@ -146,3 +146,11 @@ find_package(
lib/cmake/amd_comgr)
target_link_libraries(rocprofiler-amd-comgr INTERFACE amd_comgr)
# ----------------------------------------------------------------------------------------#
#
# PTL (Parallel Tasking Library)
#
# ----------------------------------------------------------------------------------------#
target_link_libraries(rocprofiler-ptl INTERFACE PTL::ptl-static)
+1
Просмотреть файл
@@ -49,3 +49,4 @@ rocprofiler_add_interface_library(rocprofiler-gtest "Google Test library" INTERN
rocprofiler_add_interface_library(rocprofiler-glog "Google Log library" INTERNAL)
rocprofiler_add_interface_library(rocprofiler-fmt "C++ format string library" INTERNAL)
rocprofiler_add_interface_library(rocprofiler-stdcxxfs "C++ filesystem library" INTERNAL)
rocprofiler_add_interface_library(rocprofiler-ptl "Parallel Tasking Library" INTERNAL)
поставляемый
+42
Просмотреть файл
@@ -88,3 +88,45 @@ else()
find_package(fmt REQUIRED)
target_link_libraries(rocprofiler-fmt INTERFACE fmt::fmt)
endif()
if(NOT TARGET PTL::ptl-static)
rocprofiler_checkout_git_submodule(
RELATIVE_PATH external/ptl
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
REPO_URL https://github.com/jrmadsen/PTL.git
REPO_BRANCH rocprofiler)
set(PTL_BUILD_EXAMPLES OFF)
set(PTL_USE_TBB OFF)
set(PTL_USE_GPU OFF)
set(PTL_DEVELOPER_INSTALL OFF)
if(NOT DEFINED BUILD_OBJECT_LIBS)
set(BUILD_OBJECT_LIBS OFF)
endif()
if(NOT DEFINED BUILD_STATIC_LIBS)
set(BUILD_STATIC_LIBS OFF)
endif()
rocprofiler_save_variables(
BUILD_CONFIG
VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS
CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET
CMAKE_VISIBILITY_INLINES_HIDDEN)
set(BUILD_SHARED_LIBS OFF)
set(BUILD_STATIC_LIBS ON)
set(BUILD_OBJECT_LIBS OFF)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
set(CMAKE_CXX_VISIBILITY_PRESET "hidden")
set(CMAKE_VISIBILITY_INLINES_HIDDEN ON)
add_subdirectory(ptl EXCLUDE_FROM_ALL)
rocprofiler_restore_variables(
BUILD_CONFIG
VARIABLES BUILD_SHARED_LIBS BUILD_STATIC_LIBS BUILD_OBJECT_LIBS
CMAKE_POSITION_INDEPENDENT_CODE CMAKE_CXX_VISIBILITY_PRESET
CMAKE_VISIBILITY_INLINES_HIDDEN)
endif()
поставляемый Подмодуль
+1
Submodule external/ptl added at 7bbc5a4e66
+2
Просмотреть файл
@@ -5,3 +5,5 @@ project(rocprofiler-samples LANGUAGES C CXX)
# add_subdirectory(api_tracing)
add_subdirectory(pc_sampling)
add_subdirectory(api_callback_tracing)
add_subdirectory(api_buffered_tracing)
+52
Просмотреть файл
@@ -0,0 +1,52 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
if(NOT CMAKE_HIP_COMPILER)
find_program(
amdclangpp_EXECUTABLE
NAMES amdclang++
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATH_SUFFIXES bin llvm/bin NO_CACHE)
mark_as_advanced(amdclangpp_EXECUTABLE)
if(amdclangpp_EXECUTABLE)
set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}")
endif()
endif()
project(rocprofiler-samples-buffered-api-tracing LANGUAGES CXX HIP)
foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO)
if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "")
set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}")
endif()
endforeach()
add_library(buffered-api-tracing-client SHARED)
target_sources(buffered-api-tracing-client PRIVATE client.cpp client.hpp)
target_link_libraries(buffered-api-tracing-client
PRIVATE rocprofiler::rocprofiler-library)
set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP)
find_package(Threads REQUIRED)
add_executable(buffered-api-tracing)
target_sources(buffered-api-tracing PRIVATE main.cpp)
target_link_libraries(buffered-api-tracing PRIVATE buffered-api-tracing-client
Threads::Threads)
add_test(NAME buffered-api-tracing COMMAND $<TARGET_FILE:buffered-api-tracing>)
set_tests_properties(
buffered-api-tracing
PROPERTIES
TIMEOUT
45
LABELS
"samples"
ENVIRONMENT
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$<TARGET_FILE:rocprofiler::rocprofiler-library>"
)
+383
Просмотреть файл
@@ -0,0 +1,383 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
// undefine NDEBUG so asserts are implemented
#ifdef NDEBUG
# undef NDEBUG
#endif
/**
* @file samples/api_buffered_tracing/client.cpp
*
* @brief Example rocprofiler client (tool)
*/
#include "client.hpp"
#include <rocprofiler/buffer.h>
#include <rocprofiler/fwd.h>
#include <rocprofiler/internal_threading.h>
#include <rocprofiler/registration.h>
#include <rocprofiler/rocprofiler.h>
#include <cassert>
#include <chrono>
#include <cstddef>
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <filesystem>
#include <iostream>
#include <mutex>
#include <string>
#include <string_view>
#include <thread>
#include <vector>
#define ROCPROFILER_CALL(result, msg) \
{ \
rocprofiler_status_t CHECKSTATUS = result; \
if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \
{ \
std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \
throw std::runtime_error(#result " failure"); \
} \
}
namespace client
{
namespace
{
struct source_location
{
std::string function = {};
std::string file = {};
uint32_t line = 0;
std::string context = {};
};
using call_stack_t = std::vector<source_location>;
rocprofiler_client_id_t* client_id = nullptr;
rocprofiler_client_finalize_t client_fini_func = nullptr;
rocprofiler_context_id_t client_ctx = {};
rocprofiler_buffer_id_t client_buffer = {};
void
print_call_stack(const call_stack_t& _call_stack)
{
namespace fs = ::std::filesystem;
size_t n = 0;
for(const auto& itr : _call_stack)
{
std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " ";
std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] "
<< std::setw(20) << std::left << itr.function;
if(!itr.context.empty()) std::clog << " :: " << itr.context;
std::clog << "\n";
}
std::clog << std::flush;
}
void
store_buffer_id_names(call_stack_t* tool_data)
{
//
// buffered for each kind operation
//
static auto tracing_operation_names_cb = [](rocprofiler_service_buffer_tracing_kind_t /*kindv*/,
uint32_t /*operation*/,
const char* operation_name,
void* data_v) {
static_cast<call_stack_t*>(data_v)->emplace_back(
source_location{"rocprofiler_iterate_buffer_trace_kind_operation_names",
__FILE__,
__LINE__,
std::string{" "} + std::string{operation_name}});
return 0;
};
//
// callback for each buffer kind (i.e. domain)
//
static auto tracing_kind_names_cb =
[](rocprofiler_service_buffer_tracing_kind_t kind, const char* kind_name, void* data) {
// store the buffer kind name
static_cast<call_stack_t*>(data)->emplace_back(
source_location{"rocprofiler_iterate_buffer_trace_kind_names ",
__FILE__,
__LINE__,
kind_name});
// store the operation names for the HSA API
if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
{
rocprofiler_iterate_buffer_tracing_kind_operation_names(
kind, tracing_operation_names_cb, data);
}
return 0;
};
rocprofiler_iterate_buffer_tracing_kind_names(tracing_kind_names_cb,
static_cast<void*>(tool_data));
}
void
tool_tracing_callback(rocprofiler_context_id_t context,
rocprofiler_buffer_id_t buffer_id,
rocprofiler_record_header_t** headers,
size_t num_headers,
void* user_data,
uint64_t drop_count)
{
assert(user_data != nullptr);
if(num_headers == 0)
throw std::runtime_error{
"rocprofiler invoked a buffer callback with no headers. this should never happen"};
else if(headers == nullptr)
throw std::runtime_error{"rocprofiler invoked a buffer callback with a null pointer to the "
"array of headers. this should never happen"};
for(size_t i = 0; i < num_headers; ++i)
{
auto* header = headers[i];
if(header == nullptr)
{
throw std::runtime_error{
"rocprofiler provided a null pointer to header. this should never happen"};
}
else if(header->hash !=
rocprofiler_record_header_compute_hash(header->category, header->kind))
{
throw std::runtime_error{"rocprofiler_record_header_t (category | kind) != hash"};
}
else if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING &&
header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
{
auto* record =
static_cast<rocprofiler_buffer_tracing_hsa_api_record_t*>(header->payload);
auto info = std::stringstream{};
info << "tid=" << record->thread_id << ", context=" << context.handle
<< ", buffer_id=" << buffer_id.handle << ", cid=" << record->correlation_id.id
<< ", kind=" << record->kind << ", operation=" << record->operation
<< ", drop_count=" << drop_count << ", start=" << record->start_timestamp
<< ", stop=" << record->end_timestamp;
if(record->start_timestamp > record->end_timestamp)
throw std::runtime_error("start > end");
static_cast<call_stack_t*>(user_data)->emplace_back(
source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
}
else
{
throw std::runtime_error{"unexpected rocprofiler_record_header_t category + kind"};
}
}
}
void
thread_precreate(rocprofiler_internal_thread_library_t lib, void* tool_data)
{
static_cast<call_stack_t*>(tool_data)->emplace_back(
source_location{__FUNCTION__,
__FILE__,
__LINE__,
std::string{"internal thread about to be created by rocprofiler (lib="} +
std::to_string(lib) + ")"});
}
void
thread_postcreate(rocprofiler_internal_thread_library_t lib, void* tool_data)
{
static_cast<call_stack_t*>(tool_data)->emplace_back(
source_location{__FUNCTION__,
__FILE__,
__LINE__,
std::string{"internal thread was created by rocprofiler (lib="} +
std::to_string(lib) + ")"});
}
int
tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
{
assert(tool_data != nullptr);
static_cast<call_stack_t*>(tool_data)->emplace_back(
source_location{__FUNCTION__, __FILE__, __LINE__, ""});
store_buffer_id_names(static_cast<call_stack_t*>(tool_data));
client_fini_func = fini_func;
ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed");
ROCPROFILER_CALL(rocprofiler_create_buffer(client_ctx,
4096,
2048,
ROCPROFILER_BUFFER_POLICY_LOSSLESS,
tool_tracing_callback,
tool_data,
&client_buffer),
"buffer creation failed");
ROCPROFILER_CALL(
rocprofiler_configure_buffer_tracing_service(
client_ctx, ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, nullptr, 0, client_buffer),
"buffer tracing service failed to configure");
auto client_thread = rocprofiler_callback_thread_t{};
ROCPROFILER_CALL(rocprofiler_create_callback_thread(&client_thread),
"failure creating callback thread");
ROCPROFILER_CALL(rocprofiler_assign_callback_thread(client_buffer, client_thread),
"failed to assign thread for buffer");
int valid_ctx = 0;
ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx),
"failure checking context validity");
if(valid_ctx == 0)
{
// notify rocprofiler that initialization failed
// and all the contexts, buffers, etc. created
// should be ignored
return -1;
}
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
// no errors
return 0;
}
void
tool_fini(void* tool_data)
{
assert(tool_data != nullptr);
auto* _call_stack = static_cast<call_stack_t*>(tool_data);
_call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""});
print_call_stack(*_call_stack);
delete _call_stack;
}
} // namespace
void
setup()
{
ROCPROFILER_CALL(rocprofiler_force_configure(&rocprofiler_configure),
"failed to force configuration");
}
void
shutdown()
{
if(client_id)
{
auto status = ROCPROFILER_STATUS_SUCCESS;
while((status = rocprofiler_flush_buffer(client_buffer)) ==
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY)
{
std::this_thread::yield();
std::this_thread::sleep_for(std::chrono::milliseconds{10});
}
ROCPROFILER_CALL(status, "rocprofiler_flush_buffer failed");
while((status = rocprofiler_flush_buffer(client_buffer)) ==
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY)
{
std::this_thread::yield();
std::this_thread::sleep_for(std::chrono::milliseconds{10});
}
client_fini_func(*client_id);
}
}
void
start()
{
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
}
void
stop()
{
ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed");
}
} // namespace client
extern "C" rocprofiler_tool_configure_result_t*
rocprofiler_configure(uint32_t version,
const char* runtime_version,
uint32_t priority,
rocprofiler_client_id_t* id)
{
// only activate if main tool
if(priority > 0) return nullptr;
// set the client name
id->name = "ExampleTool";
// store client info
client::client_id = id;
// compute major/minor/patch version info
uint32_t major = version / 10000;
uint32_t minor = (version % 10000) / 100;
uint32_t patch = version % 100;
// generate info string
auto info = std::stringstream{};
info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " ("
<< runtime_version << ")";
std::clog << info.str() << std::endl;
auto* client_tool_data = new std::vector<client::source_location>{};
client_tool_data->emplace_back(
client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
ROCPROFILER_CALL(rocprofiler_at_internal_thread_create(
client::thread_precreate,
client::thread_postcreate,
ROCPROFILER_LIBRARY | ROCPROFILER_HSA_LIBRARY | ROCPROFILER_HIP_LIBRARY |
ROCPROFILER_MARKER_LIBRARY,
static_cast<void*>(client_tool_data)),
"failed to register for thread creation notifications");
// create configure data
static auto cfg =
rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t),
&client::tool_init,
&client::tool_fini,
static_cast<void*>(client_tool_data)};
// return pointer to configure data
return &cfg;
}
+44
Просмотреть файл
@@ -0,0 +1,44 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#ifdef buffered_api_tracing_client_EXPORTS
# define CLIENT_API __attribute__((visibility("default")))
#else
# define CLIENT_API
#endif
namespace client
{
void
setup() CLIENT_API;
void
shutdown() CLIENT_API;
void
start() CLIENT_API;
void
stop() CLIENT_API;
} // namespace client
+244
Просмотреть файл
@@ -0,0 +1,244 @@
/*
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#include "client.hpp"
#include "hip/hip_runtime.h"
#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <mutex>
#include <random>
#include <stdexcept>
#define HIP_API_CALL(CALL) \
{ \
hipError_t error_ = (CALL); \
if(error_ != hipSuccess) \
{ \
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
fprintf(stderr, \
"%s:%d :: HIP error : %s\n", \
__FILE__, \
__LINE__, \
hipGetErrorString(error_)); \
throw std::runtime_error("hip_api_call"); \
} \
}
namespace
{
using auto_lock_t = std::unique_lock<std::mutex>;
auto print_lock = std::mutex{};
size_t nthreads = 2;
size_t nitr = 500;
size_t nsync = 10;
constexpr unsigned shared_mem_tile_dim = 32;
void
check_hip_error(void);
void
verify(int* in, int* out, int M, int N);
} // namespace
__global__ void
transpose_a(int* in, int* out, int M, int N);
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
int
main(int argc, char** argv)
{
client::setup(); // forces rocprofiler to configure/initialize
client::start(); // starts context before any API tables are available
int rank = 0;
int size = 1;
for(int i = 1; i < argc; ++i)
{
auto _arg = std::string{argv[i]};
if(_arg == "?" || _arg == "-h" || _arg == "--help")
{
fprintf(stderr,
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
nthreads,
nitr,
nsync);
exit(EXIT_SUCCESS);
}
}
if(argc > 1) nthreads = atoll(argv[1]);
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
printf("[transpose] Number of threads: %zu\n", nthreads);
printf("[transpose] Number of iterations: %zu\n", nitr);
printf("[transpose] Syncing every %zu iterations\n", nsync);
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
int ndevice = 0;
int devid = rank;
HIP_API_CALL(hipGetDeviceCount(&ndevice));
printf("[transpose] Number of devices found: %i\n", ndevice);
if(ndevice > 0)
{
devid = rank % ndevice;
HIP_API_CALL(hipSetDevice(devid));
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
}
if(rank == devid && rank < ndevice)
{
std::vector<std::thread> _threads{};
std::vector<hipStream_t> _streams(nthreads);
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
for(size_t i = 1; i < nthreads; ++i)
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
run(rank, 0, _streams.at(0), argc, argv);
for(auto& itr : _threads)
itr.join();
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
}
HIP_API_CALL(hipDeviceSynchronize());
HIP_API_CALL(hipDeviceReset());
client::stop();
client::shutdown();
return 0;
}
__global__ void
transpose_a(int* in, int* out, int M, int N)
{
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
tile[threadIdx.y][threadIdx.x] = in[idx];
__syncthreads();
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
out[idx] = tile[threadIdx.x][threadIdx.y];
}
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
{
unsigned int M = 4960 * 2;
unsigned int N = 4960 * 2;
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
auto_lock_t _lk{print_lock};
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
_lk.unlock();
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
std::uniform_int_distribution<int> _dist{0, 1000};
size_t size = sizeof(int) * M * N;
int* inp_matrix = new int[size];
int* out_matrix = new int[size];
for(size_t i = 0; i < M * N; i++)
{
inp_matrix[i] = _dist(_engine);
out_matrix[i] = 0;
}
int* in = nullptr;
int* out = nullptr;
HIP_API_CALL(hipMalloc(&in, size));
HIP_API_CALL(hipMalloc(&out, size));
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
HIP_API_CALL(hipStreamSynchronize(stream));
dim3 grid(M / 32, N / 32, 1);
dim3 block(32, 32, 1); // transpose_a
auto t1 = std::chrono::high_resolution_clock::now();
for(size_t i = 0; i < nitr; ++i)
{
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
check_hip_error();
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
}
auto t2 = std::chrono::high_resolution_clock::now();
HIP_API_CALL(hipStreamSynchronize(stream));
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
float GB = (float) size * nitr * 2 / (1 << 30);
print_lock.lock();
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
<< std::endl;
print_lock.unlock();
HIP_API_CALL(hipStreamSynchronize(stream));
// cpu_transpose(matrix, out_matrix, M, N);
verify(inp_matrix, out_matrix, M, N);
HIP_API_CALL(hipFree(in));
HIP_API_CALL(hipFree(out));
delete[] inp_matrix;
delete[] out_matrix;
}
namespace
{
void
check_hip_error(void)
{
hipError_t err = hipGetLastError();
if(err != hipSuccess)
{
auto_lock_t _lk{print_lock};
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
throw std::runtime_error("hip_api_call");
}
}
void
verify(int* in, int* out, int M, int N)
{
for(int i = 0; i < 10; i++)
{
int row = rand() % M;
int col = rand() % N;
if(in[row * N + col] != out[col * M + row])
{
auto_lock_t _lk{print_lock};
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
<< out[col * M + row] << "\n";
}
}
}
} // namespace
+52
Просмотреть файл
@@ -0,0 +1,52 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
if(NOT CMAKE_HIP_COMPILER)
find_program(
amdclangpp_EXECUTABLE
NAMES amdclang++
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATH_SUFFIXES bin llvm/bin NO_CACHE)
mark_as_advanced(amdclangpp_EXECUTABLE)
if(amdclangpp_EXECUTABLE)
set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}")
endif()
endif()
project(rocprofiler-samples-callback-api-tracing LANGUAGES CXX HIP)
foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO)
if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "")
set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}")
endif()
endforeach()
add_library(callback-api-tracing-client SHARED)
target_sources(callback-api-tracing-client PRIVATE client.cpp client.hpp)
target_link_libraries(callback-api-tracing-client
PRIVATE rocprofiler::rocprofiler-library)
set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP)
find_package(Threads REQUIRED)
add_executable(callback-api-tracing)
target_sources(callback-api-tracing PRIVATE main.cpp)
target_link_libraries(callback-api-tracing PRIVATE callback-api-tracing-client
Threads::Threads)
add_test(NAME callback-api-tracing COMMAND $<TARGET_FILE:callback-api-tracing>)
set_tests_properties(
callback-api-tracing
PROPERTIES
TIMEOUT
45
LABELS
"samples"
ENVIRONMENT
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV};HSA_TOOLS_LIB=$<TARGET_FILE:rocprofiler::rocprofiler-library>"
)
+317
Просмотреть файл
@@ -0,0 +1,317 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
// undefine NDEBUG so asserts are implemented
#ifdef NDEBUG
# undef NDEBUG
#endif
/**
* @file samples/api_callback_tracing/client.cpp
*
* @brief Example rocprofiler client (tool)
*/
#include "client.hpp"
#include <rocprofiler/registration.h>
#include <rocprofiler/rocprofiler.h>
#include <cassert>
#include <cstddef>
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <filesystem>
#include <iostream>
#include <mutex>
#include <string>
#include <string_view>
#include <vector>
#define ROCPROFILER_CALL(result, msg) \
{ \
rocprofiler_status_t CHECKSTATUS = result; \
if(CHECKSTATUS != ROCPROFILER_STATUS_SUCCESS) \
{ \
std::cerr << #result << " failed with error code " << CHECKSTATUS << std::endl; \
throw std::runtime_error(#result " failure"); \
} \
}
namespace client
{
namespace
{
struct source_location
{
std::string function = {};
std::string file = {};
uint32_t line = 0;
std::string context = {};
};
using call_stack_t = std::vector<source_location>;
rocprofiler_client_id_t* client_id = nullptr;
rocprofiler_client_finalize_t client_fini_func = nullptr;
rocprofiler_context_id_t client_ctx = {};
void
print_call_stack(const call_stack_t& _call_stack)
{
namespace fs = ::std::filesystem;
size_t n = 0;
for(const auto& itr : _call_stack)
{
std::clog << std::setw(2) << ++n << "/" << std::setw(2) << _call_stack.size() << " ";
std::clog << "[" << fs::path{itr.file}.filename() << ":" << itr.line << "] "
<< std::setw(20) << std::left << itr.function;
if(!itr.context.empty()) std::clog << " :: " << itr.context;
std::clog << "\n";
}
std::clog << std::flush;
}
void
store_callback_id_names(call_stack_t* tool_data)
{
//
// callback for each kind operation
//
static auto tracing_operation_names_cb =
[](rocprofiler_service_callback_tracing_kind_t /*kindv*/,
uint32_t /*operation*/,
const char* operation_name,
void* data_v) {
static_cast<call_stack_t*>(data_v)->emplace_back(
source_location{"rocprofiler_iterate_callback_tracing_kind_operation_names",
__FILE__,
__LINE__,
std::string{" "} + std::string{operation_name}});
return 0;
};
//
// callback for each callback kind (i.e. domain)
//
static auto tracing_kind_names_cb = [](rocprofiler_service_callback_tracing_kind_t kind,
const char* kind_name,
void* data) {
// store the callback kind name
static_cast<call_stack_t*>(data)->emplace_back(source_location{
"rocprofiler_iterate_callback_tracing_kind_names ", __FILE__, __LINE__, kind_name});
// store the operation names for the HSA API
if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
{
rocprofiler_iterate_callback_tracing_kind_operation_names(
kind, tracing_operation_names_cb, data);
}
return 0;
};
rocprofiler_iterate_callback_tracing_kind_names(tracing_kind_names_cb,
static_cast<void*>(tool_data));
}
void
tool_tracing_callback(rocprofiler_callback_tracing_record_t record, void* user_data)
{
assert(user_data != nullptr);
auto info = std::stringstream{};
info << "tid=" << record.thread_id << ", cid=" << record.correlation_id.id
<< ", kind=" << record.kind << ", operation=" << record.operation
<< ", phase=" << record.phase;
auto info_data_cb = [](rocprofiler_service_callback_tracing_kind_t,
uint32_t,
uint32_t arg_num,
const char* arg_name,
const char* arg_value_str,
const void* const arg_value_addr,
void* cb_data) -> int {
auto& dss = *static_cast<std::stringstream*>(cb_data);
dss << ((arg_num == 0) ? "(" : ", ");
dss << arg_num << ": " << arg_name << "=" << arg_value_str;
(void) arg_value_addr;
return 0;
};
auto info_data = std::stringstream{};
ROCPROFILER_CALL(rocprofiler_iterate_callback_tracing_operation_args(
record, info_data_cb, static_cast<void*>(&info_data)),
"Failure iterating trace operation args");
auto info_data_str = info_data.str();
if(!info_data_str.empty()) info << " " << info_data_str << ")";
static auto _mutex = std::mutex{};
_mutex.lock();
static_cast<call_stack_t*>(user_data)->emplace_back(
source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
_mutex.unlock();
}
int
tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
{
assert(tool_data != nullptr);
static_cast<call_stack_t*>(tool_data)->emplace_back(
source_location{__FUNCTION__, __FILE__, __LINE__, ""});
store_callback_id_names(static_cast<call_stack_t*>(tool_data));
client_fini_func = fini_func;
ROCPROFILER_CALL(rocprofiler_create_context(&client_ctx), "context creation failed");
ROCPROFILER_CALL(
rocprofiler_configure_callback_tracing_service(client_ctx,
ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API,
nullptr,
0,
tool_tracing_callback,
tool_data),
"callback tracing service failed to configure");
int valid_ctx = 0;
ROCPROFILER_CALL(rocprofiler_context_is_valid(client_ctx, &valid_ctx),
"failure checking context validity");
if(valid_ctx == 0)
{
// notify rocprofiler that initialization failed
// and all the contexts, buffers, etc. created
// should be ignored
return -1;
}
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
// no errors
return 0;
}
void
tool_fini(void* tool_data)
{
assert(tool_data != nullptr);
auto* _call_stack = static_cast<call_stack_t*>(tool_data);
_call_stack->emplace_back(source_location{__FUNCTION__, __FILE__, __LINE__, ""});
print_call_stack(*_call_stack);
delete _call_stack;
}
} // namespace
void
setup()
{}
void
shutdown()
{
if(client_id) client_fini_func(*client_id);
}
void
start()
{
ROCPROFILER_CALL(rocprofiler_start_context(client_ctx), "rocprofiler context start failed");
}
void
stop()
{
int status = 0;
ROCPROFILER_CALL(rocprofiler_is_initialized(&status), "failed to retrieve init status");
if(status != 0)
{
ROCPROFILER_CALL(rocprofiler_stop_context(client_ctx), "rocprofiler context stop failed");
}
}
} // namespace client
extern "C" rocprofiler_tool_configure_result_t*
rocprofiler_configure(uint32_t version,
const char* runtime_version,
uint32_t priority,
rocprofiler_client_id_t* id)
{
// only activate if main tool
if(priority > 0) return nullptr;
// set the client name
id->name = "ExampleTool";
// store client info
client::client_id = id;
// compute major/minor/patch version info
uint32_t major = version / 10000;
uint32_t minor = (version % 10000) / 100;
uint32_t patch = version % 100;
// generate info string
auto info = std::stringstream{};
info << id->name << " is using rocprofiler v" << major << "." << minor << "." << patch << " ("
<< runtime_version << ")";
std::clog << info.str() << std::endl;
// demonstration of alternative way to get the version info
{
auto version_info = std::array<uint32_t, 3>{};
ROCPROFILER_CALL(
rocprofiler_get_version(&version_info.at(0), &version_info.at(1), &version_info.at(2)),
"failed to get version info");
if(std::array<uint32_t, 3>{major, minor, patch} != version_info)
{
throw std::runtime_error{"version info mismatch"};
}
}
// data passed around all the callbacks
auto* client_tool_data = new std::vector<client::source_location>{};
// add first entry
client_tool_data->emplace_back(
client::source_location{__FUNCTION__, __FILE__, __LINE__, info.str()});
// create configure data
static auto cfg =
rocprofiler_tool_configure_result_t{sizeof(rocprofiler_tool_configure_result_t),
&client::tool_init,
&client::tool_fini,
static_cast<void*>(client_tool_data)};
// return pointer to configure data
return &cfg;
}
+44
Просмотреть файл
@@ -0,0 +1,44 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#ifdef callback_api_tracing_client_EXPORTS
# define CLIENT_API __attribute__((visibility("default")))
#else
# define CLIENT_API
#endif
namespace client
{
void
setup() CLIENT_API;
void
shutdown() CLIENT_API;
void
start() CLIENT_API;
void
stop() CLIENT_API;
} // namespace client
+244
Просмотреть файл
@@ -0,0 +1,244 @@
/*
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#include "client.hpp"
#include "hip/hip_runtime.h"
#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <mutex>
#include <random>
#include <stdexcept>
#define HIP_API_CALL(CALL) \
{ \
hipError_t error_ = (CALL); \
if(error_ != hipSuccess) \
{ \
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
fprintf(stderr, \
"%s:%d :: HIP error : %s\n", \
__FILE__, \
__LINE__, \
hipGetErrorString(error_)); \
throw std::runtime_error("hip_api_call"); \
} \
}
namespace
{
using auto_lock_t = std::unique_lock<std::mutex>;
auto print_lock = std::mutex{};
size_t nthreads = 2;
size_t nitr = 500;
size_t nsync = 10;
constexpr unsigned shared_mem_tile_dim = 32;
void
check_hip_error(void);
void
verify(int* in, int* out, int M, int N);
} // namespace
__global__ void
transpose_a(int* in, int* out, int M, int N);
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
int
main(int argc, char** argv)
{
client::setup(); // currently does nothing
// client::start(); // currently will fail
int rank = 0;
int size = 1;
for(int i = 1; i < argc; ++i)
{
auto _arg = std::string{argv[i]};
if(_arg == "?" || _arg == "-h" || _arg == "--help")
{
fprintf(stderr,
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
nthreads,
nitr,
nsync);
exit(EXIT_SUCCESS);
}
}
if(argc > 1) nthreads = atoll(argv[1]);
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
printf("[transpose] Number of threads: %zu\n", nthreads);
printf("[transpose] Number of iterations: %zu\n", nitr);
printf("[transpose] Syncing every %zu iterations\n", nsync);
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
int ndevice = 0;
int devid = rank;
HIP_API_CALL(hipGetDeviceCount(&ndevice));
printf("[transpose] Number of devices found: %i\n", ndevice);
if(ndevice > 0)
{
devid = rank % ndevice;
HIP_API_CALL(hipSetDevice(devid));
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
}
if(rank == devid && rank < ndevice)
{
std::vector<std::thread> _threads{};
std::vector<hipStream_t> _streams(nthreads);
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
for(size_t i = 1; i < nthreads; ++i)
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
run(rank, 0, _streams.at(0), argc, argv);
for(auto& itr : _threads)
itr.join();
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
}
HIP_API_CALL(hipDeviceSynchronize());
HIP_API_CALL(hipDeviceReset());
client::stop();
client::shutdown();
return 0;
}
__global__ void
transpose_a(int* in, int* out, int M, int N)
{
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
tile[threadIdx.y][threadIdx.x] = in[idx];
__syncthreads();
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
out[idx] = tile[threadIdx.x][threadIdx.y];
}
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
{
unsigned int M = 4960 * 2;
unsigned int N = 4960 * 2;
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
auto_lock_t _lk{print_lock};
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
_lk.unlock();
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
std::uniform_int_distribution<int> _dist{0, 1000};
size_t size = sizeof(int) * M * N;
int* inp_matrix = new int[size];
int* out_matrix = new int[size];
for(size_t i = 0; i < M * N; i++)
{
inp_matrix[i] = _dist(_engine);
out_matrix[i] = 0;
}
int* in = nullptr;
int* out = nullptr;
HIP_API_CALL(hipMalloc(&in, size));
HIP_API_CALL(hipMalloc(&out, size));
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
HIP_API_CALL(hipStreamSynchronize(stream));
dim3 grid(M / 32, N / 32, 1);
dim3 block(32, 32, 1); // transpose_a
auto t1 = std::chrono::high_resolution_clock::now();
for(size_t i = 0; i < nitr; ++i)
{
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
check_hip_error();
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
}
auto t2 = std::chrono::high_resolution_clock::now();
HIP_API_CALL(hipStreamSynchronize(stream));
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
float GB = (float) size * nitr * 2 / (1 << 30);
print_lock.lock();
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
<< std::endl;
print_lock.unlock();
HIP_API_CALL(hipStreamSynchronize(stream));
// cpu_transpose(matrix, out_matrix, M, N);
verify(inp_matrix, out_matrix, M, N);
HIP_API_CALL(hipFree(in));
HIP_API_CALL(hipFree(out));
delete[] inp_matrix;
delete[] out_matrix;
}
namespace
{
void
check_hip_error(void)
{
hipError_t err = hipGetLastError();
if(err != hipSuccess)
{
auto_lock_t _lk{print_lock};
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
throw std::runtime_error("hip_api_call");
}
}
void
verify(int* in, int* out, int M, int N)
{
for(int i = 0; i < 10; i++)
{
int row = rand() % M;
int col = rand() % N;
if(in[row * N + col] != out[col * M + row])
{
auto_lock_t _lk{print_lock};
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
<< out[col * M + row] << "\n";
}
}
}
} // namespace
+38
Просмотреть файл
@@ -0,0 +1,38 @@
cmake_minimum_required(VERSION 3.21 FATAL_ERROR)
find_program(
HIPCC_EXECUTABLE
NAMES hipcc
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm NO_CACHE)
mark_as_advanced(HIPCC_EXECUTABLE)
if(HIPCC_EXECUTABLE)
set(CMAKE_CXX_COMPILER ${HIPCC_EXECUTABLE})
endif()
project(rocprofiler-transpose-sample LANGUAGES CXX)
option(TRANSPOSE_USE_MPI "Enable MPI support in transpose exe" OFF)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
add_executable(transpose)
target_sources(transpose PRIVATE transpose.cpp)
target_compile_options(transpose PRIVATE -W -Wall -Wextra -Wpedantic -Wshadow -Werror)
find_package(Threads REQUIRED)
target_link_libraries(transpose PRIVATE Threads::Threads)
if(TRANSPOSE_USE_MPI)
find_package(MPI REQUIRED)
target_compile_definitions(transpose PRIVATE USE_MPI)
target_link_libraries(transpose PRIVATE MPI::MPI_C)
endif()
install(
TARGETS transpose
DESTINATION bin
COMPONENT rocprofiler-samples)
+278
Просмотреть файл
@@ -0,0 +1,278 @@
/*
Copyright (c) 2015-2020 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#include "hip/hip_runtime.h"
#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <mutex>
#include <random>
#include <stdexcept>
#if defined(USE_MPI)
# include <mpi.h>
#endif
#define HIP_API_CALL(CALL) \
{ \
hipError_t error_ = (CALL); \
if(error_ != hipSuccess) \
{ \
auto _hip_api_print_lk = auto_lock_t{print_lock}; \
fprintf(stderr, \
"%s:%d :: HIP error : %s\n", \
__FILE__, \
__LINE__, \
hipGetErrorString(error_)); \
throw std::runtime_error("hip_api_call"); \
} \
}
namespace
{
using auto_lock_t = std::unique_lock<std::mutex>;
auto print_lock = std::mutex{};
size_t nthreads = 2;
size_t nitr = 500;
size_t nsync = 10;
constexpr unsigned shared_mem_tile_dim = 32;
void
check_hip_error(void);
void
verify(int* in, int* out, int M, int N);
} // namespace
__global__ void
transpose_a(int* in, int* out, int M, int N);
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv);
#if defined(USE_MPI)
void
do_a2a(int rank);
#endif
int
main(int argc, char** argv)
{
int rank = 0;
int size = 1;
for(int i = 1; i < argc; ++i)
{
auto _arg = std::string{argv[i]};
if(_arg == "?" || _arg == "-h" || _arg == "--help")
{
fprintf(stderr,
"usage: transpose [NUM_THREADS (%zu)] [NUM_ITERATION (%zu)] "
"[SYNC_EVERY_N_ITERATIONS (%zu)]\n",
nthreads,
nitr,
nsync);
exit(EXIT_SUCCESS);
}
}
if(argc > 1) nthreads = atoll(argv[1]);
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
printf("[transpose] Number of threads: %zu\n", nthreads);
printf("[transpose] Number of iterations: %zu\n", nitr);
printf("[transpose] Syncing every %zu iterations\n", nsync);
#if defined(USE_MPI)
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
#else
(void) size;
#endif
// this is a temporary workaround in omnitrace when HIP + MPI is enabled
int ndevice = 0;
int devid = rank;
HIP_API_CALL(hipGetDeviceCount(&ndevice));
printf("[transpose] Number of devices found: %i\n", ndevice);
if(ndevice > 0)
{
devid = rank % ndevice;
HIP_API_CALL(hipSetDevice(devid));
printf("[transpose] Rank %i assigned to device %i\n", rank, devid);
}
if(rank == devid && rank < ndevice)
{
std::vector<std::thread> _threads{};
std::vector<hipStream_t> _streams(nthreads);
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamCreate(&_streams.at(i)));
for(size_t i = 1; i < nthreads; ++i)
_threads.emplace_back(run, rank, i, _streams.at(i), argc, argv);
run(rank, 0, _streams.at(0), argc, argv);
for(auto& itr : _threads)
itr.join();
for(size_t i = 0; i < nthreads; ++i)
HIP_API_CALL(hipStreamDestroy(_streams.at(i)));
}
HIP_API_CALL(hipDeviceSynchronize());
HIP_API_CALL(hipDeviceReset());
#if defined(USE_MPI)
MPI_Barrier(MPI_COMM_WORLD);
do_a2a(rank);
MPI_Finalize();
#endif
return 0;
}
__global__ void
transpose_a(int* in, int* out, int M, int N)
{
__shared__ int tile[shared_mem_tile_dim][shared_mem_tile_dim];
int idx = (blockIdx.y * blockDim.y + threadIdx.y) * M + blockIdx.x * blockDim.x + threadIdx.x;
tile[threadIdx.y][threadIdx.x] = in[idx];
__syncthreads();
idx = (blockIdx.x * blockDim.x + threadIdx.y) * N + blockIdx.y * blockDim.y + threadIdx.x;
out[idx] = tile[threadIdx.x][threadIdx.y];
}
void
run(int rank, int tid, hipStream_t stream, int argc, char** argv)
{
unsigned int M = 4960 * 2;
unsigned int N = 4960 * 2;
if(argc > 2) nitr = atoll(argv[2]);
if(argc > 3) nsync = atoll(argv[3]);
auto_lock_t _lk{print_lock};
std::cout << "[" << rank << "][" << tid << "] M: " << M << " N: " << N << std::endl;
_lk.unlock();
std::default_random_engine _engine{std::random_device{}() * (rank + 1) * (tid + 1)};
std::uniform_int_distribution<int> _dist{0, 1000};
size_t size = sizeof(int) * M * N;
int* inp_matrix = new int[size];
int* out_matrix = new int[size];
for(size_t i = 0; i < M * N; i++)
{
inp_matrix[i] = _dist(_engine);
out_matrix[i] = 0;
}
int* in = nullptr;
int* out = nullptr;
HIP_API_CALL(hipMalloc(&in, size));
HIP_API_CALL(hipMalloc(&out, size));
HIP_API_CALL(hipMemsetAsync(in, 0, size, stream));
HIP_API_CALL(hipMemsetAsync(out, 0, size, stream));
HIP_API_CALL(hipMemcpyAsync(in, inp_matrix, size, hipMemcpyHostToDevice, stream));
HIP_API_CALL(hipStreamSynchronize(stream));
dim3 grid(M / 32, N / 32, 1);
dim3 block(32, 32, 1); // transpose_a
auto t1 = std::chrono::high_resolution_clock::now();
for(size_t i = 0; i < nitr; ++i)
{
transpose_a<<<grid, block, 0, stream>>>(in, out, M, N);
check_hip_error();
if(i % nsync == (nsync - 1)) HIP_API_CALL(hipStreamSynchronize(stream));
}
auto t2 = std::chrono::high_resolution_clock::now();
HIP_API_CALL(hipStreamSynchronize(stream));
HIP_API_CALL(hipMemcpyAsync(out_matrix, out, size, hipMemcpyDeviceToHost, stream));
double time = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
float GB = (float) size * nitr * 2 / (1 << 30);
print_lock.lock();
std::cout << "[" << rank << "][" << tid << "] Runtime of transpose is " << time << " sec\n"
<< "The average performance of transpose is " << GB / time << " GBytes/sec"
<< std::endl;
print_lock.unlock();
HIP_API_CALL(hipStreamSynchronize(stream));
// cpu_transpose(matrix, out_matrix, M, N);
verify(inp_matrix, out_matrix, M, N);
HIP_API_CALL(hipFree(in));
HIP_API_CALL(hipFree(out));
delete[] inp_matrix;
delete[] out_matrix;
}
namespace
{
void
check_hip_error(void)
{
hipError_t err = hipGetLastError();
if(err != hipSuccess)
{
auto_lock_t _lk{print_lock};
std::cerr << "Error: " << hipGetErrorString(err) << std::endl;
throw std::runtime_error("hip_api_call");
}
}
void
verify(int* in, int* out, int M, int N)
{
for(int i = 0; i < 10; i++)
{
int row = rand() % M;
int col = rand() % N;
if(in[row * N + col] != out[col * M + row])
{
auto_lock_t _lk{print_lock};
std::cout << "mismatch: " << row << ", " << col << " : " << in[row * N + col] << " | "
<< out[col * M + row] << "\n";
}
}
}
} // namespace
#if defined(USE_MPI)
void
do_a2a(int rank)
{
// Define my value
int values[3];
for(int i = 0; i < 3; ++i)
values[i] = rank * 300 + i * 100;
printf("Process %d, values = %d, %d, %d.\n", rank, values[0], values[1], values[2]);
int buffer_recv[3];
MPI_Alltoall(&values, 1, MPI_INT, buffer_recv, 1, MPI_INT, MPI_COMM_WORLD);
printf("Values collected on process %d: %d, %d, %d.\n",
rank,
buffer_recv[0],
buffer_recv[1],
buffer_recv[2]);
}
#endif
+1 -1
Просмотреть файл
@@ -102,7 +102,7 @@ rocprofiler_pc_sampling_callback(rocprofiler_context_id_t /*context_id*/,
for(size_t i = 0; i < num_headers; i++)
{
auto* cur_header = headers[i];
if(cur_header->kind == 0)
if(cur_header->category == ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING)
{
auto* pc_sample = static_cast<rocprofiler_pc_sampling_record_t*>(cur_header->payload);
printf("--- pc: %lx, dispatch_id: %lx, timestamp: %lu, hardware_id: %lu\n",
+19 -5
Просмотреть файл
@@ -142,14 +142,22 @@ RECURSIVE = YES
EXCLUDE =
EXCLUDE_SYMLINKS = YES
EXCLUDE_PATTERNS = */.git/* \
@SOURCE_DIR@/samples/* \
@SOURCE_DIR@/**/tests/* \
@SOURCE_DIR@/source/include/rocprofiler/defines.h \
@SOURCE_DIR@/source/include/rocprofiler/config.h
@SOURCE_DIR@/**/scripts/* \
@SOURCE_DIR@/**/docs/*
EXCLUDE_SYMBOLS = "std::*" \
"ROCPROFILER_ATTRIBUTE" \
"ROCPROFILER_API" \
"ROCPROFILER_NONNULL"
"ROCPROFILER_NONNULL" \
"ROCPROFILER_PUBLIC_API" \
"ROCPROFILER_HIDDEN_API" \
"ROCPROFILER_EXPORT_DECORATOR" \
"ROCPROFILER_IMPORT_DECORATOR" \
"ROCPROFILER_EXPORT" \
"ROCPROFILER_IMPORT" \
"ROCPROFILER_HANDLE_LITERAL" \
"ROCPROFILER_EXTERN_C_INIT" \
"ROCPROFILER_EXTERN_C_FINI"
EXAMPLE_PATH = @SOURCE_DIR@/samples
EXAMPLE_PATTERNS = *.h \
*.hh \
@@ -157,7 +165,6 @@ EXAMPLE_PATTERNS = *.h \
*.c \
*.cc \
*.cpp \
conf.py \
*.txt
EXAMPLE_RECURSIVE = YES
IMAGE_PATH =
@@ -330,6 +337,13 @@ PREDEFINED = "ROCPROFILER_API=" \
"ROCPROFILER_EXPORT=" \
"ROCPROFILER_IMPORT=" \
"ROCPROFILER_NONNULL(...)=" \
"ROCPROFILER_PUBLIC_API=" \
"ROCPROFILER_HIDDEN_API=" \
"ROCPROFILER_EXPORT_DECORATOR=" \
"ROCPROFILER_IMPORT_DECORATOR=" \
"ROCPROFILER_HANDLE_LITERAL=" \
"ROCPROFILER_EXTERN_C_INIT=" \
"ROCPROFILER_EXTERN_C_FINI=" \
"__attribute__(x)=" \
"__declspec(x)=" \
"size_t=unsigned long" \
+24 -2
Просмотреть файл
@@ -6,8 +6,30 @@
configure_file(${CMAKE_CURRENT_LIST_DIR}/version.h.in
${CMAKE_CURRENT_BINARY_DIR}/version.h @ONLY)
set(ROCPROFILER_HEADER_FILES config.h defines.h hip.h hsa.h marker.h rocprofiler.h
rocprofiler_plugin.h ${CMAKE_CURRENT_BINARY_DIR}/version.h)
set(ROCPROFILER_HEADER_FILES
# core headers
rocprofiler.h
rocprofiler_plugin.h
# secondary headers
agent.h
agent_profile.h
buffer.h
buffer_tracing.h
callback_tracing.h
context.h
counters.h
defines.h
dispatch_profile.h
external_correlation.h
fwd.h
hip.h
hsa.h
internal_threading.h
marker.h
pc_sampling.h
profile_config.h
spm.h
${CMAKE_CURRENT_BINARY_DIR}/version.h)
install(FILES ${ROCPROFILER_HEADER_FILES}
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/rocprofiler)
+72
Просмотреть файл
@@ -0,0 +1,72 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup AGENTS Agent Information
* @{
*/
/**
* @brief Agent.
*/
typedef struct
{
rocprofiler_agent_id_t id;
rocprofiler_agent_type_t type;
const char* name;
rocprofiler_pc_sampling_config_array_t pc_sampling_configs;
} rocprofiler_agent_t;
/**
* @brief Callback function type for querying the available agents
*
* @param [in] agents Array of pointers to agents
* @param [in] num_agents Number of agents in array
* @param [in] user_data Data pointer passback
* @return ::rocprofiler_status_t
*/
typedef rocprofiler_status_t (*rocprofiler_available_agents_cb_t)(rocprofiler_agent_t** agents,
size_t num_agents,
void* user_data);
/**
* @brief Receive synchronous callback with an array of available agents at moment of invocation
*
* @param [in] callback Callback function accepting list of agents
* @param [in] agent_size Should be set to sizeof(rocprofiler_agent_t)
* @param [in] user_data Data pointer provided to callback
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
size_t agent_size,
void* user_data) ROCPROFILER_NONNULL(1);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+70
Просмотреть файл
@@ -0,0 +1,70 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup AGENT_PROFILE_COUNTING_SERVICE Agent Profile Counting Service
* @{
*/
/**
* @brief ROCProfiler Agent Profile Counting Data.
*
* Counters, including identifiers to get counter information and Counters values
*/
typedef struct
{
/**
*/
rocprofiler_record_counter_t* counters;
uint64_t counters_count;
} rocprofiler_agent_profile_counting_data_t;
/**
* @brief Configure Profile Counting Service for agent.
*
* @param [in] buffer_id
* @param [in] profile_config_id
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_agent_profile_counting_service(
rocprofiler_buffer_id_t buffer_id,
rocprofiler_profile_config_id_t profile_config_id);
/**
* @brief Sample Profile Counting Service for agent.
*
* @param [out] data // It is always a size of one
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_sample_agent_profile_counting_service(rocprofiler_agent_profile_counting_data_t* data);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+106
Просмотреть файл
@@ -0,0 +1,106 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup BUFFER_HANDLING Buffer
* @{
*
* Every Buffer is associated with a specific service kind.
* OR
* Every Buffer is associated with a specific service ID.
*
*/
/**
* @brief Async callback function.
*
* @code{.cpp}
* for(size_t i = 0; i < num_headers; ++i)
* {
* rocprofiler_record_header_t* hdr = headers[i];
* if(hdr->kind == ROCPROFILER_RECORD_KIND_PC_SAMPLE)
* {
* auto* data = static_cast<rocprofiler_pc_sample_t*>(&hdr->payload);
* ...
* }
* }
* @endcode
*/
typedef void (*rocprofiler_buffer_tracing_cb_t)(rocprofiler_context_id_t context,
rocprofiler_buffer_id_t buffer_id,
rocprofiler_record_header_t** headers,
size_t num_headers,
void* data,
uint64_t drop_count);
/**
* @brief Create buffer.
*
* @param [in] context Context identifier associated with buffer
* @param [in] size Size of the buffer in bytes
* @param [in] watermark - watermark size, where the callback is called, if set
* to 0 then the callback will be called on every record
* @param [in] policy Behavior policy when buffer is full
* @param [in] callback Callback to invoke when buffer is flushed/full
* @param [in] callback_data Data to provide in callback function
* @param [out] buffer_id Identification handle for buffer
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_create_buffer(rocprofiler_context_id_t context,
size_t size,
size_t watermark,
rocprofiler_buffer_policy_t policy,
rocprofiler_buffer_tracing_cb_t callback,
void* callback_data,
rocprofiler_buffer_id_t* buffer_id) ROCPROFILER_NONNULL(5, 7);
/**
* @brief Destroy buffer.
*
* @param [in] buffer_id
* @return ::rocprofiler_status_t
*
* Note: This will destroy the buffer even if it is not empty. The user can
* call @ref ::rocprofiler_flush_buffer before it to make sure the buffer is empty.
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id);
/**
* @brief Flush buffer.
*
* @param [in] buffer_id
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+278
Просмотреть файл
@@ -0,0 +1,278 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup BUFFER_TRACING_SERVICE Asynchronous Tracing Service
*
* Receive callbacks for batches of records from an internal (background) thread
*
* @{
*/
/**
* @brief ROCProfiler Buffer HSA API Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_tracing_operation_t operation; // rocprofiler/hsa.h
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_thread_id_t thread_id;
} rocprofiler_buffer_tracing_hsa_api_record_t;
/**
* @brief ROCProfiler Buffer HIP API Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_tracing_operation_t operation; // rocprofiler/hip.h
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_thread_id_t thread_id;
} rocprofiler_buffer_tracing_hip_api_record_t;
/**
* @brief ROCProfiler Buffer Marker Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_tracing_operation_t operation; // rocprofiler/marker.h
rocprofiler_timestamp_t timestamp;
rocprofiler_thread_id_t thread_id;
uint64_t marker_id; // rocprofiler_marker_id_t
// const char* message; // (Need Review?)
} rocprofiler_buffer_tracing_marker_record_t;
/**
* @brief ROCProfiler Buffer Memory Copy Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
/**
* Memory copy operation that can be derived from
* ::rocprofiler_tracing_operation_t
*/
uint32_t operation;
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_queue_id_t queue_id;
} rocprofiler_buffer_tracing_memory_copy_record_t;
/**
* @brief ROCProfiler Buffer Kernel Dispatch Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_queue_id_t queue_id;
const char* kernel_name;
} rocprofiler_buffer_tracing_kernel_dispatch_record_t;
/**
* @brief ROCProfiler Buffer Page Migration Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_queue_id_t queue_id;
// Not Sure What is the info needed here?
} rocprofiler_buffer_tracing_page_migration_record_t;
/**
* @brief ROCProfiler Buffer Scratch Memory Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_queue_id_t queue_id;
// Not Sure What is the info needed here?
} rocprofiler_buffer_tracing_scratch_memory_record_t;
/**
* @brief ROCProfiler Buffer Queue Scheduling Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
rocprofiler_queue_id_t queue_id;
// Not Sure What is the info needed here?
} rocprofiler_buffer_tracing_queue_scheduling_record_t;
/**
* @brief ROCProfiler Code Object Tracer Buffer Record.
*
* We need to guarantee that these records are in the buffer before the
* corresponding Exit Phase API calls are called.
*/
// typedef struct {
// rocprofiler_buffer_tracing_record_header_t header;
// rocprofiler_tracing_code_object_kind_id_t kind;
// } rocprofiler_buffer_tracing_code_object_header_t;
/**
* @brief ROCProfiler Code Object Load Tracer Buffer Record.
*
*/
// typedef struct {
// rocprofiler_buffer_tracing_code_object_header_t header;
// uint64_t load_base; // code object load base
// uint64_t load_size; // code object load size
// const char *uri; // URI string (NULL terminated)
// rocprofiler_timestamp_t timestamp;
// // uint32_t storage_type; // code object storage type (Need Review?)
// // int storage_file; // origin file descriptor (Need Review?)
// // uint64_t memory_base; // origin memory base (Need Review?)
// // uint64_t memory_size; // origin memory size (Need Review?)
// // uint64_t load_delta; // code object load delta (Need Review?)
// } rocprofiler_buffer_tracing_code_object_load_record_t;
/**
* @brief ROCProfiler Code Object UnLoad Tracer Buffer Record.
*
*/
// typedef struct {
// rocprofiler_buffer_tracing_code_object_header_t header;
// uint64_t load_base; // code object load base
// rocprofiler_timestamp_t timestamp;
// } rocprofiler_buffer_tracing_code_object_unload_record_t;
/**
* @brief ROCProfiler Code Object Kernel Symbol Tracer Buffer Record.
*
*/
// typedef struct {
// rocprofiler_buffer_tracing_code_object_header_t header;
// const char *kernel_name; // kernel name string (NULL terminated)
// uint64_t kernel_descriptor; // kernel descriptor (Need to be changed from
// // uint64_t to ::rocprofiler_address_t)
// // rocprofiler_timestamp_t timestamp; // (Need Review?)
// } rocprofiler_buffer_tracing_code_object_kernel_symbol_record_t;
/**
* @brief ROCProfiler Buffer External Correlation Tracer Record.
*/
typedef struct
{
rocprofiler_service_buffer_tracing_kind_t kind;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_external_correlation_id_t external_correlation_id;
} rocprofiler_buffer_tracing_external_correlation_record_t;
/**
* @brief Callback function for mapping @ref rocprofiler_service_buffer_tracing_kind_t ids to
* string names. @see rocprofiler_iterate_buffer_trace_kind_names.
*/
typedef int (*rocprofiler_buffer_tracing_kind_name_cb_t)(
rocprofiler_service_buffer_tracing_kind_t kind,
const char* kind_name,
void* data);
/**
* @brief Callback function for mapping the operations of a given @ref
* rocprofiler_service_buffer_tracing_kind_t to string names. @see
* rocprofiler_iterate_buffer_trace_kind_operation_names.
*/
typedef int (*rocprofiler_buffer_tracing_operation_name_cb_t)(
rocprofiler_service_buffer_tracing_kind_t kind,
uint32_t operation,
const char* operation_name,
void* data);
/**
* @brief Configure Buffer Tracing Service.
*
* @param [in] context_id
* @param [in] kind
* @param [in] operations
* @param [in] operations_count
* @param [in] buffer_id
* @return ::rocprofiler_status_t
*
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id,
rocprofiler_service_buffer_tracing_kind_t kind,
rocprofiler_tracing_operation_t* operations,
size_t operations_count,
rocprofiler_buffer_id_t buffer_id);
/**
* @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id
* mapped to a constant string. The strings provided in the arg will be valid pointers for the
* entire duration of the program. It is recommended to call this function once and cache this data
* in the client instead of making multiple on-demand calls.
*
* @param [in] callback Callback function invoked for each enumeration value in @ref
* rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values.
* @param [in] data User data passed back into the callback
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback,
void* data) ROCPROFILER_NONNULL(1);
/**
* @brief Iterates over all the mappings of the operations for a given @ref
* rocprofiler_service_buffer_tracing_kind_t and invokes the callback with the kind, operation id,
* and the string mapping to the operation id. The strings provided in the callback arg will be
* valid pointers for the entire duration of the program. It is recommended to call this function
* once per kind, and cache this data in the client instead of making multiple on-demand calls.
*
* @param [in] kind which buffer tracing kind operations to iterate over
* @param [in] callback Callback function invoked for each operation associated with @ref
* rocprofiler_service_buffer_tracing_kind_t with the exception of the `NONE` and `LAST` values.
* @param [in] data User data passed back into the callback
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_iterate_buffer_tracing_kind_operation_names(
rocprofiler_service_buffer_tracing_kind_t kind,
rocprofiler_buffer_tracing_operation_name_cb_t callback,
void* data) ROCPROFILER_NONNULL(2);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+252
Просмотреть файл
@@ -0,0 +1,252 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
#include <rocprofiler/hsa.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup CALLBACK_TRACING_SERVICE Synchronous Tracing Services
*
* Receive immediate callbacks on the calling thread
*
* @{
*/
/**
* @brief ROCProfiler HSA API Callback Data.
*/
typedef struct
{
size_t size; ///< provides the size of this struct
rocprofiler_hsa_api_args_t args;
rocprofiler_hsa_api_retval_t retval;
} rocprofiler_hsa_api_callback_tracer_data_t;
/**
* @brief ROCProfiler HIP API Callback Data.
*
* Depending on the operation kind, the data can be casted to the corresponding
* structure.
*
*/
typedef void* rocprofiler_hip_api_callback_api_data_t;
/**
* @brief ROCProfiler HIP API Tracer Callback Data.
*/
typedef struct
{
size_t size;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_address_t host_kernel_address;
rocprofiler_hip_api_callback_api_data_t data; // Arguments or api_data?
} rocprofiler_hip_api_callback_tracer_data_t;
/**
* @brief ROCProfiler Marker Callback Data.
*
* Depending on the operation kind, the data can be casted to the corresponding
* structure.
*
*/
typedef void* rocprofiler_marker_callback_api_data_t;
/**
* @brief ROCProfiler Marker Tracer Callback Data.
*/
typedef struct
{
size_t size;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_marker_callback_api_data_t data; // Arguments or api_data?
} rocprofiler_marker_callback_tracer_data_t;
/**
* @brief ROCProfiler Code Object Load Tracer Callback Record.
*/
typedef struct
{
uint64_t load_base; // code object load base
uint64_t load_size; // code object load size
const char* uri; // URI string (NULL terminated)
// uint32_t storage_type; // code object storage type (Need Review?)
// int storage_file; // origin file descriptor (Need Review?)
// uint64_t memory_base; // origin memory base (Need Review?)
// uint64_t memory_size; // origin memory size (Need Review?)
// uint64_t load_delta; // code object load delta (Need Review?)
} rocprofiler_callback_tracer_code_object_load_data_t;
/**
* @brief ROCProfiler Code Object UnLoad Tracer Callback Record.
*
*/
typedef struct
{
uint64_t load_base; // code object load base
} rocprofiler_callback_tracer_code_object_unload_data_t;
/**
* @brief ROCProfiler Code Object Device Kernel Symbol Tracer Callback Record.
*
*/
typedef struct
{
const char* kernel_name; // kernel name string (NULL terminated)
rocprofiler_address_t kernel_descriptor; // kernel descriptor
} rocprofiler_callback_tracer_code_object_device_kernel_symbol_data_t;
/**
* @brief ROCProfiler Code Object Register Host Kernel Symbol Tracer Callback
* Record.
*
*/
typedef struct
{
rocprofiler_address_t host_address; // host address
// Should this be nullptr if it is unregister?
const char* kernel_name; // kernel name string (NULL terminated)
rocprofiler_address_t kernel_descriptor; // kernel descriptor
} rocprofiler_callback_tracer_code_object_register_host_kernel_symbol_data_t;
/**
* @brief API Tracing callback function.
*/
typedef void (*rocprofiler_callback_tracing_cb_t)(rocprofiler_callback_tracing_record_t record,
void* user_data);
/**
* @brief Callback function for mapping @ref rocprofiler_service_callback_tracing_kind_t ids to
* string names. @see rocprofiler_iterate_callback_tracing_kind_names.
*/
typedef int (*rocprofiler_callback_tracing_kind_name_cb_t)(
rocprofiler_service_callback_tracing_kind_t kind,
const char* kind_name,
void* data);
/**
* @brief Callback function for mapping the operations of a given @ref
* rocprofiler_service_callback_tracing_kind_t to string names. @see
* rocprofiler_iterate_callback_tracing_kind_operation_names.
*/
typedef int (*rocprofiler_callback_tracing_operation_name_cb_t)(
rocprofiler_service_callback_tracing_kind_t kind,
uint32_t operation,
const char* operation_name,
void* data);
/**
* @brief Callback function for iterating over the function arguments to a traced function.
* This function will be invoked for each argument.
* @see rocprofiler_iterate_callback_tracing_operation_args
*
* @param kind [in] domain
* @param operation [in] associated domain operation
* @param arg_number [in] the argument number, starting at zero
* @param arg_name [in] the name of the argument in the prototype (or rocprofiler union)
* @param arg_value_str [in] conversion of the argument to a string, e.g. operator<< overload
* @param arg_value_addr [in] the address of the argument stored by rocprofiler.
* @param data [in] user data
*/
typedef int (*rocprofiler_callback_tracing_operation_args_cb_t)(
rocprofiler_service_callback_tracing_kind_t kind,
uint32_t operation,
uint32_t arg_number,
const char* arg_name,
const char* arg_value_str,
const void* const arg_value_addr,
void* data);
/**
* @brief Configure Callback Tracing Service.
*
* @param [in] context_id
* @param [in] kind
* @param [in] operations
* @param [in] operations_count
* @param [in] callback
* @param [in] callback_args
* @return ::rocprofiler_status_t
*
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id,
rocprofiler_service_callback_tracing_kind_t kind,
rocprofiler_tracing_operation_t* operations,
size_t operations_count,
rocprofiler_callback_tracing_cb_t callback,
void* callback_args);
/**
* @brief Iterate over all the mappings of the callback tracing kinds and get a callback with the id
* mapped to a constant string. The strings provided in the arg will be valid pointers for the
* entire duration of the program. It is recommended to call this function once and cache this data
* in the client instead of making multiple on-demand calls.
*
* @param [in] callback Callback function invoked for each enumeration value in @ref
* rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values.
* @param [in] data User data passed back into the callback
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_iterate_callback_tracing_kind_names(
rocprofiler_callback_tracing_kind_name_cb_t callback,
void* data) ROCPROFILER_NONNULL(1);
/**
* @brief Iterates over all the mappings of the operations for a given @ref
* rocprofiler_service_callback_tracing_kind_t and invokes the callback with the kind, operation id,
* and the string mapping to the operation id. The strings provided in the callback arg will be
* valid pointers for the entire duration of the program. It is recommended to call this function
* once per kind, and cache this data in the client instead of making multiple on-demand calls.
*
* @param [in] kind which tracing callback kind operations to iterate over
* @param [in] callback Callback function invoked for each operation associated with @ref
* rocprofiler_service_callback_tracing_kind_t with the exception of the `NONE` and `LAST` values.
* @param [in] data User data passed back into the callback
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_iterate_callback_tracing_kind_operation_names(
rocprofiler_service_callback_tracing_kind_t kind,
rocprofiler_callback_tracing_operation_name_cb_t callback,
void* data) ROCPROFILER_NONNULL(2);
/**
* @brief Iterates over all the arguments for the traced function (when available). This is
* particularly useful when tools want to annotate traces with the function arguments. See
* @example samples/api_callback_tracing/client.cpp for a usage example.
*
* @param[in] record Record provided by service callback
* @param[in] callback The callback function which will be invoked for each argument
* @param[in] user_data Data to be passed to each invocation of the callback
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_iterate_callback_tracing_operation_args(
rocprofiler_callback_tracing_record_t record,
rocprofiler_callback_tracing_operation_args_cb_t callback,
void* user_data) ROCPROFILER_NONNULL(2);
/** @} */
ROCPROFILER_EXTERN_C_FINI
-210
Просмотреть файл
@@ -1,210 +0,0 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/rocprofiler.h>
#ifdef __cplusplus
extern "C" {
#endif
#define ROCPROFILER_API_VERSION_ID 1
#define ROCPROFILER_DOMAIN_OPS_MAX 512
#define ROCPROFILER_DOMAIN_OPS_RESERVED \
((ROCPROFILER_DOMAIN_OPS_MAX * ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST / 8))
typedef uint64_t (*rocprofiler_external_cid_cb_t)(rocprofiler_tracer_activity_domain_t,
uint32_t,
uint64_t);
typedef int (*rocprofiler_filter_name_t)(const char*);
typedef int (*rocprofiler_filter_op_id_t)(uint32_t);
typedef int (*rocprofiler_filter_range_t)(uint32_t, uint32_t);
typedef int (*rocprofiler_filter_dispatch_id_t)(uint64_t);
/// permits tools opportunity to modify the correlation id based on the domain, op, and
/// the rocprofiler generated correlation id
struct rocprofiler_correlation_config
{
rocprofiler_external_cid_cb_t external_id_callback;
};
/// how the tools specify the tracing domain and (optionally) which operations in the
/// domain they want to trace
struct rocprofiler_domain_config
{
rocprofiler_tracer_callback_t callback;
char reserved0[sizeof(uint64_t)];
char reserved1[ROCPROFILER_DOMAIN_OPS_RESERVED];
};
/// for buffered callbacks, the tool provides a callback to create a buffer and the size
struct rocprofiler_buffer_config
{
rocprofiler_buffer_callback_t callback;
uint64_t buffer_size;
// void* reserved0;
char reserved1[sizeof(uint64_t)];
};
/// filters are available to make quick decisions about whether rocprofiler should
/// assemble the data necessary for a callback. This is more for convenience and
/// performance -- anything decisions here could be made in the callback but rocprofiler
/// has to first assemble all the infomation on the callback before it (eventually) gets
/// discarded because the tool has decided it (after configuration), that it no longer
/// wants info meeting certain requirements
struct rocprofiler_filter_config
{
// filter callbacks
rocprofiler_filter_name_t name;
rocprofiler_filter_op_id_t hip_function_id;
rocprofiler_filter_op_id_t hsa_function_id;
rocprofiler_filter_range_t range;
rocprofiler_filter_dispatch_id_t dispatch_id;
// reserved padding
char padding[24 * sizeof(void*)];
};
/// this is the "single source of truth" for the capabilities of rocprofiler.
/// you can one configuration that activates all the capabilities you want
/// and holistically start/stop the sum of those features. Alternatively,
/// you can have multiple configurations in order to activate certain features
/// modularly.
///
/// The general workflow is:
///
/// 1. invoke rocprofiler_allocate_config(...)
/// - rocprofiler allocates any space internally needed for the config
/// - rocprofiler sets a few initial values:
/// - "size" to the size of the config structure used internally
/// - "api_version" to the version id of the API in the rocprofiler library that
/// is being used.
/// - these two values can be used by the tool to identify any potential
/// incompatibilities that the tool might want to know about
/// - rocprofiler checks whether it is too late to configure the tool, e.g.
/// something went wrong and rocprofiler was not able to set itself up as
/// the intercepter
/// 2. tool sets up the configuration struct and sets the "size" variable to the size of
/// their configuration struct and sets the "compat_version" field to the
/// ROCPROFILER_API_VERSION_ID defined by the rocprofiler headers when the tool was
/// built
/// - in other words, the user can communicate to rocprofiler, don't read
/// past this distance in my configuration struct and I built against X version
/// so assume the default behavior and capabilties of version X.
/// 3. tool passes this struct to rocprofiler_validate_config(...)
/// - this step checks the config in isolation and will communicate any potential
/// warnings/issues with that configuration, e.g. rocprofiler_X_config is needed,
/// to HW counters XYZ are not available, etc. The tool then has an opportunity
/// to address these issues however they see fit.
/// 4. tool passes this struct to rocprofiler_start_config(...)
/// - internally, we make a call to rocprofiler_validate_config(...) and if any
/// issues still exist with the config in isolation, rocprofiler tells the app
/// to abort -- mechanisms were provided to prevent aborting prior to this call,
/// aborting the app at this point is to guard against rocprofiler "silently"
/// not working because error codes were ignored
/// - rocprofiler then checks whether this config can actually be activated
/// alongside any other active configuration, e.g. this config wants 4 HW counters
/// and another wants 4 HW counters but we can only activate 6 out of 8 of
/// them in this run. Any issues here will not abort execution but, instead,
/// the features of this configuration will not happen (i.e. config won't be
/// activated) and the issues will be communicated with error codes -- giving
/// the tool the opportunity to address the conflicts (i.e. only request tracing
/// and no HW counters) before attempting to activate the modified config.
/// - once rocprofiler determines all features of a config can be activated, it
/// makes an internal copy of the config and returns an identifier for that
/// configuration. The tool is then free to delete the config and any modification
/// to the config will NOT be reflected in the behavior of rocprofiler.
///
///
struct rocprofiler_config
{
// size is used to ensure that we never read past the end of the version
size_t size; // = sizeof(rocprofiler_config)
uint32_t compat_version; // set by user
uint32_t api_version; // set by rocprofiler
uint64_t reserved0; // internal field
void* user_data; // data passed to callbacks
struct rocprofiler_correlation_config* correlation_id; // = &my_cid_config (optional)
struct rocprofiler_buffer_config* buffer; // = &my_buffer_config (required)
struct rocprofiler_domain_config* domain; // = &my_domain_config (required)
struct rocprofiler_filter_config* filter; // = &my_filter_config (optional)
};
/// \brief returns a properly initialized config struct and allocates any data structures
/// necessary for the config to be used
///
/// \param [out] cfg may adjust config or assign values within structs.
rocprofiler_status_t
rocprofiler_allocate_config(struct rocprofiler_config* cfg);
/// \brief rocprofiler validates config, checks for conflicts, etc. Ensures that
/// the configuration is valid *in isolation*, e.g. it may check that the user
/// set the compat_version field and that required config fields, such as buffer
/// are set. This function will be called before \ref rocprofiler_start_config
/// but is provided to help the user validate one or more configs without starting
/// them
///
/// \param [in] cfg configuration to validate
rocprofiler_status_t
rocprofiler_validate_config(const struct rocprofiler_config* cfg);
/// \brief rocprofiler activates configuration and provides a context identifier
/// \param [in] cfg may adjust config or assign values within structs. If error
/// occurs, could nullptr valid sub-configs and leave the pointers to
/// invalid configs
/// \param [out] id the context identifier for this config.
rocprofiler_status_t
rocprofiler_start_config(struct rocprofiler_config*, rocprofiler_context_id_t* id);
/// \brief disable the configuration.
rocprofiler_status_t rocprofiler_stop_config(rocprofiler_context_id_t);
///
///
/// the following 4 functions may be changed to permit removing domain/ops and/or
/// identifying domains and operations via strings
///
///
rocprofiler_status_t
rocprofiler_domain_set_domain(struct rocprofiler_domain_config*,
rocprofiler_tracer_activity_domain_t);
rocprofiler_status_t
rocprofiler_domain_add_domains(struct rocprofiler_domain_config*,
rocprofiler_tracer_activity_domain_t*,
size_t);
rocprofiler_status_t
rocprofiler_domain_add_op(struct rocprofiler_domain_config*,
rocprofiler_tracer_activity_domain_t,
uint32_t);
rocprofiler_status_t
rocprofiler_domain_add_ops(struct rocprofiler_domain_config*,
rocprofiler_tracer_activity_domain_t,
uint32_t*,
size_t);
#ifdef __cplusplus
}
#endif
+91
Просмотреть файл
@@ -0,0 +1,91 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/**
* @defgroup CONTEXT_OPERATIONS Context
* @{
*/
/**
* The NULL Context handle.
*/
#define ROCPROFILER_CONTEXT_NONE ROCPROFILER_HANDLE_LITERAL(rocprofiler_context_id_t, UINT64_MAX)
/**
* @brief Create context.
*
* @param context_id [out] Context identifier
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_create_context(rocprofiler_context_id_t* context_id) ROCPROFILER_NONNULL(1);
/**
* @brief Start context.
*
* @param [in] context_id
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_start_context(rocprofiler_context_id_t context_id);
/**
* @brief Stop context.
*
* @param [in] context_id
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_stop_context(rocprofiler_context_id_t context_id);
/**
* @brief Query whether context is active.
*
* @param [in] context_id
* @param [out] status If context is active, this will be a nonzero value
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status)
ROCPROFILER_NONNULL(2);
/**
* @brief Query whether the context is valid
*
* @param [in] context_id
* @param [out] status If context is invalid, this will be a nonzero value
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status)
ROCPROFILER_NONNULL(2);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+73
Просмотреть файл
@@ -0,0 +1,73 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/agent.h>
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup COUNTERS Hardware counters
* @{
*/
/**
* @brief Query Counter name.
*
* @param [in] counter_id
* @param [out] name if nullptr, size will be returned
* @param [out] size
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_query_counter_name(rocprofiler_counter_id_t counter_id, const char* name, size_t* size)
ROCPROFILER_NONNULL(3);
/**
* @brief Query Counter Instances Count.
*
* @param [in] counter_id
* @param [out] instance_count
* @return rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_query_counter_instance_count(rocprofiler_counter_id_t counter_id,
size_t* instance_count) ROCPROFILER_NONNULL(2);
/**
* @brief Query Agent Counters Availability.
*
* @param [in] agent
* @param [out] counters_list
* @param [out] counters_count
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_query_agent_supported_counters(rocprofiler_agent_t agent,
rocprofiler_counter_id_t* counters_list,
size_t* counters_count) ROCPROFILER_NONNULL(2, 3);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+31
Просмотреть файл
@@ -22,6 +22,29 @@
#pragma once
/** @defgroup SYMBOL_VERSIONING_GROUP Symbol Versions
*
* The names used for the shared library versioned symbols.
*
* Every function is annotated with one of the version macros defined in this
* section. Each macro specifies a corresponding symbol version string. After
* dynamically loading the shared library with @p dlopen, the address of each
* function can be obtained using @p dlsym with the name of the function and
* its corresponding symbol version string. An error will be reported by @p
* dlvsym if the installed library does not support the version for the
* function specified in this version of the interface.
*
* @{
*/
/**
* @brief The function was introduced in version 10.0 of the interface and has the
* symbol version string of ``"ROCPROFILER_10.0"``.
*/
#define ROCPROFILER_VERSION_10_0
/** @} */
#if !defined(ROCPROFILER_ATTRIBUTE)
# if defined(_MSC_VER)
# define ROCPROFILER_ATTRIBUTE(...) __declspec(__VA_ARGS__)
@@ -95,3 +118,11 @@
value \
}
#endif
#ifdef __cplusplus
# define ROCPROFILER_EXTERN_C_INIT extern "C" {
# define ROCPROFILER_EXTERN_C_FINI }
#else
# define ROCPROFILER_EXTERN_C_INIT
# define ROCPROFILER_EXTERN_C_FINI
#endif
+97
Просмотреть файл
@@ -0,0 +1,97 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/agent.h>
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
#include <rocprofiler/hsa.h>
#include <rocprofiler/profile_config.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup DISPATCH_PROFILE_COUNTING_SERVICE Dispatch Profile Counting
* Service
* @{
*/
/**
* @brief ROCProfiler Profile Counting Data.
*
*/
typedef struct
{
rocprofiler_timestamp_t start_timestamp;
rocprofiler_timestamp_t end_timestamp;
/**
* Counters, including identifiers to get counter information and Counters
* values
*
* Should it be a record per counter?
*/
rocprofiler_record_counter_t* counters;
uint64_t counters_count;
rocprofiler_correlation_id_t correlation_id;
} rocprofiler_dispatch_profile_counting_record_t;
/**
* @brief Kernel Dispatch Callback
*
* @param [out] queue_id
* @param [out] agent_id
* @param [out] correlation_id
* @param [out] dispatch_packet It can be used to get the kernel descriptor and then using
* code_object tracing, we can get the kernel name. `dispatch_packet->reserved2` is the
* correlation_id used to correlate the dispatch packet with the corresponding API call.
* @param [out] callback_data_args
* @param [in] config
*/
typedef void (*rocprofiler_profile_counting_dispatch_callback_t)(
rocprofiler_queue_id_t queue_id,
rocprofiler_agent_t agent_id,
rocprofiler_correlation_id_t correlation_id,
const hsa_kernel_dispatch_packet_t* dispatch_packet,
void* callback_data_args,
rocprofiler_profile_config_id_t* config);
/**
* @brief Configure Dispatch Profile Counting Service.
*
* @param [in] context_id
* @param [in] agent_id
* @param [in] buffer_id
* @param [in] callback
* @param [in] callback_data_args
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_dispatch_profile_counting_service(
rocprofiler_context_id_t context_id,
rocprofiler_agent_t agent_id,
rocprofiler_buffer_id_t buffer_id,
rocprofiler_profile_counting_dispatch_callback_t callback,
void* callback_data_args);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+60
Просмотреть файл
@@ -0,0 +1,60 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/**
* @defgroup EXTERNAL_CORRELATION External Correlation IDs
*
* User-defined correlation identifiers to supplement rocprofiler generated correlation ids
*
* @{
*/
/** @} */
/**
* @brief ROCProfiler Push External Correlation ID.
*
* @param external_correlation_id
* @return rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_push_external_correlation_id(
rocprofiler_external_correlation_id_t external_correlation_id);
/**
* @brief ROCProfiler Push External Correlation ID.
*
* @param external_correlation_id
* @return rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_pop_external_correlation_id(
rocprofiler_external_correlation_id_t* external_correlation_id);
ROCPROFILER_EXTERN_C_FINI
+457
Просмотреть файл
@@ -0,0 +1,457 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <stddef.h>
#include <stdint.h>
ROCPROFILER_EXTERN_C_INIT
//--------------------------------------------------------------------------------------//
//
// ENUMERATIONS
//
//--------------------------------------------------------------------------------------//
/**
* @defgroup BASIC_DATA_TYPES Basic data types
*
* Basic data types and typedefs
* @{
*/
/**
* @brief Status codes.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_STATUS_SUCCESS = 0, ///< No error occurred
ROCPROFILER_STATUS_ERROR, ///< Generalized error
ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND, ///< No valid context for given context id
ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND, ///< No valid buffer for given buffer id
ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND, ///< Domain identifier is invalid
ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND, ///< Operation identifier is invalid for domain
ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND, ///< No valid thread for given thread id
ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR, ///> Generalized context error
ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID, ///< Context configuration is not valid
ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED, ///< Context was not started (maybe already
///< started or atomic swap into active array
///< failed)
ROCPROFILER_STATUS_ERROR_BUFFER_BUSY, ///< buffer operation failed because it currently busy
///< handling another request (e.g. flushing)
ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED, ///< service has already been configured
///< in context
ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED, ///< Function call is not valid outside of
///< rocprofiler configuration (i.e.
///< function called post-initialization)
ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED, ///< Function is not implemented
ROCPROFILER_STATUS_LAST,
} rocprofiler_status_t;
/**
* @brief Buffer record categories. This enumeration type is encoded in @ref
* rocprofiler_record_header_t category field
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_BUFFER_CATEGORY_NONE = 0,
ROCPROFILER_BUFFER_CATEGORY_TRACING,
ROCPROFILER_BUFFER_CATEGORY_PC_SAMPLING,
ROCPROFILER_BUFFER_CATEGORY_LAST,
} rocprofiler_buffer_category_t;
/**
* @brief Agent type.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_AGENT_TYPE_NONE = 0, ///< Agent type is unknown
ROCPROFILER_AGENT_TYPE_CPU, ///< Agent type is a CPU
ROCPROFILER_AGENT_TYPE_GPU, ///< Agent type is a GPU
ROCPROFILER_AGENT_TYPE_LAST,
} rocprofiler_agent_type_t;
/**
* @brief Service Callback Phase.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_SERVICE_CALLBACK_PHASE_NONE = 0, ///< Callback has no phase
ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER, ///< Callback invoked prior to function execution
ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT, ///< Callback invoked after to function execution
ROCPROFILER_SERVICE_CALLBACK_PHASE_LAST,
} rocprofiler_service_callback_phase_t;
/**
* @brief Service Callback Tracing Kind.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE = 0,
ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, ///< Callbacks for HSA functions
ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, ///< Callbacks for HIP functions
ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, ///< Callbacks for ROCTx functions
ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, ///< Callbacks for code object info
ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, ///< Callbacks for kernel dispatches
ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST,
} rocprofiler_service_callback_tracing_kind_t;
/**
* @brief Service Buffer Tracing Kind.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_SERVICE_BUFFER_TRACING_NONE = 0,
ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, ///< Buffer HSA function calls
ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, ///< Buffer HIP function calls
ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, ///< Buffer ROCTx function calls
ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, ///< Buffer memory copy info
ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, ///< Buffer kernel dispatch info
ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, ///< Buffer page migration info
ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, ///< Buffer scratch memory reclaimation info
ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, ///< Buffer external correlation info
// To determine if this is possible to implement?
// ROCPROFILER_SERVICE_BUFFER_TRACING_QUEUE_SCHEDULING,
ROCPROFILER_SERVICE_BUFFER_TRACING_LAST,
} rocprofiler_service_buffer_tracing_kind_t;
/**
* @brief ROCProfiler Code Object Tracer Operation.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_NONE = 0,
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LOAD,
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_UNLOAD,
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_REGISTER,
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER,
// next two are part of hipRegisterFunction API.
// ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_REGISTER,
// ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_HOST_KERNEL_SYMBOL_UNREGISTER,
ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LAST,
} rocprofiler_callback_tracing_code_object_operation_t;
/**
* @brief Memory Copy Operation.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE = 0,
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_HOST,
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_HOST_TO_DEVICE,
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_DEVICE_TO_DEVICE,
ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_LAST,
} rocprofiler_buffer_tracing_memory_copy_operation_t;
/**
* @brief PC Sampling Method.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_PC_SAMPLING_METHOD_NONE = 0,
ROCPROFILER_PC_SAMPLING_METHOD_STOCHASTIC,
ROCPROFILER_PC_SAMPLING_METHOD_HOST_TRAP,
ROCPROFILER_PC_SAMPLING_METHOD_LAST,
} rocprofiler_pc_sampling_method_t;
/**
* @brief PC Sampling Unit.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_PC_SAMPLING_UNIT_NONE = 0, ///< Sample interval has unspecified units
ROCPROFILER_PC_SAMPLING_UNIT_INSTRUCTIONS, ///< Sample interval is in instructions
ROCPROFILER_PC_SAMPLING_UNIT_CYCLES, ///< Sample interval is in cycles
ROCPROFILER_PC_SAMPLING_UNIT_TIME, ///< Sample internval is in nanoseconds
ROCPROFILER_PC_SAMPLING_UNIT_LAST,
} rocprofiler_pc_sampling_unit_t;
/**
* @brief Actions when Buffer is full.
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_BUFFER_POLICY_NONE = 0, ///< No policy has been set
ROCPROFILER_BUFFER_POLICY_DISCARD, ///< Drop records when buffer is full
ROCPROFILER_BUFFER_POLICY_LOSSLESS, ///< Block when buffer is full
ROCPROFILER_BUFFER_POLICY_LAST,
} rocprofiler_buffer_policy_t;
//--------------------------------------------------------------------------------------//
//
// ALIASES
//
//--------------------------------------------------------------------------------------//
/**
* @brief ROCProfiler Timestamp.
*/
typedef uint64_t rocprofiler_timestamp_t;
/**
* @brief ROCProfiler Address.
*/
typedef uint64_t rocprofiler_address_t;
/**
* @brief Thread ID. Value will be equivalent to `syscall(__NR_gettid)`
*/
typedef uint64_t rocprofiler_thread_id_t;
/**
* @brief Tracing Operation ID. Depending on the kind, operations can be determined.
* If the value is equal to zero that means all operations will be considered
* for tracing.
*/
typedef uint32_t rocprofiler_tracing_operation_t;
/**
* @brief Needs non-typedef specification?
*/
typedef uint32_t rocprofiler_counter_instance_id_t;
// forward declaration of struct
typedef struct rocprofiler_pc_sampling_configuration_s rocprofiler_pc_sampling_configuration_t;
//--------------------------------------------------------------------------------------//
//
// UNIONS
//
//--------------------------------------------------------------------------------------//
/**
* @brief User-assignable data type
*
*/
typedef union rocprofiler_user_data_t
{
uint64_t value;
void* ptr;
} rocprofiler_user_data_t;
//--------------------------------------------------------------------------------------//
//
// STRUCTS
//
//--------------------------------------------------------------------------------------//
/**
* @brief Context ID.
*/
typedef struct
{
uint64_t handle;
} rocprofiler_context_id_t;
/**
* @brief Queue ID.
*/
typedef struct
{
uint64_t handle;
} rocprofiler_queue_id_t;
/**
* @brief ROCProfiler Record Correlation ID.
*/
typedef struct
{
uint64_t id;
} rocprofiler_correlation_id_t;
/**
* @brief ROCProfiler External Correlation ID.
*/
typedef struct
{
uint64_t id;
} rocprofiler_external_correlation_id_t;
/**
* @brief Buffer ID.
* @addtogroup BUFFER_HANDLING
*/
typedef struct
{
uint64_t handle;
} rocprofiler_buffer_id_t;
/**
* @brief Agent Identifier
*/
typedef struct
{
uint64_t handle;
} rocprofiler_agent_id_t;
/**
* @brief Counter ID.
*/
typedef struct
{
uint64_t handle;
} rocprofiler_counter_id_t;
/**
* @brief Profile Configurations
*/
typedef struct
{
uint64_t handle;
} rocprofiler_profile_config_id_t;
/**
* @brief Array of PC Sampling Configurations
*/
typedef struct rocprofiler_pc_sampling_config_array_s
{
rocprofiler_pc_sampling_configuration_t* data;
size_t size;
} rocprofiler_pc_sampling_config_array_t;
/**
* @brief Tracing record
*
*/
typedef struct rocprofiler_callback_tracing_record_t
{
rocprofiler_thread_id_t thread_id;
rocprofiler_correlation_id_t correlation_id;
rocprofiler_external_correlation_id_t external_correlation_id;
rocprofiler_service_callback_tracing_kind_t kind;
uint32_t operation;
rocprofiler_service_callback_phase_t phase;
rocprofiler_user_data_t data;
void* payload;
} rocprofiler_callback_tracing_record_t;
/**
* @brief Generic record with type identifier(s) and a pointer to data. This data type is used with
* buffered data.
*
* @code{.cpp}
* void
* tool_tracing_callback(rocprofiler_record_header_t** headers,
* size_t num_headers)
* {
* for(size_t i = 0; i < num_headers; ++i)
* {
* rocprofiler_record_header_t* header = headers[i];
*
* if(header->category == ROCPROFILER_BUFFER_CATEGORY_TRACING &&
* header->kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
* {
* // cast to rocprofiler_buffer_tracing_hsa_api_record_t which
* // is type associated with this category + kind
* auto* record =
* static_cast<rocprofiler_buffer_tracing_hsa_api_record_t*>(header->payload);
*
* // trivial test
* assert(record->start_timestamp <= record->end_timestamp);
* }
* }
* }
*
* @endcode
*/
typedef struct
{
union
{
struct
{
uint32_t category; ///< rocprofiler_buffer_category_t
uint32_t kind; ///< domain
};
uint64_t hash; ///< generic identifier. You can compute this via: `uint64_t hash = category
///< | ((uint64_t)(kind) << 32)`, e.g.
};
void* payload;
} rocprofiler_record_header_t;
/**
* @brief Function for computing the unsigned 64-bit hash value in @ref rocprofiler_record_header_t
* from a category and kind (two unsigned 32-bit values)
*
* @param category [in] a value from @ref rocprofiler_buffer_category_t
* @param kind [in] depending on the category, this is the domain value, e.g., @ref
* rocprofiler_service_buffer_tracing_kind_t value
* @return uint64_t hash value of category and kind
*/
static inline uint64_t
rocprofiler_record_header_compute_hash(uint32_t category, uint32_t kind)
{
uint64_t value = category;
value |= ((uint64_t)(kind)) << 32;
return value;
}
/**
* @brief ROCProfiler Profile Counting Counter per instance.
*/
typedef struct
{
rocprofiler_counter_id_t counter_id;
rocprofiler_counter_instance_id_t instance_id;
double counter_value;
} rocprofiler_record_counter_t;
/**
* @brief ROCProfiler PC Sampling Record.
*
*/
typedef struct
{
uint64_t pc;
uint64_t dispatch_id;
uint64_t timestamp;
uint64_t hardware_id;
union
{
uint8_t arb_value;
};
union
{
void* data;
};
} rocprofiler_pc_sampling_record_t;
/**
* @brief ROCProfiler SPM Record.
*
*/
typedef struct
{
/**
* Counters, including identifiers to get counter information and Counters
* values
*/
rocprofiler_record_counter_t* counters;
uint64_t counters_count;
} rocprofiler_spm_record_t;
/** @} */
ROCPROFILER_EXTERN_C_FINI
-1
Просмотреть файл
@@ -27,7 +27,6 @@
#include <stdint.h>
typedef uint32_t rocprofiler_trace_record_hip_operation_kind_t;
typedef struct rocprofiler_hip_trace_data_s rocprofiler_hip_trace_data_t;
typedef struct rocprofiler_hip_api_data_s rocprofiler_hip_api_data_t;
+5 -24
Просмотреть файл
@@ -30,33 +30,14 @@
#include <stdint.h>
typedef uint32_t rocprofiler_trace_record_hsa_operation_kind_t;
typedef struct hsa_kernel_dispatch_packet_s hsa_kernel_dispatch_packet_t;
typedef struct rocprofiler_hsa_trace_data_s rocprofiler_hsa_trace_data_t;
typedef struct rocprofiler_hsa_api_data_s rocprofiler_hsa_api_data_t;
struct rocprofiler_hsa_api_data_s
{
uint64_t correlation_id;
uint32_t phase;
union
{
uint64_t uint64_t_retval;
uint32_t uint32_t_retval;
hsa_signal_value_t hsa_signal_value_t_retval;
hsa_status_t hsa_status_t_retval;
};
rocprofiler_hsa_api_args_t args;
uint64_t* phase_data;
};
struct rocprofiler_hsa_trace_data_s
{
rocprofiler_hsa_api_data_t api_data;
uint64_t phase_enter_timestamp;
uint64_t phase_exit_timestamp;
uint64_t phase_data;
void (*phase_enter)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data);
void (*phase_exit)(rocprofiler_hsa_api_id_t operation_id, rocprofiler_hsa_trace_data_t* data);
uint64_t correlation_id;
uint32_t phase;
rocprofiler_hsa_api_args_t args;
rocprofiler_hsa_api_retval_t retval;
uint64_t* phase_data;
};
+8
Просмотреть файл
@@ -26,6 +26,14 @@
#include <hsa/hsa_ext_image.h>
#include <rocprofiler/version.h>
typedef union rocprofiler_hsa_api_retval_u
{
uint64_t uint64_t_retval;
uint32_t uint32_t_retval;
hsa_signal_value_t hsa_signal_value_t_retval;
hsa_status_t hsa_status_t_retval;
} rocprofiler_hsa_api_retval_t;
typedef union rocprofiler_hsa_api_args_u
{
// block: CoreApi API
+123
Просмотреть файл
@@ -0,0 +1,123 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup INTERNAL_THREADING Internal thread handling
*
* Callbacks before and after threads created internally by libraries
*
* @{
*/
/**
* @brief Enumeration for specifying which libraries you want callbacks before and after the library
* creates an internal thread. These callbacks will be invoked on the thread that is about to create
* the new thread (not on the newly created thread). In thread-aware tools that wrap pthread_create,
* this can be used to disable the wrapper before the pthread_create invocation and re-enable the
* wrapper afterwards. In many cases, tools will want to ignore the thread(s) created by rocprofiler
* since these threads do not exist in the normal application execution, whereas the internal
* threads for HSA, HIP, etc. are created in normal application execution; however, the HIP, HSA,
* etc. internal threads are typically background threads which just monitor kernel completion and
* are unlikely to contribute to any performance issues.
*/
typedef enum
{
ROCPROFILER_LIBRARY = (1 << 0),
ROCPROFILER_HSA_LIBRARY = (1 << 1),
ROCPROFILER_HIP_LIBRARY = (1 << 2),
ROCPROFILER_MARKER_LIBRARY = (1 << 3),
ROCPROFILER_LIBRARY_LAST = ROCPROFILER_MARKER_LIBRARY,
} rocprofiler_internal_thread_library_t;
/**
* @brief Callback type before and after internal thread creation. @see
* rocprofiler_at_internal_thread_create
*
*/
typedef void (*rocprofiler_internal_thread_library_cb_t)(rocprofiler_internal_thread_library_t,
void*);
/**
* @brief Invoke this function to receive callbacks before and after the creation of an internal
* thread by a library which as invoked on the thread which is creating the internal thread(s).
* Please note that the postcreate callback is guaranteed to be invoked after the underlying
* system call to create a new thread but it does not guarantee that the new thread has been
* started. Please note, that once these callbacks are registered, they cannot be removed so the
* caller is responsible for ignoring these callbacks if they want to ignore them beyond a certain
* point in the application.
*
* @param precreate [in] Callback invoked immediately before a new internal thread is created
* @param postcreate [in] Callback invoked immediately after a new internal thread is created
* @param libs [in] Bitwise-or of libraries, e.g. `ROCPROFILER_LIBRARY | ROCPROFILER_MARKER_LIBRARY`
* means the callbacks will be invoked whenever rocprofiler and/or the marker library create
* internal threads but not when the HSA or HIP libraries create internal threads.
* @param data [in] Data shared between callbacks
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate,
rocprofiler_internal_thread_library_cb_t postcreate,
int libs,
void* data);
/**
* @brief opaque handle to an internal thread identifier which delivers callbacks for buffers
*/
typedef struct
{
uint64_t handle;
} rocprofiler_callback_thread_t;
/**
* @brief Create a handle to a unique thread (created by rocprofiler) which, when associated with a
* particular buffer, will guarantee those buffered results always get delivered on the same thread.
* This is useful to prevent/control thread-safety issues and/or enable multithreaded processing of
* buffers with non-overlapping data
*
* @param [in] cb_thread_id User-provided pointer to a @ref rocprofiler_callback_thread_t
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id)
ROCPROFILER_NONNULL(1);
/**
* @brief By default, all buffered results are delivered on the same thread. Using @ref
* rocprofiler_create_callback_thread, one or more buffers can be assigned to deliever their results
* on a unique, dedicated thread.
*
* @param [in] buffer_id Buffer identifier
* @param [in] cb_thread_id Callback thread identifier via @ref rocprofiler_create_callback_thread
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id,
rocprofiler_callback_thread_t cb_thread_id);
/** @} */
ROCPROFILER_EXTERN_C_FINI
-3
Просмотреть файл
@@ -24,9 +24,6 @@
#include <rocprofiler/marker/api_args.h>
#include <stdint.h>
typedef uint32_t rocprofiler_trace_record_marker_operation_kind_t;
typedef struct rocprofiler_roctx_api_data_s rocprofiler_roctx_api_data_t;
struct rocprofiler_roctx_api_data_s
+79
Просмотреть файл
@@ -0,0 +1,79 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/agent.h>
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup PC_SAMPLING_SERVICE PC Sampling Service
* @{
*/
/**
* @brief Create PC Sampling Service.
*
* @param [in] context_id
* @param [in] agent
* @param [in] method
* @param [in] unit
* @param [in] interval
* @param [in] buffer_id
* @return ::rocprofiler_status_t
*
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id,
rocprofiler_agent_t agent,
rocprofiler_pc_sampling_method_t method,
rocprofiler_pc_sampling_unit_t unit,
uint64_t interval,
rocprofiler_buffer_id_t buffer_id);
struct rocprofiler_pc_sampling_configuration_s
{
rocprofiler_pc_sampling_method_t method;
rocprofiler_pc_sampling_unit_t unit;
size_t min_interval;
size_t max_interval;
uint64_t flags;
};
/**
* @brief Query PC Sampling Configuration.
*
* @param [in] agent
* @param [out] config
* @param [out] config_count
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_query_pc_sampling_agent_configurations(rocprofiler_agent_t agent,
rocprofiler_pc_sampling_configuration_t* config,
size_t* config_count) ROCPROFILER_NONNULL(2, 3);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+63
Просмотреть файл
@@ -0,0 +1,63 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/agent.h>
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup PROFILE_CONFIG Profile Configurations
*
* @{
*/
/**
* @brief Create Profile Configuration.
*
* @param [in] agent Agent identifier
* @param [in] counters_list List of GPU counters
* @param [in] counters_count Size of counters list
* @param [out] config_id Identifier for GPU counters group
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_create_profile_config(rocprofiler_agent_t agent,
rocprofiler_counter_id_t* counters_list,
size_t counters_count,
rocprofiler_profile_config_id_t* config_id)
ROCPROFILER_NONNULL(4);
/**
* @brief Destroy Profile Configuration.
*
* @param [in] config_id
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_destroy_profile_config(rocprofiler_profile_config_id_t config_id);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+220
Просмотреть файл
@@ -0,0 +1,220 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/**
* @defgroup REGISTRATION_GROUP Tool registration
*
* Data types and functions for tool registration with rocprofiler
* @{
*/
/**
* @brief A pointer to this data structure is provided to the client tool initialization function.
* The name member can be set by the client to assist with debugging (e.g. rocprofiler cannot start
* your context because there is a conflicting context started by `<name>` -- at least that is the
* plan). The handle member is a unique identifer assigned by rocprofiler for the client and the
* client can store it and pass it to the @ref rocprofiler_client_finalize_t function to force
* finalization (i.e. deactivate all of it's contexts) for the client.
*/
typedef struct
{
const char* name; ///< clients should set this value for debugging
const uint32_t handle; ///< internal handle
} rocprofiler_client_id_t;
typedef void (*rocprofiler_client_finalize_t)(rocprofiler_client_id_t);
typedef int (*rocprofiler_tool_initialize_t)(rocprofiler_client_finalize_t finalize_func,
void* tool_data);
typedef void (*rocprofiler_tool_finalize_t)(void* tool_data);
/**
* @brief Data structure containing a initialization, finalization, and data
*
*/
typedef struct
{
size_t size; ///< in case of future extensions
rocprofiler_tool_initialize_t initialize; ///< context creation
rocprofiler_tool_finalize_t finalize; ///< cleanup
void* tool_data; ///< data to provide to init and fini callbacks
} rocprofiler_tool_configure_result_t;
/**
* @brief Query whether rocprofiler has already scanned the binary for all the instances of @ref
* rocprofiler_configure (or is currently scanning). If rocprofiler has completed it's scan, clients
* can directly register themselves with rocprofiler.
*
* @param [out] status 0 indicates rocprofiler has not been initialized (i.e. configured), 1
* indicates rocprofiler has been initialized, -1 indicates rocprofiler is currently initializing.
* @return rocprofiler_status_t
*/
rocprofiler_status_t
rocprofiler_is_initialized(int* status) ROCPROFILER_API;
/**
* @brief Query rocprofiler finalization status.
*
* @param [out] status 0 indicates rocprofiler has not been finalized, 1 indicates rocprofiler has
* been finalized, -1 indicates rocprofiler is currently finalizing.
* @return rocprofiler_status_t
*/
rocprofiler_status_t
rocprofiler_is_finalized(int* status) ROCPROFILER_API;
/**
* @brief This is the special function that tools define to enable rocprofiler support. The tool
* should return a pointer to
* @ref rocprofiler_tool_configure_result_t which will contain a function pointer to (1) an
* initialization function where all the contexts are created, (2) a finalization function (if
* necessary) which will be invoked when rocprofiler shutdown and, (3) a pointer to any data that
* the tool wants communicated between the @ref rocprofiler_tool_configure_result_t::initialize and
* @ref rocprofiler_tool_configure_result_t::finalize functions. If the user
*
* @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch`
* @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info.
* @param [in] priority How many client tools were initialized before this client tool
* @param [in, out] client_id tool identifier value.
* @return rocprofiler_tool_configure_result_t*
*
* @code{.cpp}
* #include <rocprofiler/registration.h>
*
* static rocprofiler_client_id_t my_client_id;
* static rocprofiler_client_finalize_t my_fini_func;
* static int my_tool_data = 1234;
*
* static int my_init_func(rocprofiler_client_finalize_t fini_func,
* void* tool_data)
* {
* my_fini_func = fini_func;
*
* assert(*static_cast<int*>(tool_data) == 1234 && "tool_data is wrong");
*
* rocprofiler_context_id_t ctx;
* rocprofiler_create_context(&ctx);
*
* if(int valid_ctx = 0;
* rocprofiler_context_is_valid(ctx, &valid_ctx) != ROCPROFILER_STATUS_SUCCESS ||
* valid_ctx != 0)
* {
* // notify rocprofiler that initialization failed
* // and all the contexts, buffers, etc. created
* // should be ignored
* return -1;
* }
*
* if(rocprofiler_start_context(ctx) != ROCPROFILER_STATUS_SUCCESS)
* {
* // notify rocprofiler that initialization failed
* // and all the contexts, buffers, etc. created
* // should be ignored
* return -1;
* }
*
* // no errors
* return 0;
* }
*
* static int my_fini_func(void* tool_data)
* {
* assert(*static_cast<int*>(tool_data) == 1234 && "tool_data is wrong");
* }
*
* rocprofiler_tool_configure_result_t*
* rocprofiler_configure(uint32_t version,
* const char* runtime_version,
* uint32_t priority,
* rocprofiler_client_id_t* client_id)
* {
* // only activate if main tool
* if(priority > 0) return nullptr;
*
* // set the client name
* client_id->name = "ExampleTool";
*
* // make a copy of client info
* my_client_id = *client_id;
*
* // compute major/minor/patch version info
* uint32_t major = version / 10000;
* uint32_t minor = (version % 10000) / 100;
* uint32_t patch = version % 100;
*
* // print info
* printf("Configuring rocprofiler (v%u.%u.%u) [%s]\n", major, minor, patch, runtime_version);
*
* // create configure data
* static auto cfg = rocprofiler_tool_configure_result_t{ &my_init_func,
* &my_fini_func,
* &my_tool_data };
*
* // return pointer to configure data
* return &cfg;
* }
* @endcode
*/
rocprofiler_tool_configure_result_t*
rocprofiler_configure(uint32_t version,
const char* runtime_version,
uint32_t priority,
rocprofiler_client_id_t* client_id) ROCPROFILER_PUBLIC_API;
// NOTE: we use ROCPROFILER_PUBLIC_API above instead of ROCPROFILER_API because we always
// want the symbol to be visible when the user includes the header for the prototype
/**
* @brief Function pointer typedef for @ref rocprofiler_configure function
* @param [in] version The version of rocprofiler: `(10000 * major) + (100 * minor) + patch`
* @param [in] runtime_version String descriptor of the rocprofiler version and other relevant info.
* @param [in] priority How many client tools were initialized before this client tool
* @param [in, out] client_id tool identifier value.
*/
typedef rocprofiler_tool_configure_result_t* (*rocprofiler_configure_func_t)(
uint32_t version,
const char* runtime_version,
uint32_t priority,
rocprofiler_client_id_t* client_id);
/**
* @brief Function for explicitly registering a configuration with rocprofiler. This can be invoked
* before any ROCm runtimes (lazily) initialize and context(s) can be started before the runtimes
* initialize.
* @param [in] configure_func Address of @ref rocprofiler_configure function. A null pointer is
* acceptable if the address is not known
* @returns rocprofiler_status_t If rocprofiler has already been configured, or is currently being
* configured, this function will return @ref ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED.
*/
rocprofiler_status_t
rocprofiler_force_configure(rocprofiler_configure_func_t configure_func) ROCPROFILER_API;
/** @} */
ROCPROFILER_EXTERN_C_FINI
Разница между файлами не показана из-за своего большого размера Загрузить разницу
+39 -43
Просмотреть файл
@@ -20,7 +20,7 @@
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
/** \section rocprofiler_plugin_api ROCProfiler Plugin API
/** @section rocprofiler_plugin_api ROCProfiler Plugin API
*
* The ROCProfiler Plugin API is used by the ROCProfiler Tool to output all
* profiling information. Different implementations of the ROCProfiler Plugin
@@ -37,7 +37,7 @@
*/
/**
* \file
* @file
* ROCProfiler Tool Plugin API interface.
*/
@@ -47,44 +47,42 @@
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
ROCPROFILER_EXTERN_C_INIT /* __cplusplus */
/** \defgroup rocprofiler_plugins ROCProfiler Plugin API Specification
* @{
*/
/** @defgroup rocprofiler_plugins ROCProfiler Plugin API Specification
* @{
*/
/** \defgroup initialization_group Initialization and Finalization
* \ingroup rocprofiler_plugins
*
* The ROCProfiler Plugin API must be initialized before using any of the
* operations to report trace data, and finalized after the last trace data has
* been reported.
*
* @{
*/
/** @defgroup initialization_group Initialization and Finalization
* @ingroup rocprofiler_plugins
*
* The ROCProfiler Plugin API must be initialized before using any of the
* operations to report trace data, and finalized after the last trace data has
* been reported.
*
* @{
*/
/**
* Initialize plugin.
* Must be called before any other operation.
*
* @param[in] rocprofiler_major_version The major version of the ROCProfiler API
* being used by the ROCProfiler Tool. An error is reported if this does not
* match the major version of the ROCProfiler API used to build the plugin
* library. This ensures compatibility of the trace data format.
* @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API
* being used by the ROCProfiler Tool. An error is reported if the
* \p rocprofiler_major_version matches and this is greater than the minor
* version of the ROCProfiler API used to build the plugin library. This ensures
* compatibility of the trace data format.
* @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool
* @return Returns 0 on success and -1 on error.
*/
ROCPROFILER_EXPORT int
rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version,
uint32_t rocprofiler_minor_version,
void* data);
/**
* Initialize plugin.
* Must be called before any other operation.
*
* @param[in] rocprofiler_major_version The major version of the ROCProfiler API
* being used by the ROCProfiler Tool. An error is reported if this does not
* match the major version of the ROCProfiler API used to build the plugin
* library. This ensures compatibility of the trace data format.
* @param[in] rocprofiler_minor_version The minor version of the ROCProfiler API
* being used by the ROCProfiler Tool. An error is reported if the
* @p rocprofiler_major_version matches and this is greater than the minor
* version of the ROCProfiler API used to build the plugin library. This ensures
* compatibility of the trace data format.
* @param[in] data Pointer to the data passed to the ROCProfiler Plugin by the tool
* @return Returns 0 on success and -1 on error.
*/
ROCPROFILER_EXPORT int
rocprofiler_plugin_initialize(uint32_t rocprofiler_major_version,
uint32_t rocprofiler_minor_version,
void* data);
/**
* Finalize plugin.
@@ -97,8 +95,8 @@ rocprofiler_plugin_finalize();
/** @} */
/** \defgroup profiling_record_write_functions Profiling data reporting
* \ingroup rocprofiler_plugins
/** @defgroup profiling_record_write_functions Profiling data reporting
* @ingroup rocprofiler_plugins
* Operations to output profiling data.
* @{
*/
@@ -128,12 +126,10 @@ rocprofiler_plugin_write_buffer_records(rocprofiler_context_id_t context_id
*/
ROCPROFILER_EXPORT int
rocprofiler_plugin_write_record(rocprofiler_record_tracer_t record);
rocprofiler_plugin_write_record(rocprofiler_record_header_t record);
/** @} */
/** @} */
#ifdef __cplusplus
} /* extern "C" */
#endif /* __cplusplus */
ROCPROFILER_EXTERN_C_FINI
+51
Просмотреть файл
@@ -0,0 +1,51 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/defines.h>
#include <rocprofiler/fwd.h>
ROCPROFILER_EXTERN_C_INIT
/** @defgroup SPM_SERVICE SPM Service
* @{
*/
/**
* @brief Configure SPM Service.
*
* @param [in] context_id
* @param [in] buffer_id
* @param [in] profile_config
* @param [in] interval
* @return ::rocprofiler_status_t
*/
rocprofiler_status_t ROCPROFILER_API
rocprofiler_configure_spm_service(rocprofiler_context_id_t context_id,
rocprofiler_buffer_id_t buffer_id,
rocprofiler_profile_config_id_t profile_config,
uint64_t interval);
/** @} */
ROCPROFILER_EXTERN_C_FINI
+2 -1
Просмотреть файл
@@ -26,6 +26,7 @@ target_link_libraries(
$<BUILD_INTERFACE:rocprofiler::rocprofiler-dl>
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hip>
$<BUILD_INTERFACE:rocprofiler::rocprofiler-amd-comgr>
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hsa-runtime>)
$<BUILD_INTERFACE:rocprofiler::rocprofiler-hsa-runtime>
$<BUILD_INTERFACE:rocprofiler::rocprofiler-ptl>)
set_target_properties(rocprofiler-common-library PROPERTIES OUTPUT_NAME
rocprofiler-common)
+7 -6
Просмотреть файл
@@ -59,7 +59,7 @@ record_header_buffer::operator=(record_header_buffer&& _rhs) noexcept
if(this != &_rhs)
{
auto _lk = rhb_raii_lock{_rhs};
m_index = _rhs.m_index.load(std::memory_order_relaxed);
m_index = _rhs.m_index.load(std::memory_order_acquire);
m_buffer = std::move(_rhs.m_buffer);
m_headers = std::move(_rhs.m_headers);
_rhs.reset();
@@ -74,7 +74,8 @@ record_header_buffer::allocate(size_t num_bytes)
auto _lk = rhb_raii_lock{*this};
m_buffer.init(num_bytes);
m_headers.resize(m_buffer.capacity(), rocprofiler_record_header_t{0, nullptr});
m_headers.resize(m_buffer.capacity(),
rocprofiler_record_header_t{.hash = 0, .payload = nullptr});
return true;
}
@@ -83,13 +84,13 @@ record_header_buffer::get_record_headers(size_t _n)
{
auto _lk = rhb_raii_lock{*this};
auto _sz = m_index.load(std::memory_order_relaxed);
auto _sz = m_index.load(std::memory_order_acquire);
if(_n > _sz) _n = _sz;
auto _ret = record_ptr_vec_t{};
_ret.reserve(_n);
for(size_t i = 0; i < _n; ++i)
{
if(auto& itr = m_headers.at(i); itr.kind > 0 && itr.payload != nullptr)
if(auto& itr = m_headers.at(i); itr.hash > 0 && itr.payload != nullptr)
_ret.emplace_back(&itr);
}
return _ret;
@@ -105,9 +106,9 @@ record_header_buffer::clear()
auto _sz = m_buffer.capacity();
if(!m_buffer.clear(std::nothrow_t{})) return 0;
std::for_each(m_headers.begin(), m_headers.end(), [](auto& itr) {
itr = rocprofiler_record_header_t{0, nullptr};
itr = rocprofiler_record_header_t{.hash = 0, .payload = nullptr};
});
m_headers.resize(_sz, rocprofiler_record_header_t{0, nullptr});
m_headers.resize(_sz, rocprofiler_record_header_t{.hash = 0, .payload = nullptr});
m_index.store(0, std::memory_order_release);
}
+69 -8
Просмотреть файл
@@ -29,6 +29,7 @@
#include <atomic>
#include <limits>
#include <mutex>
#include <shared_mutex>
#include <vector>
namespace rocprofiler
@@ -70,17 +71,29 @@ struct record_header_buffer
template <typename Tp>
bool emplace(uint64_t, Tp&);
/// place an object in the buffer using the specified numerical identifier
template <typename Tp>
bool emplace(uint32_t, uint32_t, Tp&);
/// this function will return a vector of pointers to the record headers
/// at the time of invocation.
record_ptr_vec_t get_record_headers(size_t _n = std::numeric_limits<size_t>::max());
/// prevent emplace
/// record_header_buffer is a multiple writer, single reader data structure so
/// this function prevents writing via emplace
void lock();
/// try to re-enable emplace
/// potentially re-enable emplace if no other readers have locked
void unlock();
/// check if emplace is available
/// record_header_buffer is a multiple writer, single reader data structure so
/// this function prevents reading while emplacing
void read_lock();
/// potentially allow reading after writing via emplace
void read_unlock();
/// check if writing is available
bool is_locked() const;
/// restores to original empty state
@@ -116,6 +129,7 @@ struct record_header_buffer
private:
std::atomic<int32_t> m_locked = {0};
std::atomic<size_t> m_index = {};
std::shared_mutex m_shared = {};
base_buffer_t m_buffer = {};
record_vec_t m_headers = {};
};
@@ -129,13 +143,27 @@ record_header_buffer::is_locked() const
inline void
record_header_buffer::lock()
{
m_locked.fetch_add(1, std::memory_order_release);
auto n = m_locked.fetch_add(1, std::memory_order_release);
if(n == 0) m_shared.lock();
}
inline void
record_header_buffer::unlock()
{
m_locked.fetch_add(-1, std::memory_order_release);
auto n = m_locked.fetch_add(-1, std::memory_order_release);
if(n <= 1) m_shared.unlock();
}
inline void
record_header_buffer::read_lock()
{
m_shared.lock_shared();
}
inline void
record_header_buffer::read_unlock()
{
m_shared.unlock_shared();
}
inline bool
@@ -182,7 +210,7 @@ record_header_buffer::is_full() const
template <typename Tp>
bool
record_header_buffer::emplace(uint64_t _kind, Tp& _v)
record_header_buffer::emplace(uint64_t _hash, Tp& _v)
{
if(is_locked() || m_headers.empty()) return false;
@@ -195,6 +223,7 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v)
return _ptr;
};
read_lock();
auto _addr = _create_record(m_buffer, _v);
if(_addr)
{
@@ -202,9 +231,41 @@ record_header_buffer::emplace(uint64_t _kind, Tp& _v)
// for where the header record should be placed.
// NOTE: m_headers was resized to be large enough to accomodate
// sizeof(Tp) == 1 for every entry in buffer
auto _idx = m_index++;
m_headers.at(_idx) = rocprofiler_record_header_t{_kind, _addr};
auto idx = m_index.fetch_add(1, std::memory_order_release);
m_headers.at(idx) = rocprofiler_record_header_t{.hash = _hash, .payload = _addr};
}
read_unlock();
return (_addr != nullptr);
}
template <typename Tp>
bool
record_header_buffer::emplace(uint32_t _category, uint32_t _kind, Tp& _v)
{
if(is_locked() || m_headers.empty()) return false;
// request N bytes in the buffer (where N=sizeof(Tp)) and if
// available, copy _v into the buffer region
auto _create_record = [](auto& _buf, auto& _data) {
constexpr auto buffer_sz = sizeof(Tp);
void* _ptr = _buf.request(buffer_sz, false);
if(_ptr) new(_ptr) Tp{_data};
return _ptr;
};
read_lock();
auto _addr = _create_record(m_buffer, _v);
if(_addr)
{
// if there is space in the buffer, atomically get an index
// for where the header record should be placed.
// NOTE: m_headers was resized to be large enough to accomodate
// sizeof(Tp) == 1 for every entry in buffer
auto idx = m_index.fetch_add(1, std::memory_order_release);
m_headers.at(idx) =
rocprofiler_record_header_t{.category = _category, .kind = _kind, .payload = _addr};
}
read_unlock();
return (_addr != nullptr);
}
+17
Просмотреть файл
@@ -29,6 +29,7 @@
#include <algorithm>
#include <initializer_list>
#include <iterator>
#include <limits>
#include <memory>
#include <numeric>
#include <type_traits>
@@ -40,6 +41,15 @@ namespace common
{
namespace container
{
struct reserve_size
{
explicit reserve_size(size_t _v)
: value{_v}
{}
size_t value;
};
template <typename Tp, size_t ChunkSizeV = 64>
class stable_vector
{
@@ -155,6 +165,7 @@ public:
stable_vector() = default;
explicit stable_vector(size_type count, const Tp& value);
explicit stable_vector(size_type count);
explicit stable_vector(reserve_size&& reserve_count);
template <typename InputItrT,
typename = std::enable_if_t<
@@ -247,6 +258,12 @@ stable_vector<Tp, ChunkSizeV>::stable_vector(size_type count)
}
}
template <typename Tp, size_t ChunkSizeV>
stable_vector<Tp, ChunkSizeV>::stable_vector(reserve_size&& reserve_count)
{
reserve(reserve_count.value);
}
template <typename Tp, size_t ChunkSizeV>
template <typename InputItrT, typename>
stable_vector<Tp, ChunkSizeV>::stable_vector(InputItrT first, InputItrT last)
+5 -2
Просмотреть файл
@@ -1,8 +1,10 @@
#
#
#
set(ROCPROFILER_LIB_HEADERS config_helpers.hpp config_internal.hpp tracer.hpp)
set(ROCPROFILER_LIB_SOURCES config_internal.cpp rocprofiler_config.cpp rocprofiler.cpp)
set(ROCPROFILER_LIB_HEADERS buffer.hpp internal_threading.hpp registration.hpp)
set(ROCPROFILER_LIB_SOURCES
buffer.cpp buffer_tracing.cpp callback_tracing.cpp context.cpp internal_threading.cpp
rocprofiler.cpp registration.cpp)
add_library(rocprofiler-library SHARED)
add_library(rocprofiler::rocprofiler-library ALIAS rocprofiler-library)
@@ -11,6 +13,7 @@ target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_SOURCES}
${ROCPROFILER_LIB_HEADERS})
add_subdirectory(hsa)
add_subdirectory(context)
target_link_libraries(
rocprofiler-library
+203
Просмотреть файл
@@ -0,0 +1,203 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include "lib/rocprofiler/buffer.hpp"
#include <rocprofiler/rocprofiler.h>
#include "lib/common/container/stable_vector.hpp"
#include "lib/common/utility.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/internal_threading.hpp"
#include "lib/rocprofiler/registration.hpp"
#include <atomic>
#include <exception>
#include <vector>
namespace rocprofiler
{
namespace buffer
{
namespace
{
using reserve_size_t = common::container::reserve_size;
auto&
get_buffers_mutex()
{
static auto _v = std::mutex{};
return _v;
}
} // namespace
unique_buffer_vec_t&
get_buffers()
{
static auto _v = unique_buffer_vec_t{reserve_size_t{unique_buffer_vec_t::chunk_size}};
return _v;
}
std::optional<rocprofiler_buffer_id_t>
allocate_buffer()
{
// ... allocate any internal space needed to handle another context ...
auto _lk = std::unique_lock<std::mutex>{get_buffers_mutex()};
// initial context identifier number
auto _idx = get_buffers().size();
// make space in registered
get_buffers().emplace_back(nullptr);
// create an entry in the registered
auto& _cfg_v = get_buffers().back();
_cfg_v = std::make_unique<buffer::instance>();
auto* _cfg = _cfg_v.get();
if(!_cfg) return std::nullopt;
return rocprofiler_buffer_id_t{_idx};
}
rocprofiler_status_t
flush(rocprofiler_buffer_id_t buffer_id, bool wait)
{
if(buffer_id.handle >= get_buffers().size()) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
auto& buff = get_buffers().at(buffer_id.handle);
auto* task_group = rocprofiler::internal_threading::get_task_group(
rocprofiler_callback_thread_t{buff->task_group_id});
if(task_group) task_group->wait();
// buffer is currently being flushed or destroyed
if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY;
auto buff_idx = buff->buffer_idx++;
auto _task = [buff_idx, buffer_id]() {
auto& _buff = get_buffers().at(buffer_id.handle);
auto& buff_v = _buff->buffers.at(buff_idx % _buff->buffers.size());
if(!buff_v.is_empty())
{
// get the array of record headers
auto buff_data = buff_v.get_record_headers();
// invoke buffer callback
try
{
_buff->callback(rocprofiler_context_id_t{_buff->context_id},
rocprofiler_buffer_id_t{_buff->buffer_id},
buff_data.data(),
buff_data.size(),
_buff->callback_data,
_buff->drop_count);
} catch(std::exception& e)
{
LOG(ERROR) << "buffer callback threw an exception: " << e.what();
}
// clear the buffer
buff_v.clear();
}
_buff->syncer.clear();
};
if(task_group)
{
task_group->exec(_task);
if(wait) task_group->wait();
}
else
{
_task();
}
return ROCPROFILER_STATUS_SUCCESS;
}
} // namespace buffer
} // namespace rocprofiler
extern "C" {
rocprofiler_status_t
rocprofiler_create_buffer(rocprofiler_context_id_t context,
size_t size,
size_t watermark,
rocprofiler_buffer_policy_t action,
rocprofiler_buffer_tracing_cb_t callback,
void* callback_data,
rocprofiler_buffer_id_t* buffer_id)
{
if(rocprofiler::registration::get_init_status() > 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
auto opt_buff_id = rocprofiler::buffer::allocate_buffer();
if(!opt_buff_id) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
*buffer_id = *opt_buff_id;
auto& buff = rocprofiler::buffer::get_buffers().at(opt_buff_id->handle);
// allocate the buffers. if it is lossless, we allocate a second buffer to store data while
// other buffer is being flushed
buff->buffers.front().allocate(size);
if(action == ROCPROFILER_BUFFER_POLICY_LOSSLESS) buff->buffers.back().allocate(size);
buff->watermark = watermark;
buff->policy = action;
buff->callback = callback;
buff->callback_data = callback_data;
buff->context_id = context.handle;
buff->buffer_idx = buffer_id->handle;
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id)
{
return rocprofiler::buffer::flush(buffer_id, true);
}
rocprofiler_status_t
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id)
{
if(buffer_id.handle >= rocprofiler::buffer::get_buffers().size())
return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
auto& buff = rocprofiler::buffer::get_buffers().at(buffer_id.handle);
// buffer is currently being flushed or destroyed
if(buff->syncer.test_and_set()) return ROCPROFILER_STATUS_ERROR_BUFFER_BUSY;
for(auto& itr : buff->buffers)
itr.reset();
buff->syncer.clear();
return ROCPROFILER_STATUS_SUCCESS;
}
}
+122
Просмотреть файл
@@ -0,0 +1,122 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <bits/stdint-uintn.h>
#include <rocprofiler/buffer.h>
#include <rocprofiler/fwd.h>
#include "lib/common/container/record_header_buffer.hpp"
#include "lib/common/container/stable_vector.hpp"
#include "lib/common/demangle.hpp"
#include <array>
#include <atomic>
#include <cstdint>
#include <optional>
namespace rocprofiler
{
namespace buffer
{
struct instance
{
using buffer_t = common::container::record_header_buffer;
mutable std::array<buffer_t, 2> buffers = {};
mutable std::atomic<unsigned short> buffer_idx = {};
mutable std::atomic_flag syncer = ATOMIC_FLAG_INIT;
mutable std::atomic<uint64_t> drop_count = {};
uint64_t watermark = 0;
uint64_t context_id = 0;
uint64_t buffer_id = 0;
uint64_t task_group_id = 0;
rocprofiler_buffer_tracing_cb_t callback = nullptr;
void* callback_data = nullptr;
rocprofiler_buffer_policy_t policy = ROCPROFILER_BUFFER_POLICY_NONE;
template <typename Tp>
void emplace(uint32_t, uint32_t, Tp&);
};
using unique_buffer_vec_t = common::container::stable_vector<std::unique_ptr<instance>, 4>;
std::optional<rocprofiler_buffer_id_t>
allocate_buffer();
unique_buffer_vec_t&
get_buffers();
rocprofiler_status_t
flush(rocprofiler_buffer_id_t buffer_id, bool wait);
inline rocprofiler_status_t
flush(uint64_t buffer_idx, bool wait)
{
return flush(rocprofiler_buffer_id_t{buffer_idx}, wait);
}
} // namespace buffer
} // namespace rocprofiler
template <typename Tp>
inline void
rocprofiler::buffer::instance::emplace(uint32_t category, uint32_t kind, Tp& value)
{
// get the index of the current buffer
auto get_idx = [this]() { return buffer_idx.load(std::memory_order_acquire) % buffers.size(); };
auto idx = get_idx();
if(!buffers.at(idx).emplace(category, kind, value))
{
if(buffers.at(idx).size() < sizeof(value))
{
auto msg = std::stringstream{};
msg << "buffer " << buffer_id << " to small (size=" << buffers.at(idx).size()
<< ") to hold an object of type " << common::cxx_demangle(typeid(value).name())
<< " with size " << sizeof(value);
throw std::runtime_error(msg.str());
}
if(policy == ROCPROFILER_BUFFER_POLICY_LOSSLESS)
{
// blocks until buffer is flushed
bool success = false;
while(!success)
{
buffer::flush(buffer_id, true);
idx = get_idx();
success = buffers.at(idx).emplace(category, kind, value);
}
}
else
{
++drop_count;
}
}
if(buffers.at(idx).count() >= watermark)
{
// flush without syncing
buffer::flush(buffer_id, false);
}
}
+151
Просмотреть файл
@@ -0,0 +1,151 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/fwd.h>
#include <rocprofiler/rocprofiler.h>
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/registration.hpp"
#include <glog/logging.h>
#include <atomic>
#include <limits>
#include <vector>
#define RETURN_STATUS_ON_FAIL(...) \
if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \
{ \
return _status; \
}
extern "C" {
rocprofiler_status_t
rocprofiler_configure_buffer_tracing_service(rocprofiler_context_id_t context_id,
rocprofiler_service_buffer_tracing_kind_t kind,
rocprofiler_tracing_operation_t* operations,
size_t operations_count,
rocprofiler_buffer_id_t buffer_id)
{
if(rocprofiler::registration::get_init_status() > 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
if(context_id.handle >= rocprofiler::context::get_registered_contexts().size())
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle);
if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
constexpr auto invalid_buffer_id =
rocprofiler_buffer_id_t{std::numeric_limits<uint64_t>::max()};
if(!ctx->buffered_tracer)
{
ctx->buffered_tracer = std::make_unique<rocprofiler::context::buffer_tracing_service>();
ctx->buffered_tracer->buffer_data.fill(invalid_buffer_id);
}
if(ctx->buffered_tracer->buffer_data.at(kind).handle != invalid_buffer_id.handle)
return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED;
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->buffered_tracer->domains, kind));
ctx->buffered_tracer->buffer_data.at(kind) = buffer_id;
for(size_t i = 0; i < operations_count; ++i)
{
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op(
ctx->buffered_tracer->domains, kind, operations[i]));
}
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_iterate_buffer_tracing_kind_names(rocprofiler_buffer_tracing_kind_name_cb_t callback,
void* data)
{
// TODO(jrmadsen): need to add for other kinds
size_t n = 0;
bool premature = false;
using pair_t = std::pair<rocprofiler_service_buffer_tracing_kind_t, const char*>;
for(auto [eitr, sitr] : {
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API, "HSA_API"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_HIP_API, "HIP_API"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MARKER_API, "MARKER_API"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_MEMORY_COPY, "MEMORY_COPY"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_PAGE_MIGRATION, "PAGE_MIGRATION"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_SCRATCH_MEMORY, "SCRATCH_MEMORY"},
pair_t{ROCPROFILER_SERVICE_BUFFER_TRACING_EXTERNAL_CORRELATION, "EXTERNAL_CORRELATION"},
})
{
auto _success = callback(eitr, sitr, data);
if(_success != 0)
{
premature = true;
break;
}
++n;
}
#if defined(ROCPROFILER_CI)
if(!premature)
{
LOG_ASSERT(n == ROCPROFILER_SERVICE_BUFFER_TRACING_LAST - 1)
<< " :: new enumeration value added. Update this function";
}
#else
(void) n;
(void) premature;
#endif
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_iterate_buffer_tracing_kind_operation_names(
rocprofiler_service_buffer_tracing_kind_t kind,
rocprofiler_buffer_tracing_operation_name_cb_t callback,
void* data)
{
if(kind == ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API)
{
auto ops = rocprofiler::hsa::get_ids();
for(const auto& itr : ops)
{
auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data);
if(_success != 0) break;
}
return ROCPROFILER_STATUS_SUCCESS;
}
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
}
#undef RETURN_STATUS_ON_FAIL
+161
Просмотреть файл
@@ -0,0 +1,161 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/rocprofiler.h>
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/registration.hpp"
#include <glog/logging.h>
#include <atomic>
#include <vector>
#define RETURN_STATUS_ON_FAIL(...) \
if(rocprofiler_status_t _status; (_status = __VA_ARGS__) != ROCPROFILER_STATUS_SUCCESS) \
{ \
return _status; \
}
extern "C" {
rocprofiler_status_t
rocprofiler_configure_callback_tracing_service(rocprofiler_context_id_t context_id,
rocprofiler_service_callback_tracing_kind_t kind,
rocprofiler_tracing_operation_t* operations,
size_t operations_count,
rocprofiler_callback_tracing_cb_t callback,
void* callback_args)
{
if(rocprofiler::registration::get_init_status() > 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
if(context_id.handle >= rocprofiler::context::get_registered_contexts().size())
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
auto& ctx = rocprofiler::context::get_registered_contexts().at(context_id.handle);
if(!ctx) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
if(!ctx->callback_tracer)
ctx->callback_tracer = std::make_unique<rocprofiler::context::callback_tracing_service>();
if(ctx->callback_tracer->callback_data.at(kind).callback)
return ROCPROFILER_STATUS_ERROR_SERVICE_ALREADY_CONFIGURED;
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain(ctx->callback_tracer->domains, kind));
ctx->callback_tracer->callback_data.at(kind) = {callback, callback_args};
for(size_t i = 0; i < operations_count; ++i)
{
RETURN_STATUS_ON_FAIL(rocprofiler::context::add_domain_op(
ctx->callback_tracer->domains, kind, operations[i]));
}
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_iterate_callback_tracing_kind_names(
rocprofiler_callback_tracing_kind_name_cb_t callback,
void* data)
{
// TODO(jrmadsen): need to add for other kinds
size_t n = 0;
bool premature = false;
using pair_t = std::pair<rocprofiler_service_callback_tracing_kind_t, const char*>;
for(auto [eitr, sitr] : {
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API, "HSA_API"},
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_HIP_API, "HIP_API"},
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API, "MARKER_API"},
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_CODE_OBJECT, "CODE_OBJECT"},
pair_t{ROCPROFILER_SERVICE_CALLBACK_TRACING_KERNEL_DISPATCH, "KERNEL_DISPATCH"},
})
{
auto _success = callback(eitr, sitr, data);
if(_success != 0)
{
premature = true;
break;
}
++n;
}
#if defined(ROCPROFILER_CI)
if(!premature)
{
LOG_ASSERT(n == ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST - 1)
<< " :: new enumeration value added. Update this function";
}
#else
(void) n;
(void) premature;
#endif
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_iterate_callback_tracing_kind_operation_names(
rocprofiler_service_callback_tracing_kind_t kind,
rocprofiler_callback_tracing_operation_name_cb_t callback,
void* data)
{
if(kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
{
auto ops = rocprofiler::hsa::get_ids();
for(const auto& itr : ops)
{
auto _success = callback(kind, itr, rocprofiler::hsa::name_by_id(itr), data);
if(_success != 0) break;
}
return ROCPROFILER_STATUS_SUCCESS;
}
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_iterate_callback_tracing_operation_args(
rocprofiler_callback_tracing_record_t record,
rocprofiler_callback_tracing_operation_args_cb_t callback,
void* user_data)
{
if(record.kind == ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API)
{
rocprofiler::hsa::iterate_args(
record.operation,
*static_cast<rocprofiler_hsa_api_callback_tracer_data_t*>(record.payload),
callback,
user_data);
return ROCPROFILER_STATUS_SUCCESS;
}
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
}
#undef RETURN_STATUS_ON_FAIL
-28
Просмотреть файл
@@ -1,28 +0,0 @@
#include "config_internal.hpp"
namespace rocprofiler
{
namespace internal
{
uint64_t
correlation_config::get_unique_record_id()
{
static auto _v = std::atomic<uint64_t>{};
return _v++;
}
bool
domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain) const
{
return ((1 << _domain) & domains) == (1 << _domain);
}
bool
domain_config::operator()(rocprofiler_tracer_activity_domain_t _domain, uint32_t _op) const
{
auto _offset = (_domain * rocprofiler::internal::domain_ops_offset);
return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op));
}
} // namespace internal
} // namespace rocprofiler
-74
Просмотреть файл
@@ -1,74 +0,0 @@
#pragma once
#include <rocprofiler/config.h>
#include <rocprofiler/rocprofiler.h>
#include <array>
#include <atomic>
#include <bitset>
#include <cstddef>
#include <cstdint>
namespace rocprofiler
{
namespace internal
{
// number of bits to reserve all op codes
constexpr size_t domain_ops_offset = ROCPROFILER_DOMAIN_OPS_MAX;
constexpr size_t reserved_domain_size = ROCPROFILER_DOMAIN_OPS_RESERVED * 8;
constexpr size_t max_configs_count = 8;
struct correlation_config
{
uint64_t id = 0;
uint64_t external_id = 0;
::rocprofiler_external_cid_cb_t external_id_callback = nullptr;
static uint64_t get_unique_record_id();
};
struct domain_config
{
::rocprofiler_tracer_callback_t user_sync_callback = nullptr;
int64_t domains = 0;
std::bitset<reserved_domain_size> opcodes = {};
/// check if domain is enabled
bool operator()(::rocprofiler_tracer_activity_domain_t) const;
/// check if op in a domain is enabled
bool operator()(::rocprofiler_tracer_activity_domain_t, uint32_t) const;
};
struct buffer_config
{
::rocprofiler_buffer_callback_t callback = nullptr;
uint64_t buffer_size;
// Memory::GenericBuffer* buffer = nullptr;
uint64_t buffer_idx = 0;
};
using filter_config = ::rocprofiler_filter_config;
struct config
{
// size is used to ensure that we never read past the end of the version
size_t size = 0; // = sizeof(rocprofiler_config)
uint32_t compat_version = 0; // set by user
uint32_t api_version = 0; // set by rocprofiler
uint64_t context_idx = 0; // context id index
void* user_data = nullptr; // user data passed to callbacks
correlation_config* correlation_id = nullptr; // &my_cid_config (optional)
buffer_config* buffer = nullptr; // = &my_buffer_config (required)
domain_config* domain = nullptr; // = &my_domain_config (required)
filter_config* filter = nullptr; // = &my_filter_config (optional)
};
std::array<rocprofiler::internal::config*, max_configs_count>&
get_registered_configs();
std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>&
get_active_configs();
} // namespace internal
} // namespace rocprofiler
+89
Просмотреть файл
@@ -0,0 +1,89 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/rocprofiler.h>
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/registration.hpp"
#include <atomic>
#include <vector>
extern "C" {
rocprofiler_status_t
rocprofiler_create_context(rocprofiler_context_id_t* context_id)
{
if(rocprofiler::registration::get_init_status() > 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
auto cfg_id = rocprofiler::context::allocate_context();
if(!cfg_id) return ROCPROFILER_STATUS_ERROR_CONTEXT_ERROR;
*context_id = *cfg_id;
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_start_context(rocprofiler_context_id_t context_id)
{
return rocprofiler::context::start_context(context_id);
}
rocprofiler_status_t
rocprofiler_stop_context(rocprofiler_context_id_t context_id)
{
return rocprofiler::context::stop_context(context_id);
}
rocprofiler_status_t
rocprofiler_context_is_active(rocprofiler_context_id_t context_id, int* status)
{
*status = 0;
for(const auto& itr : rocprofiler::context::get_active_contexts())
{
auto* cfg = itr.load(std::memory_order_relaxed);
if(cfg && cfg->context_idx == context_id.handle)
{
*status = 1;
return ROCPROFILER_STATUS_SUCCESS;
}
}
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
rocprofiler_status_t
rocprofiler_context_is_valid(rocprofiler_context_id_t context_id, int* status)
{
*status = 0;
for(const auto& itr : rocprofiler::context::get_registered_contexts())
{
if(itr && itr->context_idx == context_id.handle)
{
auto _ret = rocprofiler::context::validate_context(itr.get());
*status = (_ret == ROCPROFILER_STATUS_SUCCESS) ? 1 : 0;
return _ret;
}
}
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
}
+14
Просмотреть файл
@@ -0,0 +1,14 @@
#
# context
#
set(ROCPROFILER_LIB_CONFIG_SOURCES context.cpp domain.cpp)
set(ROCPROFILER_LIB_CONFIG_HEADERS context.hpp domain.hpp allocator.hpp)
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_CONFIG_SOURCES}
${ROCPROFILER_LIB_CONFIG_HEADERS})
# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE
# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR}
# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src)
# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL
# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1)
+26 -24
Просмотреть файл
@@ -1,36 +1,38 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include "rocprofiler/rocprofiler.h"
#include <array>
#include <atomic>
#include <cstddef>
#include <utility>
namespace
namespace rocprofiler
{
inline size_t // NOLINTNEXTLINE
get_domain_max_op(rocprofiler_tracer_activity_domain_t _domain)
namespace context
{
switch(_domain)
{
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE: return -1;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_MARKER_API: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_KFD_API: return -1;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_EXT_API: return -1;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_EVT: return 0;
case ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST: return -1;
}
return -1;
}
template <typename Tp, size_t N = 8>
struct allocator
struct locality_allocator
{
void construct(Tp* const _p, const Tp& _v) const { ::new((void*) _p) Tp{_v}; }
void construct(Tp* const _p, Tp&& _v) const { ::new((void*) _p) Tp{std::move(_v)}; }
@@ -103,5 +105,5 @@ struct allocator
void reserve(const size_t) {}
};
} // namespace
} // namespace context
} // namespace rocprofiler
+230
Просмотреть файл
@@ -0,0 +1,230 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/fwd.h>
#include <rocprofiler/rocprofiler.h>
#include "lib/common/container/stable_vector.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include <glog/logging.h>
#include <unistd.h>
#include <atomic>
#include <cstddef>
#include <memory>
#include <mutex>
#include <optional>
namespace rocprofiler
{
namespace context
{
namespace
{
auto&
get_contexts_mutex()
{
static auto _v = std::mutex{};
return _v;
}
constexpr auto invalid_client_idx = std::numeric_limits<uint32_t>::max();
auto&
get_client_index()
{
static auto _v = invalid_client_idx;
return _v;
}
} // namespace
uint64_t
correlation_tracing_service::get_unique_record_id()
{
static auto _v = std::atomic<uint64_t>{};
return _v++;
}
using reserve_size_t = common::container::reserve_size;
unique_context_vec_t&
get_registered_contexts()
{
static auto _v = unique_context_vec_t{reserve_size_t{unique_context_vec_t::chunk_size}};
return _v;
}
active_context_vec_t&
get_active_contexts()
{
static auto* _v = new active_context_vec_t{reserve_size_t{active_context_vec_t::chunk_size}};
static auto _once = std::once_flag{};
std::call_once(_once, std::atexit, []() {
for(auto& itr : *_v)
{
itr.store(nullptr);
}
});
return *_v;
}
// set the client index needs to be called before allocate_context()
void
push_client(uint32_t value)
{
LOG_ASSERT(get_client_index() == invalid_client_idx)
<< " rocprofiler client index is currently " << get_client_index()
<< "... which means that a new client is initializing before the last client finished "
"initializing. This is an internal error, please file a bug report with a reproducer";
get_client_index() = value;
}
// remove the client index
void
pop_client(uint32_t value)
{
LOG_ASSERT(get_client_index() == value)
<< " rocprofiler client index is currently not " << value
<< "... which means that a new client was initialized before this client finished "
"initializing. This is an internal error, please file a bug report with a reproducer";
get_client_index() = invalid_client_idx;
}
std::optional<rocprofiler_context_id_t>
allocate_context()
{
// ... allocate any internal space needed to handle another context ...
auto _lk = std::unique_lock<std::mutex>{get_contexts_mutex()};
// initial context identifier number
auto _idx = get_registered_contexts().size();
// make space in registered
get_registered_contexts().emplace_back(nullptr);
// create an entry in the registered
auto& _cfg_v = get_registered_contexts().back();
_cfg_v = std::make_unique<context>();
auto* _cfg = _cfg_v.get();
// ...
if(!_cfg) return std::nullopt;
_cfg->size = sizeof(context);
_cfg->context_idx = _idx;
_cfg->client_idx = get_client_index();
LOG_ASSERT(_cfg->client_idx != invalid_client_idx)
<< " rocprofiler internal error: a context was allocated without an associated tool client "
"identifier";
return rocprofiler_context_id_t{_idx};
}
rocprofiler_status_t
validate_context(const context* cfg)
{
// if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
// if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND;
return (cfg) ? ROCPROFILER_STATUS_SUCCESS : ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
rocprofiler_status_t
start_context(rocprofiler_context_id_t context_id)
{
if(context_id.handle >= get_registered_contexts().size())
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
context* cfg = get_registered_contexts().at(context_id.handle).get();
if(!cfg)
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND;
}
if(validate_context(cfg) != ROCPROFILER_STATUS_SUCCESS)
{
return ROCPROFILER_STATUS_ERROR_CONTEXT_INVALID;
}
uint64_t rocp_tot_contexts = get_registered_contexts().size();
auto idx = rocp_tot_contexts;
{
// hold a lock here so prevent multiple threads from finding the same nullptr slot
auto _lk = std::unique_lock<std::mutex>{get_contexts_mutex()};
// try to find a nullptr slot first
for(size_t i = 0; i < get_active_contexts().size(); ++i)
{
auto* itr = get_active_contexts().at(i).load(std::memory_order_relaxed);
if(itr == nullptr)
{
idx = i;
break;
}
else if(context_id.handle == itr->context_idx)
{
return ROCPROFILER_STATUS_SUCCESS;
}
}
// if no nullptr slot was found, then create one while lock is held
if(idx == rocp_tot_contexts)
{
idx = get_active_contexts().size();
get_active_contexts().emplace_back();
}
}
// atomic swap the pointer into the "active" array used internally
context* _expected = nullptr;
bool success = get_active_contexts().at(idx).compare_exchange_strong(
_expected, get_registered_contexts().at(context_id.handle).get());
if(!success) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_STARTED;
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
stop_context(rocprofiler_context_id_t idx)
{
// atomically assign the context pointer to NULL so that it is skipped in future
// callbacks
for(auto& itr : get_active_contexts())
{
auto* _expected = itr.load(std::memory_order_relaxed);
if(_expected && _expected->context_idx == idx.handle)
{
bool success = itr.compare_exchange_strong(_expected, nullptr);
if(success) return ROCPROFILER_STATUS_SUCCESS;
}
}
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange failed
}
} // namespace context
} // namespace rocprofiler
+130
Просмотреть файл
@@ -0,0 +1,130 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/fwd.h>
#include <rocprofiler/rocprofiler.h>
#include "lib/common/container/stable_vector.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include <array>
#include <atomic>
#include <cstddef>
#include <cstdint>
#include <optional>
namespace rocprofiler
{
namespace context
{
using external_cid_cb_t = uint64_t (*)(rocprofiler_service_callback_tracing_kind_t,
uint32_t,
uint64_t);
/// permits tools opportunity to modify the correlation id based on the domain, op, and
/// the rocprofiler generated correlation id
struct correlation_tracing_service
{
uint64_t id = 0;
uint64_t external_id = 0;
external_cid_cb_t external_id_callback = nullptr;
static uint64_t get_unique_record_id();
};
struct callback_tracing_service
{
struct callback_data
{
rocprofiler_callback_tracing_cb_t callback = nullptr;
void* data = nullptr;
};
using domain_t = rocprofiler_service_callback_tracing_kind_t;
using callback_array_t = std::array<callback_data, domain_info<domain_t>::last>;
domain_context<domain_t> domains = {};
callback_array_t callback_data = {};
};
struct buffer_tracing_service
{
using domain_t = rocprofiler_service_buffer_tracing_kind_t;
using buffer_array_t = std::array<rocprofiler_buffer_id_t, domain_info<domain_t>::last>;
domain_context<domain_t> domains = {};
buffer_array_t buffer_data = {};
};
struct context
{
// size is used to ensure that we never read past the end of the version
size_t size = 0;
uint64_t context_idx = 0; // context id
uint32_t client_idx = 0; // tool id
correlation_tracing_service correlation_tracer = {};
std::unique_ptr<callback_tracing_service> callback_tracer = {};
std::unique_ptr<buffer_tracing_service> buffered_tracer = {};
};
// set the client index needs to be called before allocate_context()
void push_client(uint32_t);
// remove the client index
void pop_client(uint32_t);
/// @brief creates a context struct and returns a handle for locating the context struct
///
std::optional<rocprofiler_context_id_t>
allocate_context();
/// \brief rocprofiler validates context, checks for conflicts, etc. Ensures that
/// the contexturation is valid *in isolation*, e.g. it may check that the user
/// set the compat_version field and that required context fields, such as buffer
/// are set. This function will be called before \ref start_context
/// but is provided to help the user validate one or more contexts without starting
/// them
///
/// \param [in] cfg contexturation to validate
rocprofiler_status_t
validate_context(const context* cfg);
/// \brief rocprofiler activates contexturation and provides a context identifier
/// \param [in] id the context identifier to start.
rocprofiler_status_t
start_context(rocprofiler_context_id_t id);
/// \brief disable the contexturation.
rocprofiler_status_t stop_context(rocprofiler_context_id_t);
using unique_context_vec_t = common::container::stable_vector<std::unique_ptr<context>, 8>;
using active_context_vec_t = common::container::stable_vector<std::atomic<context*>, 8>;
unique_context_vec_t&
get_registered_contexts();
active_context_vec_t&
get_active_contexts();
} // namespace context
} // namespace rocprofiler
+99
Просмотреть файл
@@ -0,0 +1,99 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include "lib/rocprofiler/context/domain.hpp"
#include <rocprofiler/rocprofiler.h>
namespace rocprofiler
{
namespace context
{
template <typename DomainT>
bool
domain_context<DomainT>::operator()(DomainT _domain) const
{
return ((1 << _domain) & domains) == (1 << _domain);
}
template <typename DomainT>
bool
domain_context<DomainT>::operator()(DomainT _domain, uint32_t _op) const
{
auto _offset = (_domain * opcode_padding_v);
return (*this)(_domain) && (opcodes.none() || opcodes.test(_offset + _op));
}
template <typename DomainT>
rocprofiler_status_t
add_domain(domain_context<DomainT>& _cfg, DomainT _domain)
{
if(_domain <= domain_info<DomainT>::none || _domain >= domain_info<DomainT>::last)
return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND;
_cfg.domains |= (1 << _domain);
return ROCPROFILER_STATUS_SUCCESS;
}
template <typename DomainT>
rocprofiler_status_t
add_domain_op(domain_context<DomainT>& _cfg, DomainT _domain, uint32_t _op)
{
if(_domain <= domain_info<DomainT>::none || _domain >= domain_info<DomainT>::last)
return ROCPROFILER_STATUS_ERROR_DOMAIN_NOT_FOUND;
if(_op >= domain_info<DomainT>::padding) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND;
auto _offset = (_domain * domain_info<DomainT>::padding);
if(_offset >= _cfg.opcodes.size()) return ROCPROFILER_STATUS_ERROR_OPERATION_NOT_FOUND;
_cfg.opcodes.set(_offset + _op, true);
return ROCPROFILER_STATUS_SUCCESS;
}
// instantiate the templates
template struct domain_context<rocprofiler_service_callback_tracing_kind_t>;
template rocprofiler_status_t
add_domain<rocprofiler_service_callback_tracing_kind_t>(
domain_context<rocprofiler_service_callback_tracing_kind_t>&,
rocprofiler_service_callback_tracing_kind_t);
template rocprofiler_status_t
add_domain<rocprofiler_service_buffer_tracing_kind_t>(
domain_context<rocprofiler_service_buffer_tracing_kind_t>&,
rocprofiler_service_buffer_tracing_kind_t);
template rocprofiler_status_t
add_domain_op<rocprofiler_service_callback_tracing_kind_t>(
domain_context<rocprofiler_service_callback_tracing_kind_t>&,
rocprofiler_service_callback_tracing_kind_t,
uint32_t);
template struct domain_context<rocprofiler_service_buffer_tracing_kind_t>;
template rocprofiler_status_t
add_domain_op<rocprofiler_service_buffer_tracing_kind_t>(
domain_context<rocprofiler_service_buffer_tracing_kind_t>&,
rocprofiler_service_buffer_tracing_kind_t,
uint32_t);
} // namespace context
} // namespace rocprofiler
+89
Просмотреть файл
@@ -0,0 +1,89 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/rocprofiler.h>
#include "lib/common/mpl.hpp"
#include <bitset>
#include <cstddef>
#include <cstdint>
namespace rocprofiler
{
namespace context
{
// number of bits to reserve all op codes
constexpr size_t domain_ops_padding = 512;
template <typename Tp>
struct domain_info;
template <>
struct domain_info<rocprofiler_service_callback_tracing_kind_t>
{
static constexpr size_t none = ROCPROFILER_SERVICE_CALLBACK_TRACING_NONE;
static constexpr size_t last = ROCPROFILER_SERVICE_CALLBACK_TRACING_LAST;
static constexpr auto padding = domain_ops_padding;
};
template <>
struct domain_info<rocprofiler_service_buffer_tracing_kind_t>
{
static constexpr size_t none = ROCPROFILER_SERVICE_BUFFER_TRACING_NONE;
static constexpr size_t last = ROCPROFILER_SERVICE_BUFFER_TRACING_LAST;
static constexpr auto padding = domain_ops_padding;
};
/// how the tools specify the tracing domain and (optionally) which operations in the
/// domain they want to trace
template <typename DomainT>
struct domain_context
{
using supported_domains_v = common::mpl::type_list<rocprofiler_service_callback_tracing_kind_t,
rocprofiler_service_buffer_tracing_kind_t>;
static_assert(common::mpl::is_one_of<DomainT, supported_domains_v>::value,
"Unsupported domain type");
static constexpr auto opcode_padding_v = domain_info<DomainT>::padding;
static constexpr auto max_opcodes_v = opcode_padding_v * domain_info<DomainT>::last;
/// check if domain is enabled
bool operator()(DomainT) const;
/// check if op in a domain is enabled
bool operator()(DomainT, uint32_t) const;
int64_t domains = 0;
std::bitset<max_opcodes_v> opcodes = {};
};
template <typename DomainT>
rocprofiler_status_t
add_domain(domain_context<DomainT>&, DomainT);
template <typename DomainT>
rocprofiler_status_t
add_domain_op(domain_context<DomainT>&, DomainT, uint32_t);
} // namespace context
} // namespace rocprofiler
+6 -6
Просмотреть файл
@@ -1,10 +1,10 @@
#
#
#
set(ROCPROFILER_LIB_HSA_SOURCES hsa.cpp)
set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp ostream.hpp types.hpp utils.hpp)
set(ROCPROFILER_LIB_HSA_HEADERS hsa.hpp defines.hpp types.hpp utils.hpp)
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_SOURCES}
${ROCPROFILER_LIB_HSA_HEADERS})
# add_executable(rocr-example hsa.cpp rocr.hpp) target_link_libraries(rocr-example PRIVATE
# rocprofiler-v2) target_include_directories( rocr-example PRIVATE ${PROJECT_SOURCE_DIR}
# ${PROJECT_BINARY_DIR} ${PROJECT_SOURCE_DIR}/src ${PROJECT_BINARY_DIR}/src)
# target_compile_definitions( rocr-example PRIVATE AMD_INTERNAL_BUILD PROF_API_IMPL
# HIP_PROF_HIP_API_STRING=1 __HIP_PLATFORM_AMD__=1)
add_subdirectory(details)
+45 -91
Просмотреть файл
@@ -32,30 +32,27 @@
#define IMPL_DETAIL_FOR_EACH(MACRO, PREFIX, ...) \
IMPL_DETAIL_FOR_EACH_(IMPL_DETAIL_FOR_EACH_NARG(__VA_ARGS__), MACRO, PREFIX, __VA_ARGS__)
#define MEMBER_0(...)
#define MEMBER_1(PREFIX, FIELD) PREFIX.FIELD
#define MEMBER_2(PREFIX, A, B) MEMBER_1(PREFIX, A), MEMBER_1(PREFIX, B)
#define MEMBER_3(PREFIX, A, B, C) MEMBER_2(PREFIX, A, B), MEMBER_1(PREFIX, C)
#define MEMBER_4(PREFIX, A, B, C, D) MEMBER_3(PREFIX, A, B, C), MEMBER_1(PREFIX, D)
#define MEMBER_5(PREFIX, A, B, C, D, E) MEMBER_4(PREFIX, A, B, C, D), MEMBER_1(PREFIX, E)
#define MEMBER_6(PREFIX, A, B, C, D, E, F) MEMBER_5(PREFIX, A, B, C, D, E), MEMBER_1(PREFIX, F)
#define MEMBER_7(PREFIX, A, B, C, D, E, F, G) \
MEMBER_6(PREFIX, A, B, C, D, E, F), MEMBER_1(PREFIX, G)
#define MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \
MEMBER_7(PREFIX, A, B, C, D, E, F, G), MEMBER_1(PREFIX, H)
#define MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \
MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), MEMBER_1(PREFIX, I)
#define MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \
MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), MEMBER_1(PREFIX, J)
#define MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \
MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), MEMBER_1(PREFIX, K)
#define MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), MEMBER_1(PREFIX, L)
#define ADDR_MEMBER_0(...)
#define ADDR_MEMBER_1(PREFIX, FIELD) static_cast<void*>(&PREFIX.FIELD)
#define ADDR_MEMBER_2(PREFIX, A, B) ADDR_MEMBER_1(PREFIX, A), ADDR_MEMBER_1(PREFIX, B)
#define ADDR_MEMBER_3(PREFIX, A, B, C) ADDR_MEMBER_2(PREFIX, A, B), ADDR_MEMBER_1(PREFIX, C)
#define ADDR_MEMBER_4(PREFIX, A, B, C, D) ADDR_MEMBER_3(PREFIX, A, B, C), ADDR_MEMBER_1(PREFIX, D)
#define ADDR_MEMBER_5(PREFIX, A, B, C, D, E) \
ADDR_MEMBER_4(PREFIX, A, B, C, D), ADDR_MEMBER_1(PREFIX, E)
#define ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F) \
ADDR_MEMBER_5(PREFIX, A, B, C, D, E), ADDR_MEMBER_1(PREFIX, F)
#define ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G) \
ADDR_MEMBER_6(PREFIX, A, B, C, D, E, F), ADDR_MEMBER_1(PREFIX, G)
#define ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H) \
ADDR_MEMBER_7(PREFIX, A, B, C, D, E, F, G), ADDR_MEMBER_1(PREFIX, H)
#define ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I) \
ADDR_MEMBER_8(PREFIX, A, B, C, D, E, F, G, H), ADDR_MEMBER_1(PREFIX, I)
#define ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J) \
ADDR_MEMBER_9(PREFIX, A, B, C, D, E, F, G, H, I), ADDR_MEMBER_1(PREFIX, J)
#define ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K) \
ADDR_MEMBER_10(PREFIX, A, B, C, D, E, F, G, H, I, J), ADDR_MEMBER_1(PREFIX, K)
#define ADDR_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
ADDR_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), ADDR_MEMBER_1(PREFIX, L)
#define NAMED_MEMBER_0(...)
#define NAMED_MEMBER_1(PREFIX, FIELD) std::make_pair(#FIELD, PREFIX.FIELD)
@@ -80,44 +77,10 @@
#define NAMED_MEMBER_12(PREFIX, A, B, C, D, E, F, G, H, I, J, K, L) \
NAMED_MEMBER_11(PREFIX, A, B, C, D, E, F, G, H, I, J, K), NAMED_MEMBER_1(PREFIX, L)
/// @def GET_MEMBER_FIELDS
/// @param VAR some struct instance
/// @param ... The member fields of the struct
///
/// @brief this macro is used to expand one variable (VAR) + one or more member fields (FIELDS)
/// into a sequence of something like: `(VAR.FIELD, ...)`
/// For example, `GET_MEMBER_FIELDS(foo, a, b, c)` would transform into `foo.a, foo.b, foo.c`:
///
/// @code{.cpp}
///
/// struct Foo
/// {
/// int a;
/// float b;
/// double c;
/// };
///
/// // some function taking int, float, and double
/// void some_function(int, float, double);
///
/// // overload to some_function accepting Foo instance and using
/// // the args to invoke "real" function
/// void some_function(Foo _foo_v)
/// {
/// some_function(GET_MEMBER_FIELDS(_foo_v, a, b, c));
/// }
///
/// int main()
/// {
/// Foo _foo_v = {-1, 0.5f, 2.0};
/// invoke_some_function(_foo_v);
/// }
///
/// @code
#define GET_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(MEMBER_, VAR, __VA_ARGS__)
#define GET_ADDR_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(ADDR_MEMBER_, VAR, __VA_ARGS__)
#define GET_NAMED_MEMBER_FIELDS(VAR, ...) IMPL_DETAIL_FOR_EACH(NAMED_MEMBER_, VAR, __VA_ARGS__)
#define HSA_API_INFO_DEFINITION_0(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \
#define HSA_API_INFO_DEFINITION_0(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR) \
namespace rocprofiler \
{ \
namespace hsa \
@@ -125,10 +88,11 @@
template <> \
struct hsa_api_info<HSA_API_ID> \
{ \
static constexpr auto domain_idx = HSA_DOMAIN; \
static constexpr auto table_idx = HSA_TABLE; \
static constexpr auto operation_idx = HSA_API_ID; \
static constexpr auto name = #HSA_FUNC; \
static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \
static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \
static constexpr auto table_idx = HSA_TABLE; \
static constexpr auto operation_idx = HSA_API_ID; \
static constexpr auto name = #HSA_FUNC; \
\
using this_type = hsa_api_info<operation_idx>; \
using base_type = hsa_api_impl<operation_idx>; \
@@ -160,7 +124,7 @@
template <typename DataT> \
static auto& get_api_data_args(DataT& _data) \
{ \
return _data.api_data.args.HSA_FUNC; \
return _data.HSA_FUNC; \
} \
\
template <typename RetT, typename... Args> \
@@ -174,18 +138,13 @@
\
static auto get_functor() { return get_functor(get_table_func()); } \
\
static std::string as_string(rocprofiler_hsa_trace_data_t) \
static std::vector<void*> as_arg_addr(rocprofiler_hsa_api_callback_tracer_data_t) \
{ \
return std::string{name} + "()"; \
} \
\
static std::string as_named_string(rocprofiler_hsa_trace_data_t) \
{ \
return std::string{name} + "()"; \
return std::vector<void*>{}; \
} \
\
static std::vector<std::pair<std::string, std::string>> as_arg_list( \
rocprofiler_hsa_trace_data_t) \
rocprofiler_hsa_api_callback_tracer_data_t) \
{ \
return {}; \
} \
@@ -193,7 +152,7 @@
} \
}
#define HSA_API_INFO_DEFINITION_V(HSA_DOMAIN, HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \
#define HSA_API_INFO_DEFINITION_V(HSA_TABLE, HSA_API_ID, HSA_FUNC, HSA_FUNC_PTR, ...) \
namespace rocprofiler \
{ \
namespace hsa \
@@ -201,10 +160,11 @@
template <> \
struct hsa_api_info<HSA_API_ID> \
{ \
static constexpr auto domain_idx = HSA_DOMAIN; \
static constexpr auto table_idx = HSA_TABLE; \
static constexpr auto operation_idx = HSA_API_ID; \
static constexpr auto name = #HSA_FUNC; \
static constexpr auto callback_domain_idx = ROCPROFILER_SERVICE_CALLBACK_TRACING_HSA_API; \
static constexpr auto buffered_domain_idx = ROCPROFILER_SERVICE_BUFFER_TRACING_HSA_API; \
static constexpr auto table_idx = HSA_TABLE; \
static constexpr auto operation_idx = HSA_API_ID; \
static constexpr auto name = #HSA_FUNC; \
\
using this_type = hsa_api_info<operation_idx>; \
using base_type = hsa_api_impl<operation_idx>; \
@@ -236,7 +196,7 @@
template <typename DataT> \
static auto& get_api_data_args(DataT& _data) \
{ \
return _data.api_data.args.HSA_FUNC; \
return _data.HSA_FUNC; \
} \
\
template <typename RetT, typename... Args> \
@@ -250,23 +210,17 @@
\
static auto get_functor() { return get_functor(get_table_func()); } \
\
static std::string as_string(rocprofiler_hsa_trace_data_t trace_data) \
static std::vector<void*> as_arg_addr( \
rocprofiler_hsa_api_callback_tracer_data_t trace_data) \
{ \
return utils::join(utils::join_args{std::string{name} + "(", ")", ", "}, \
GET_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
return std::vector<void*>{ \
GET_ADDR_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)}; \
} \
\
static std::string as_named_string(rocprofiler_hsa_trace_data_t trace_data) \
{ \
return utils::join( \
utils::join_args{std::string{name} + "(", ")", ", "}, \
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
} \
\
static auto as_arg_list(rocprofiler_hsa_trace_data_t trace_data) \
static auto as_arg_list(rocprofiler_hsa_api_callback_tracer_data_t trace_data) \
{ \
return utils::stringize( \
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data), __VA_ARGS__)); \
GET_NAMED_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)); \
} \
}; \
} \
+8
Просмотреть файл
@@ -0,0 +1,8 @@
#
#
#
set(ROCPROFILER_LIB_HSA_DETAILS_SOURCES)
set(ROCPROFILER_LIB_HSA_DETAILS_HEADERS ostream.hpp)
target_sources(rocprofiler-library PRIVATE ${ROCPROFILER_LIB_HSA_DETAILS_SOURCES}
${ROCPROFILER_LIB_HSA_DETAILS_HEADERS})
Разница между файлами не показана из-за своего большого размера Загрузить разницу
+266 -203
Просмотреть файл
@@ -19,12 +19,20 @@
// THE SOFTWARE.
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/common/defines.hpp"
#include "lib/rocprofiler/hsa/ostream.hpp"
#include "lib/common/utility.hpp"
#include "lib/rocprofiler/buffer.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/hsa/details/ostream.hpp"
#include "lib/rocprofiler/hsa/types.hpp"
#include "lib/rocprofiler/hsa/utils.hpp"
#include <rocprofiler/buffer.h>
#include <rocprofiler/callback_tracing.h>
#include <rocprofiler/fwd.h>
#include <glog/logging.h>
#include <atomic>
#include <cstddef>
#include <cstdint>
@@ -46,7 +54,12 @@ template <typename DataT, typename Tp>
void
set_data_retval(DataT& _data, Tp _val)
{
if constexpr(std::is_same<Tp, hsa_signal_value_t>::value)
if constexpr(std::is_same<Tp, null_type>::value)
{
(void) _data;
(void) _val;
}
else if constexpr(std::is_same<Tp, hsa_signal_value_t>::value)
{
_data.hsa_signal_value_t_retval = _val;
}
@@ -100,65 +113,35 @@ get_table()
}
template <size_t Idx>
template <typename DataT, typename DataArgsT, typename... Args>
template <typename DataArgsT, typename... Args>
auto
hsa_api_impl<Idx>::phase_enter(DataT& _data, DataArgsT& _data_args, Args... args)
hsa_api_impl<Idx>::set_data_args(DataArgsT& _data_args, Args... args)
{
using info_type = hsa_api_info<Idx>;
activity_functor_t _func = report_activity.load(std::memory_order_relaxed);
if(_func)
if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect)
{
if constexpr(Idx == ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect)
{
auto _tuple = std::make_tuple(args...);
_data.api_data.args.hsa_amd_memory_async_copy_rect.dst = std::get<0>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.dst_offset = std::get<1>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.src = std::get<2>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.src_offset = std::get<3>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.range = std::get<4>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.range__val = *(std::get<4>(_tuple));
_data.api_data.args.hsa_amd_memory_async_copy_rect.copy_agent = std::get<5>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.dir = std::get<6>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.num_dep_signals =
std::get<7>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.dep_signals = std::get<8>(_tuple);
_data.api_data.args.hsa_amd_memory_async_copy_rect.completion_signal =
std::get<9>(_tuple);
}
else
{
_data_args = DataArgsT{args...};
}
if(_func(info_type::domain_idx, info_type::operation_idx, &_data) == 0)
{
if(_data.phase_enter != nullptr) _data.phase_enter(info_type::operation_idx, &_data);
return true;
}
return false;
auto _tuple = std::make_tuple(args...);
_data_args.dst = std::get<0>(_tuple);
_data_args.dst_offset = std::get<1>(_tuple);
_data_args.src = std::get<2>(_tuple);
_data_args.src_offset = std::get<3>(_tuple);
_data_args.range = std::get<4>(_tuple);
_data_args.range__val = *(std::get<4>(_tuple));
_data_args.copy_agent = std::get<5>(_tuple);
_data_args.dir = std::get<6>(_tuple);
_data_args.num_dep_signals = std::get<7>(_tuple);
_data_args.dep_signals = std::get<8>(_tuple);
_data_args.completion_signal = std::get<9>(_tuple);
}
else
{
_data_args = DataArgsT{args...};
}
return false;
}
template <size_t Idx>
template <typename DataT, typename... Args>
template <typename FuncT, typename... Args>
auto
hsa_api_impl<Idx>::phase_exit(DataT& _data)
{
using info_type = hsa_api_info<Idx>;
if(_data.phase_exit != nullptr)
{
_data.phase_exit(info_type::operation_idx, &_data);
return true;
}
return false;
}
template <size_t Idx>
template <typename DataT, typename FuncT, typename... Args>
auto
hsa_api_impl<Idx>::exec(DataT& _data, FuncT&& _func, Args&&... args)
hsa_api_impl<Idx>::exec(FuncT&& _func, Args&&... args)
{
using return_type = std::decay_t<std::invoke_result_t<FuncT, Args...>>;
@@ -175,9 +158,7 @@ hsa_api_impl<Idx>::exec(DataT& _data, FuncT&& _func, Args&&... args)
}
else
{
auto _ret = _func(std::forward<Args>(args)...);
set_data_retval(_data.api_data, _ret);
return _ret;
return _func(std::forward<Args>(args)...);
}
}
@@ -194,14 +175,161 @@ hsa_api_impl<Idx>::functor(Args&&... args)
{
using info_type = hsa_api_info<Idx>;
auto trace_data = rocprofiler_hsa_trace_data_t{};
LOG(INFO) << __PRETTY_FUNCTION__;
auto _enabled = phase_enter(
trace_data, info_type::get_api_data_args(trace_data), std::forward<Args>(args)...);
struct callback_context_data
{
context::context* ctx = nullptr;
rocprofiler_callback_tracing_record_t record = {};
};
auto _ret = exec(trace_data, info_type::get_table_func(), std::forward<Args>(args)...);
struct buffered_context_data
{
context::context* ctx = nullptr;
};
if(_enabled) phase_exit(trace_data);
auto callback_contexts = std::vector<callback_context_data>{};
auto buffered_contexts = std::vector<buffered_context_data>{};
for(const auto& aitr : context::get_active_contexts())
{
auto* itr = aitr.load();
if(!itr) continue;
if(itr->callback_tracer)
{
// if the given domain + op is not enabled, skip this context
if(!itr->callback_tracer->domains(info_type::callback_domain_idx,
info_type::operation_idx))
continue;
callback_contexts.emplace_back(
callback_context_data{itr, rocprofiler_callback_tracing_record_t{}});
}
if(itr->buffered_tracer)
{
// if the given domain + op is not enabled, skip this context
if(!itr->buffered_tracer->domains(info_type::buffered_domain_idx,
info_type::operation_idx))
continue;
buffered_contexts.emplace_back(buffered_context_data{itr});
}
}
if(callback_contexts.empty() && buffered_contexts.empty())
{
auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
if constexpr(!std::is_same<decltype(_ret), null_type>::value)
return _ret;
else
return HSA_STATUS_SUCCESS;
}
auto buffer_record = rocprofiler_buffer_tracing_hsa_api_record_t{};
auto tracer_data = rocprofiler_hsa_api_callback_tracer_data_t{};
auto corr_id = context::correlation_tracing_service::get_unique_record_id();
auto thr_id = common::get_tid();
// construct the buffered info before the callback so the callbacks are as closely wrapped
// around the function call as possible
if(!buffered_contexts.empty())
{
buffer_record.kind = info_type::buffered_domain_idx;
buffer_record.correlation_id = rocprofiler_correlation_id_t{corr_id};
buffer_record.operation = info_type::operation_idx;
buffer_record.thread_id = thr_id;
}
// invoke the callbacks
if(!callback_contexts.empty())
{
tracer_data.size = sizeof(rocprofiler_hsa_api_callback_tracer_data_t);
set_data_args(info_type::get_api_data_args(tracer_data.args), std::forward<Args>(args)...);
for(auto& itr : callback_contexts)
{
auto& ctx = itr.ctx;
auto& record = itr.record;
uint64_t extern_corr_id = 0;
auto& _correlation = ctx->correlation_tracer;
if(_correlation.external_id_callback)
{
_correlation.external_id = _correlation.external_id_callback(
info_type::callback_domain_idx, info_type::operation_idx, corr_id);
extern_corr_id = _correlation.external_id;
}
auto user_data = rocprofiler_user_data_t{.value = 0};
record = rocprofiler_callback_tracing_record_t{
thr_id,
rocprofiler_correlation_id_t{corr_id},
rocprofiler_external_correlation_id_t{extern_corr_id},
info_type::callback_domain_idx,
info_type::operation_idx,
ROCPROFILER_SERVICE_CALLBACK_PHASE_ENTER,
user_data,
static_cast<void*>(&tracer_data)};
auto& callback_info =
ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx);
callback_info.callback(record, callback_info.data);
}
}
// record the start timestamp as close to the function call as possible
if(!buffered_contexts.empty())
{
buffer_record.start_timestamp = common::timestamp_ns();
}
auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
// record the end timestamp as close to the function call as possible
if(!buffered_contexts.empty())
{
buffer_record.end_timestamp = common::timestamp_ns();
}
if(!callback_contexts.empty())
{
set_data_retval(tracer_data.retval, _ret);
for(auto& itr : callback_contexts)
{
auto& ctx = itr.ctx;
auto& record = itr.record;
record.phase = ROCPROFILER_SERVICE_CALLBACK_PHASE_EXIT;
record.payload = static_cast<void*>(&tracer_data);
auto& callback_info =
ctx->callback_tracer->callback_data.at(info_type::callback_domain_idx);
callback_info.callback(record, callback_info.data);
}
}
if(!buffered_contexts.empty())
{
for(auto& itr : buffered_contexts)
{
assert(itr.ctx->buffered_tracer);
auto buffer_id =
itr.ctx->buffered_tracer->buffer_data.at(info_type::buffered_domain_idx);
for(auto& bitr : buffer::get_buffers())
{
if(bitr && bitr->context_id == itr.ctx->context_idx &&
bitr->buffer_id == buffer_id.handle)
{
bitr->emplace(ROCPROFILER_BUFFER_CATEGORY_TRACING,
info_type::buffered_domain_idx,
buffer_record);
break;
}
}
}
}
if constexpr(!std::is_same<decltype(_ret), null_type>::value)
return _ret;
@@ -222,74 +350,59 @@ namespace
{
template <size_t Idx, size_t... IdxTail>
const char*
hsa_api_name(const uint32_t id, std::index_sequence<Idx, IdxTail...>)
name_by_id(const uint32_t id, std::index_sequence<Idx, IdxTail...>)
{
if(Idx == id) return hsa_api_info<Idx>::name;
if constexpr(sizeof...(IdxTail) > 0)
return hsa_api_name(id, std::index_sequence<IdxTail...>{});
return name_by_id(id, std::index_sequence<IdxTail...>{});
else
return nullptr;
}
template <size_t Idx, size_t... IdxTail>
uint32_t
hsa_api_id_by_name(const char* name, std::index_sequence<Idx, IdxTail...>)
id_by_name(const char* name, std::index_sequence<Idx, IdxTail...>)
{
if(std::string_view{hsa_api_info<Idx>::name} == std::string_view{name})
return hsa_api_info<Idx>::operation_idx;
if constexpr(sizeof...(IdxTail) > 0)
return hsa_api_id_by_name(name, std::index_sequence<IdxTail...>{});
return id_by_name(name, std::index_sequence<IdxTail...>{});
else
return ROCPROFILER_HSA_API_ID_NONE;
}
template <size_t Idx, size_t... IdxTail>
std::string
hsa_api_data_string(const uint32_t id,
const rocprofiler_hsa_trace_data_t& _data,
std::index_sequence<Idx, IdxTail...>)
{
if(Idx == id) return hsa_api_info<Idx>::as_string(_data);
if constexpr(sizeof...(IdxTail) > 0)
return hsa_api_data_string(id, _data, std::index_sequence<IdxTail...>{});
else
return std::string{};
}
template <size_t Idx, size_t... IdxTail>
std::string
hsa_api_named_data_string(const uint32_t id,
const rocprofiler_hsa_trace_data_t& _data,
std::index_sequence<Idx, IdxTail...>)
{
if(Idx == id) return hsa_api_info<Idx>::as_named_string(_data);
if constexpr(sizeof...(IdxTail) > 0)
return hsa_api_named_data_string(id, _data, std::index_sequence<IdxTail...>{});
else
return std::string{};
}
template <size_t Idx, size_t... IdxTail>
void
hsa_api_iterate_args(const uint32_t id,
const rocprofiler_hsa_trace_data_t& _data,
int (*_func)(const char*, const char*),
std::index_sequence<Idx, IdxTail...>)
iterate_args(const uint32_t id,
const rocprofiler_hsa_api_callback_tracer_data_t& data,
rocprofiler_callback_tracing_operation_args_cb_t func,
void* user_data,
std::index_sequence<Idx, IdxTail...>)
{
if(Idx == id)
{
for(auto&& itr : hsa_api_info<Idx>::as_arg_list(_data))
using info_type = hsa_api_info<Idx>;
auto&& arg_list = info_type::as_arg_list(data);
auto&& arg_addr = info_type::as_arg_addr(data);
for(size_t i = 0; i < std::min(arg_list.size(), arg_addr.size()); ++i)
{
_func(itr.first.c_str(), itr.second.c_str());
auto ret = func(info_type::callback_domain_idx, // kind
id, // operation
i, // arg_number
arg_list.at(i).first.c_str(), // arg_name
arg_list.at(i).second.c_str(), // arg_value_str
arg_addr.at(i), // arg_value_addr
user_data);
if(ret != 0) break;
}
}
if constexpr(sizeof...(IdxTail) > 0)
hsa_api_iterate_args(id, _data, _func, std::index_sequence<IdxTail...>{});
iterate_args(id, data, func, user_data, std::index_sequence<IdxTail...>{});
}
template <size_t... Idx>
void
hsa_api_get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
{
auto _emplace = [](auto& _vec, uint32_t _v) {
if(_v < ROCPROFILER_HSA_API_ID_LAST) _vec.emplace_back(_v);
@@ -300,7 +413,7 @@ hsa_api_get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<Idx...>)
template <size_t... Idx>
void
hsa_api_get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx...>)
get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx...>)
{
auto _emplace = [](auto& _vec, const char* _v) {
if(_v != nullptr && strnlen(_v, 1) > 0) _vec.emplace_back(_v);
@@ -311,9 +424,42 @@ hsa_api_get_names(std::vector<const char*>& _name_list, std::index_sequence<Idx.
template <size_t... Idx>
void
hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
{
static auto _should_wrap_functor =
[](auto _callback_domain, auto _buffered_domain, auto _operation) {
for(const auto& itr : context::get_registered_contexts())
{
if(!itr) continue;
if(itr->callback_tracer)
{
// domain not enabled so skip to next callback_tracer
if(!itr->callback_tracer->domains(_callback_domain)) continue;
// if the given domain + op is enabled, we need to wrap
if(itr->callback_tracer->domains(_callback_domain, _operation)) return true;
}
if(itr->buffered_tracer)
{
// domain not enabled so skip to next callback_tracer
if(!itr->buffered_tracer->domains(_buffered_domain)) continue;
// if the given domain + op is enabled, we need to wrap
if(itr->buffered_tracer->domains(_buffered_domain, _operation)) return true;
}
}
return false;
};
(void) _should_wrap_functor;
auto _update = [](hsa_api_table_t* _orig_v, auto _info) {
// check to see if there are any contexts which enable this operation in the HSA API domain
if(!_should_wrap_functor(
_info.callback_domain_idx, _info.buffered_domain_idx, _info.operation_idx))
return;
// 1. get the sub-table containing the function pointer
// 2. get reference to function pointer in sub-table
// 3. update function pointer with functor
@@ -328,140 +474,57 @@ hsa_api_update_table(hsa_api_table_t* _orig, std::index_sequence<Idx...>)
// check out the assembly here... this compiles to a switch statement
const char*
hsa_api_name(uint32_t id)
name_by_id(uint32_t id)
{
return hsa_api_name(id, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
return name_by_id(id, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
uint32_t
hsa_api_id_by_name(const char* name)
id_by_name(const char* name)
{
return hsa_api_id_by_name(name, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
std::string
hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data)
{
return hsa_api_data_string(id, _data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
std::string
hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data)
{
return hsa_api_named_data_string(
id, _data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
return id_by_name(name, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
void
hsa_api_iterate_args(uint32_t id,
const rocprofiler_hsa_trace_data_t& _data,
int (*_func)(const char*, const char*))
iterate_args(uint32_t id,
const rocprofiler_hsa_api_callback_tracer_data_t& data,
rocprofiler_callback_tracing_operation_args_cb_t callback,
void* user_data)
{
if(_func)
hsa_api_iterate_args(
id, _data, _func, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
if(callback)
iterate_args(
id, data, callback, user_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
std::vector<uint32_t>
hsa_api_get_ids()
get_ids()
{
auto _data = std::vector<uint32_t>{};
_data.reserve(ROCPROFILER_HSA_API_ID_LAST);
hsa_api_get_ids(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
get_ids(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
return _data;
}
std::vector<const char*>
hsa_api_get_names()
get_names()
{
auto _data = std::vector<const char*>{};
_data.reserve(ROCPROFILER_HSA_API_ID_LAST);
hsa_api_get_names(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
get_names(_data, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
return _data;
}
void
hsa_api_set_callback(activity_functor_t _func)
set_callback(activity_functor_t _func)
{
auto&& _v = report_activity.load();
report_activity.compare_exchange_strong(_v, _func);
}
void
hsa_api_update_table(hsa_api_table_t* _orig)
update_table(hsa_api_table_t* _orig)
{
if(_orig) hsa_api_update_table(_orig, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
if(_orig) update_table(_orig, std::make_index_sequence<ROCPROFILER_HSA_API_ID_LAST>{});
}
} // namespace hsa
} // namespace rocprofiler
extern "C" {
bool
OnLoad(HsaApiTable* table,
uint64_t runtime_version,
uint64_t failed_tool_count,
const char* const* failed_tool_names)
{
(void) runtime_version;
(void) failed_tool_count;
(void) failed_tool_names;
fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__);
auto& _saved = rocprofiler::hsa::get_table();
::copyTables(table, &_saved);
rocprofiler::hsa::hsa_api_update_table(table);
return true;
}
}
/*
#include <iomanip>
int
main()
{
rocprofiler::hsa::activity_functor_t _cb =
[](rocprofiler_tracer_activity_domain_t domain, uint32_t operation_id, void* data) {
const auto* _name = rocprofiler::hsa::hsa_api_name(operation_id);
auto _name_id = rocprofiler::hsa::hsa_api_id_by_name(_name);
auto& _data = *static_cast<rocprofiler::hsa::hsa_trace_data_t*>(data);
std::cout << "[cb] domain=" << domain << ", op_id=" << operation_id << ", data=" << data
<< ", name=" << _name << ", name_id=" << _name_id << ", named_string='"
<< rocprofiler::hsa::hsa_api_named_data_string(operation_id, _data) << "'"
<< "\n";
auto _func = [](const char* name, const char* value) {
std::cout << " " << std::setw(20) << name << " = " << value << "\n";
return 0;
};
rocprofiler::hsa::hsa_api_iterate_args(operation_id, _data, _func);
return 0;
};
rocprofiler::hsa::report_activity.store(_cb);
{
double val = 40;
hsa_code_object_t code_object = {};
hsa_code_object_info_t attribute = HSA_CODE_OBJECT_INFO_TYPE;
void* value = &val;
auto _func =
rocprofiler::hsa::hsa_api_info<HSA_API_ID_hsa_code_object_get_info>::get_functor();
_func(code_object, attribute, value);
}
{
bool result = false;
uint16_t ext = 1;
uint16_t major = 4;
uint16_t minor = 2;
auto _func = rocprofiler::hsa::hsa_api_info<
HSA_API_ID_hsa_system_extension_supported>::get_functor();
_func(ext, major, minor, &result);
}
}
*/
+205 -217
Просмотреть файл
@@ -28,204 +28,203 @@ HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, core_)
HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, amd_ext_)
HSA_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, image_ext_)
HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn)
HSA_API_INFO_DEFINITION_0(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API, ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data)
HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_init, hsa_init, hsa_init_fn)
HSA_API_INFO_DEFINITION_0(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_shut_down, hsa_shut_down, hsa_shut_down_fn)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_info, hsa_system_get_info, hsa_system_get_info_fn, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_extension_supported, hsa_system_extension_supported, hsa_system_extension_supported_fn, extension, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_extension_table, hsa_system_get_extension_table, hsa_system_get_extension_table_fn, extension, version_major, version_minor, table)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_iterate_agents, hsa_iterate_agents, hsa_iterate_agents_fn, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_info, hsa_agent_get_info, hsa_agent_get_info_fn, agent, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_create, hsa_queue_create, hsa_queue_create_fn, agent, size, type, callback, data, private_segment_size, group_segment_size, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_soft_queue_create, hsa_soft_queue_create, hsa_soft_queue_create_fn, region, size, type, features, doorbell_signal, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_destroy, hsa_queue_destroy, hsa_queue_destroy_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_inactivate, hsa_queue_inactivate, hsa_queue_inactivate_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire, hsa_queue_load_read_index_scacquire_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed, hsa_queue_load_read_index_relaxed_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire, hsa_queue_load_write_index_scacquire_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed, hsa_queue_load_write_index_relaxed_fn, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed, hsa_queue_store_write_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease, hsa_queue_store_write_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl, hsa_queue_cas_write_index_scacq_screl_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire, hsa_queue_cas_write_index_scacquire_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed, hsa_queue_cas_write_index_relaxed_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease, hsa_queue_cas_write_index_screlease_fn, queue, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl, hsa_queue_add_write_index_scacq_screl_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire, hsa_queue_add_write_index_scacquire_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed, hsa_queue_add_write_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease, hsa_queue_add_write_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed, hsa_queue_store_read_index_relaxed_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease, hsa_queue_store_read_index_screlease_fn, queue, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_regions, hsa_agent_iterate_regions, hsa_agent_iterate_regions_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_region_get_info, hsa_region_get_info, hsa_region_get_info_fn, region, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_get_exception_policies, hsa_agent_get_exception_policies, hsa_agent_get_exception_policies_fn, agent, profile, mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_extension_supported, hsa_agent_extension_supported, hsa_agent_extension_supported_fn, extension, agent, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_register, hsa_memory_register, hsa_memory_register_fn, ptr, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_deregister, hsa_memory_deregister, hsa_memory_deregister_fn, ptr, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_allocate, hsa_memory_allocate, hsa_memory_allocate_fn, region, size, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_free, hsa_memory_free, hsa_memory_free_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_copy, hsa_memory_copy, hsa_memory_copy_fn, dst, src, size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_memory_assign_agent, hsa_memory_assign_agent, hsa_memory_assign_agent_fn, ptr, agent, access)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_create, hsa_signal_create, hsa_signal_create_fn, initial_value, num_consumers, consumers, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_destroy, hsa_signal_destroy, hsa_signal_destroy_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_relaxed, hsa_signal_load_relaxed, hsa_signal_load_relaxed_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_load_scacquire, hsa_signal_load_scacquire, hsa_signal_load_scacquire_fn, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_relaxed, hsa_signal_store_relaxed, hsa_signal_store_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_store_screlease, hsa_signal_store_screlease, hsa_signal_store_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_relaxed, hsa_signal_wait_relaxed, hsa_signal_wait_relaxed_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_wait_scacquire, hsa_signal_wait_scacquire, hsa_signal_wait_scacquire_fn, signal, condition, compare_value, timeout_hint, wait_state_hint)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_relaxed, hsa_signal_and_relaxed, hsa_signal_and_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacquire, hsa_signal_and_scacquire, hsa_signal_and_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_screlease, hsa_signal_and_screlease, hsa_signal_and_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl, hsa_signal_and_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_relaxed, hsa_signal_or_relaxed, hsa_signal_or_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacquire, hsa_signal_or_scacquire, hsa_signal_or_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_screlease, hsa_signal_or_screlease, hsa_signal_or_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl, hsa_signal_or_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_relaxed, hsa_signal_xor_relaxed, hsa_signal_xor_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacquire, hsa_signal_xor_scacquire, hsa_signal_xor_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_screlease, hsa_signal_xor_screlease, hsa_signal_xor_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl, hsa_signal_xor_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed, hsa_signal_exchange_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire, hsa_signal_exchange_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_screlease, hsa_signal_exchange_screlease, hsa_signal_exchange_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl, hsa_signal_exchange_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_relaxed, hsa_signal_add_relaxed, hsa_signal_add_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacquire, hsa_signal_add_scacquire, hsa_signal_add_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_screlease, hsa_signal_add_screlease, hsa_signal_add_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl, hsa_signal_add_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed, hsa_signal_subtract_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire, hsa_signal_subtract_scacquire_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_screlease, hsa_signal_subtract_screlease, hsa_signal_subtract_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl, hsa_signal_subtract_scacq_screl_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_relaxed, hsa_signal_cas_relaxed, hsa_signal_cas_relaxed_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacquire, hsa_signal_cas_scacquire, hsa_signal_cas_scacquire_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_screlease, hsa_signal_cas_screlease, hsa_signal_cas_screlease_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl, hsa_signal_cas_scacq_screl_fn, signal, expected, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_from_name, hsa_isa_from_name, hsa_isa_from_name_fn, name, isa)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info, hsa_isa_get_info, hsa_isa_get_info_fn, isa, attribute, index, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_compatible, hsa_isa_compatible, hsa_isa_compatible_fn, code_object_isa, agent_isa, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_serialize, hsa_code_object_serialize, hsa_code_object_serialize_fn, code_object, alloc_callback, callback_data, options, serialized_code_object, serialized_code_object_size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_deserialize, hsa_code_object_deserialize, hsa_code_object_deserialize_fn, serialized_code_object, serialized_code_object_size, options, code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_destroy, hsa_code_object_destroy, hsa_code_object_destroy_fn, code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_info, hsa_code_object_get_info, hsa_code_object_get_info_fn, code_object, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol, hsa_code_object_get_symbol, hsa_code_object_get_symbol_fn, code_object, symbol_name, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_symbol_get_info, hsa_code_symbol_get_info, hsa_code_symbol_get_info_fn, code_symbol, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols, hsa_code_object_iterate_symbols_fn, code_object, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create, hsa_executable_create, hsa_executable_create_fn, profile, executable_state, options, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_destroy, hsa_executable_destroy, hsa_executable_destroy_fn, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_code_object, hsa_executable_load_code_object, hsa_executable_load_code_object_fn, executable, agent, code_object, options)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_freeze, hsa_executable_freeze, hsa_executable_freeze_fn, executable, options)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_info, hsa_executable_get_info, hsa_executable_get_info_fn, executable, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_global_variable_define, hsa_executable_global_variable_define, hsa_executable_global_variable_define_fn, executable, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define, hsa_executable_agent_global_variable_define_fn, executable, agent, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define, hsa_executable_readonly_variable_define_fn, executable, agent, variable_name, address)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate, hsa_executable_validate, hsa_executable_validate_fn, executable, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol, hsa_executable_get_symbol, hsa_executable_get_symbol_fn, executable, module_name, symbol_name, agent, call_convention, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_symbol_get_info, hsa_executable_symbol_get_info, hsa_executable_symbol_get_info_fn, executable_symbol, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_symbols, hsa_executable_iterate_symbols, hsa_executable_iterate_symbols_fn, executable, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_status_string, hsa_status_string, hsa_status_string_fn, status, status_string)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_extension_get_name, hsa_extension_get_name, hsa_extension_get_name_fn, extension, name)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_major_extension_supported, hsa_system_major_extension_supported, hsa_system_major_extension_supported_fn, extension, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_system_get_major_extension_table, hsa_system_get_major_extension_table, hsa_system_get_major_extension_table_fn, extension, version_major, table_length, table)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_major_extension_supported, hsa_agent_major_extension_supported, hsa_agent_major_extension_supported_fn, extension, agent, version_major, version_minor, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_cache_get_info, hsa_cache_get_info, hsa_cache_get_info_fn, cache, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_caches, hsa_agent_iterate_caches, hsa_agent_iterate_caches_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed, hsa_signal_silent_store_relaxed_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease, hsa_signal_silent_store_screlease_fn, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_create, hsa_signal_group_create, hsa_signal_group_create_fn, num_signals, signals, num_consumers, consumers, signal_group)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_destroy, hsa_signal_group_destroy, hsa_signal_group_destroy_fn, signal_group)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire, hsa_signal_group_wait_any_scacquire_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed, hsa_signal_group_wait_any_relaxed_fn, signal_group, conditions, compare_values, wait_state_hint, signal, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_agent_iterate_isas, hsa_agent_iterate_isas, hsa_agent_iterate_isas_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_info_alt, hsa_isa_get_info_alt, hsa_isa_get_info_alt_fn, isa, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_exception_policies, hsa_isa_get_exception_policies, hsa_isa_get_exception_policies_fn, isa, profile, mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_get_round_method, hsa_isa_get_round_method, hsa_isa_get_round_method_fn, isa, fp_type, flush_mode, round_method)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_wavefront_get_info, hsa_wavefront_get_info, hsa_wavefront_get_info_fn, wavefront, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts, hsa_isa_iterate_wavefronts_fn, isa, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name, hsa_code_object_get_symbol_from_name_fn, code_object, module_name, symbol_name, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file, hsa_code_object_reader_create_from_file_fn, file, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory, hsa_code_object_reader_create_from_memory_fn, code_object, size, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_code_object_reader_destroy, hsa_code_object_reader_destroy, hsa_code_object_reader_destroy_fn, code_object_reader)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_create_alt, hsa_executable_create_alt, hsa_executable_create_alt_fn, profile, default_float_rounding_mode, options, executable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_program_code_object, hsa_executable_load_program_code_object, hsa_executable_load_program_code_object_fn, executable, code_object_reader, options, loaded_code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object, hsa_executable_load_agent_code_object_fn, executable, agent, code_object_reader, options, loaded_code_object)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_validate_alt, hsa_executable_validate_alt, hsa_executable_validate_alt_fn, executable, options, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name, hsa_executable_get_symbol_by_name_fn, executable, symbol_name, agent, symbol)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols, hsa_executable_iterate_agent_symbols_fn, executable, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_CoreApi, ROCPROFILER_HSA_API_ID_hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols, hsa_executable_iterate_program_symbols_fn, executable, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_get_type, hsa_amd_coherency_get_type, hsa_amd_coherency_get_type_fn, agent, type)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_coherency_set_type, hsa_amd_coherency_set_type, hsa_amd_coherency_set_type_fn, agent, type)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled, hsa_amd_profiling_set_profiler_enabled_fn, queue, enable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable, hsa_amd_profiling_async_copy_enable_fn, enable)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time, hsa_amd_profiling_get_dispatch_time_fn, agent, signal, time)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time, hsa_amd_profiling_get_async_copy_time_fn, signal, time)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain, hsa_amd_profiling_convert_tick_to_system_domain_fn, agent, agent_tick, system_tick)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_async_handler, hsa_amd_signal_async_handler, hsa_amd_signal_async_handler_fn, signal, cond, value, handler, arg)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_async_function, hsa_amd_async_function, hsa_amd_async_function_fn, callback, arg)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_wait_any, hsa_amd_signal_wait_any, hsa_amd_signal_wait_any_fn, signal_count, signals, conds, values, timeout_hint, wait_hint, satisfying_value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask, hsa_amd_queue_cu_set_mask_fn, queue, num_cu_mask_count, cu_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info, hsa_amd_memory_pool_get_info_fn, memory_pool, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools, hsa_amd_agent_iterate_memory_pools_fn, agent, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate, hsa_amd_memory_pool_allocate_fn, memory_pool, size, flags, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_free, hsa_amd_memory_pool_free, hsa_amd_memory_pool_free_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy, hsa_amd_memory_async_copy, hsa_amd_memory_async_copy_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine, hsa_amd_memory_async_copy_on_engine_fn, dst, dst_agent, src, src_agent, size, num_dep_signals, dep_signals, completion_signal, engine_id, force_copy_on_sdma)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status, hsa_amd_memory_copy_engine_status_fn, dst_agent, src_agent, engine_ids_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info, hsa_amd_agent_memory_pool_get_info_fn, agent, memory_pool, attribute, value)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_agents_allow_access, hsa_amd_agents_allow_access, hsa_amd_agents_allow_access_fn, num_agents, agents, flags, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate, hsa_amd_memory_pool_can_migrate_fn, src_memory_pool, dst_memory_pool, result)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_migrate, hsa_amd_memory_migrate, hsa_amd_memory_migrate_fn, ptr, memory_pool, flags)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock, hsa_amd_memory_lock, hsa_amd_memory_lock_fn, host_ptr, size, agents, num_agent, agent_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_unlock, hsa_amd_memory_unlock, hsa_amd_memory_unlock_fn, host_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_fill, hsa_amd_memory_fill, hsa_amd_memory_fill_fn, ptr, value, count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer, hsa_amd_interop_map_buffer_fn, num_agents, agents, interop_handle, flags, size, ptr, metadata_size, metadata)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer, hsa_amd_interop_unmap_buffer_fn, ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_image_create, hsa_amd_image_create, hsa_amd_image_create_fn, agent, image_descriptor, image_layout, image_data, access_permission, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info, hsa_amd_pointer_info, hsa_amd_pointer_info_fn, ptr, info, alloc, num_agents_accessible, accessible)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata, hsa_amd_pointer_info_set_userdata_fn, ptr, userdata)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create, hsa_amd_ipc_memory_create_fn, ptr, len, handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach, hsa_amd_ipc_memory_attach_fn, handle, len, num_agents, mapping_agents, mapped_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach, hsa_amd_ipc_memory_detach_fn, mapped_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_create, hsa_amd_signal_create, hsa_amd_signal_create_fn, initial_value, num_consumers, consumers, attributes, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create, hsa_amd_ipc_signal_create_fn, signal, handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach, hsa_amd_ipc_signal_attach_fn, handle, signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler, hsa_amd_register_system_event_handler_fn, callback, data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_set_priority, hsa_amd_queue_set_priority, hsa_amd_queue_set_priority_fn, queue, priority)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect, hsa_amd_memory_async_copy_rect_fn, dst, dst_offset, src, src_offset, range, copy_agent, dir, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool, hsa_amd_memory_lock_to_pool_fn, host_ptr, size, agents, num_agent, pool, flags, agent_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback, hsa_amd_register_deallocation_callback_fn, ptr, callback, user_data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback, hsa_amd_deregister_deallocation_callback_fn, ptr, callback)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer, hsa_amd_signal_value_pointer_fn, signal, value_ptr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set, hsa_amd_svm_attributes_set_fn, ptr, size, attribute_list, attribute_count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get, hsa_amd_svm_attributes_get_fn, ptr, size, attribute_list, attribute_count)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async, hsa_amd_svm_prefetch_async_fn, ptr, size, agent, num_dep_signals, dep_signals, completion_signal)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_acquire, hsa_amd_spm_acquire, hsa_amd_spm_acquire_fn, preferred_agent)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_release, hsa_amd_spm_release, hsa_amd_spm_release_fn, preferred_agent)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer, hsa_amd_spm_set_dest_buffer_fn, preferred_agent, size_in_bytes, timeout, size_copied, dest, is_data_loss)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask, hsa_amd_queue_cu_get_mask_fn, queue, num_cu_mask_count, cu_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf, hsa_amd_portable_export_dmabuf_fn, ptr, size, dmabuf, offset)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf, hsa_amd_portable_close_dmabuf_fn, dmabuf)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability, hsa_ext_image_get_capability, hsa_ext_image_get_capability_fn, agent, geometry, image_format, capability_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info, hsa_ext_image_data_get_info, hsa_ext_image_data_get_info_fn, agent, image_descriptor, access_permission, image_data_info)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create, hsa_ext_image_create, hsa_ext_image_create_fn, agent, image_descriptor, image_data, access_permission, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_import, hsa_ext_image_import, hsa_ext_image_import_fn, agent, src_memory, src_row_pitch, src_slice_pitch, dst_image, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_export, hsa_ext_image_export, hsa_ext_image_export_fn, agent, src_image, dst_memory, dst_row_pitch, dst_slice_pitch, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_copy, hsa_ext_image_copy, hsa_ext_image_copy_fn, agent, src_image, src_offset, dst_image, dst_offset, range)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_clear, hsa_ext_image_clear, hsa_ext_image_clear_fn, agent, image, data, image_region)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_destroy, hsa_ext_image_destroy, hsa_ext_image_destroy_fn, agent, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_create, hsa_ext_sampler_create, hsa_ext_sampler_create_fn, agent, sampler_descriptor, sampler)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_sampler_destroy, hsa_ext_sampler_destroy, hsa_ext_sampler_destroy_fn, agent, sampler)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout, hsa_ext_image_get_capability_with_layout_fn, agent, geometry, image_format, image_data_layout, capability_mask)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout, hsa_ext_image_data_get_info_with_layout_fn, agent, image_descriptor, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image_data_info)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_ImageExt, ROCPROFILER_HSA_API_ID_hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout, hsa_ext_image_create_with_layout_fn, agent, image_descriptor, image_data, access_permission, image_data_layout, image_data_row_pitch, image_data_slice_pitch, image)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create, hsa_amd_queue_intercept_create_fn, agent_handle, size, type, callback, data, private_segment_size, group_segment_size, queue)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register, hsa_amd_queue_intercept_register_fn, queue, callback, user_data)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt, ROCPROFILER_HSA_API_ID_hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register, hsa_amd_runtime_queue_create_register_fn, callback, user_data)
// clang-format on
#if HSA_AMD_EXT_API_TABLE_MAJOR_VERSION >= 0x02
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_reserve,
hsa_amd_vmem_address_reserve,
hsa_amd_vmem_address_reserve_fn,
@@ -233,15 +232,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
size,
address,
flags)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_address_free,
hsa_amd_vmem_address_free,
hsa_amd_vmem_address_free_fn,
ptr,
size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_create,
hsa_amd_vmem_handle_create,
hsa_amd_vmem_handle_create_fn,
@@ -250,14 +247,12 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
type,
flags,
memory_handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_handle_release,
hsa_amd_vmem_handle_release,
hsa_amd_vmem_handle_release_fn,
memory_handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_map,
hsa_amd_vmem_map,
hsa_amd_vmem_map_fn,
@@ -266,15 +261,13 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
in_offset,
memory_handle,
flags)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_unmap,
hsa_amd_vmem_unmap,
hsa_amd_vmem_unmap_fn,
va,
size)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_set_access,
hsa_amd_vmem_set_access,
hsa_amd_vmem_set_access_fn,
@@ -282,38 +275,33 @@ HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
size,
desc,
desc_cnt)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_access,
hsa_amd_vmem_get_access,
hsa_amd_vmem_get_access_fn,
va,
perms,
agent_handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_export_shareable_handle,
hsa_amd_vmem_export_shareable_handle,
hsa_amd_vmem_export_shareable_handle_fn,
dmabuf_fd,
handle,
flags)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_import_shareable_handle,
hsa_amd_vmem_import_shareable_handle,
hsa_amd_vmem_import_shareable_handle_fn,
dmabuf_fd,
handle)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_retain_alloc_handle,
hsa_amd_vmem_retain_alloc_handle,
hsa_amd_vmem_retain_alloc_handle_fn,
handle,
addr)
HSA_API_INFO_DEFINITION_V(ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
HSA_API_INFO_DEFINITION_V(ROCPROFILER_HSA_API_TABLE_ID_AmdExt,
ROCPROFILER_HSA_API_ID_hsa_amd_vmem_get_alloc_properties_from_handle,
hsa_amd_vmem_get_alloc_properties_from_handle,
hsa_amd_vmem_get_alloc_properties_from_handle_fn,
+19 -34
Просмотреть файл
@@ -29,9 +29,9 @@ namespace rocprofiler
{
namespace hsa
{
using activity_functor_t = int (*)(rocprofiler_tracer_activity_domain_t domain,
uint32_t operation_id,
void* data);
using activity_functor_t = int (*)(rocprofiler_service_callback_tracing_kind_t domain,
uint32_t operation_id,
void* data);
using hsa_api_table_t = HsaApiTable;
@@ -44,14 +44,11 @@ struct hsa_table_lookup;
template <size_t Idx>
struct hsa_api_impl
{
template <typename DataT, typename DataArgsT, typename... Args>
static auto phase_enter(DataT& _data, DataArgsT&, Args... args);
template <typename DataArgsT, typename... Args>
static auto set_data_args(DataArgsT&, Args... args);
template <typename DataT, typename... Args>
static auto phase_exit(DataT& _data);
template <typename DataT, typename FuncT, typename... Args>
static auto exec(DataT& _data, FuncT&&, Args&&... args);
template <typename FuncT, typename... Args>
static auto exec(FuncT&&, Args&&... args);
template <typename... Args>
static auto functor(Args&&... args);
@@ -61,39 +58,27 @@ template <size_t Idx>
struct hsa_api_info;
const char*
hsa_api_name(uint32_t id);
name_by_id(uint32_t id);
uint32_t
hsa_api_id_by_name(const char* name);
std::string
hsa_api_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data);
std::string
hsa_api_named_data_string(uint32_t id, const rocprofiler_hsa_trace_data_t& _data);
id_by_name(const char* name);
void
hsa_api_iterate_args(uint32_t id,
const rocprofiler_hsa_trace_data_t& _data,
int (*_func)(const char*, const char*));
iterate_args(uint32_t id,
const rocprofiler_hsa_api_callback_tracer_data_t& data,
rocprofiler_callback_tracing_operation_args_cb_t callback,
void* user_data);
std::vector<const char*>
hsa_api_get_names();
get_names();
std::vector<uint32_t>
hsa_api_get_ids();
get_ids();
void
hsa_api_set_callback(activity_functor_t _func);
set_callback(activity_functor_t _func);
void
update_table(hsa_api_table_t* _orig);
} // namespace hsa
} // namespace rocprofiler
extern "C" {
using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*);
bool
OnLoad(HsaApiTable* table,
uint64_t runtime_version,
uint64_t failed_tool_count,
const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API;
}
+21 -44
Просмотреть файл
@@ -45,70 +45,47 @@ namespace hsa
{
namespace utils
{
template <typename Tp, typename Up = Tp, std::enable_if_t<fmt::is_formattable<Tp>::value, int> = 0>
std::string
stringize_impl(Tp _v, int)
{
return fmt::format("{}", _v);
}
template <typename Tp>
std::string
stringize_impl(Tp _v, long)
struct is_pair_impl
{
auto _ss = std::stringstream{};
_ss << _v;
return _ss.str();
}
static constexpr auto value = false;
};
template <typename LhsT, typename RhsT>
auto
stringize_impl(const std::pair<LhsT, RhsT>& _v, int)
struct is_pair_impl<std::pair<LhsT, RhsT>>
{
return std::make_pair(stringize_impl(_v.first, 0), stringize_impl(_v.second, 0));
}
struct join_args
{
std::string_view prefix = {};
std::string_view suffix = {};
std::string_view separator = {};
static constexpr auto value = true;
};
template <typename Tp>
std::string
join_impl(const Tp& _v)
{
return stringize_impl(_v, 0);
}
struct is_pair : is_pair_impl<std::remove_cv_t<std::remove_reference_t<std::decay_t<Tp>>>>
{};
template <typename LhsT, typename RhsT>
std::string
join_impl(const std::pair<LhsT, RhsT>& _v)
{
return fmt::format("{}={}", join_impl(_v.first), join_impl(_v.second));
}
template <typename... Args>
template <typename Tp>
auto
join(join_args ja, Args... args)
stringize_impl(const Tp& _v)
{
auto _content = std::string{};
if constexpr(is_pair<Tp>::value)
{
return std::make_pair(stringize_impl(_v.first), stringize_impl(_v.second));
}
else if constexpr(fmt::is_formattable<Tp>::value && !std::is_pointer<Tp>::value)
{
return fmt::format("{}", _v);
}
else
{
auto _ss = std::stringstream{};
((_ss << ja.separator << join_impl(args)), ...);
auto _v = _ss.str();
if(_v.length() > ja.separator.length()) _content = _v.substr(2);
_ss << _v;
return _ss.str();
}
return (std::stringstream{} << ja.prefix << _content << ja.suffix).str();
}
template <typename... Args>
auto
stringize(Args... args)
{
return std::vector<std::pair<std::string, std::string>>{stringize_impl(args, 0)...};
return std::vector<std::pair<std::string, std::string>>{stringize_impl(args)...};
}
template <typename Tp>
+279
Просмотреть файл
@@ -0,0 +1,279 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/fwd.h>
#include <rocprofiler/internal_threading.h>
#include <rocprofiler/rocprofiler.h>
#include "lib/common/container/stable_vector.hpp"
#include "lib/rocprofiler/buffer.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/internal_threading.hpp"
#include <cstdint>
#include <mutex>
#include <string>
#include <vector>
namespace rocprofiler
{
namespace internal_threading
{
namespace
{
template <rocprofiler_internal_thread_library_t... Idx>
using library_sequence_t = std::integer_sequence<rocprofiler_internal_thread_library_t, Idx...>;
using creation_notifier_cb_t = void (*)(rocprofiler_internal_thread_library_t, void*);
using thread_pool_config_t = PTL::ThreadPool::Config;
// this is used to loop over the different libraries
constexpr auto creation_notifier_library_seq = library_sequence_t<ROCPROFILER_LIBRARY,
ROCPROFILER_HSA_LIBRARY,
ROCPROFILER_HIP_LIBRARY,
ROCPROFILER_MARKER_LIBRARY>{};
// check that creation_notifier_library_seq is up to date
static_assert((1 << (creation_notifier_library_seq.size() - 1)) == ROCPROFILER_LIBRARY_LAST,
"Update creation_notifier_library_seq to include new libraries");
// used to distinguish invoking pre vs. post at compile-time
enum class notifier_stage
{
precreation = 0,
postcreation,
};
// data structure holding list of callbacks
template <rocprofiler_internal_thread_library_t LibT>
struct creation_notifier
{
static constexpr auto value = LibT;
std::vector<creation_notifier_cb_t> precreate_callbacks = {};
std::vector<creation_notifier_cb_t> postcreate_callbacks = {};
std::vector<void*> user_data = {};
std::mutex mutex = {};
};
// static accessor for creation_notifier instance
template <rocprofiler_internal_thread_library_t LibT>
auto&
get_creation_notifier()
{
static auto _v = creation_notifier<LibT>{};
return _v;
}
// adds callbacks to creation_notifier instance(s)
template <rocprofiler_internal_thread_library_t... Idx>
void
update_creation_notifiers(creation_notifier_cb_t pre,
creation_notifier_cb_t post,
int libs,
void* data,
library_sequence_t<Idx...>)
{
auto update = [pre, post, libs, data](auto& notifier) {
if(libs == 0 || ((libs & notifier.value) == notifier.value))
{
notifier.mutex.lock();
notifier.precreate_callbacks.emplace_back(pre);
notifier.postcreate_callbacks.emplace_back(post);
notifier.user_data.emplace_back(data);
notifier.mutex.unlock();
}
};
(update(get_creation_notifier<Idx>()), ...);
}
// invokes creation notifiers
template <notifier_stage StageT, rocprofiler_internal_thread_library_t... Idx>
void
execute_creation_notifiers(rocprofiler_internal_thread_library_t libs,
std::integer_sequence<rocprofiler_internal_thread_library_t, Idx...>)
{
auto execute = [libs](auto& notifier) {
if(((libs & notifier.value) == notifier.value))
{
notifier.mutex.lock();
if constexpr(StageT == notifier_stage::precreation)
{
for(size_t i = 0; i < notifier.precreate_callbacks.size(); ++i)
{
auto itr = notifier.precreate_callbacks.at(i);
if(itr) itr(notifier.value, notifier.user_data.at(i));
}
}
else if constexpr(StageT == notifier_stage::postcreation)
{
for(size_t i = 0; i < notifier.postcreate_callbacks.size(); ++i)
{
auto itr = notifier.postcreate_callbacks.at(i);
if(itr) itr(notifier.value, notifier.user_data.at(i));
}
}
notifier.mutex.unlock();
}
};
(execute(get_creation_notifier<Idx>()), ...);
}
auto&
get_thread_pools()
{
static auto _v = thread_pool_vec_t{};
return _v;
}
auto&
get_task_groups()
{
static auto _v = task_group_vec_t{};
return _v;
}
} // namespace
// initialize the default thread pool
void
initialize()
{
static auto _once = std::once_flag{};
std::call_once(_once, create_callback_thread);
}
// sync all the task groups and destroy the thread pools
void
finalize()
{
for(auto& itr : get_task_groups())
{
if(itr) itr->join();
}
for(auto& itr : get_thread_pools())
{
if(itr) itr->destroy_threadpool();
}
for(auto& itr : get_task_groups())
itr.reset();
for(auto& itr : get_thread_pools())
itr.reset();
}
void
notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t libs)
{
execute_creation_notifiers<notifier_stage::precreation>(libs, creation_notifier_library_seq);
}
void
notify_post_internal_thread_create(rocprofiler_internal_thread_library_t libs)
{
execute_creation_notifiers<notifier_stage::postcreation>(libs, creation_notifier_library_seq);
}
rocprofiler_callback_thread_t
create_callback_thread()
{
// notify that rocprofiler library is about to create an inernal thread
notify_pre_internal_thread_create(ROCPROFILER_LIBRARY);
// this will be index after emplace_back
auto idx = get_thread_pools().size();
auto& thr_pool = get_thread_pools().emplace_back(
new thread_pool_t{thread_pool_config_t{.pool_size = 1}}, [](thread_pool_t* v) {
v->destroy_threadpool();
delete v;
});
// construct the task group to use the newly created thread pool
get_task_groups().emplace_back(new task_group_t{thr_pool.get()});
// notify that rocprofiler library finished creating an internal thread
notify_post_internal_thread_create(ROCPROFILER_LIBRARY);
return rocprofiler_callback_thread_t{idx};
}
// returns the task group for the given callback thread identifier
task_group_t*
get_task_group(rocprofiler_callback_thread_t cb_tid)
{
return get_task_groups().at(cb_tid.handle).get();
}
} // namespace internal_threading
} // namespace rocprofiler
extern "C" {
rocprofiler_status_t
rocprofiler_at_internal_thread_create(rocprofiler_internal_thread_library_cb_t precreate,
rocprofiler_internal_thread_library_cb_t postcreate,
int libs,
void* data)
{
rocprofiler::internal_threading::update_creation_notifiers(
precreate,
postcreate,
libs,
data,
rocprofiler::internal_threading::creation_notifier_library_seq);
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_create_callback_thread(rocprofiler_callback_thread_t* cb_thread_id)
{
rocprofiler::internal_threading::initialize();
auto cb_tid = rocprofiler::internal_threading::create_callback_thread();
if(cb_tid.handle > 0)
{
*cb_thread_id = cb_tid;
return ROCPROFILER_STATUS_SUCCESS;
}
return ROCPROFILER_STATUS_ERROR;
}
rocprofiler_status_t ROCPROFILER_API
rocprofiler_assign_callback_thread(rocprofiler_buffer_id_t buffer_id,
rocprofiler_callback_thread_t cb_thread_id)
{
if(cb_thread_id.handle >= rocprofiler::internal_threading::get_task_groups().size())
return ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND;
for(auto& bitr : rocprofiler::buffer::get_buffers())
{
if(bitr && bitr->buffer_id == buffer_id.handle)
{
bitr->task_group_id = cb_thread_id.handle;
return ROCPROFILER_STATUS_SUCCESS;
}
}
return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
}
}
+66
Просмотреть файл
@@ -0,0 +1,66 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/internal_threading.h>
#include "lib/common/container/stable_vector.hpp"
#include "lib/common/defines.hpp"
#include <PTL/TaskGroup.hh>
#include <PTL/ThreadPool.hh>
#include <cstdint>
#include <string>
#include <vector>
namespace rocprofiler
{
namespace internal_threading
{
using thread_pool_t = PTL::ThreadPool;
using task_group_t = PTL::TaskGroup<void>;
using unique_thread_pool_t = std::unique_ptr<thread_pool_t, void (*)(thread_pool_t*)>;
using unique_task_group_t = std::unique_ptr<task_group_t>;
using thread_pool_vec_t = std::vector<unique_thread_pool_t>;
using task_group_vec_t = std::vector<unique_task_group_t>;
void notify_pre_internal_thread_create(rocprofiler_internal_thread_library_t);
void notify_post_internal_thread_create(rocprofiler_internal_thread_library_t);
// initialize the default thread pool
void
initialize();
// destroy all the thread pools
void
finalize();
// creates a new thread
rocprofiler_callback_thread_t
create_callback_thread();
// returns the task group for the given callback thread identifier
task_group_t* get_task_group(rocprofiler_callback_thread_t);
} // namespace internal_threading
} // namespace rocprofiler
+556
Просмотреть файл
@@ -0,0 +1,556 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include "lib/rocprofiler/registration.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/internal_threading.hpp"
#include <rocprofiler/context.h>
#include <rocprofiler/fwd.h>
#include <rocprofiler/hsa.h>
#include <rocprofiler/version.h>
#include <fmt/format.h>
#include <glog/logging.h>
#include <dlfcn.h>
#include <link.h>
#include <unistd.h>
#include <atomic>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <memory>
#include <mutex>
#include <stdexcept>
#include <string>
#include <string_view>
#include <thread>
#include <unordered_set>
#include <vector>
extern "C" {
#pragma weak rocprofiler_configure
extern rocprofiler_tool_configure_result_t*
rocprofiler_configure(uint32_t, const char*, uint32_t, rocprofiler_client_id_t*);
}
namespace rocprofiler
{
namespace registration
{
namespace
{
auto&
get_status()
{
static auto _v = std::pair<std::atomic<int>, std::atomic<int>>{0, 0};
return _v;
}
auto&
get_invoked_configures()
{
static auto _v = std::unordered_set<rocprofiler_configure_func_t>{};
return _v;
}
auto&
get_forced_configure()
{
static rocprofiler_configure_func_t _v = nullptr;
return _v;
}
void
init_logging()
{
static auto _once = std::once_flag{};
std::call_once(_once, []() {
auto get_argv0 = []() {
auto ifs = std::ifstream{"/proc/self/cmdline"};
auto sarg = std::string{};
while(ifs && !ifs.eof())
{
ifs >> sarg;
if(!sarg.empty()) break;
}
return sarg;
};
static auto argv0 = get_argv0();
google::InitGoogleLogging(argv0.c_str());
LOG(INFO) << "logging initialized";
});
}
std::vector<std::string>
get_link_map()
{
auto chain = std::vector<std::string>{};
void* handle = nullptr;
handle = dlopen(nullptr, RTLD_LAZY | RTLD_NOLOAD);
if(handle)
{
struct link_map* link_map_v = nullptr;
dlinfo(handle, RTLD_DI_LINKMAP, &link_map_v);
struct link_map* next_link = link_map_v->l_next;
while(next_link)
{
if(next_link->l_name != nullptr && !std::string_view{next_link->l_name}.empty())
{
chain.emplace_back(next_link->l_name);
}
next_link = next_link->l_next;
}
}
return chain;
}
struct client_library
{
std::string name = {};
void* dlhandle = nullptr;
decltype(::rocprofiler_configure)* configure_func = nullptr;
std::unique_ptr<rocprofiler_tool_configure_result_t> configure_result = {};
rocprofiler_client_id_t internal_client_id = {};
rocprofiler_client_id_t mutable_client_id = {};
};
std::vector<client_library>
find_clients()
{
auto data = std::vector<client_library>{};
if(get_forced_configure())
{
data.emplace_back(client_library{"(forced)", nullptr, get_forced_configure()});
}
if(!rocprofiler_configure && !get_forced_configure())
{
LOG(ERROR) << "no rocprofiler_configure function found";
return data;
}
if(rocprofiler_configure != &rocprofiler_configure)
throw std::runtime_error("rocprofiler_configure != &rocprofiler_configure");
if(&rocprofiler_configure != get_forced_configure())
data.emplace_back(client_library{"unknown", nullptr, &rocprofiler_configure});
for(const auto& itr : get_link_map())
{
LOG(INFO) << "searching " << itr << " for rocprofiler_configure";
void* handle = dlopen(itr.c_str(), RTLD_LAZY | RTLD_NOLOAD);
LOG_IF(ERROR, handle == nullptr) << "error dlopening " << itr;
decltype(::rocprofiler_configure)* _sym = nullptr;
*(void**) (&_sym) = dlsym(handle, "rocprofiler_configure");
// skip the configure function that was forced
if(_sym == get_forced_configure())
{
data.front().name = itr;
data.front().dlhandle = handle;
data.front().internal_client_id.name = "(forced)";
continue;
}
if(!_sym)
{
LOG(INFO) << "|_" << itr << " did not contain rocprofiler_configure symbol";
continue;
}
if(_sym == &rocprofiler_configure && data.size() == 1)
{
data.front().name = itr;
data.front().dlhandle = handle;
data.front().internal_client_id.name = "default";
}
else
{
uint32_t _prio = data.size();
auto& entry =
data.emplace_back(client_library{itr,
handle,
_sym,
nullptr,
rocprofiler_client_id_t{nullptr, _prio},
rocprofiler_client_id_t{nullptr, _prio}});
entry.internal_client_id.name = entry.name.c_str();
}
}
LOG(ERROR) << __FUNCTION__ << " found " << data.size() << " clients";
return data;
}
std::vector<client_library>&
get_clients()
{
static auto _v = find_clients();
return _v;
}
using mutex_t = std::recursive_mutex;
using scoped_lock_t = std::unique_lock<mutex_t>;
mutex_t&
get_registration_mutex()
{
static auto _v = mutex_t{};
return _v;
}
} // namespace
int
get_init_status()
{
return get_status().first.load(std::memory_order_acquire);
}
int
get_fini_status()
{
return get_status().second.load(std::memory_order_acquire);
}
void
set_init_status(int v)
{
get_status().first.store(v, std::memory_order_release);
}
void
set_fini_status(int v)
{
get_status().second.store(v, std::memory_order_release);
}
bool
invoke_client_configures()
{
if(get_init_status() > 0) return false;
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
if(_lk.owns_lock()) return false;
_lk.lock();
LOG(ERROR) << __FUNCTION__;
size_t prio = 0;
for(auto& itr : get_clients())
{
if(get_invoked_configures().find(itr.configure_func) != get_invoked_configures().end())
{
LOG(ERROR) << "rocprofiler::registration::invoke_client_configures() attempted to "
"invoke configure function from "
<< itr.name << " (addr="
<< fmt::format("{:#018x}", reinterpret_cast<uint64_t>(itr.configure_func))
<< ") more than once";
continue;
}
else
{
LOG(INFO) << "rocprofiler::registration::invoke_client_configures() invoking configure "
"function from "
<< itr.name << " (addr="
<< fmt::format("{:#018x}", reinterpret_cast<uint64_t>(itr.configure_func))
<< ")";
}
auto* _result = itr.configure_func(
ROCPROFILER_VERSION, ROCPROFILER_VERSION_STRING, prio++, &itr.mutable_client_id);
if(_result)
itr.configure_result = std::make_unique<rocprofiler_tool_configure_result_t>(*_result);
get_invoked_configures().emplace(itr.configure_func);
}
return true;
}
bool
invoke_client_initializers()
{
if(get_init_status() > 0) return false;
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
if(_lk.owns_lock()) return false;
_lk.lock();
LOG(ERROR) << __FUNCTION__;
set_init_status(-1);
for(auto& itr : get_clients())
{
if(itr.configure_result && itr.configure_result->initialize)
{
context::push_client(itr.internal_client_id.handle);
itr.configure_result->initialize(&invoke_client_finalizer,
itr.configure_result->tool_data);
context::pop_client(itr.internal_client_id.handle);
// set to nullptr so initialize only gets called once
itr.configure_result->initialize = nullptr;
}
}
// initialization is no longer available
set_init_status(1);
return true;
}
bool
invoke_client_finalizers()
{
if(get_fini_status() > 0) return false;
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
if(_lk.owns_lock()) return false;
_lk.lock();
set_fini_status(-1);
for(auto& itr : get_clients())
{
if(itr.configure_result && itr.configure_result->finalize)
{
itr.configure_result->finalize(itr.configure_result->tool_data);
// set to nullptr so finalize only gets called once
itr.configure_result->finalize = nullptr;
}
}
set_fini_status(1);
return true;
}
bool
invoke_client_initializer(rocprofiler_client_id_t client_id)
{
if(get_init_status() > 0) return false;
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
if(_lk.owns_lock()) return false;
_lk.lock();
// save the original status
auto _restore_status = get_init_status();
set_init_status(-1);
for(auto& itr : get_clients())
{
if(itr.internal_client_id.handle == client_id.handle &&
itr.mutable_client_id.handle == client_id.handle)
{
if(itr.configure_result && itr.configure_result->initialize)
{
context::push_client(itr.internal_client_id.handle);
itr.configure_result->initialize(&invoke_client_finalizer,
itr.configure_result->tool_data);
context::pop_client(itr.internal_client_id.handle);
// set to nullptr so initialize only gets called once
itr.configure_result->initialize = nullptr;
}
}
}
// we don't want the explicit client initialization to set the init status to 1
// we just want to restore what it previously was
set_init_status(_restore_status);
return true;
}
void
invoke_client_finalizer(rocprofiler_client_id_t client_id)
{
auto _lk = scoped_lock_t{get_registration_mutex(), std::defer_lock};
if(_lk.owns_lock()) return;
_lk.lock();
for(auto& itr : get_clients())
{
if(itr.internal_client_id.handle == client_id.handle &&
itr.mutable_client_id.handle == client_id.handle)
{
if(itr.configure_result && itr.configure_result->finalize)
{
itr.configure_result->finalize(itr.configure_result->tool_data);
// set to nullptr so finalize only gets called once
itr.configure_result->finalize = nullptr;
}
}
}
}
void
initialize()
{
static auto _once = std::once_flag{};
static auto _ready = std::atomic<bool>{false};
std::call_once(_once, []() {
init_logging();
invoke_client_configures();
invoke_client_initializers();
internal_threading::initialize();
std::atexit(&finalize);
_ready.store(true, std::memory_order_release);
});
if(!_ready.load(std::memory_order_acquire))
{
while(!_ready.load(std::memory_order_acquire))
std::this_thread::yield();
}
}
void
finalize()
{
hsa_shut_down();
invoke_client_finalizers();
for(auto& itr : rocprofiler::context::get_active_contexts())
itr.store(nullptr, std::memory_order_seq_cst);
internal_threading::finalize();
}
} // namespace registration
} // namespace rocprofiler
extern "C" {
rocprofiler_status_t
rocprofiler_is_initialized(int* status)
{
*status = rocprofiler::registration::get_init_status();
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_is_finalized(int* status)
{
*status = rocprofiler::registration::get_fini_status();
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_force_configure(rocprofiler_configure_func_t configure_func)
{
auto& forced_config = rocprofiler::registration::get_forced_configure();
// init status may be -1 (currently initializing) or 1 (already initialized).
// if either case, we want to ignore this function call but if this is
if(rocprofiler::registration::get_init_status() != 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
// if another tool forced configure, the init status should be 1, but
// let's just make sure that the forced configure function is a nullptr
if(forced_config) return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
forced_config = configure_func;
rocprofiler::registration::initialize();
return ROCPROFILER_STATUS_SUCCESS;
}
int
rocprofiler_set_api_table(const char* name,
uint64_t lib_version,
uint64_t lib_instance,
void** tables,
uint64_t num_tables)
{
static auto _once = std::once_flag{};
std::call_once(_once, rocprofiler::registration::initialize);
// pass to roctx init
LOG_IF(ERROR, num_tables == 0) << " rocprofiler expected " << name
<< " library to pass at least one table, not " << num_tables;
LOG_IF(ERROR, tables == nullptr) << " rocprofiler expected pointer to array of tables from "
<< name << " library, not a nullptr";
if(std::string_view{name} == "hip")
{
// pass to hip init
LOG_IF(ERROR, num_tables > 1)
<< " rocprofiler expected HIP library to pass 1 API table, not " << num_tables;
}
else if(std::string_view{name} == "hsa")
{
// pass to hsa init
LOG_IF(ERROR, num_tables > 1)
<< " rocprofiler expected HSA library to pass 1 API table, not " << num_tables;
auto* hsa_api_table = static_cast<HsaApiTable*>(*tables);
auto& saved_hsa_api_table = rocprofiler::hsa::get_table();
::copyTables(hsa_api_table, &saved_hsa_api_table);
rocprofiler::hsa::update_table(hsa_api_table);
}
else if(std::string_view{name} == "roctx")
{
// pass to roctx init
LOG_IF(ERROR, num_tables > 1)
<< " rocprofiler expected ROCTX library to pass 1 API table, not " << num_tables;
}
else
{
LOG(ERROR) << "rocprofiler does not accept API tables from " << name;
LOG_ASSERT(false) << " rocprofiler does not accept API tables from " << name;
}
(void) lib_version;
(void) lib_instance;
(void) tables;
(void) num_tables;
return 0;
}
bool
OnLoad(HsaApiTable* table,
uint64_t runtime_version,
uint64_t failed_tool_count,
const char* const* failed_tool_names)
{
rocprofiler::registration::init_logging();
(void) runtime_version;
(void) failed_tool_count;
(void) failed_tool_names;
fprintf(stderr, "[%s:%i] %s\n", __FILE__, __LINE__, __FUNCTION__);
void* table_v = static_cast<void*>(table);
rocprofiler_set_api_table("hsa", runtime_version, 0, &table_v, 1);
return true;
}
}
+95
Просмотреть файл
@@ -0,0 +1,95 @@
// MIT License
//
// Copyright (c) 2023 ROCm Developer Tools
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler/registration.h>
#include "lib/common/defines.hpp"
#include <cstdint>
#include <string>
#include <vector>
extern "C" {
struct HsaApiTable;
using on_load_t = bool (*)(HsaApiTable*, uint64_t, uint64_t, const char* const*);
bool
OnLoad(HsaApiTable* table,
uint64_t runtime_version,
uint64_t failed_tool_count,
const char* const* failed_tool_names) ROCPROFILER_PUBLIC_API;
// this is the "hidden" function that rocprofiler-register invokes to pass
// the API tables to rocprofiler
int
rocprofiler_set_api_table(const char* name,
uint64_t lib_version,
uint64_t lib_instance,
void** tables,
uint64_t num_tables) ROCPROFILER_PUBLIC_API;
}
namespace rocprofiler
{
namespace registration
{
// initialize the clients
void
initialize();
// finalize the clients
void
finalize();
// invoke all rocprofiler_configure symbols
bool
invoke_client_configures();
// invoke initialize functions returned from rocprofiler_configure
bool
invoke_client_initializers();
// invoke finalize functions returned from rocprofiler_configure
bool
invoke_client_finalizers();
// explicitly invoke the initialize function of a specific client
bool invoke_client_initializer(rocprofiler_client_id_t);
// explicitly invoke the finalize function of a specific client
void invoke_client_finalizer(rocprofiler_client_id_t);
int
get_init_status();
int
get_fini_status();
void
set_init_status(int);
void
set_fini_status(int);
} // namespace registration
} // namespace rocprofiler
+27 -49
Просмотреть файл
@@ -20,9 +20,16 @@
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include <rocprofiler/fwd.h>
#include <rocprofiler/rocprofiler.h>
#include <algorithm>
#include "lib/common/utility.hpp"
#include "lib/rocprofiler/context/context.hpp"
#include "lib/rocprofiler/context/domain.hpp"
#include "lib/rocprofiler/hsa/hsa.hpp"
#include "lib/rocprofiler/registration.hpp"
#include <atomic>
#include <vector>
namespace
@@ -34,6 +41,22 @@ consume_args(Tp&&...)
} // namespace
extern "C" {
rocprofiler_status_t
rocprofiler_get_version(uint32_t* major, uint32_t* minor, uint32_t* patch)
{
if(major) *major = ROCPROFILER_VERSION_MAJOR;
if(minor) *minor = ROCPROFILER_VERSION_MINOR;
if(patch) *patch = ROCPROFILER_VERSION_PATCH;
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_get_timestamp(rocprofiler_timestamp_t* ts)
{
*ts = rocprofiler::common::timestamp_ns();
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
size_t agent_size,
@@ -76,54 +99,6 @@ rocprofiler_query_available_agents(rocprofiler_available_agents_cb_t callback,
return callback(_agents.data(), _agents.size(), user_data);
}
rocprofiler_status_t
rocprofiler_create_context(rocprofiler_context_id_t* context_id)
{
consume_args(context_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_start_context(rocprofiler_context_id_t context_id)
{
consume_args(context_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_stop_context(rocprofiler_context_id_t context_id)
{
consume_args(context_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_flush_buffer(rocprofiler_buffer_id_t buffer_id)
{
consume_args(buffer_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_destroy_buffer(rocprofiler_buffer_id_t buffer_id)
{
consume_args(buffer_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_create_buffer(rocprofiler_context_id_t context,
size_t size,
size_t watermark,
rocprofiler_buffer_policy_t action,
rocprofiler_buffer_callback_t callback,
void* callback_data,
rocprofiler_buffer_id_t* buffer_id)
{
consume_args(context, size, watermark, action, callback, callback_data, buffer_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
rocprofiler_status_t
rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t context_id,
rocprofiler_agent_t agent,
@@ -132,6 +107,9 @@ rocprofiler_configure_pc_sampling_service(rocprofiler_context_id_t conte
uint64_t interval,
rocprofiler_buffer_id_t buffer_id)
{
if(rocprofiler::registration::get_init_status() > 0)
return ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED;
consume_args(context_id, agent, method, unit, interval, buffer_id);
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
-701
Просмотреть файл
@@ -1,701 +0,0 @@
#include <rocprofiler/config.h>
#include <rocprofiler/rocprofiler.h>
#include "config_helpers.hpp"
#include "config_internal.hpp"
#include <roctracer/roctx.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <atomic>
#include <cstddef>
#include <iostream>
#include <mutex>
#include <hsa/hsa.h>
#include <hsa/hsa_api_trace.h>
#include <hsa/hsa_ext_amd.h>
#include <hsa/hsa_ext_image.h>
typedef enum
{
ACTIVITY_API_PHASE_ENTER = 0,
ACTIVITY_API_PHASE_EXIT = 1
} activity_api_phase_t;
typedef struct roctx_api_data_s
{
union
{
struct
{
const char* message;
roctx_range_id_t id;
};
struct
{
const char* message;
} roctxMarkA;
struct
{
const char* message;
} roctxRangePushA;
struct
{
const char* message;
} roctxRangePop;
struct
{
const char* message;
roctx_range_id_t id;
} roctxRangeStartA;
struct
{
const char* message;
roctx_range_id_t id;
} roctxRangeStop;
} args;
} roctx_api_data_t;
// helper macros ensuring C and C++ structs adhere to specific naming convention
#define ROCP_PUBLIC_CONFIG(TYPE) ::rocprofiler_##TYPE
#define ROCP_PRIVATE_CONFIG(TYPE) ::rocprofiler::internal::TYPE
// Below asserts at compile time that the external C object has the same size as internal
// C++ object, e.g.,
// sizeof(rocprofiler_domain_config) == sizeof(rocprofiler::internal::domain_config)
#define ROCP_ASSERT_CONFIG_ABI(TYPE) \
static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)) == sizeof(ROCP_PRIVATE_CONFIG(TYPE)), \
"Error! rocprofiler_" #TYPE " ABI error");
// Below asserts at compile time that the external C struct members has the same offset as
// internal C++ struct members
#define ROCP_ASSERT_CONFIG_OFFSET_ABI(TYPE, PUB_FIELD, PRIV_FIELD) \
static_assert(offsetof(ROCP_PUBLIC_CONFIG(TYPE), PUB_FIELD) == \
offsetof(ROCP_PRIVATE_CONFIG(TYPE), PRIV_FIELD), \
"Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI offset error"); \
static_assert(sizeof(ROCP_PUBLIC_CONFIG(TYPE)::PUB_FIELD) == \
sizeof(ROCP_PRIVATE_CONFIG(TYPE)::PRIV_FIELD), \
"Error! rocprofiler_" #TYPE "." #PUB_FIELD " ABI size error");
// this defines a template specialization for ensuring that the reinterpret_cast is only
// applied between public C structs and private C++ structs which are compatible.
#define ROCP_DEFINE_API_CAST_IMPL(INPUT_TYPE, OUTPUT_TYPE) \
namespace traits \
{ \
template <> \
struct api_cast<INPUT_TYPE> \
{ \
using input_type = INPUT_TYPE; \
using output_type = OUTPUT_TYPE; \
\
output_type* operator()(input_type* _v) const \
{ \
return reinterpret_cast<output_type*>(_v); \
} \
\
const output_type* operator()(const input_type* _v) const \
{ \
return reinterpret_cast<const output_type*>(_v); \
} \
}; \
}
// define C -> C++ and C++ -> C casting rules
#define ROCP_DEFINE_API_CAST_D(TYPE) \
ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE)) \
ROCP_DEFINE_API_CAST_IMPL(ROCP_PRIVATE_CONFIG(TYPE), ROCP_PUBLIC_CONFIG(TYPE))
// use only when C++ struct is just an alias for C struct
#define ROCP_DEFINE_API_CAST_S(TYPE) \
ROCP_DEFINE_API_CAST_IMPL(ROCP_PUBLIC_CONFIG(TYPE), ROCP_PRIVATE_CONFIG(TYPE))
namespace
{
namespace traits
{
// left undefined to ensure template specialization
template <typename PublicT>
struct api_cast;
// ensure api_cast<decltype(a)> where decltype(a) is const Tp equates to api_cast<Tp>
template <typename PublicT>
struct api_cast<const PublicT> : api_cast<PublicT>
{};
// ensure api_cast<decltype(a)> where decltype(a) is Tp& equates to api_cast<Tp>
template <typename PublicT>
struct api_cast<PublicT&> : api_cast<PublicT>
{};
// ensure api_cast<decltype(a)> where decltype(a) is Tp* equates to api_cast<Tp>
template <typename PublicT>
struct api_cast<PublicT*> : api_cast<PublicT>
{};
} // namespace traits
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
//
// SEE BELOW! VERY IMPORTANT!
//
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
//
// EVERY NEW CONFIG AND ALL OF ITS MEMBER FIELDS NEED TO HAVE THESE COMPILE TIME CHECKS!
//
// these checks verify the two structs have the same size and that each
// member field has the same size and offset into the struct
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ROCP_ASSERT_CONFIG_ABI(config)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, size, size)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, compat_version, compat_version)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, api_version, api_version)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, reserved0, context_idx)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, user_data, user_data)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, buffer, buffer)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, domain, domain)
ROCP_ASSERT_CONFIG_OFFSET_ABI(config, filter, filter)
ROCP_ASSERT_CONFIG_ABI(domain_config)
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, callback, user_sync_callback)
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved0, domains)
ROCP_ASSERT_CONFIG_OFFSET_ABI(domain_config, reserved1, opcodes)
ROCP_ASSERT_CONFIG_ABI(buffer_config)
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, callback, callback)
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, buffer_size, buffer_size)
// ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved0, buffer)
ROCP_ASSERT_CONFIG_OFFSET_ABI(buffer_config, reserved1, buffer_idx)
ROCP_DEFINE_API_CAST_D(config)
ROCP_DEFINE_API_CAST_D(domain_config)
ROCP_DEFINE_API_CAST_D(buffer_config)
ROCP_DEFINE_API_CAST_S(filter_config)
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
//
// SEE ABOVE! VERY IMPORTANT!
//
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/// use this to ensure that reinterpret_cast from public C struct to internal C++ struct
/// is valid, e.g. guard against accidentally casting to wrong type
template <typename Tp>
auto
rocp_cast(Tp* _val)
{
return traits::api_cast<Tp>{}(_val);
}
/// helper function for making copies of the fields in rocprofiler_config. If the config
/// field needs to be copied in some special way, use a template specialization of the
/// "construct" function in the allocator to handle this, e.g.:
///
/// using special_config = ::rocprofiler::internal::special_config;
///
/// template <>
/// void
/// allocator<special_config, 8>::construct(special_config* const _p,
/// const special_config& _v) const
/// {
/// auto _tmp = special_config{};
/// // ... special copy of fields from _v into _tmp
///
/// // placement new of _tmp into _p
/// _p = new(_p) special_config{ _tmp };
/// }
///
/// template <>
/// void
/// allocator<special_config, 8>::construct(special_config* const _p,
/// special_config&& _v) const
/// {
/// auto _tmp = std::move(_v);
/// // ... perform special needs
///
/// // placement new of _tmp into _p
/// _p = new(_p) special_config{ std::move(_tmp) };
/// }
///
template <typename Tp, typename Up>
Tp*&
copy_config_field(Tp*& _dst, Up* _src_v)
{
static auto _allocator = allocator<Tp>{};
if constexpr(!std::is_same<Tp, Up>::value)
{
using PrivateT = typename traits::api_cast<Up>::output_type;
static_assert(std::is_same<PrivateT, Tp>::value, "Error incorrect field copy");
auto _src = rocp_cast(_src_v);
if(_src)
{
_dst = _allocator.allocate(1);
_allocator.construct(_dst, *_src);
}
return _dst;
}
else
{
if(_src_v)
{
_dst = _allocator.allocate(1);
_allocator.construct(_dst, *_src_v);
}
return _dst;
}
}
auto&
get_configs_buffer()
{
static char
_v[::rocprofiler::internal::max_configs_count * sizeof(rocprofiler::internal::config)];
return _v;
}
auto&
get_configs_mutex()
{
static auto _v = std::mutex{};
return _v;
}
inline uint32_t
get_tid()
{
return syscall(__NR_gettid);
}
constexpr auto rocp_max_configs = ::rocprofiler::internal::max_configs_count;
} // namespace
namespace rocprofiler
{
namespace internal
{
std::array<rocprofiler::internal::config*, max_configs_count>&
get_registered_configs()
{
static auto _v = std::array<rocprofiler::internal::config*, max_configs_count>{};
return _v;
}
std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>&
get_active_configs()
{
static auto _v = std::array<std::atomic<rocprofiler::internal::config*>, max_configs_count>{};
return _v;
}
} // namespace internal
} // namespace rocprofiler
extern "C" {
rocprofiler_status_t
rocprofiler_allocate_config(rocprofiler_config* _inp_cfg)
{
// perform checks that rocprofiler can be activated
::memset(_inp_cfg, 0, sizeof(rocprofiler_config));
auto* _cfg = rocp_cast(_inp_cfg);
_cfg->size = sizeof(::rocprofiler_config);
_cfg->compat_version = 0;
_cfg->api_version = ROCPROFILER_API_VERSION_ID;
_cfg->context_idx = std::numeric_limits<decltype(_cfg->context_idx)>::max();
// initial value checks
assert(_cfg->size == sizeof(rocprofiler::internal::config));
assert(_cfg->compat_version == 0);
assert(_cfg->api_version == ROCPROFILER_API_VERSION_ID);
assert(_cfg->buffer == nullptr);
assert(_cfg->domain == nullptr);
assert(_cfg->filter == nullptr);
assert(_cfg->context_idx ==
std::numeric_limits<decltype(rocprofiler::internal::config::context_idx)>::max());
// ... allocate any internal space needed to handle another config ...
{
auto _lk = std::unique_lock<std::mutex>{get_configs_mutex()};
// ...
}
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_validate_config(const rocprofiler_config* cfg_v)
{
const auto* cfg = rocp_cast(cfg_v);
if(cfg->buffer == nullptr) return ROCPROFILER_STATUS_ERROR_BUFFER_NOT_FOUND;
if(cfg->filter == nullptr) return ROCPROFILER_STATUS_ERROR_FILTER_NOT_FOUND;
if(cfg->domain == nullptr || cfg->domain->domains == 0)
return ROCPROFILER_STATUS_ERROR_INCORRECT_DOMAIN;
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_start_config(rocprofiler_config* cfg_v, rocprofiler_context_id_t* context_id)
{
if(rocprofiler_validate_config(cfg_v) != ROCPROFILER_STATUS_SUCCESS)
{
std::cerr << "rocprofiler_start_config() provided an invalid configuration. tool "
"should use rocprofiler_validate_config() to check whether the "
"config is valid and adapt accordingly to issues before trying to "
"start the configuration."
<< std::endl;
abort();
}
auto* cfg = rocp_cast(cfg_v);
uint64_t idx = rocp_max_configs;
{
auto _lk = std::unique_lock<std::mutex>{get_configs_mutex()};
for(size_t i = 0; i < rocp_max_configs; ++i)
{
if(rocprofiler::internal::get_registered_configs().at(i) == nullptr)
{
idx = i;
break;
}
}
}
// too many configs already registered
if(idx == rocp_max_configs) return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_ACTIVE;
cfg->context_idx = idx;
context_id->handle = idx;
// using the context id, compute the location in the buffer of configs
auto* _offset = get_configs_buffer() + (idx * sizeof(rocprofiler::internal::config));
// placement new into the buffer
auto* _copy_cfg = new(_offset) rocprofiler::internal::config{*cfg};
// make copies of non-null config fields
copy_config_field(_copy_cfg->buffer, cfg->buffer);
copy_config_field(_copy_cfg->domain, cfg->domain);
copy_config_field(_copy_cfg->filter, cfg->filter);
// store until "deallocation"
rocprofiler::internal::get_registered_configs().at(idx) = _copy_cfg;
using config_t = rocprofiler::internal::config;
// atomic swap the pointer into the "active" array used internally
config_t* _expected = nullptr;
bool success = rocprofiler::internal::get_active_configs().at(idx).compare_exchange_strong(
_expected, rocprofiler::internal::get_registered_configs().at(idx));
if(!success) return ROCPROFILER_STATUS_ERROR_HAS_ACTIVE_CONTEXT; // need relevant enum
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_stop_config(rocprofiler_context_id_t idx)
{
// atomically assign the config pointer to NULL so that it is skipped in future
// callbacks
auto* _expected =
rocprofiler::internal::get_active_configs().at(idx.handle).load(std::memory_order_relaxed);
bool success = rocprofiler::internal::get_active_configs()
.at(idx.handle)
.compare_exchange_strong(_expected, nullptr);
if(!success)
return ROCPROFILER_STATUS_ERROR_CONTEXT_NOT_FOUND; // compare exchange strong
// failed
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_domain_add_domain(struct rocprofiler_domain_config* _inp_cfg,
rocprofiler_tracer_activity_domain_t _domain)
{
auto* _cfg = rocp_cast(_inp_cfg);
if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE ||
_domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST)
return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID;
_cfg->domains |= (1 << _domain);
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_domain_add_domains(struct rocprofiler_domain_config* _inp_cfg,
rocprofiler_tracer_activity_domain_t* _domains,
size_t _ndomains)
{
for(size_t i = 0; i < _ndomains; ++i)
{
auto _status = rocprofiler_domain_add_domain(_inp_cfg, _domains[i]);
if(_status != ROCPROFILER_STATUS_SUCCESS) return _status;
}
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_domain_add_op(struct rocprofiler_domain_config* _inp_cfg,
rocprofiler_tracer_activity_domain_t _domain,
uint32_t _op)
{
auto* _cfg = rocp_cast(_inp_cfg);
if(_domain <= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_NONE ||
_domain >= ROCPROFILER_TRACER_ACTIVITY_DOMAIN_LAST)
return ROCPROFILER_STATUS_ERROR_INVALID_DOMAIN_ID;
if(_op >= get_domain_max_op(_domain)) return ROCPROFILER_STATUS_ERROR_INVALID_OPERATION_ID;
auto _offset = (_domain * rocprofiler::internal::domain_ops_offset);
_cfg->opcodes.set(_offset + _op, true);
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
rocprofiler_domain_add_ops(struct rocprofiler_domain_config* _inp_cfg,
rocprofiler_tracer_activity_domain_t _domain,
uint32_t* _ops,
size_t _nops)
{
for(size_t i = 0; i < _nops; ++i)
{
auto _status = rocprofiler_domain_add_op(_inp_cfg, _domain, _ops[i]);
if(_status != ROCPROFILER_STATUS_SUCCESS) return _status;
}
return ROCPROFILER_STATUS_SUCCESS;
}
// ------------------------------------------------------------------------------------ //
//
// demo of internal implementation
//
// ------------------------------------------------------------------------------------ //
void
api_callback(rocprofiler_tracer_activity_domain_t domain,
uint32_t cid,
const void* /*callback_data*/,
void*)
{
for(const auto& aitr : rocprofiler::internal::get_active_configs())
{
auto* itr = aitr.load();
if(!itr) continue;
// below should be valid so this might need to raise error
if(!itr->domain) continue;
// if the given domain + op is not enabled, skip this config
if(!(*itr->domain)(domain, cid)) continue;
if(itr->filter)
{
if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX)
{}
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API)
{
if(itr->filter->hsa_function_id && itr->filter->hsa_function_id(cid) == 0) continue;
}
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API)
{
if(itr->filter->hip_function_id && itr->filter->hip_function_id(cid) == 0) continue;
}
}
auto& _domain = (*itr->domain);
auto& _correlation = (*itr->correlation_id);
auto _correlation_id = rocprofiler::internal::correlation_config::get_unique_record_id();
if(_correlation.external_id_callback)
_correlation.external_id =
_correlation.external_id_callback(domain, cid, _correlation_id);
auto timestamp_ns = []() -> uint64_t {
return std::chrono::steady_clock::now().time_since_epoch().count();
};
(void) _domain;
(void) timestamp_ns;
/*
auto _header = rocprofiler_record_header_t{ROCPROFILER_TRACER_RECORD,
rocprofiler_record_id_t{_correlation_id}};
auto _op_id = rocprofiler_tracer_operation_id_t{cid};
auto _agent_id = rocprofiler_agent_id_t{0};
auto _queue_id = rocprofiler_queue_id_t{0};
auto _thread_id = rocprofiler_thread_id_t{get_tid()};
auto _context = rocprofiler_context_id_t{itr->context_idx};
auto _timestamp_raw = rocprofiler_timestamp_t{timestamp_ns()};
auto _timestamp = rocprofiler_record_header_timestamp_t{_timestamp_raw, _timestamp_raw};
if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX)
{
auto _api_data = rocprofiler_tracer_api_data_t{};
const roctx_api_data_t* _data =
reinterpret_cast<const roctx_api_data_t*>(callback_data);
if(itr->filter && itr->filter->name && itr->filter->name(_data->args.message) == 0)
continue;
_api_data.roctx = _data;
auto _phase = rocprofiler_api_tracing_phase_t{ROCPROFILER_PHASE_ENTER};
_timestamp = {_timestamp_raw, _timestamp_raw};
auto _external_cid = rocprofiler_tracer_external_id_t{_data ? _data->args.id : 0};
auto _activity_cid = rocprofiler_tracer_activity_correlation_id_t{0};
const char* _name = _data->args.message;
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
_external_cid,
ACTIVITY_DOMAIN_ROCTX,
_op_id,
_api_data,
_activity_cid,
_timestamp,
_agent_id,
_queue_id,
_thread_id,
_phase,
_name},
_context);
}
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API)
{
auto _api_data = rocprofiler_tracer_api_data_t{};
const hsa_api_data_t* _data = reinterpret_cast<const hsa_api_data_t*>(callback_data);
_api_data.hsa = _data;
auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER)
? ROCPROFILER_PHASE_ENTER
: ROCPROFILER_PHASE_EXIT};
if(_phase == ROCPROFILER_PHASE_ENTER)
_timestamp.begin = _timestamp_raw;
else
_timestamp.end = _timestamp_raw;
auto _external_cid = rocprofiler_tracer_external_id_t{0};
auto _activity_cid =
rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id};
const char* _name = nullptr;
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
_external_cid,
ACTIVITY_DOMAIN_HSA_API,
_op_id,
_api_data,
_activity_cid,
_timestamp,
_agent_id,
_queue_id,
_thread_id,
_phase,
_name},
_context);
}
else if(domain == ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API)
{
auto _api_data = rocprofiler_tracer_api_data_t{};
const hip_api_data_t* _data = reinterpret_cast<const hip_api_data_t*>(callback_data);
_api_data.hip = _data;
auto _phase = rocprofiler_api_tracing_phase_t{(_data->phase == ACTIVITY_API_PHASE_ENTER)
? ROCPROFILER_PHASE_ENTER
: ROCPROFILER_PHASE_EXIT};
if(_phase == ROCPROFILER_PHASE_ENTER)
_timestamp.begin = _timestamp_raw;
else
_timestamp.end = _timestamp_raw;
auto _external_cid = rocprofiler_tracer_external_id_t{0};
auto _activity_cid =
rocprofiler_tracer_activity_correlation_id_t{_data->correlation_id};
const char* _name = nullptr;
_domain.user_sync_callback(rocprofiler_record_tracer_t{_header,
_external_cid,
ACTIVITY_DOMAIN_HIP_API,
_op_id,
_api_data,
_activity_cid,
_timestamp,
_agent_id,
_queue_id,
_thread_id,
_phase,
_name},
_context);
}
*/
}
}
void
InitRoctracer()
{
for(const auto& itr : rocprofiler::internal::get_registered_configs())
{
if(!itr) continue;
// below should be valid so this might need to raise error
if(!itr->domain) continue;
for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API,
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_API,
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_ROCTX})
{
if((*itr->domain)(ditr))
{
if(itr->domain->user_sync_callback)
{
// ...
}
else
{
// ...
}
}
}
for(auto ditr : {ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_OPS,
ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HIP_OPS})
{
if((*itr->domain)(ditr))
{
if(itr->domain->opcodes.none())
{
// ...
}
else
{
for(size_t i = 0; i < itr->domain->opcodes.size(); ++i)
{
if((*itr->domain)(ditr, i))
{
// ...
}
}
}
}
}
}
}
}
-510
Просмотреть файл
@@ -1,510 +0,0 @@
/* Copyright (c) 2018-2022 Advanced Micro Devices, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE. */
#pragma once
#include <rocprofiler/rocprofiler.h>
#include <cstddef>
#include <cstdint>
typedef struct
{
rocprofiler_context_id_t context_id;
rocprofiler_buffer_id_t buffer_id;
} context_buffer_id_t;
typedef context_buffer_id_t roctracer_pool_t;
/* Correlation id */
typedef uint64_t activity_correlation_id_t;
typedef uint32_t activity_kind_t;
typedef uint32_t activity_op_t;
typedef uint64_t roctracer_timestamp_t;
typedef rocprofiler_tracer_activity_domain_t roctracer_domain_t;
typedef rocprofiler_tracer_activity_domain_t activity_domain_t;
// Prof_Protocol
/* Activity record type */
typedef struct activity_record_s
{
uint32_t domain; /* activity domain id */
activity_kind_t kind; /* activity kind */
activity_op_t op; /* activity op */
union
{
struct
{
activity_correlation_id_t correlation_id; /* activity ID */
roctracer_timestamp_t begin_ns; /* host begin timestamp */
roctracer_timestamp_t end_ns; /* host end timestamp */
};
struct
{
uint32_t se; /* sampled SE */
uint64_t cycle; /* sample cycle */
uint64_t pc; /* sample PC */
} pc_sample;
};
union
{
struct
{
int device_id; /* device id */
uint64_t queue_id; /* queue id */
};
struct
{
uint32_t process_id; /* device id */
uint32_t thread_id; /* thread id */
};
struct
{
activity_correlation_id_t external_id; /* external correlation id */
};
};
union
{
size_t bytes; /* data size bytes */
const char* kernel_name; /* kernel name */
const char* mark_message;
};
} activity_record_t;
typedef activity_record_t roctracer_record_t;
/* Activity sync callback type */
typedef void (*activity_sync_callback_t)(activity_domain_t cid,
activity_record_t* record,
const void* data,
void* arg);
/* Activity async callback type */
typedef void (*activity_async_callback_t)(activity_domain_t op, void* record, void* arg);
/* API callback type */
typedef void (*activity_rtapi_callback_t)(activity_domain_t domain,
uint32_t cid,
const void* data,
void* arg);
typedef activity_rtapi_callback_t roctracer_rtapi_callback_t;
typedef roctracer_timestamp_t (*roctracer_get_timestamp_t)();
typedef rocprofiler_timestamp_t (*rocprofiler_get_timestamp_t)();
typedef uint32_t activity_kind_t;
typedef uint32_t activity_op_t;
/* API callback phase */
typedef enum
{
ACTIVITY_API_PHASE_ENTER = 0,
ACTIVITY_API_PHASE_EXIT = 1
} activity_api_phase_t;
const char*
roctracer_op_string(uint32_t domain, uint32_t op);
/* Trace record types */
/**
* Memory pool allocator callback.
*
* If \p *ptr is NULL, then allocate memory of \p size bytes and save address
* in \p *ptr.
*
* If \p *ptr is non-NULL and size is non-0, then reallocate the memory at \p
* *ptr with size \p size and save the address in \p *ptr. The memory will have
* been allocated by the same callback.
*
* If \p *ptr is non-NULL and size is 0, then deallocate the memory at \p *ptr.
* The memory will have been allocated by the same callback.
*
* \p size is the size of the memory allocation or reallocation, or 0 if
* deallocating.
*
* \p arg Argument provided
*/
typedef void (*roctracer_allocator_t)(char** ptr, size_t size, void* arg);
/**
* Memory pool buffer callback.
*
* The callback that will be invoked when a memory pool buffer becomes full or
* is flushed.
*
* \p begin pointer to first entry entry in the buffer.
*
* \p end pointer to one past the end entry in the buffer.
*
* \p arg the argument specified when the callback was defined.
*/
typedef void (*roctracer_buffer_callback_t)(const char* begin, const char* end, void* arg);
/**
* Memory pool properties.
*
* Defines the properties when a tracer memory pool is created.
*/
typedef struct
{
/**
* ROC Tracer mode.
*/
uint32_t mode;
/**
* Size of buffer in bytes.
*/
size_t buffer_size;
/**
* The allocator function to use to allocate and deallocate the buffer. If
* NULL then \p malloc, \p realloc, and \p free are used.
*/
roctracer_allocator_t alloc_fun;
/**
* The argument to pass when invoking the \p alloc_fun allocator.
*/
void* alloc_arg;
/**
* The function to call when a buffer becomes full or is flushed.
*/
roctracer_buffer_callback_t buffer_callback_fun;
/**
* The argument to pass when invoking the \p buffer_callback_fun callback.
*/
void* buffer_callback_arg;
} roctracer_properties_t;
/**
* ROC Tracer API status codes.
*/
typedef enum
{
/**
* The function has executed successfully.
*/
ROCTRACER_STATUS_SUCCESS = 0,
/**
* A generic error has occurred.
*/
ROCTRACER_STATUS_ERROR = -1,
/**
* The domain ID is invalid.
*/
ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID = -2,
/**
* An invalid argument was given to the function.
*/
ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT = -3,
/**
* No default pool is defined.
*/
ROCTRACER_STATUS_ERROR_DEFAULT_POOL_UNDEFINED = -4,
/**
* The default pool is already defined.
*/
ROCTRACER_STATUS_ERROR_DEFAULT_POOL_ALREADY_DEFINED = -5,
/**
* Memory allocation error.
*/
ROCTRACER_STATUS_ERROR_MEMORY_ALLOCATION = -6,
/**
* External correlation ID pop mismatch.
*/
ROCTRACER_STATUS_ERROR_MISMATCHED_EXTERNAL_CORRELATION_ID = -7,
/**
* The operation is not currently implemented. This error may be reported by
* any function. Check the \ref known_limitations section to determine the
* status of the library implementation of the interface.
*/
ROCTRACER_STATUS_ERROR_NOT_IMPLEMENTED = -8,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_UNINIT = 2,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_BREAK = 3,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_BAD_DOMAIN = ROCTRACER_STATUS_ERROR_INVALID_DOMAIN_ID,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_BAD_PARAMETER = ROCTRACER_STATUS_ERROR_INVALID_ARGUMENT,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_HIP_API_ERR = 6,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_HIP_OPS_ERR = 7,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_HCC_OPS_ERR = ROCTRACER_STATUS_HIP_OPS_ERR,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_HSA_ERR = 7,
/**
* Deprecated error code.
*/
ROCTRACER_STATUS_ROCTX_ERR = 8,
} roctracer_status_t;
/**
* Query textual name of an operation of a domain.
* @param[in] domain Domain being queried.
* @param[in] op Operation within \p domain.
* @param[in] kind \todo Define kind.
* @return Returns the NUL terminated string for the operation name, or NULL if
* the domain or operation are invalid. The string is owned by the ROC Tracer
* library.
*/
const char*
roctracer_op_string(uint32_t domain, uint32_t op, uint32_t kind);
/**
* Query the operation code given a domain and the name of an operation.
* @param[in] domain The domain being queried.
* @param[in] str The NUL terminated name of the operation name being queried.
* @param[out] op The operation code.
* @param[out] kind If not NULL then the operation kind code.
*/
void
roctracer_op_code(uint32_t domain, const char* str, uint32_t* op, uint32_t* kind);
/**
* Set the properties of a domain.
* @param[in] domain The domain.
* @param[in] properties The properties. Each domain defines its own type for
* the properties. Some domains require the properties to be set before they
* can be enabled.
*/
void
roctracer_set_properties(roctracer_domain_t domain, void* properties);
/**
* Enable runtime API callback for a specific operation of a domain.
* @param domain The domain.
* @param op The operation ID in \p domain.
* @param callback The callback to invoke each time the operation is performed
* on entry and exit.
* @param pool Value to pass as last argument of \p callback.
*/
void
roctracer_enable_op_callback(roctracer_domain_t domain,
uint32_t op,
roctracer_rtapi_callback_t callback);
/**
* Enable runtime API callback for all operations of a domain.
* @param domain The domain
* @param callback The callback to invoke each time the operation is performed
* on entry and exit.
* @param arg Value to pass as last argument of \p callback.
*/
void
roctracer_enable_domain_callback(roctracer_domain_t domain,
roctracer_rtapi_callback_t callback,
void* user_data = nullptr);
/**
* Disable runtime API callback for a specific operation of a domain.
* @param domain The domain
* @param op The operation in \p domain.
*/
void
roctracer_disable_op_callback(roctracer_domain_t domain, uint32_t op);
/**
* Disable runtime API callback for all operations of a domain.
* @param domain The domain
*/
void
roctracer_disable_domain_callback(roctracer_domain_t domain);
/**
* Enable activity record logging for a specified operation of a domain using
* the default memory pool.
* @param[in] domain The domain.
* @param[in] op The activity operation ID in \p domain.
*/
void
roctracer_enable_op_activity(roctracer_domain_t domain, uint32_t op, roctracer_pool_t pool);
/**
* Enable activity record logging for all operations of a domain using the
* default memory pool.
* @param[in] domain The domain.
*/
void
roctracer_enable_domain_activity(roctracer_domain_t domain, roctracer_pool_t pool);
/**
* Disable activity record logging for a specified operation of a domain.
* @param[in] domain The domain.
* @param[in] op The activity operation ID in \p domain.
*/
void
roctracer_disable_op_activity(roctracer_domain_t domain, uint32_t op);
/**
* Disable activity record logging for all operations of a domain.
* @param[in] domain The domain.
*/
void
roctracer_disable_domain_activity(roctracer_domain_t domain);
// HIP Support
typedef enum
{
HIP_OP_ID_DISPATCH = 0,
HIP_OP_ID_COPY = 1,
HIP_OP_ID_BARRIER = 2,
HIP_OP_ID_NUMBER = 3
} hip_op_id_t;
// HSA Support
// HSA OP ID enumeration
enum hsa_op_id_t
{
HSA_OP_ID_DISPATCH = 0,
HSA_OP_ID_COPY = 1,
HSA_OP_ID_BARRIER = 2,
HSA_OP_ID_RESERVED1 = 3,
HSA_OP_ID_NUMBER
};
// HSA EVT ID enumeration
enum hsa_evt_id_t
{
HSA_EVT_ID_ALLOCATE = 0, // Memory allocate callback
HSA_EVT_ID_DEVICE = 1, // Device assign callback
HSA_EVT_ID_MEMCOPY = 2, // Memcopy callback
HSA_EVT_ID_SUBMIT = 3, // Packet submission callback
HSA_EVT_ID_KSYMBOL = 4, // Loading/unloading of kernel symbol
HSA_EVT_ID_CODEOBJ = 5, // Loading/unloading of device code object
HSA_EVT_ID_NUMBER
};
struct hsa_ops_properties_t
{
void* reserved1[4];
};
// ROCTx Support
typedef uint64_t roctx_range_id_t;
/**
* ROCTX API ID enumeration
*/
enum roctx_api_id_t
{
ROCTX_API_ID_roctxMarkA = 0,
ROCTX_API_ID_roctxRangePushA = 1,
ROCTX_API_ID_roctxRangePop = 2,
ROCTX_API_ID_roctxRangeStartA = 3,
ROCTX_API_ID_roctxRangeStop = 4,
ROCTX_API_ID_NUMBER,
};
/**
* ROCTX callbacks data type
*/
typedef struct roctx_api_data_s
{
union
{
struct
{
const char* message;
roctx_range_id_t id;
};
struct
{
const char* message;
} roctxMarkA;
struct
{
const char* message;
} roctxRangePushA;
struct
{
const char* message;
} roctxRangePop;
struct
{
const char* message;
roctx_range_id_t id;
} roctxRangeStartA;
struct
{
const char* message;
roctx_range_id_t id;
} roctxRangeStop;
} args;
} roctx_api_data_t;
// External Support
/* Extension opcodes */
typedef enum
{
ACTIVITY_EXT_OP_MARK = 0,
ACTIVITY_EXT_OP_EXTERN_ID = 1
} activity_ext_op_t;
typedef void (*roctracer_start_cb_t)();
typedef void (*roctracer_stop_cb_t)();
typedef struct
{
roctracer_start_cb_t start_cb;
roctracer_stop_cb_t stop_cb;
} roctracer_ext_properties_t;
// Tracing start
void
roctracer_start();
// Tracing stop
void
roctracer_stop();
// Notifies that the calling thread is entering an external region.
// Push an external correlation id for the calling thread.
void
roctracer_activity_push_external_correlation_id(activity_correlation_id_t id);
// Notifies that the calling thread is leaving an external region.
// Pop an external correlation id for the calling thread.
// 'lastId' returns the last external correlation if not NULL
void
roctracer_activity_pop_external_correlation_id(activity_correlation_id_t* last_id);
+1 -1
Просмотреть файл
@@ -154,7 +154,7 @@ validate(const std::vector<rocprofiler_record_header_t*>& _headers)
auto& _ref_data = get_generated_array<Tp, N>();
for(auto* itr : _headers)
{
if(itr->kind == typeid(data_type).hash_code())
if(itr->hash == typeid(data_type).hash_code())
{
auto* _data = static_cast<data_type*>(itr->payload);
EXPECT_EQ(_ref_data, *_data);
+1 -1
Просмотреть файл
@@ -147,7 +147,7 @@ validate(const std::vector<rocprofiler_record_header_t*>& _headers)
auto& _ref_data = get_generated_array<Tp, N>();
for(auto* itr : _headers)
{
if(itr->kind == typeid(data_type).hash_code())
if(itr->hash == typeid(data_type).hash_code())
{
auto* _data = static_cast<data_type*>(itr->payload);
ASSERT_TRUE(_data != nullptr);
+4 -4
Просмотреть файл
@@ -54,7 +54,7 @@ template <typename Tp>
void
extract_header(std::vector<Tp>& _arr, rocprofiler_record_header_t* _hdr)
{
if(_hdr->kind == typeid(Tp).hash_code())
if(_hdr->hash == typeid(Tp).hash_code())
{
auto* _v = reinterpret_cast<Tp*>(_hdr->payload);
_arr.emplace_back(*_v);
@@ -129,17 +129,17 @@ TEST(buffering, serial)
{
ASSERT_TRUE(itr->payload) << "nullptr to payload not expected";
if(itr->kind == typeid(uint_raw_array_t).hash_code())
if(itr->hash == typeid(uint_raw_array_t).hash_code())
{
extract_header(_ui_result, itr);
}
else if(itr->kind == typeid(flt_raw_array_t).hash_code())
else if(itr->hash == typeid(flt_raw_array_t).hash_code())
{
extract_header(_fp_result, itr);
}
else
{
GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->kind);
GTEST_FAIL() << "unknown type id hash code: " << std::to_string(itr->hash);
}
}
+1 -1
Просмотреть файл
@@ -105,7 +105,7 @@ def generate_custom(args, cmake_args, ctest_args):
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_ERRORS "100")
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_WARNINGS "100")
set(CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE "51200")
set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*")
set(CTEST_CUSTOM_COVERAGE_EXCLUDE "/usr/.*;/opt/.*;.*external/.*;.*samples/.*;.*tests/.*;.*/details/.*")
set(CTEST_MEMORYCHECK_TYPE "{MEMCHECK_TYPE}")
set(CTEST_MEMORYCHECK_SUPPRESSIONS_FILE "{MEMCHECK_SUPPRESSION_FILE}")
+4
Просмотреть файл
@@ -7,3 +7,7 @@ thread:libhsa-runtime64.so
# unlock of an unlocked mutex (or by a wrong thread)
mutex:librocm_smi64.so
# google logging
race:google::LogMessageTime::CalcGmtOffset
race:tzset_internal
+6
Просмотреть файл
@@ -3,8 +3,14 @@
WORK_DIR=$(cd $(dirname ${BASH_SOURCE[0]})/../docs &> /dev/null && pwd)
SOURCE_DIR=$(cd ${WORK_DIR}/../.. &> /dev/null && pwd)
pushd ${SOURCE_DIR}
cmake -B build-docs ${SOURCE_DIR} -DROCPROFILER_INTERNAL_BUILD_DOCS=ON
popd
pushd ${WORK_DIR}
cmake -DSOURCE_DIR=${SOURCE_DIR} -P generate-doxyfile.cmake
doxygen rocprofiler.dox
doxysphinx build ${WORK_DIR} ${WORK_DIR}/_build/html ${WORK_DIR}/_doxygen/html
popd