Commit graph

8 Commits

Autor SHA1 Nachricht Datum
Madsen, Jonathan adc4e6995d [SDK] Support HIP 7.0 API changes (#432)
* [Do not merge] Make changes to api_args

* Support HIP 7.0 API changes

- Provide ROCPROFILER_SDK_ variants of ROCPROFILER_ version defines
- Provide ROCPROFILER_SDK_COMPUTE_VERSION
- hipCtxGetApiVersion changes parameter from int* to unsigned int*
- hipMemcpyHtoD and hipMemcpyHtoDAsync changed void* to const void*

* Fix comment

---------

Co-authored-by: Jatin Chaudhary <jatchaud@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: caf1a2174e]
2025-06-03 21:50:50 -05:00
Rawat, Swati edb51fc861 update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 97b7a6315d]
2025-01-22 19:11:20 -06:00
Jonathan R. Madsen 407fc57ede Shared Library Constructor (rocprofv3 deadlock fix) (#599)
* Moved tests/apps to tests/bin

* Renamed cmake project in tests/bin

* Update samples

- Use ROCPROFILER_DEFAULT_FAIL_REGEX
- tweaks to stdout messages

* Update tests

- Use ROCPROFILER_DEFAULT_FAIL_REGEX

* Add tests/lib

- libraries with HIP code

* Update PTL submodule

- remove atexit delete of thread_id_map

* Update cmake/rocprofiler_options.cmake

- Set ROCPROFILER_DEFAULT_FAIL_REGEX

* Update common lib: env + logging

- improved customization of logging settings
- default to disabling logging to files
- install failure handler for rocprofv3
- set_env support in environment.*

* Add lib/rocprofiler-sdk/shared_library.cpp

- shared library constructor

* Update lib/rocprofiler-sdk-tool/tool.cpp

- destructor thread safety
- convert callback_name_info and buffered_name_info to pointers
- install failure handler for logging

* Add tests/bin/hip-in-libraries

- hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels
  - used for testing deadlocking within __hipRegisterFatBinary

* Update bin/rocprofv3

- reorganized the env variables
- use exec to launch command
- set ROCPROFILER_LIBRARY_CTOR=1

* Add tests/rocprofv3/tracing-hip-in-libraries

- uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels

* Update bin/rocprofv3

- fix counter collection (no exec)

* Update lib/rocprofiler-sdk-tool/tool.cpp

- replace "Kernel-Name" with "Kernel_Name"

* Update lib/rocprofiler-sdk/registration.cpp

Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries

* Update tests/rocprofv3

- replace "Kernel-Name" with "Kernel_Name"

* Update tests

- vector-ops (bin) stream syncs + runs with 4 queues per device
- improve counter-collection/input1 validation
- rocprofv3/tracing-hip-in-libraries does not do sys-trace
- improved validation script for tracing-hip-in-libraries
- updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection

* Update samples/counter_collection

- updated dispatch_callback(s) and record_callback(s) following reworking of prototypes

* Update bin/rocprofv3

- reorganized help menu
- added options for sub-HSA tables
- added --hip-runtime-trace
- changed --hip-trace to include --hip-compiler-trace

* Update lib/rocprofiler-sdk-tool

- improved kernel filtering
- removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk)
- fixed issue with counter-collection w/o tracing
- added support for fine grained HSA API tracing
- removed directly linking to HSA-runtime

* Update lib/rocprofiler-sdk/agent.cpp

- rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option)

* GPR (vector and scalar) info in kernel symbol data

- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info

* Header include order fix

- Include repo headers first
- Third party library headers next
- standard library headers last

* Update dispatch profiling public API

- introduce rocprofiler_profile_counting_dispatch_data_t
- change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t
- provide rocprofiler_user_data_t pointer in dispatch callback
- provide rocprofiler_user_data_t value (from dispatch cb) in record callback

* Update tests/bin/CMakeLists.txt

- fix add_subdirectory(hip-in-libraries) order

* Update VERSION

- bump to 0.2.0 in prep for AFAR

[ROCm/rocprofiler-sdk commit: 7b6d3c70bd]
2024-03-07 22:21:26 -06:00
Jonathan R. Madsen 20b236f2c2 Counter API and Samples Updates (#410)
* Update include/rocprofiler-sdk/{counters,profile_config}.h

- use rocprofiler_agent_id_t instead of rocprofiler_agent_t

* Update samples

- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t

* Update run-ci.py

- exclude rocprofiler-sdk-tool from samples (no sample uses that code)

* Update lib/rocprofiler-sdk-tool/tool.cpp

- Update rocprofiler_iterate_agent_supported_counters to use agent ID

* Update lib/rocprofiler-sdk/counters/core.*

- profile_config has pointer to agent instead of copy

* Update lib/rocprofiler-sdk/agent.*

- provide get_agent(...) func via rocp agent id

* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp

- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation

* Update lib/rocprofiler-sdk/counters.cpp

- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t

* Update lib/rocprofiler-sdk/profile_config.cpp

- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t

* Update source/docs

- requirements.txt + install reqs in cmake

* Bump version to 0.1.0

* Update samples/api_callback_tracing/CMakeLists.txt

- LD_LIBRARY_PATH for test

* Update test/rocprofv3/tracing/CMakeLists.txt

- reorder validation files so memory copy comes first

* Update lib/rocprofiler-sdk-tool/tool.cpp

- logging for flushing buffers
- variables for buffer_size and buffer_watermark
  - increase the watermark to a full buffer
- use dedicated threads for each buffer

* Update lib/rocprofiler-sdk-tool/CMakeLists.txt

- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info

* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp

* Update lib/rocprofiler-sdk-tool/CMakeLists.txt

- drop log level to warning when leak sanitizer is enabled (produces small memory leak)

[ROCm/rocprofiler-sdk commit: 9a8b6f6b7b]
2024-01-25 23:47:40 -06:00
Jonathan R. Madsen cefb7bc8d6 HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*

- updates for intercept table

* Update lib/common/units.hpp

- clang-tidy fixes

* Add lib/rocprofiler-sdk/hip

- tracing implementation for the HIP intercept table

* Update source/lib/rocprofiler-sdk/CMakeLists.txt

- add_subdirectory(hip)

* Update source/lib/rocprofiler-sdk/hsa

- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION

* Update lib/rocprofiler-sdk/hip

- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/tests/intercept_table.cpp

- remove failures for intercepting HIP API tables

* Update include/rocprofiler-sdk/fwd.h

- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args

* Update lib/rocprofiler-sdk/intercept_table.cpp

- support HipDispatchTable and HipCompilerDispatchTable

* Update lib/rocprofiler-sdk/internal_threading.cpp

- Support ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/registration.cpp

- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging

* Update samples/api_{buffered,callback}_tracing

- Modifications to demonstrate HIP API tracing

* Update tests/kernel-tracing

- Modifications to handle/test HIP API tracing

* Separate HIP tracing from HIP compiler tracing

* Fix installation of include/rocprofiler-sdk/hip/*

- add compiler and table headers to install

* Fixes to HIP interception

- hip_api_trace.hpp was updated a bit
  - removed hipGetDeviceProperties (generic)
  - added hipGetDevicePropertiesR0600
  - added hipGetDevicePropertiesR0000
  - removed hipRegisterTracerCallback
  - reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
  - added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers

* Update lib/rocprofiler-sdk/hip/hip.*

- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update lib/rocprofiler-sdk/hsa/hsa.*

- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update test/kernel-tracing/validate.py

- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register

* Update tests/tools/json-tool.cpp

- fix context associated with "HIP_API_CALLBACK"

* Update external/CMakeLists.txt

- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
  - BUILD_TESTING (OFF)
  - BUILD_SHARED_LIBS (OFF)
  - BUILD_OBJECT_LIBS (OFF)
  - BUILD_STATIC_LIBS (ON)
  - CMAKE_POSITION_INDEPENDENT_CODE (ON)
  - CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
  - CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog

* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt

- remove explicit setting of SKIP_BUILD_RPATH

* Update CMakeLists.txt

- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH

* Update tests/CMakeLists.txt

- include(GNUInstallDirs)

* Update samples/CMakeLists.txt

- include(GNUInstallDirs)

* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h

- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- clang-tidy fixes

* Update cmake/rocprofiler_linting.cmake

- add a feature for clang tidy exe

* Update lib/rocprofiler-sdk/hip/hip.cpp

- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- fix merge

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- fix merge

* Update bin/rocprofv3

- args for marker, HIP runtime, and HIP compiler tracing

* Update tests/apps/simple-transpose

- use roctx

* Update tests/rocprofv3/tracing

- validate marker API data

* Update lib/rocprofiler-sdk-tool

- support for HIP runtime, HIP compiler, marker API

* Update queue/queue_controller/registration/utility

- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
  - implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
  - this is used to sync each queue during queue_controller_fini()

* Fix data races: queue/context/stable_vector

- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array

* Update lib/rocprofiler-sdk/hsa/hsa.*

- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables

* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp

- use HSA subtable accessors

* Update rocprofiler_memcheck and CI workflow

- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
  - GCC 13 uses libtsan.so.2

* Update CI workflow

* Update lib/rocprofiler-sdk/counters/{metrics,counters}

- fix possibly dangling reference to a temporary from gcc-13

* Update thread-sanitizer-suppr.txt

- Ignore data races originating in hsa-runtime library

* Update cmake/rocprofiler_memcheck.cmake

- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library

* Update tests/rocprofv3/tracing/CMakeLists.txt

- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test

* Update lib/common/container/record_header_buffer.hpp

- fix data race identified by gcc v13 and libtsan.so.2

* Update hip API id, args, and def

- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0

* Update lib/common/container/record_header_buffer.hpp

- fix deadlock in save/read/reset

* Update source/docs/CMakeLists.txt

- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- remove overloads for HIP_MEMSET_NODE_PARAMS

* Update docs/CMakeLists.txt

- use find_program for shell instead of hardcoded /bin/bash

[ROCm/rocprofiler-sdk commit: c641749fe6]
2024-01-24 16:32:54 -06:00
Jonathan R. Madsen 35c6c82025 Fixes licensing in files (#206)
* Update LICENSE

- fix inconsistencies

* Revert lib/rocprofiler/counters/parser/scanner.cpp

* Update lib/rocprofiler/counters/tests/dimension.cpp

- revert ending curly brace

* Revert missing curly braces

- missing curly braces when file did not end with a new line

[ROCm/rocprofiler-sdk commit: 086218c2eb]
2023-11-14 10:58:33 -06:00
Jonathan R. Madsen 218666ebe9 Linting workflow and clang-tidy fixes (#72)
* Update source/{bin,lib/{common,rocprofiler}}/CMakeLists.txt

- activate clang-tidy

* Update PTL submodule

- clang-tidy fixes

* Update .clang-tidy

- ignore performance-enum-size

* Update CI workflow

- update paths-ignore

* Add linting workflow

- runs clang-tidy

* Update cmake/rocprofiler_build_settings.cmake

- minor modification of flags not recognized by clang-tidy

* Update samples (all of them)

- rocprofiler-samples-build-flags target with -W -Wall -Wextra -Wshadow [-Werror]
- Link samples targets to rocprofiler-samples-build-flags if target exists
- Remove unused variable in main.cpp of api_{buffered,callback}_tracing
- Update samples/pc_sampling
  - single-user-multiple-agents.cpp ends up with unused function find_first_gpu_agent() error
  - change find_first_gpu_agent to return std::optional<rocprofiler_agent_t>
  - change usage after call to find_first_gpu_agent()
  - use find_first_gpu_agent() in single-user-multiple-agents.cpp to determine if there are any GPUs

* Update linting workflow

- fix path to run-ci.py script

* Update linting workflow

- install cmake

* Update common/container/stable_vector.hpp

- fix clang-tidy warning for readability-container-size-empty

[ROCm/rocprofiler-sdk commit: 34505943b2]
2023-09-21 14:35:20 -05:00
Jonathan R. Madsen 18da0bd49d Contexts, tracing, include reorg, registration, thread-pool (#65)
* Update scripts/update-doxygen.sh

- ensure build-docs folder exists

* Update scripts/run-ci.py

- exclude files in details subdirectory from code coverage

* Update scripts/thread-sanitizer-suppr.txt

- exclude races in glog

* Update docs/rocprofiler.dox.in

- exclude defines in include/rocprofiler/defines.h from doxygen
- Tweak EXCLUDE_PATTERNS and EXAMPLE_PATTERNS

* Update docs workflow

- trigger workflow whenever there is a change to the public headers (which may be doxygen comments)

* Update include/rocprofiler (reorg and overhaul)

- rocprofiler_status_t additions
  - CONTEXT_NOT_FOUND
  - CONTEXT_ERROR
  - INVALID_CONTEXT_ID
  - INVALID_CONTEXT
  - BUFFER_BUSY
- rocprofiler_context_is_active func
- rocprofiler_context_is_valid func
- rocprofiler_service_callback_tracing_kind_t update
  - remove ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD
- Remove rocprofiler_tracing_helper_thread_operation_t
- Remove rocprofiler_helper_thread_callback_tracer_data_t
- Added rocprofiler_internal_thread_library_t
- Added rocprofiler_at_internal_thread_create
- split rocprofiler.h into several smaller headers
- reworked rocprofiler_status_t values
- added doxygen comments for enums
- replaced rocprofiler_trace_record_operation_kind_t with rocprofiler_trace_operation_t
- use @ instead of / in doxygen comment in rocprofiler_plugin.h
- fix ref to ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API
- end group in fwd.h
- remove PROFILE_COUNTING group in dispatch_profile.h
- remove premature group close in callback_tracing.h
- hsa.h: remove rocprofiler_hsa_trace_data_t
- fwd.h: remove rocprofiler_tracer_callback_data_t
- rename rocprofiler_correlation_id_t.handle to rocprofiler_correlation_id_t.id (consistency)
- fwd.h: add rocprofiler_callback_tracing_record_t
- callback_tracing.h: update rocprofiler_hsa_api_callback_tracer_data_t
- callback_tracing.h: add size fields
- simplify rocprofiler_tracer_callback_t
- removed ROCPROFILER_NONNULL from rocprofiler_get_version
- added rocprofiler_get_timestamp
- ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED in rocprofiler_status_t
- add ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND rocprofiler_status_t
- add rocprofiler_buffer_category_t
- rocprofiler_trace_operation_t -> rocprofiler_tracing_operation_t
- rocprofiler_user_data_t union
- tweak rocprofiler_callback_tracing_record_t
  - make external_correlation_id non-pointer
  - add rocprofiler_user_data_t data field
- tweak rocprofiler_record_header_t
  - instead of single uint64_t kind field, have union for category + kind (two u32) with u64 hash
- API extensions for kind id <-> kind string
- API extensions for operation id <-> operation string
- rocprofiler_callback_trace_kind_name_cb_t
- rocprofiler_callback_trace_operation_name_cb_t
- rocprofiler_iterate_callback_trace_kind_names
- rocprofiler_iterate_callback_trace_kind_operation_names
- modify rocprofiler_hsa_api_callback_tracer_data_t data members (remove pointers)
- add rocprofiler_callback_trace_operation_args_cb_t function pointer typedef
- add rocprofiler_iterate_callback_trace_operation_args function
- fixed inconsistent use of *_trace_* vs. *_tracing_* (opting for tracing)
- removed rocprofiler_query_callback_trace_kind_name
- removed rocprofiler_query_callback_kind_operation_name
- Add include/rocprofiler/registration.h
  - header dedicated to registering a tool/client with rocprofiler
  - this header is not intended to be included by rocprofiler.h
  - rocprofiler_client_id_t
    - identifier for client tool
  - rocprofiler_client_finalize_t
    - function pointer prototype for tool-initiated finalization
  - rocprofiler_tool_initialize_t
    - function pointer prototype for tool initialization (i.e. configuration)
  - rocprofiler_tool_finalize_t
    - function pointer prototype for tool finalization
  - rocprofiler_tool_configure_result_t
    - struct returned by tool/client to rocprofiler
  - rocprofiler_is_initialized
    - function for querying whether tool-induced initialization is possible
  - rocprofiler_is_finalized
    - function for querying whether rocprofiler has been finalized
  - rocprofiler_configure prototype
    - this is the function tools implement
    - prototype is always marked as having default visibility
    - no implementation in rocprofiler
  - added typedef for rocprofiler_configure function pointer
  - added rocprofiler_force_configure to explicitly invoke rocprofiler_configure instead of relying on lazy init
- made callback typedef names more consistent (_cb_t suffix)
- typedef for rocprofiler_internal_thread_library_cb_t function pointer
- added rocprofiler_at_internal_thread_create function
- added rocprofiler_callback_thread_t struct
- added rocprofiler_create_callback_thread function
- added rocprofiler_assign_callback_thread function
- removed rocprofiler_buffer_tracing_record_header_t in favor of kind and correlation id in each record type
- added rocprofiler_buffer_tracing_kind_name_cb_t typedef
- added rocprofiler_buffer_tracing_operation_name_cb_t typedef
- added rocprofiler_iterate_buffer_tracing_kind_names function
- added rocprofiler_iterate_buffer_tracing_kind_operation_names function
- removed rocprofiler_query_buffer_trace_kind_name function
- removed rocprofiler_query_buffer_kind_operation_name function

* Update lib/common/container/stable_vector.hpp

- include limits header
- reserve_size struct
- overload stable_vector constructor to support reserving as part of construction

* Update lib/common/container/record_header_buffer.{hpp,cpp}

- add emplace member function accepting category and kind (two u32 variables) instead of one u64 kind
- use std::shared_mutex to prevent data-race when reading m_headers
- record_header_buffer is now multiple writer, single reader
- add read_lock member function (shared)
- add read_unlock member function (shared)
- lock member function gets exclusive lock
- unlock member function releases exclusive lock

* Rename "config" to "context" + restructure + implement

- Restructure config files + license
  - move config files into lib/rocprofiler/config subfolder
  - rename some files
  - add license to some files which were missing it
- Rename config/helpers.hpp
  - rename to allocator.hpp
  - remove get_domain_max_ops
- Create config/domain.{hpp,cpp}
  - structures for handling tracing domains and ops
- Update config/config.{hpp,cpp}
  - buffer_instance struct
  - callback_tracing_service struct
  - buffer_tracing_service struct
  - config struct
  - allocate_{config,buffer} func
  - {validate,start,stop}_config funcs
  - get_registered_configs func
  - get_active_configs func
  - get_buffers func
- Update rocprofiler.cpp
  - Implement rocprofiler_create_context
  - Implement rocprofiler_start_context
  - Implement rocprofiler_stop_context
  - Implement rocprofiler_context_is_active
  - Implement rocprofiler_context_is_valid
  - Implement rocprofiler_flush_buffer
  - Implement rocprofiler_destroy_buffer
  - Implement rocprofiler_create_buffer
- Update lib/rocprofiler/hsa
  - use rocprofiler_tracer_activity_domain_t instead of rocprofiler_tracer_activity_domain_t
  - remove ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API fromHSA_API_INFO_DEFINITION_* macros
- Update lib/rocprofiler/context/domain.*
  - fixes for domain_info (i.e. use correct enums)
  - update rocprofiler_status_t codes
  - fix template instantiations
- Update lib/rocprofiler/context/context.*
  - use rocprofiler_service_callback_tracing_kind_t instead of rocprofiler_tracer_activity_domain_t
  - rename correlation_context to correlation_tracing_service
  - fix domains in callback_tracing_service and buffer_tracing_service
  - unique_ptr for callback_tracer and buffered_tracer in context
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_configure_callback_tracing_service
- Update lib/rocprofiler/hsa/ostream.hpp
  - include rocprofiler.h instead of tracer.hpp
- Update lib/rocprofiler/hsa
  - migration to use rocprofiler_hsa_api_callback_tracer_data_t instead of rocprofiler_hsa_trace_data_t
  - restructure hsa_api_impl<Idx>
    - remove phase_enter and phase_exit
    - add set_data_args (partial replacement for phase_enter)
    - functor handles the contexts
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_get_version
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - remove hsa_api_ prefix for functions already in hsa namespace
- Update lib/rocprofiler/context/context.{hpp,cpp}
  - add client_idx to context struct (tool identifier)
  - add push_client function to set client_idx before context is allocated
  - add pop_client function to remove client identifier from future context creations
  - implemented {registered,active}_contexts and buffers to use new container::reserve_size overload to stable_vector
  - fix implementation of start_context
  - fix implementation of stop_context
- Update lib/rocprofiler/rocprofiler.cpp
  - prevent context creation, buffer creation, pc sampling config, etc. after initialization
  - add nullptr checks to rocprofiler_context_is_valid
  - fix rocprofiler_configure_callback_tracing_service
    - was checking size of buffers, not registered context
  - implement rocprofiler_iterate_callback_trace_kind_names
  - implement rocprofiler_iterate_callback_trace_kind_operation_names
- Update lib/rocprofiler/CMakeLists.txt
  - add registration.{hpp,cpp} to rocprofiler-library target sources
- Update lib/rocprofiler/hsa/utils.hpp
  - fix using fmt::formt with const char* strings
  - remove join functions (no longer used)
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - remove args_string function
  - remove named_args_string function
  - update iterate_args function
    - change callback type
    - accept user data
  - rework the hsa_api_impl<Idx>::functor function
    - save the rocprofiler_callback_tracing_record_t between callbacks
  - update update_table function
    - check buffered_tracer domains
  - remove comments
- Update lib/rocprofiler/hsa/defines.hpp
  - remove MEMBER_<N> macros
  - add ADDR_MEMBER_<N> macros
  - remove doxygen comments for GET_MEMBER_FIELDS
  - add GET_ADDR_MEMBER_FIELDS
  - update HSA_API_INFO_DEFINITION_{0,V}
    - rename domain_idx to callback_domain_idx
    - add buffered_domain_idx
    - add as_arg_addr function
- Update lib/rocprofiler/rocprofiler.cpp
  - implement rocprofiler_iterate_callback_trace_operation_args
- Remove lib/rocprofiler/tracing.{hpp,cpp} and lib/rocprofiler/CMakeLists.txt
  - unused
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
  - support buffered tracing in hsa_api_impl<Idx>::functor
  - rocprofiler_callback_trace_operation_args_cb_t -> rocprofiler_callback_tracing_operation_args_cb_t
    - i.e. trace -> tracing
- Update lib/rocprofiler/context/context.{hpp,cpp}
  - removed buffer_instance struct
  - removed allocate_buffer function
  - removed get_buffers function
  - changed buffer_tracing_service::buffer_array_t
- Update lib/rocprofiler/hsa: hsa.cpp, ostream.hpp, details folder
  - move ostream.hpp into details folder to prevent from contributing to code coverage
  - update cmake build system for new directory

* Add lib/rocprofiler/registration.{hpp,cpp}

- implements rocprofiler_set_api_table (called by rocprofiler-register)
- miscellaneous functions for client configure/initialize/finalize
- functions for querying the init/fini status
- relocated OnLoad HSA workaround to this file
  - at present, this is used to workaround ROCr not having rocprofiler-register integration yet
- implement rocprofiler_force_configure function
- implement rocprofiler_is_initialized function
- implement rocprofiler_is_finalized function
- ensure configure functions only invoked once
- ensure internal thread creation notification functions are invoked
- get_status is pair of atomics
- fix heap-use-after-free in init_logging
- update finalize
  - invoke hsa_shut_down
  - set all active contexts to null pointers

* Add lib/rocprofiler/buffer_tracing.cpp

- contains implementations of buffer_tracing (i.e. rocprofiler/buffer_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/buffer.{hpp,cpp}

- contains implementations of buffer (i.e. rocprofiler/buffer.h) and misc internal access functions
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp and lib/rocprofiler/context/context.{hpp,cpp}

* Add lib/rocprofiler/callback_tracing.cpp

- contains implementations of callback_tracing (i.e. rocprofiler/callback_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/context.cpp

- contains implementations of context public API functions (i.e. rocprofiler/context.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp

* Add lib/rocprofiler/internal_threading.{hpp,cpp}

- contains implementations of internal_threading (i.e. rocprofiler/internal_threading.h)
- also contains implementations of internal access functions
- update finalize function
  - join all task groups and destroy all thread pools first, then reset unique_ptr

* Update lib/rocprofiler/rocprofiler.cpp

- rocprofiler_get_version returns status
- implement rocprofiler_get_timestamp
- remove misc implementations that were split into other files

* Update lib/rocprofiler/CMakeLists.txt

- compile new implementation files
  - buffer.cpp
  - buffer_tracing.cpp
  - callback_tracing.cpp
  - context.cpp
  - internal_threading.cpp

* Update lib/tests/buffering/buffering-*.cpp

- update to reflect changes to rocprofiler_record_header_t

* Update CMakeLists.txt

- increase minimum cmake version to 3.21 which added HIP support as a language

* Add samples/apps/transpose

- simple HIP application for testing

* Add samples/api_callback_tracing

- HIP application and tool library
- This effectively demos how to setup HSA API tracing
  - For each function called in tool, it stores the func/file/line and prints it during finalization
- client.hpp and client.cpp are the tool library
- Implement use of rocprofiler_iterate_callback_trace_operation_args
- add demo of using rocprofiler_get_version
- add_test
  - remove PASS_REGULAR_EXPRESSION
    - causing false passes during memcheck
  - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
- check if rocprofiler is initialized before stopping context

* Add samples/api_buffered_tracing

- Sample demonstrating tracing the HSA API via buffering
- demo rocprofiler_record_header_compute_hash
- throw exceptions for unexpected buffer data
- add_test
  - remove PASS_REGULAR_EXPRESSION
    - causing false passes during memcheck
  - add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment

* Update samples/CMakeLists.txt

- add subdirectory for api_callback_tracing
- add subdirectory api_buffered_tracing

* Update samples/pc_sampling/common.h

- fix processing of headers

* Update lib/rocprofiler/hsa/details/ostream.hpp

- fix data race on HSA_depth_max_cnt and recursion
- HSA_depth_max_cnt and recursion is now thread-local static instead of global static
- replace std::string usage with std::string_view

* Actions update

- add dependabot.yml
- use actions/checkout@v4
- install latest libasan and libtsan in sanitizer containers

* Add PTL (Parallel Tasking Library) submodule

[ROCm/rocprofiler-sdk commit: d3eaacd610]
2023-09-20 19:32:02 -05:00