39db6d842fc1c398082ba94de18b81cbc0eab156
22 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
97b7a6315d |
update copyright date to 2025 (#102)
* Update LICENSE * Update conf.py * Update copyright year * [fix] Update copyright year * Update copyright year "ROCm Developer Tools" * Add license headers to c++ files * Add license to *.py * Update licenses in rocdecode sources --------- Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com> Co-authored-by: Mythreya <mythreya.kuricheti@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
e307b89ca4 |
rocDecode API Tracing Support (#49)
* rocDecode API Tracing support * Test bin file added to rocdecode. Need to add validate python methods * Added option to not make rocDecode tests * Added rocdecode and rocprofv3 tests * Added csv test * Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI * Add option to avoid building rocdecode tests * Added option to avoid building rocdecode bin file * Merge conflict error * CMake files changed in response to review comments. Attempting to implement callbacks. * Turned off test building for rocdecode * Minor fixes for review comments * Review comments * Updated formatting * Document changes and format.hpp reversion. Need to remove iterate args support for now for later update. * Remove iterate args support * Remove iterate-args * enforce abi versioning in macro if * Fix doc error * removed spaces to fix indentation error --------- Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> |
||
|
|
00c46fd5e5 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
249c50fc40 |
Runtime Initialization Tracing (#1105)
* Runtime initialization tracing - calbacks and buffer entries notifying when a runtime has been initialized * Minor cleanup to registration.cpp * JSON tool implementation * Increase perfetto_reader timeout * Handle perfetto_reader timeout when attr doesn't exist * clang-tidy fixes to memory_allocation.cpp |
||
|
|
3bd7773cf7 |
Memory Allocation Tracking (#1142)
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions * Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected * Debugging and implementing generateCSV function * Memory allocation size and starting address outputted to csv and json file formats * Formatting * Initial setup for OTF2 and Perfetto generation * Collecting agent id for memory_allocation and formatting * Modified memory_allocation.cpp to set up code for AMD_EXT commands * Support for memory_pool_allocate added * Removed accidently added file * Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works * Formatting * Fixed perfetto and otf2 output * Fixed flag issue due to incorrect buffer use * Updated documentation * Small cleaning and comments * Added test for HSA memory allocation tracing * Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error * Decreased lower limit of hip calls for test * Modified summary tests to vary number of allocate requests * Minor fixes to address comments. Still need to address OTF2 comments * Fix docs and changed OTF2 to use enum for type specified in location_base construction * Fixed schema error * Added vmem command tracking. Need to add test * Updated test to work with vmem command and updated generateCSV to output int instead of hex string. * OTF2 enum update and mispelling fix * CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API * Update CMakeLists.txt for memory allocation test * Updated summary test * Minor fixes to address comments * Moved domain_type.hpp enum to before LAST * Fixed compile errors and formatting * Fixed stats summary domain name error * Added rocprofv3 test * Page migration test fix * Undo page migration test changes. Failures do not appear to have to do with memory allocation |
||
|
|
62e0a9c1a3 |
SDK: OMPT Support part 1: include file and print formatters for OMPT support (#1175)
* include file and print formatters for OMPT support * Apply suggestions from code review * Remove rocprofiler_ompt_set_callbacks * Reorder ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_OPENMP --------- Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
8c1382fceb |
Package RCCL headers to support adding RCCL support w/o installed headers (#1075)
- in ROCm CI, rocprofiler-sdk gets built before RCCL is installed, this is a workaround for this issue |
||
|
|
2a146259c7 |
Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing Address comments * [Draft]: Add support for RCCL tracing Address PR comments, changes from RCCL upstream * Add RCCL library table registration Working on adding support to rocprofiler-register * Support compilation w/o <rccl/amd_detail/api_trace.h> - dummy api_trace.h header - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header * RCCL API tracing tool support - add to rocprofv3 - add to json-tool --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
bb25376480 |
Misc API cleanup and consistency fixes (#1023)
- ROCPROFILER_API after function - use rocprofiler_tracing_operation_t in lieu of uint32_t where appropriate - rocprofiler_tracing_operation_t is not int32_t typedef (formerly uint32_t) - use const T* instead of T* where appropriate |
||
|
|
4d5b71b0e7 |
Update logging (#838)
* Update logging * Remove unused function * Fix lib/rocprofiler-sdk/hsa/pc_sampling.cpp logging compilation * Fix logging FLAGS_vmodule string leak and numerical log level * Update logging * Update glog submodule * Leak fixes * format |
||
|
|
fd3d97287c |
Page migration reporting (#651)
* Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Add kfd_ioctl.h * Formatting * Update get_key - get key was not used (and shouldn't be), so delete it * clang-tidy fixes * Tests for page migration * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update tests/bin/page-migration/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update page-migration test app - add hipHostRegister to register mmap'ed allocation with HIP - misc cleanup and reorg - remove HSA_XNACK=1 from test env * Update lib/rocprofiler-sdk/tests/page_migration.cpp - fix compilation error * Minor updates (reorg, rename) * Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Update page migration tests, fix trigger types * Page Migration Tracing Support Refactoring (#753) * Reorganization * Update page migration init/fini * Formatting * Update page_migration.cpp - change logging severity * Skip test if KFD does not support page migration reporting * Rework skipping test if KFD does not support page migration * Fix event trigger enum values * Fix clang-diagnostic-unused-const-variable --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> |
||
|
|
56030018dc |
Callback tracing for kernel dispatches + External correlation ID request service (#682)
* Support ROCPROFILER_CALLBACK_TRACING_KERNEL_DISPATCH * Fix doxygen * Update callback tracing - temporary hacks for kind operation name and iterate kind operations * Update source/include/rocprofiler-sdk - introduce sequence id for kernel dispatches * Update lib/rocprofiler-sdk (seq id) - support sequence id passing * Update tests (seq id) - testing for sequence ids * Cleanup include/rocprofiler-sdk/fwd.h * Misc cleanup * External Correlation ID Request Service (#699) * External correlation ID request service - callback requesting an external correlation ID instead of fetching from top of pushed external correlation ID stack * Update external correlation id request support - pass internal correlation ID in callback - async copy generates a correlation ID if none already exists - added external correlation ID request support for scratch memory tracing - updated scratch memory tracing to use tracing:: functions * Update hsa/queue.hpp - new line at EOF * Misc tweaks - remove unnecessary logging in agent.cpp - correlation_id::add_ref_count check for retirement - finalization check in HSA queue AsyncSignalHandler * Improve assertion failure logging in misc tests * Update include/rocprofiler-sdk/fwd.h - remove rocprofiler_record_counter_header_t * Move lib/rocprofiler-sdk/tracing.hpp into lib/rocprofiler-sdk/tracing/ folder * Update lib/rocprofiler-sdk/hsa/* - hsa::get_hsa_status_string - queue_info_session.hpp header - rocprofiler_packet.hpp * Update lib/rocprofiler-sdk/{counters,hip,marker} - execute_phase_exit_callbacks tweaks - queue_info_session tweaks * Move rocprofiler_kernel_dispatch_operation_t to include/rocprofiler-sdk/fwd.h * Update rocprofiler_buffer_tracing_kernel_dispatch_record_t - add operation field and thread_id field * Add lib/rocprofiler-sdk/kernel_dispatch - enum <-> string mapping for kernel dispatch - tracing implementations * Update lib/rocprofiler-sdk/CMakeLists.txt - tracing and kernel dispatch sub-directories * Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp - invoke rocprofiler::kernel_tracing functions * Update tests/common/serialization.hpp - support operation and thread_id fields for rocprofiler_buffer_tracing_kernel_dispatch_record_t * Update tests/tools/json-tool.cpp - use external correlation id request service * Rename sequence_id to dispatch_id |
||
|
|
4fa165ec1a |
Add support for scratch reporting (#523)
* Add ToolsApiTable Add ToolsApiTable wrapping for scratch memory tracking * Add initial support for scratch memory tracking Buffering is implemented * cmake formatting (cmake-format) (#525) Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> * source formatting (clang-format v11) (#524) Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> * Add callback tracing for scratch Fixed the error where scratch tracking init was called irrespective of whether any client requested for it * Apply suggestions from code review Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * Fix tools api copy/update Table were saved/updated incorrectly in previous commit. Also adds passing user data through the callback * Fix OpKind sequence for scratch tracking Previously scratch was using OpKind from rocprofiler-sdk, but templates were instantiated using API ID. These differ by 1 * Integration tests for scratch reporting Added buffer and callback integration tests for scratch reporting * source formatting (clang-format v11) (#550) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * cmake formatting (cmake-format) (#551) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * python formatting (black) (#549) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * CI fixes * source formatting (clang-format v11) (#554) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * Update api Rebase on main and updates based on PR feedback * Update scratch reporting and address PR comments - Added agent id to buffer records - Updated `test_internal_correlation_ids` - Is almost identical to one in async-copy - Updated scratch test to check for agent id - Updated queue id serialization in callback records (prints handle as nested key) - Remove `marker_api_traces` from scratch `test_internal_correlation_ids` validation test - Rename `amd_tools_api` to `scratch_memory` - Added doxygen comments - Remove scratch callback from `tool.cpp` - Replace assert with `LOF_IF` in `scratch_memory.cpp` * Update tools table Changed to match up with changes to hsa tables in main branch * Rework scratch memory structure * Update tests - Added suggestions from PR review, and updated tests accordingly * Misc cleanup * Update scratch test As of Apr 4th, `hsa_amd_agent_set_async_scratch_limit` is disabled. Note, > This API: `hsa_amd_agent_set_async_scratch_limit` is currently > disabled. We need some changes in CP firmware to be able to do this > and these changes are not ready yet. > With the current code, you will also not get notifications for > alternate-scratch allocations because this feature has been disabled > while CP firmware is making additional changes > We are hoping to have that feature enabled by ROCm-6.3 * Minor update to lib/rocprofiler-sdk/internal_threading.* - delay destruction of shared_ptrs of the tasks to prevent rare (but possible) data race on the destruction of the shared_ptr --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
875f53b608 |
Correlation ID Retirement + misc (#527)
* Correlation ID Retirement
- include/rocprofiler-sdk/buffer_tracing.h
- add rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- include/rocprofiler-sdk/fwd.h
- ROCPROFILER_BUFFER_TRACING_CORRELATION_ID_RETIREMENT
- lib/rocprofiler-sdk/buffer_tracing.cpp
- kind string for correlation id retirement
- lib/rocprofiler-sdk/buffer.hpp
- emplace returns bool
- lib/rocprofiler-sdk/registration.cpp
- pass lib_instance to copy_table functions
- lib/rocprofiler-sdk/context/context.*
- update correlation_id struct
- make ref_count private
- {get,add,sub}_ref_count() functions
- sub_ref_count() performs correlation id retirement
- use stack for "latest" thread-local correlation id
- lib/rocprofiler-sdk/hip/hip.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/hsa.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/marker/marker.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/async_copy.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- handle table instance in async_copy_init / async_copy_save
- lib/rocprofiler-sdk/hsa/queue.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- tweak to external correlation id mapping in WriteInterceptor
- tests/async-copy-tracing/validate.py
- check retired_correlation_ids
- tests/common/serialization.hpp
- support rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- tests/kernel-tracing/validate.py
- check retired_correlation_ids
- tests/common/CMakeLists.txt
- perfetto external project
- tests/common/perfetto.hpp
- perfetto categories + aliases
- add_perfetto_annotation
- metaprogramming helpers
- tests/tools/CMakeLists.txt
- link to tests-perfetto
- tests/tools/json-tool.cpp
- demangling functions
- serialization of marker API callback args
- reduce parallel bottleneck in tool_tracing_callback
- support correlation id retirement
- Multiple threads for buffers
- Support ROCPROFILER_TOOL_CONTEXTS_EXCLUDE env variable
- write_perfetto() function
* Update tests/rocprofv3/tracing/validate.py
- tweak test_hsa_api_trace
* Update PTL submodule
- fixes for data race during destruction of task
* Update lib/rocprofiler-sdk/buffer.*
- unique_buffer_vec_t uses std::unique_ptr instead of allocator::unique_static_ptr_t
* Reduce timeouts in counter collection samples [skip ci]
* Update tests/tools/json-tool.cpp
- tweak demangle(string_view, int*) -> demangle(string_view, int&)
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- move sub_ref_count() to later in async_copy_handler to delay retirement slightly more
|
||
|
|
3f39339926 |
API Tracing Overhaul (#437)
* Update include/rocprofiler-sdk/hsa/*
- split HSA API IDs into separate enumerations
- add support for finalize ext table
* Update include/rocprofiler-sdk/hip/*
- remove compiler_api_args.h
- rocprofiler_hip_api_args_t contains all for HIP runtime and HIP compiler
- ROCPROFILER_HIP_API_ID_ -> ROCPROFILER_HIP_RUNTIME_API_ID_
* Update include/rocprofiler-sdk/marker/table_api_id.h
- ROCPROFILER_MARKER_API_TABLE_ID_ -> ROCPROFILER_MARKER_TABLE_ID_
* Update include/rocprofiler-sdk/*/table_api_id.h
- table_api_id.h -> table_id.h
* Update include/rocprofiler-sdk/*/table_api_id.h
- table_api_id.h -> table_id.h
* Update include/rocprofiler-sdk/fwd.h
- ROCPROFILER_CALLBACK_TRACING_HSA_API split into 4 enum values:
- ROCPROFILER_CALLBACK_TRACING_HSA_CORE_API
- ROCPROFILER_CALLBACK_TRACING_HSA_AMD_EXT_API
- ROCPROFILER_CALLBACK_TRACING_HSA_IMAGE_EXT_API
- ROCPROFILER_CALLBACK_TRACING_HSA_FINALIZE_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_API split into 4 enum values:
- ROCPROFILER_BUFFER_TRACING_HSA_CORE_API
- ROCPROFILER_BUFFER_TRACING_HSA_AMD_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_IMAGE_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_FINALIZE_EXT_API
- rocprofiler_callback_tracing_code_object_operation_t renamed to rocprofiler_code_object_operation_t (more consistent)
- doxygen updates
* Update include/rocprofiler-sdk/buffer_tracing.h
- improved doxygen comments
- removed unused rocprofiler_buffer_tracing_queue_scheduling_record_t
- removed unused rocprofiler_buffer_tracing_correlation_record_t
* Update include/rocprofiler-sdk/callback_tracing.h
- removed rocprofiler_callback_tracing_hip_compiler_api_data_t
- rocprofiler_hip_api_args_t and rocprofiler_hip_compiler_api_args_t were combined
- rocprofiler_hsa_api_retval_t and rocprofiler_hsa_compiler_api_retval_t were combined
* Update lib/rocprofiler-sdk/hsa/*
- utils.hpp
- formatters for hsa_ext_program_t and hsa_ext_control_directives_t
- defines.hpp
- removed variadic macros from lib/common/defines.hpp
- HSA_API_META_DEFINITION, HSA_API_INFO_DEFINITION_0, HSA_API_INFO_DEFINITION_V specialize on table id
- async_copy.cpp
- ROCPROFILER_HSA_API_ID_* -> ROCPROFILER_HSA_AMD_EXT_API_ID_*
- add table id to templates
- improve async_copy_fini
- hsa.hpp
- add hsa_table_id_lookup
- add hsa_domain_info
- add table id to templates
- add copy_table function
- hsa.cpp
- add table id to templates
- require hsa tables to be trivial and standard layout
- remove set_data_args specialization for hsa_amd_memory_async_copy_rect
- implement copy_table function
- hsa.def.cpp
- update enums
* Update lib/rocprofiler-sdk/hip/*
- defines.hpp
- use lib/common/defines.hpp
- add hip_table_id_lookup to HIP_API_TABLE_LOOKUP_DEFINITION
- hip.hpp
- hip_table_id_lookup
- template iterate_args on table id
- templated copy_table and update_table
- hip.cpp
- replaced api_id_bounds with hip_domain_info
- templated iterate_args on table id
- templated copy_table and update_table
* Update lib/rocprofiler-sdk/marker/*
- defines.hpp
- use lib/common/defines.hpp
- marker.cpp
- updated enums
- marker.def.cpp
- updated enums
* Update lib/rocprofiler-sdk/tests
- common.hpp
- ROCPROFILER_CALL_EXPECT
- callback_data_ext
- update get_callback_tracing_names with new enums
- update get_buffer_tracing_names with new enums
- external_correlation.cpp
- support new HSA API enums
- intercept_table.cpp
- use test/common.hpp
- update to new HSA API enums
- registration.cpp
- support new HSA API enums
- naming.cpp
- validation for all get_ids(), get_names(), name_by_id(), id_by_name(), etc.
* Update lib/common
- defines.hpp
- Move IMPL_DETAIL_FOR_EACH_NARG, GET_ADDR_MEMBER_FIELDS, and GET_NAMED_MEMBER_FIELDS here
- used by HSA, HIP, and Marker
- static_object.hpp
- is_trivial_standard_layout static constexpr member function
- suppress register_static_dtor when is_trivial_standard_layout
* Update lib/rocprofiler-sdk/hsa/code_object.*
- name_by_id
- id_by_name
- get_names
- get_ids
* Update lib/rocprofiler-sdk/registration.cpp
- Update rocprofiler_set_api_table for HSA
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Update for new HSA enums
- Rework to use switch statement
- rocprofiler_query_callback_tracing_kind_operation_name
- rocprofiler_iterate_callback_tracing_kind_operations
- rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Update for new HSA enums
- Rework to use switch statement
- rocprofiler_query_buffer_tracing_kind_operation_name
- rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk-tool
- helper.cpp
- update get_buffer_id_names with new enums
- update get_callback_id_names with new enums
- tools.cpp
- update to use new HSA enums
* Update samples/common
- added call_stack.hpp
- source_location struct
- call_stack_t alias
- print_call_stack function
- added name_info.hpp
- utils for getting buffer/callback domain and operation names
* Update samples/api_buffered_tracing/client.cpp
- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums
* Update samples/api_callback_tracing/client.cpp
- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums
* Update tests/tools/json-tool.cpp
- update for new HSA enums
* Update tests/rocprofv3/tracing/validate.py
- update for new HSA domain names
* Update samples/counter_collection/main.cpp
- reduce number of kernels to 50,000 since 200,000 causes issues with thread sanitizer
|
||
|
|
9efafc4d23 |
Split ROCTx API tables and update intercept table API (#421)
* Update include/rocprofiler-sdk
- buffer_tracing.h
- fix doxygen for rocprofiler_buffer_tracing_hip_api_record_t
- update doxygen for rocprofiler_buffer_tracing_marker_api_record_t
- remove unused marker_id field
- fwd.h
- Split ROCPROFILER_CALLBACK_TRACING_MARKER_API into ROCPROFILER_CALLBACK_TRACING_MARKER_{CORE,CONTROL,NAME}_API
- Split ROCPROFILER_BUFFER_TRACING_MARKER_API into ROCPROFILER_BUFFER_TRACING_MARKER_{CORE,CONTROL,NAME}_API
- split rocprofiler_runtime_library_t into rocprofiler_runtime_library_t and rocprofiler_intercept_table_t
- after split of ROCTx into 3 tables, specifying rocprofiler_at_internal_thread_create became confusing
* Update include/rocprofiler-sdk-roctx/api_trace.h
- Split into three tables: core, control, and name
- core: what it sounds like
- control: functions for controling the profiler
- name: functions for giving resources names
* Update lib/rocprofiler-sdk-roctx/roctx.cpp
- modifications following split into multiple tables
* Update lib/rocprofiler-sdk/marker/*
- modifications following split of ROCTx API into multiple intercept tables
* Update lib/rocprofiler-sdk/tests
- common.hpp
- add enums to get_callback_tracing_names() and get_buffer_tracing_names()
- intercept_table.cpp
- update test to use rocprofiler_intercept_table_t (and enums) instead of rocproifler_runtime_library_t
- update OR combos tested
- roctx.cpp
- updates following split of ROCTx API table into multiple tables
- use simplified specification of control API
* Update lib/rocprofiler-sdk
- buffer_tracing.cpp
- Updates for ROCPROFILER_BUFFER_TRACING_MARKER_{CORE,CONTROL,NAME}_API enum values
- callback_tracing.cpp
- Updates for ROCPROFILER_CALLBACK_TRACING_MARKER_{CORE,CONTROL,NAME}_API enum values
- intercept_table.hpp
- notify_runtime_api_registration -> notify_intercept_table_registration
- intercept_table.cpp
- updates for new rocprofiler_intercept_table_t enum and new ROCTx tables
- registration.cpp
- updates for new rocprofiler_intercept_table_t enum and new ROCTx tables
- updates for notify_runtime_api_registration -> notify_intercept_table_registration
* Update lib/rocprofiler-sdk-tool
- helper.cpp
- Updates for new enums in get_callback_id_names() and get_buffer_id_names()
- tool.cpp
- migrate to new enums for split ROCTx tables
- use simplified split for control table vs. core+name tables
* Update samples/{api_callback_tracing,intercept_table}
- intercept_table/client.cpp
- rocprofiler_runtime_library_t -> rocprofiler_intercept_table_t
- api_callback_tracing/client.cpp
- Updates for new enums in get_callback_id_names()
- use simplified split for control table vs. core+name tables
- migrate to new enums for split ROCTx tables
* Update tests
- rocprofv3/tracing/validate.py
- handle new marker domain names
- tools/json-tool.cpp
- Updates for new enums in get_callback_id_names() and get_buffer_id_names()
- use simplified split for control table vs. core+name tables
- migrate to new enums for split ROCTx tables
* Update tests/rocprofv3/tracing/CMakeLists.txt
- fix FAIL_REGULAR_EXPRESSION for rocprofv3-test-trace-execute
* Update lib/rocprofiler-sdk-tool/{output_file,tool}.*
- logging in output_file dtor
- support stdout/stderr
* Update lib/common/container/record_header_buffer.hpp
- reduce probability of is_empty() returning true while emplace is happening
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for buffered_tracing_callback
- counter collection uses CSV encoder
* Update bin/rocprofv3
- remove -i flag from help menu
|
||
|
|
9a8b6f6b7b |
Counter API and Samples Updates (#410)
* Update include/rocprofiler-sdk/{counters,profile_config}.h
- use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update samples
- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t
* Update run-ci.py
- exclude rocprofiler-sdk-tool from samples (no sample uses that code)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- Update rocprofiler_iterate_agent_supported_counters to use agent ID
* Update lib/rocprofiler-sdk/counters/core.*
- profile_config has pointer to agent instead of copy
* Update lib/rocprofiler-sdk/agent.*
- provide get_agent(...) func via rocp agent id
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation
* Update lib/rocprofiler-sdk/counters.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update lib/rocprofiler-sdk/profile_config.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update source/docs
- requirements.txt + install reqs in cmake
* Bump version to 0.1.0
* Update samples/api_callback_tracing/CMakeLists.txt
- LD_LIBRARY_PATH for test
* Update test/rocprofv3/tracing/CMakeLists.txt
- reorder validation files so memory copy comes first
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for flushing buffers
- variables for buffer_size and buffer_watermark
- increase the watermark to a full buffer
- use dedicated threads for each buffer
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info
* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- drop log level to warning when leak sanitizer is enabled (produces small memory leak)
|
||
|
|
c641749fe6 |
HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
|
||
|
|
1f4cf1aa39 |
Tools update (#397)
* Srnagara/tool counters collect (#331) * Adding counter collection capability to tools * Adding counter collection feature to tools * Adding counter collection capability to tools * Fixing merge down issues * Small tool fixes for build + prevent profile realloc * Reproducing the counter name query issue in buffered callback * Minor fix for init order + sample that directly uses sdk-tool for debug purposes * Adding a temporary fix to print the counter names * Fixing the output file name and reverting the changes of caching the profile config * Fixing SGPR_Count value * cleaning up debug prints * Adding header to counter collection file * Adding kernel filtering support * Remove threading * Cleaning up the code * Removing redundant prints * Revert "Remove threading" This reverts commit 05c58fb9de826e92cf8d2e3d1c31d5578525dcb4. * Revert "Cleaning up the code" This reverts commit 1d964882bf2396dee8ad020cbb6c83b36e0674e9. * Changing the tools code to align with init-order fix * cmake formatting (cmake-format) (#335) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * source formatting (clang-format v11) (#336) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * Adding support for async memory copy * source formatting (clang-format v11) (#391) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * Fixing header typo * Fixing tool_fini * Replaceing the direction and kind fields values with description * Update lib/rocprofiler-sdk-tool/helper.cpp - Remove use of VLA * Update lib/rocprofiler-sdk-tool/tool.cpp - Formatting * Migrate common/config.* to rocprofiler-sdk-tool * Update lib/rocprofiler-sdk-tool/tool.cpp - fix clang-tidy issues * source formatting (clang-format v11) (#392) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> * Update lib/common/mpl.hpp - is_string_type / is_string_type_impl for deducing if type is a string type * Update include/rocprofiler-sdk/fwd.h - ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE starts at zero * Update lib/rocprofiler-sdk/hsa/async_copy.* - functions for operation ids and names * Update lib/rocprofiler-sdk/buffer_tracing.cpp - support iterating and getting names for ROCPROFILER_BUFFER_TRACING_MEMORY_COPY * Update lib/rocprofiler-sdk-tool/config.* - env ROCPROFILER_ prefix -> ROCPROF_ prefix - add support for memory copy tracing, counter collection, etc. * Update lib/rocprofiler-sdk-tool/helper.* - removed TracerFlushRecord - removed cxa_demangle (use one in common library) - removed GetCounterNames (handled in config) - removed GetKernelNames (handled in config) * Add lib/rocprofiler-sdk-tool/output_file.* - separate out get_output_stream function and output_file struct from tool.cpp * Add lib/rocprofiler-sdk-tool/csv.hpp - write_csv_entry automatically quotes strings - csv_encoder struct enforces correct number of columns * Update lib/rocprofiler-sdk-tool/CMakeLists.txt - add new files * Update lib/rocprofiler-sdk-tool/tool.cpp - update construction of output_file class - add kernel_symbol_data for serializing kernel trace data - use config instead of env lookups - optimize counter collection profile config lookup/creation * Update bin/rocprofv3 - rocprofv3 --help exits with 0 (as it should) - command-line arg for memory copy tracing - command-line arg for mangled kernels - command-line arg for truncated kernels - env ROCPROFILER_ prefix -> env ROCPROF_ prefix * Update tests/async-copy-tracing/validate.py - update test_async_copy_direction to new enum values * Update tests/kernel-tracing/validate.py - update test_async_copy_direction to new enum values * Update tests/tools/json-tool.cpp - add ROCPROFILER_BUFFER_TRACING_MEMORY_COPY to supported buffer_name_info * Update samples/counter_collection/{CMakeLists.txt,main.cpp} - remove counter-collection-sdk-tool * Update .github/workflows/docs.yml - fix paths triggering running the workflow --------- Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> * adding counter collection support * Adding counter collection test * changing directory structure of counter collection tests * Fixing test path for rocprofv3 * Adding hsa-tracing basic test * cmake formatting (cmake-format) (#362) Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> * counter collection tests drop2 * fixing hsa-trace test for rocprofv3 path * python formatting (black) (#371) Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> * both counter colleciton and tracing should work together * Fixing rocprofv3 path * Attempt to fix Segfault with AddressSanitizer * fixing sanitizer segfault * Update rocprofv3 * Update lib/rocprofiler-sdk-tool/README.md - update env variables * Update lib/rocprofiler-sdk/buffer_tracing.cpp - return ROCPROFILER_STATUS_BUFFER_NOT_FOUND if buffer tracing service is configured with invalid buffer * Update lib/rocprofiler-sdk-tool/tool.cpp - designated hsa API trace buffer * Update tests/hsa-tracing/CMakeLists.txt - Fix environment * Update rocprofv3 - do not override HSA_TOOLS_LIB - support ROCPROF_PRELOAD - LD_PRELOAD librocprofiler-sdk.so * Restructure tests directory - move all rocprofv3 integration tests into subfolder * Update cmake/Templates/rocprofiler-sdk/config.cmake.in - create rocprofiler-sdk::rocprofv3 cmake target * Update tests/rocprofv3/hsa-tracing - improve validate.py - convert input to dict via csv.DictReader * Update tests/apps/CMakeLists.txt - fix build rpath for simple-transpose * Update cmake/rocprofiler_memcheck.cmake - prefer libtsan.so.0 * Update tests/rocprofv3/hsa-tracing - move to tests/rocprofv3/tracing - include kernel tracing and memory copy tracing * Update lib/rocprofiler-sdk-tool/tool.cpp - normalize "_ID" vs. "_Id" in CSV column names (use "_Id") * Update lib/rocprofiler-sdk/buffer.{hpp,cpp} - change signature of buffer::get_buffers() - buffer::get_buffers() uses static_object * Update lib/rocprofiler-sdk/context/context.cpp - update usage of buffer::get_buffers() - now returns pointer * Update lib/rocprofiler-sdk/tests/buffer.cpp - update to change for signature of buffer::get_buffers() * Update tests/rocprofv3/tracing/CMakeLists.txt - use %argt% with -d argument * Update lib/rocprofiler-sdk-tool/tool.cpp - use atexit for finalization * Update tests/rocprofv3/tracing/CMakeLists.txt - tweaked name of tests * Update lib/rocprofiler-sdk/hsa/async_copy.* - async_copy_fini + reference counting signals * Update lib/rocprofiler-sdk/registration.cpp - invoke hsa::async_copy_fini() to prevent data race on signals --------- Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com> Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> |
||
|
|
21dd088c8e |
ROCTx Library Tracing (#390)
* Update include/rocprofiler-sdk/marker/*
- Update rocprofiler_marker_api_args_t for all API functions
- Add ROCPROFILER_MARKER_API_ID_roctxGetThreadId to rocprofiler_marker_api_id_t
* Update include/rocprofiler-sdk/marker/api_args.h
- fix include
* Update lib/common/mpl.hpp
- is_pair
- is_type_complete_v
* Update include/rocprofiler-sdk/marker/*
- fix rocprofiler_marker_api_retval_t
- add roctxGetThreadId to rocprofiler_marker_api_args_t
- fix type in enum: HsaDevice -> HsaAgent
- add table_api_id.h
* Update include/rocprofiler-sdk/marker.h
- include marker/table_api_id.h
* Update include/rocprofiler-sdk/buffer_tracing.h
- Buffer marker tracer records have begin and end timestamp
* Add lib/rocprofiler-sdk/marker
- tracing implementation for marker (roctx) library
* Update include/rocprofiler-sdk/{buffer_tracing,marker/table_api_id}.h
- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- support for ROCPROFILER_BUFFER_TRACING_MARKER_API
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- support for ROCPROFILER_CALLBACK_TRACING_MARKER_API
* Update lib/rocprofiler-sdk/intercept_table.cpp
- template instantiation for notify_runtime_api_registration
* Update lib/rocprofiler-sdk/registration.cpp
- enable roctx in rocprofiler_set_api_table
* Update lib/rocprofiler-sdk/marker/marker.cpp
- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t
* Update lib/rocprofiler/tests for roctx testing
- add roctx.cpp
- unit tests for roctx callback and buffer tracing
- support marker API in get_{buffer,callback}_tracing_names()
* Update lib/common/logging.cpp
- logging initialized message mentions env variable
* Update lib/common/mpl.hpp
- NOLINT for misc-definitions-in-headers
* Update lib/rocprofiler-sdk/tests/CMakeLists.txt
- include LD_LIBRARY_PATH in rocprofiler-lib-tests-shared tests
* Update lib/rocprofiler-sdk/registration.cpp
- client_library_vec_t is now vector of option<client_library>
- enables resetting the client_library after finalization
- removed acquiring registration lock when invoke_client_finalizers called via atexit
- this was causing some lock-order-inversion warnings (potential deadlock)
* Update lib/rocprofiler-sdk/agent.cpp
- model name for agent supports spaces
* Update tests/common/serialization.hpp
- add serialization support for marker tracing data structures
* Update tests/apps
- Add ROCTx markers into reproducible-runtime and transpose
* Update tests/tools/json-tools.cpp
- add marker tracing support
- remove strdup (no longer necessary)
* Update tests/kernel-tracing/validate.py
- validate marker API tracing data
* Update tests/async-copy-tracing/validate.py
- validate marker API tracing data
* Update cmake for load path resolution during testing
* Update tests/async-copy-tracing/CMakeLists.txt
- fix test LD_LIBRARY_PATH
* Update cmake/Templates/rocprofiler-sdk-roctx/config.cmake.in
- fix constructing rocprofiler-sdk-roctx::rocprofiler-sdk-roctx
|
||
|
|
199f0b5421 |
Contexts update + buffer flushing + cleanup (#338)
* Update lib/rocprofiler-sdk/context/context.*
- get_registered_contexts functions (local copy)
* Update lib/rocprofiler-sdk/hsa/{queue,queue_controller}.cpp
- remove ROCPROFILER_BUFFER_TRACING_MEMORY_COPY code
* Update tests/kernel-tracing/kernel-tracing.cpp
- move stop() and flush() in tool_fini to before reporting of sizes of data collected
* Update lib/rocprofiler-sdk/hsa/hsa.*
- remove stale set_callback / activity_functor_t code
* Update lib/rocprofiler-sdk/buffer.cpp
- full wait instead of returning busy when buffer is busy
- use task_group::join instead of task_group::wait to fully wait for tasks to finish (bug fix)
* Update lib/rocprofiler-sdk/agent.cpp
- support agent mapping for CPU agents
* Remove direct access to vector of registered contexts
|
||
|
|
9a0c84efa6 |
Use -sdk suffix and reset VERSION to 0.0.0 (#263)
* Fix find_package(rocprofiler) in build tree * Move include/rocprofiler to include/rocprofiler-sdk * Update include/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) * Move lib/rocprofiler to lib/rocprofiler-sdk * Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool * Update lib/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) - add_subdirectory(rocprofiler-sdk-tool) * Update lib/rocprofiler-sdk/CMakeLists.txt * Rename rocprofiler-tool to rocprofiler-sdk-tool * Replace include rocprofiler/ with include rocprofiler-sdk/ * Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/ * Set VERSION to 0.0.0 and finish install to rocprofiler-sdk * More fixes for rocprofiler -> rocprofiler-sdk - fix issue with rocprofiler-sdk-config.cmake.in - fix counters xml install path * Fix documentation generation * Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree * cmake formatting (cmake-format) (#264) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |