Commit graph

16 Commits

Autor SHA1 Nachricht Datum
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Jonathan R. Madsen 60b1dbfb6f Update HIP API tracing (#958)
- support HipDispatchTable additions for HIP_RUNTIME_API_TABLE_STEP_VERSION 1 thru 4
2024-07-08 17:12:53 -05:00
Jonathan R. Madsen b62ba5f096 Remove fatal error when callback and buffer tracing API in one context (#952)
- one context for callback and buffer tracing of same API produces erroneous fatal error -- this is a valid use case
2024-06-25 02:51:41 -05:00
Jonathan R. Madsen 5525b400c3 Miscellanous AFAR 5 Updates (#891)
* Dispatch table copy/update uses ROCP_TRACE instead of ROCP_INFO

* Update rocprofiler-sdk CMake config

- rocprofiler::rocprofiler is alias to rocprofiler-sdk::rocprofiler-sdk instead of other way around

* Prefer rocprofiler-sdk::rocprofiler-sdk over rocprofiler::rocprofiler

* Fix WITH_UNWIND for glog

- requires a value of "none" instead of boolean now

* Update include/rocprofiler-sdk/registration.h

- explicit struct names to permit forward decl

* Update include/rocprofiler-sdk/cxx/serialization.hpp

- ROCPROFILER_SDK_CEREAL_NAMESPACE_BEGIN and ROCPROFILER_SDK_CEREAL_NAMESPACE_END to enable customized namespace
2024-05-29 16:45:56 -05:00
Jonathan R. Madsen 92b7326910 Adding JSON support (#860)
* Adding json support

minor bugs

Fixing tests

Fixing formatting issues

Fixing test

test fix

Misc testing fixes

Use rocprofiler/cxx/name_info in rocprofiler-sdk-tool

fixes to reduce the Json file size

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/tool.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/tool.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/helper.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/helper.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/tool.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/tool.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/helper.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/tool.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

misc fixes

Removing int cast for JSON tests

formatting

removing a condition test on Navi3

adding debug info

Misc fix

* CSV updates

- fix stats
- numerical formatter support for customizing write_csv_entry
- misc formatting
- get_marker_stats_file

* Misc tests/rocprofv3/counter-collection/input2 fixes

- rocprofiler_configure_pytest_files in rocprofv3/counter-collection/input2
- removed state code from merge in rocprofv3/counter-collection/input2

* Tool: "Agent-id" -> "Agent_Id"

- consistency

* Tool update

- remove rocprofiler_tool_marker_record_t
- add marker_tracing_kind_conversion
- fix memory leak in write_json
- minor update to get_output_stream
- rework handling of marker records

* Update tests/pytest-packages/pytest_utils/__init__.py

- add collapse_dict_list function for converting a dictionary value that is a list of length one into a directly mapped value

* Update tests/rocprofv3/**/conftest.py

- use collapse_dict_list when reading in JSONs

* Update tests/rocprofv3/counter-collection/input1/validate.py

- relax testing requirements gfx1102 (AQLProfile bugs)
  - in addition to relaxed testing requirements for gfx1101

* Update tests/rocprofv3/tracing/validate.py

- fix removal of PID in every marker record

* Update tests/rocprofv3/tracing-plus-cc

- remove test design that relies on iterating subdirectories

* Wrapper around __libc_start_main

- Ensures finalization happens before main returns
- Update tests/rocprofv3/tracing/validate.py
  - wrapper around __libc_start_main changed roctx calls

* Combine include/rocprofiler-sdk/cxx/serialization.hpp and include/rocprofiler-sdk/external/serialization.hpp

- tests/common/serialization.hpp simply includes include/rocprofiler-sdk/cxx/serialization.hpp now

* Update lib/rocprofiler-sdk/hip/hip.cpp

- tracing function immediately returns when fini_status is non-zero

* Update lib/rocprofiler-sdk/hsa/hsa.cpp

- remove logging of tracing function when fini_status is non-zero

* Update lib/rocprofiler-sdk-tool/CMakeLists.txt

- remove rocprofv3_trigger_list_metrics.cpp from TOOL_SOURCES

* Update tests/rocprofv3/tracing-plus-cc/CMakeLists.txt

- fix depends

* Domain statistics

* Update tests/rocprofv3/tracing-plus-cc/CMakeLists.txt

- do not set ROCP_LOG_LEVEL in env

* Remove erroneous <bits/utility.h> include

* Restructure tool source + reduce tool table + support multiple formats

- buffered_output struct for handling output
- support multiple output formats, e.g. --output-format csv,json
- rename buffer_type_t -> domain_type
- simplified generation of CSV output files
- removed rocprofiler_tool_marker_record_t

* Update lib/common/container/ring_buffer.hpp

- value_type alias in ring_buffer<Tp>

* Remove all but one json-execute tests

- generate CSV and JSON in same run

* Fix include for domain_type.cpp

* Update tests/rocprofv3/tracing-plus-cc/input.txt

- only specify counters which can be found on gfx8, gfx9, gfx10, gfx11, etc.
- use :device= syntax

* Update lib/rocprofiler-sdk-tool/config.cpp

- support :device=N syntax for counters file
- improve stripping comments in PMC files
- only read after pmc:

* Rework tool library counter collection

- fatal error if all requested counters for device are not found
- support :device= syntax

* Update tests/rocprofv3/tracing-plus-cc/input.txt

- removed L2CacheHit (not supported on mi300)

* Disable JSON tests in tests/rocprofv3

* Update include/rocprofiler-sdk/cxx/serialization.hpp

- support rocprofiler_record_dimension_info_t

* Update tool JSON schema

- remove domain_type::CODE_OBJECT
- rocprofiler_tool_agent_v0_t
  - rocprofiler_agent_v0_t + counters
- rocprofiler_tool_counter_info_t
- get_code_object_data()

* Update JSON schema for tool

* Update lib/rocprofiler-sdk-tool/tool.cpp

- fix ROCP_WARNING_IF

* rocprofv3 -> rocprofv3.sh

- install rocprofv3.sh into sbin
- configure_file <source-tree>/rocprofv3.sh -> <binary-tree>/bin/rocprofv3

* Update tool counter collection

- rocprofiler_tool_record_counter_t
- rocprofiler_tool_counter_collection_record_t

* Update tests/rocprofv3/counter-collection/input1/CMakeLists.txt

- use rocprofiler_configure_pytest_files for validate.py, conftest.py, and input.txt

* Update tests/rocprofv3/counter-collection/input1/validate.py

- re-enable test_validate_counter_collection_pmc1_json

* Update tests/rocprofv3/counter-collection/input2/validate.py

- remove unused code

* Update tests/rocprofv3/counter-collection/input2/validate.py

- remove unused code

* Update tests/rocprofv3/hsa-queue-dependency/validate.py

- re-enable JSON tests

* Misc tests/rocprofv3 CMake updates

* Update tests/rocprofv3/tracing/validate.py

- re-enable JSON tests

* Update tests/rocprofv3/tracing-hip-in-libraries/validate.py

- re-enable JSON tests

* Update tests/rocprofv3/tracing/validate.py

- remove unused node_exists function

* Update tests/rocprofv3/tracing/validate.py

- fix test_marker_api_trace_json

---------

Co-authored-by: Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com>
2024-05-22 00:53:42 -05:00
Jonathan R. Madsen 4d5b71b0e7 Update logging (#838)
* Update logging

* Remove unused function

* Fix lib/rocprofiler-sdk/hsa/pc_sampling.cpp logging compilation

* Fix logging FLAGS_vmodule string leak and numerical log level

* Update logging

* Update glog submodule

* Leak fixes

* format
2024-05-20 15:38:18 -05:00
Jonathan R. Madsen 56030018dc Callback tracing for kernel dispatches + External correlation ID request service (#682)
* Support ROCPROFILER_CALLBACK_TRACING_KERNEL_DISPATCH

* Fix doxygen

* Update callback tracing

- temporary hacks for kind operation name and iterate kind operations

* Update source/include/rocprofiler-sdk

- introduce sequence id for kernel dispatches

* Update lib/rocprofiler-sdk (seq id)

- support sequence id passing

* Update tests (seq id)

- testing for sequence ids

* Cleanup include/rocprofiler-sdk/fwd.h

* Misc cleanup

* External Correlation ID Request Service (#699)

* External correlation ID request service

- callback requesting an external correlation ID instead of fetching from top of pushed external correlation ID stack

* Update external correlation id request support

- pass internal correlation ID in callback
- async copy generates a correlation ID if none already exists
- added external correlation ID request support for scratch memory tracing
- updated scratch memory tracing to use tracing:: functions

* Update hsa/queue.hpp

- new line at EOF

* Misc tweaks

- remove unnecessary logging in agent.cpp
- correlation_id::add_ref_count check for retirement
- finalization check in HSA queue AsyncSignalHandler

* Improve assertion failure logging in misc tests

* Update include/rocprofiler-sdk/fwd.h

- remove rocprofiler_record_counter_header_t

* Move lib/rocprofiler-sdk/tracing.hpp into lib/rocprofiler-sdk/tracing/ folder

* Update lib/rocprofiler-sdk/hsa/*

- hsa::get_hsa_status_string
- queue_info_session.hpp header
- rocprofiler_packet.hpp

* Update lib/rocprofiler-sdk/{counters,hip,marker}

- execute_phase_exit_callbacks tweaks
- queue_info_session tweaks

* Move rocprofiler_kernel_dispatch_operation_t to include/rocprofiler-sdk/fwd.h

* Update rocprofiler_buffer_tracing_kernel_dispatch_record_t

- add operation field and thread_id field

* Add lib/rocprofiler-sdk/kernel_dispatch

- enum <-> string mapping for kernel dispatch
- tracing implementations

* Update lib/rocprofiler-sdk/CMakeLists.txt

- tracing and kernel dispatch sub-directories

* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp

- invoke rocprofiler::kernel_tracing functions

* Update tests/common/serialization.hpp

- support operation and thread_id fields for rocprofiler_buffer_tracing_kernel_dispatch_record_t

* Update tests/tools/json-tool.cpp

- use external correlation id request service

* Rename sequence_id to dispatch_id
2024-04-11 19:49:49 -05:00
SrirakshaNag bef14ad1b2 rocprofiler-sdk-tool library intermediate binary output (#734)
* Support for binary temporary files

* clang formatting

* formating ring buffer.hpp

* Update source/lib/common/container/ring_buffer.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fixing bugs

* fix loop range

* Fix for v3 test failures

* bug fix

* fix bug

* fix memory leaks

* destructing agent_info

* Update CMakeLists.txt

* clang-tidy fixes

* Fix data race on destructor of rocprofiler_agent_t map in rocprofiler-sdk-tool library

* Create lib/rocproifler-sdk-tool/tmp_file.*

- move tmp_file class into separate header/implementation

* Agent Info CSV in rocprofiler-sdk-tool

- update tests to use agent_info.csv instead of rocminfo

* Update lib/rocprofiler-sdk-tool/tool.cpp

- use logical_node_id instead of node_id

* Adding stats file

* Adding tests for stats

* Update scratch memory support

- convert scratch memory support to use binary output

* Tool Update: scratch memory stats + extended statistics

- replace generate_*_csv with generate_csv overloads
- added generate_csv for scratch memory
- enable stats for scratch memory
- replace ROCPROF_*_STATS env variables with ROCPROF_STATS env variable

* rocprofv3 update

- simple --stats option
- add scratch memory trace to --sys-trace

* Update tests/rocprofv3/tracing-hip-in-libraries

- extend validate.py to test stats data
- fix conftest.py for memory_copy_stats_data

* Code coverage fixes

- invoke __gcov_dump to ensure that code coverage is flushed after finalization

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-04-09 05:25:28 -05:00
Benjamin Welton 41c0ddd72d Convert LOG() -> ROCP_X logging macros. (#695)
* Convert LOG() -> ROCP_X logging macros.

This patch converts the LOG() macro to the ROCP_X logging macros.
There are the following levels of logs.

Logs whos expressions are not evaluated unless the log level is enabled:

ROCP_TRACE - VLOG(2) (enabeled by env variable GLOG_v=2)
ROCP_INFO - VLOG(1) (enabeled by env variable GLOG_v=1)

Logs whos expressions are always evaluated:

ROCP_WARNING - LOG(WARNING)
ROCP_ERROR - LOG(ERROR)
ROCP_FATAL - LOG(FATAL)
ROCP_DFATAL - DLOG(FATAL) (only fatal in debug mode)

* source formatting (clang-format v11) (#696)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Minor fix

* Fixes for VLOG before main

* fix vmodule

* source formatting (clang-format v11) (#718)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* memory leak fix

* Vlog change

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-04-02 17:15:30 -07:00
Jonathan R. Madsen 2f9b1767e9 Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization

- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue

* Update HIP/HSA/marker update_table logging

* Update rocprofv3 tests

- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them

* Disable thread sanitizer deadlock detection

* Update CI workflow

- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers

* Update run-ci.py

- set gcovr html medium and high threshold

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- remove this capture from enable/disable serialization

* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*

- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map

* Logging for HIP/HSA/marker/profile_serializer

* Logging for HIP/HSA/marker/queue_controller

* Improve test_retired_correlation_ids asserts

* Fix tests/counter-collection/validate.py

- scale expected SQ_WAVES counter value based on warp size of GPU

* Tweak github comment for code coverage

* Remove gcovr html high/medium threshold args

* Fix tests/counter-collection/validate.py

- round before casting to int in test_counter_values

* operator bool for profile_serializer

- only wait on CV if profile_serializer is used

* Logging updates (profile_serializer + code_object)

* Update counter-collection validate.py

* QueueController does not wait on CV if finalizing/finalized

* Update CI workflow

- remove navi32 from core job

* Improve HIP/HSA/marker tracing get_functor/functor

- remove lambda wrapper around functor

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- do not acquire cvmutex lock during finalization

* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*

- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized

* Update CI workflow

- remove navi32 runners

* bwelton fixes for hangs

* CMake improvements + simplified demangle

- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-03-21 17:52:15 -05:00
Jonathan R. Madsen 8591ed1c96 Use small_vector for API iterate_args (#597)
* Use small_vector for API iterate_args

- replace dim3 value arguments with rocprofiler_dim3_t
  - dim3 has a non-trivial destructor
- common::mpl::unqualified_type
- common::stringified_argument_array_t<N> alias
- assert_public_data_type_properties()
- common::container::small_vector<T>::at function
- stringize returns small_vector<stringified_argument>
  - stack allocated vector
- remove has_pc_sampling condition (HSA, HIP)
  - this will be handled in queue interception

* Misc tweaks
2024-03-13 07:36:55 -05:00
Jonathan R. Madsen 1bb94add11 Fix rocprofiler_iterate_callback_tracing_kind_operation_args for HIP compiler callbacks (#532)
* Fix HIP compiler iterate args

- `include/rocprofiler-sdk/hip/api_args.h`
  - replace struct fields named "f" with "func"
  - replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/callback_tracing.cpp`
  - iterate_args for HIP compiler table
- `lib/rocprofiler-sdk/registration.cpp`
  - fix warning about roctx num_tables
- `lib/rocprofiler-sdk/hip/hip.def.cpp`
  - replace struct fields named "f" with "func"
  - replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/{hip,hsa,marker}/utils.hpp`
  - improve `stringize_impl`
- `lib/rocprofiler-sdk/hsa/code_object.cpp`
  - remove stale commented out code
- `lib/rocprofiler-sdk/hsa/queue_controller.*`
  - destory_queue -> destroy_queue
- `tests/tools/json-tool.cpp`
  - improve parallelism in tool_tracing_callback
  - serialize the marker api args
  - only invoke rocprofiler_iterate_callback_tracing_kind_operation_args in exit phase
- `samples/counter_collection/CMakeLists.txt`
  - reduce timeout on tests to 120 seconds

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- disable dereference of double pointer in stringize_impl

* Update lib/common

- indirection_level in mpl.hpp
- stringize_arg.hpp

* Rework rocprofiler_iterate_callback_tracing_kind_operation_args

- provide more information in rocprofiler_callback_tracing_operation_args_cb_t
- support specifying the dereference level to account for output paramters
2024-03-01 01:46:07 -06:00
Jonathan R. Madsen a1267e1fd2 C compatibility for public headers (#566)
* C compatibility for public headers

- add tests/tools/c-tool.c
  - builds a tool (which does nothing) with C language
  - ensures that tool can be compiled in C
- add tests/c-tool/CMakeLists.txt
  - ensures that tool library build from C is a valid tool
- rocprofiler_counter_info_v0_t is_derived is int instead of bool
  - C does not have bool unless <stdbool.h> is included
- add `include/rocprofiler-sdk/hsa/api_trace_version.h
  - handles providing HSA_*_TABLE_(MAJOR|STEP)_VERSION values if compiled from C
- cmake define in version.h.in for ROCPROFILER_HSA_*_TABLE_(MAJOR|STEP)_VERSION
  - HSA table versions compiled with
- use rocprofiler_(hsa|hip|marker)_api_no_args struct to handle incompatibility b/t empty structs in C vs. C++ (size of 0 vs. size of 1)
- extern "C" in include/rocprofiler-sdk/{hsa,hip,marker}/api_args.h
- fixed spelling error: derrived -> derived
- scope YY_NO_INPUT compile definition to lib/rocprofiler-sdk/counters/parser/*

* Revert CDash dashboard
2024-02-29 23:49:54 -06:00
Jonathan R. Madsen 875f53b608 Correlation ID Retirement + misc (#527)
* Correlation ID Retirement

- include/rocprofiler-sdk/buffer_tracing.h
  - add rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- include/rocprofiler-sdk/fwd.h
  - ROCPROFILER_BUFFER_TRACING_CORRELATION_ID_RETIREMENT
- lib/rocprofiler-sdk/buffer_tracing.cpp
  - kind string for correlation id retirement
- lib/rocprofiler-sdk/buffer.hpp
  - emplace returns bool
- lib/rocprofiler-sdk/registration.cpp
  - pass lib_instance to copy_table functions
- lib/rocprofiler-sdk/context/context.*
  - update correlation_id struct
    - make ref_count private
    - {get,add,sub}_ref_count() functions
      - sub_ref_count() performs correlation id retirement
    - use stack for "latest" thread-local correlation id
- lib/rocprofiler-sdk/hip/hip.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/hsa.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/marker/marker.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/async_copy.cpp
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - handle table instance in async_copy_init / async_copy_save
- lib/rocprofiler-sdk/hsa/queue.cpp
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - tweak to external correlation id mapping in WriteInterceptor
- tests/async-copy-tracing/validate.py
  - check retired_correlation_ids
- tests/common/serialization.hpp
  - support rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- tests/kernel-tracing/validate.py
  - check retired_correlation_ids
- tests/common/CMakeLists.txt
  - perfetto external project
- tests/common/perfetto.hpp
  - perfetto categories + aliases
  - add_perfetto_annotation
  - metaprogramming helpers
- tests/tools/CMakeLists.txt
  - link to tests-perfetto
- tests/tools/json-tool.cpp
  - demangling functions
  - serialization of marker API callback args
  - reduce parallel bottleneck in tool_tracing_callback
  - support correlation id retirement
  - Multiple threads for buffers
  - Support ROCPROFILER_TOOL_CONTEXTS_EXCLUDE env variable
  - write_perfetto() function

* Update tests/rocprofv3/tracing/validate.py

- tweak test_hsa_api_trace

* Update PTL submodule

- fixes for data race during destruction of task

* Update lib/rocprofiler-sdk/buffer.*

- unique_buffer_vec_t uses std::unique_ptr instead of allocator::unique_static_ptr_t

* Reduce timeouts in counter collection samples [skip ci]

* Update tests/tools/json-tool.cpp

- tweak demangle(string_view, int*) -> demangle(string_view, int&)

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- move sub_ref_count() to later in async_copy_handler to delay retirement slightly more
2024-02-23 10:30:33 -06:00
Jonathan R. Madsen 3f39339926 API Tracing Overhaul (#437)
* Update include/rocprofiler-sdk/hsa/*

- split HSA API IDs into separate enumerations
- add support for finalize ext table

* Update include/rocprofiler-sdk/hip/*

- remove compiler_api_args.h
- rocprofiler_hip_api_args_t contains all for HIP runtime and HIP compiler
- ROCPROFILER_HIP_API_ID_ -> ROCPROFILER_HIP_RUNTIME_API_ID_

* Update include/rocprofiler-sdk/marker/table_api_id.h

- ROCPROFILER_MARKER_API_TABLE_ID_ -> ROCPROFILER_MARKER_TABLE_ID_

* Update include/rocprofiler-sdk/*/table_api_id.h

- table_api_id.h -> table_id.h

* Update include/rocprofiler-sdk/*/table_api_id.h

- table_api_id.h -> table_id.h

* Update include/rocprofiler-sdk/fwd.h

- ROCPROFILER_CALLBACK_TRACING_HSA_API split into 4 enum values:
  - ROCPROFILER_CALLBACK_TRACING_HSA_CORE_API
  - ROCPROFILER_CALLBACK_TRACING_HSA_AMD_EXT_API
  - ROCPROFILER_CALLBACK_TRACING_HSA_IMAGE_EXT_API
  - ROCPROFILER_CALLBACK_TRACING_HSA_FINALIZE_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_API split into 4 enum values:
  - ROCPROFILER_BUFFER_TRACING_HSA_CORE_API
  - ROCPROFILER_BUFFER_TRACING_HSA_AMD_EXT_API
  - ROCPROFILER_BUFFER_TRACING_HSA_IMAGE_EXT_API
  - ROCPROFILER_BUFFER_TRACING_HSA_FINALIZE_EXT_API
- rocprofiler_callback_tracing_code_object_operation_t renamed to rocprofiler_code_object_operation_t (more consistent)
- doxygen updates

* Update include/rocprofiler-sdk/buffer_tracing.h

- improved doxygen comments
- removed unused rocprofiler_buffer_tracing_queue_scheduling_record_t
- removed unused rocprofiler_buffer_tracing_correlation_record_t

* Update include/rocprofiler-sdk/callback_tracing.h

- removed rocprofiler_callback_tracing_hip_compiler_api_data_t
  - rocprofiler_hip_api_args_t and rocprofiler_hip_compiler_api_args_t were combined
  - rocprofiler_hsa_api_retval_t and rocprofiler_hsa_compiler_api_retval_t were combined

* Update lib/rocprofiler-sdk/hsa/*

- utils.hpp
  - formatters for hsa_ext_program_t and hsa_ext_control_directives_t
- defines.hpp
  - removed variadic macros from lib/common/defines.hpp
  - HSA_API_META_DEFINITION, HSA_API_INFO_DEFINITION_0, HSA_API_INFO_DEFINITION_V specialize on table id
- async_copy.cpp
  - ROCPROFILER_HSA_API_ID_* -> ROCPROFILER_HSA_AMD_EXT_API_ID_*
  - add table id to templates
  - improve async_copy_fini
- hsa.hpp
  - add hsa_table_id_lookup
  - add hsa_domain_info
  - add table id to templates
  - add copy_table function
- hsa.cpp
  - add table id to templates
  - require hsa tables to be trivial and standard layout
  - remove set_data_args specialization for hsa_amd_memory_async_copy_rect
  - implement copy_table function
- hsa.def.cpp
  - update enums

* Update lib/rocprofiler-sdk/hip/*

- defines.hpp
  - use lib/common/defines.hpp
  - add hip_table_id_lookup to HIP_API_TABLE_LOOKUP_DEFINITION
- hip.hpp
  - hip_table_id_lookup
  - template iterate_args on table id
  - templated copy_table and update_table
- hip.cpp
  - replaced api_id_bounds with hip_domain_info
  - templated iterate_args on table id
  - templated copy_table and update_table

* Update lib/rocprofiler-sdk/marker/*

- defines.hpp
  - use lib/common/defines.hpp
- marker.cpp
  - updated enums
- marker.def.cpp
  - updated enums

* Update lib/rocprofiler-sdk/tests

- common.hpp
  - ROCPROFILER_CALL_EXPECT
  - callback_data_ext
  - update get_callback_tracing_names with new enums
  - update get_buffer_tracing_names with new enums
- external_correlation.cpp
  - support new HSA API enums
- intercept_table.cpp
  - use test/common.hpp
  - update to new HSA API enums
- registration.cpp
  - support new HSA API enums
- naming.cpp
  - validation for all get_ids(), get_names(), name_by_id(), id_by_name(), etc.

* Update lib/common

- defines.hpp
  - Move IMPL_DETAIL_FOR_EACH_NARG, GET_ADDR_MEMBER_FIELDS, and GET_NAMED_MEMBER_FIELDS here
    - used by HSA, HIP, and Marker
- static_object.hpp
  - is_trivial_standard_layout static constexpr member function
  - suppress register_static_dtor when is_trivial_standard_layout

* Update lib/rocprofiler-sdk/hsa/code_object.*

- name_by_id
- id_by_name
- get_names
- get_ids

* Update lib/rocprofiler-sdk/registration.cpp

- Update rocprofiler_set_api_table for HSA

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- Update for new HSA enums
- Rework to use switch statement
  - rocprofiler_query_callback_tracing_kind_operation_name
  - rocprofiler_iterate_callback_tracing_kind_operations
  - rocprofiler_iterate_callback_tracing_kind_operation_args

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- Update for new HSA enums
- Rework to use switch statement
  - rocprofiler_query_buffer_tracing_kind_operation_name
  - rocprofiler_iterate_buffer_tracing_kind_operations

* Update lib/rocprofiler-sdk-tool

- helper.cpp
  - update get_buffer_id_names with new enums
  - update get_callback_id_names with new enums
- tools.cpp
  - update to use new HSA enums

* Update samples/common

- added call_stack.hpp
  - source_location struct
  - call_stack_t alias
  - print_call_stack function
- added name_info.hpp
  - utils for getting buffer/callback domain and operation names

* Update samples/api_buffered_tracing/client.cpp

- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums

* Update samples/api_callback_tracing/client.cpp

- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums

* Update tests/tools/json-tool.cpp

- update for new HSA enums

* Update tests/rocprofv3/tracing/validate.py

- update for new HSA domain names

* Update samples/counter_collection/main.cpp

- reduce number of kernels to 50,000 since 200,000 causes issues with thread sanitizer
2024-01-30 12:14:26 -06:00
Jonathan R. Madsen c641749fe6 HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*

- updates for intercept table

* Update lib/common/units.hpp

- clang-tidy fixes

* Add lib/rocprofiler-sdk/hip

- tracing implementation for the HIP intercept table

* Update source/lib/rocprofiler-sdk/CMakeLists.txt

- add_subdirectory(hip)

* Update source/lib/rocprofiler-sdk/hsa

- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION

* Update lib/rocprofiler-sdk/hip

- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/tests/intercept_table.cpp

- remove failures for intercepting HIP API tables

* Update include/rocprofiler-sdk/fwd.h

- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args

* Update lib/rocprofiler-sdk/intercept_table.cpp

- support HipDispatchTable and HipCompilerDispatchTable

* Update lib/rocprofiler-sdk/internal_threading.cpp

- Support ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/registration.cpp

- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging

* Update samples/api_{buffered,callback}_tracing

- Modifications to demonstrate HIP API tracing

* Update tests/kernel-tracing

- Modifications to handle/test HIP API tracing

* Separate HIP tracing from HIP compiler tracing

* Fix installation of include/rocprofiler-sdk/hip/*

- add compiler and table headers to install

* Fixes to HIP interception

- hip_api_trace.hpp was updated a bit
  - removed hipGetDeviceProperties (generic)
  - added hipGetDevicePropertiesR0600
  - added hipGetDevicePropertiesR0000
  - removed hipRegisterTracerCallback
  - reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
  - added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers

* Update lib/rocprofiler-sdk/hip/hip.*

- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update lib/rocprofiler-sdk/hsa/hsa.*

- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update test/kernel-tracing/validate.py

- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register

* Update tests/tools/json-tool.cpp

- fix context associated with "HIP_API_CALLBACK"

* Update external/CMakeLists.txt

- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
  - BUILD_TESTING (OFF)
  - BUILD_SHARED_LIBS (OFF)
  - BUILD_OBJECT_LIBS (OFF)
  - BUILD_STATIC_LIBS (ON)
  - CMAKE_POSITION_INDEPENDENT_CODE (ON)
  - CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
  - CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog

* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt

- remove explicit setting of SKIP_BUILD_RPATH

* Update CMakeLists.txt

- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH

* Update tests/CMakeLists.txt

- include(GNUInstallDirs)

* Update samples/CMakeLists.txt

- include(GNUInstallDirs)

* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h

- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- clang-tidy fixes

* Update cmake/rocprofiler_linting.cmake

- add a feature for clang tidy exe

* Update lib/rocprofiler-sdk/hip/hip.cpp

- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- fix merge

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- fix merge

* Update bin/rocprofv3

- args for marker, HIP runtime, and HIP compiler tracing

* Update tests/apps/simple-transpose

- use roctx

* Update tests/rocprofv3/tracing

- validate marker API data

* Update lib/rocprofiler-sdk-tool

- support for HIP runtime, HIP compiler, marker API

* Update queue/queue_controller/registration/utility

- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
  - implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
  - this is used to sync each queue during queue_controller_fini()

* Fix data races: queue/context/stable_vector

- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array

* Update lib/rocprofiler-sdk/hsa/hsa.*

- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables

* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp

- use HSA subtable accessors

* Update rocprofiler_memcheck and CI workflow

- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
  - GCC 13 uses libtsan.so.2

* Update CI workflow

* Update lib/rocprofiler-sdk/counters/{metrics,counters}

- fix possibly dangling reference to a temporary from gcc-13

* Update thread-sanitizer-suppr.txt

- Ignore data races originating in hsa-runtime library

* Update cmake/rocprofiler_memcheck.cmake

- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library

* Update tests/rocprofv3/tracing/CMakeLists.txt

- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test

* Update lib/common/container/record_header_buffer.hpp

- fix data race identified by gcc v13 and libtsan.so.2

* Update hip API id, args, and def

- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0

* Update lib/common/container/record_header_buffer.hpp

- fix deadlock in save/read/reset

* Update source/docs/CMakeLists.txt

- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- remove overloads for HIP_MEMSET_NODE_PARAMS

* Update docs/CMakeLists.txt

- use find_program for shell instead of hardcoded /bin/bash
2024-01-24 16:32:54 -06:00