نمودار کامیت

17 کامیت‌ها

مولف SHA1 پیام تاریخ
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Jakaraddi, Manjunath 78d8f4b8ea SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary

* adding json output for host_functions

* added error handling

* adding json tool support

* Adding tests

* formatting changes

* Adding documentation

* refactoring as per amd-staging

* Adding intializers and changing macros

* Fix page-migration background thread on fork (#31)

* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)

* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- debugging for error that cannot be locally reproduced

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- improve error handling and logging

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- tweak to non-fatal logging messages

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- cleanup of logging messages

* Update host kernel symbol register data fields

* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 11:42:37 +00:00
Jonathan R. Madsen b4dc4f8e92 JSON schema + JSON node vs. Perfetto category consistency (#869)
* Fix schema differences b/t JSON and perfetto

- "kernel_dispatches" (JSON) vs. "kernel_dispatch" (Perfetto) -> "kernel_dispatch"
- "scratch_api" (JSON) -> "scratch_memory"

* Update include/rocprofiler-sdk/cxx/serialization.hpp

- remove unnecessary includes causing circular dependency

* rocprofv3 docs

- Output formats
- JSON schema

* Spelling fix

* Update schema descriptions

* Improve assert log in test_perfetto_data
2024-05-22 20:36:27 -05:00
Jonathan R. Madsen 1f96593b4f Test using HIP Graphs (#835)
* Test using hip graphs

* Remove assert for api_end < async_end

* Update rocprofv3/tracing-hip-in-libraries::test_api_trace

* Update rocprofv3/tracing-hip-in-libraries::test_api_trace

* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout

* Update rocprofv3/tracing-hip-in-libraries::test_api_trace

* Remove submit retry

* Update rocprofv3/tracing-hip-in-libraries::test_api_trace

* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout

* Update lib/common/container/record_header_buffer.hpp

- minor tweaks

* Update lib/rocprofiler-sdk/buffer.hpp

- tweak ROCPROFILER_BUFFER_POLICY_LOSSLESS flush behavior

* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout

* Update rocprofv3/tracing-hip-in-libraries::test_api_trace

* Revert rocprofv3-test-trace-hip-in-libraries-validate timeout

* Update run-ci.py

- RETRY_COUNT set to zero
2024-05-07 15:10:22 -05:00
Jonathan R. Madsen 12c836f95f Async memory copy callback tracing + memory copy size (#791)
* Async memory copy tracing update

- rocprofiler_buffer_tracing_memory_copy_record_t: thread_id and bytes
- support ROCPROFILER_CALLBACK_TRACING_MEMORY_COPY
- init_public_api_struct can fully construct

* Testing for callback async copy tracing
2024-04-18 04:31:59 -05:00
Jonathan R. Madsen 32bc339789 Simplify json-tool JSON schema for callback records (#790)
- formerly, the rocprofiler_callback_tracing_record_t data was stored in itr["record"], e.g. itr["record"]["correlation_id"]
  - dropped "record" key, e.g. itr["correlation_id"]
2024-04-18 03:58:10 -05:00
Jonathan R. Madsen 07537b6231 rocprofiler_kernel_dispatch_info_t + header record for buffered counter collection (#758)
* Update include/rocprofiler-sdk

- defines.h
  - ROCPROFILER_VERSION_10_0 -> ROCPROFILER_SDK_VERSION_0_0
- fwd.h
  - rocprofiler_counter_record_kind_t
  - rocprofiler_kernel_dispatch_info_t
  - rocprofiler_record_counter_t
    - has dispatch id instead of correlation id
  - rocprofiler_counter_info_v0_t
    - added rocprofiler_counter_id_t field
    - added is_constant field
    - reordered better packing
- dispatch_profile.h
  - added rocprofiler_profile_counting_dispatch_record_t for use as a header record for rocprofiler_profile_counting_dispatch_data_t
- callback_tracing.h
  - rocprofiler_callback_tracing_kernel_dispatch_data_t uses rocprofiler_kernel_dispatch_info_t
- buffer_tracing.h
  - rocprofiler_buffer_tracing_kernel_dispatch_record_t uses rocprofiler_kernel_dispatch_info_t

* Update lib/rocprofiler-sdk/*

- transition to rocprofiler_kernel_dispatch_info_t
- set id and is_constant values for rocprofiler_counter_info_v0_t in rocprofiler_query_counter_info

* Update lib/rocprofiler-sdk-tool

- transition to rocprofiler_kernel_dispatch_info_t

* Update lib/rocprofiler-sdk/counters/tests/core.cpp

- transition to rocprofiler_kernel_dispatch_info_t

* Update samples

- transition to rocprofiler_kernel_dispatch_info_t
- transition to rocprofiler_counter_record_kind_t

* Update tests

- transition to rocprofiler_kernel_dispatch_info_t
- transition to rocprofiler_counter_record_kind_t
- improve integration test validation for counter-collection
- update serialization for new/additional types

* Fix tests/counter-collection/validate.py

- loosen restrictions on the length of counter description

* Update include/rocprofiler-sdk/buffer_tracing.h

- remove accidental packed attribute

* Update lib/rocprofiler-sdk/counters/xml/derived_counters.xml

- Add description for TCC_TAG_STALL_sum (reference: https://rocm.docs.amd.com/en/develop/conceptual/gpu-arch/mi300-mi200-performance-counters.html)

* Update tests/page-migration/validate.py
2024-04-12 17:30:34 -05:00
Jonathan R. Madsen 56030018dc Callback tracing for kernel dispatches + External correlation ID request service (#682)
* Support ROCPROFILER_CALLBACK_TRACING_KERNEL_DISPATCH

* Fix doxygen

* Update callback tracing

- temporary hacks for kind operation name and iterate kind operations

* Update source/include/rocprofiler-sdk

- introduce sequence id for kernel dispatches

* Update lib/rocprofiler-sdk (seq id)

- support sequence id passing

* Update tests (seq id)

- testing for sequence ids

* Cleanup include/rocprofiler-sdk/fwd.h

* Misc cleanup

* External Correlation ID Request Service (#699)

* External correlation ID request service

- callback requesting an external correlation ID instead of fetching from top of pushed external correlation ID stack

* Update external correlation id request support

- pass internal correlation ID in callback
- async copy generates a correlation ID if none already exists
- added external correlation ID request support for scratch memory tracing
- updated scratch memory tracing to use tracing:: functions

* Update hsa/queue.hpp

- new line at EOF

* Misc tweaks

- remove unnecessary logging in agent.cpp
- correlation_id::add_ref_count check for retirement
- finalization check in HSA queue AsyncSignalHandler

* Improve assertion failure logging in misc tests

* Update include/rocprofiler-sdk/fwd.h

- remove rocprofiler_record_counter_header_t

* Move lib/rocprofiler-sdk/tracing.hpp into lib/rocprofiler-sdk/tracing/ folder

* Update lib/rocprofiler-sdk/hsa/*

- hsa::get_hsa_status_string
- queue_info_session.hpp header
- rocprofiler_packet.hpp

* Update lib/rocprofiler-sdk/{counters,hip,marker}

- execute_phase_exit_callbacks tweaks
- queue_info_session tweaks

* Move rocprofiler_kernel_dispatch_operation_t to include/rocprofiler-sdk/fwd.h

* Update rocprofiler_buffer_tracing_kernel_dispatch_record_t

- add operation field and thread_id field

* Add lib/rocprofiler-sdk/kernel_dispatch

- enum <-> string mapping for kernel dispatch
- tracing implementations

* Update lib/rocprofiler-sdk/CMakeLists.txt

- tracing and kernel dispatch sub-directories

* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp

- invoke rocprofiler::kernel_tracing functions

* Update tests/common/serialization.hpp

- support operation and thread_id fields for rocprofiler_buffer_tracing_kernel_dispatch_record_t

* Update tests/tools/json-tool.cpp

- use external correlation id request service

* Rename sequence_id to dispatch_id
2024-04-11 19:49:49 -05:00
Jonathan R. Madsen 2f9b1767e9 Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization

- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue

* Update HIP/HSA/marker update_table logging

* Update rocprofv3 tests

- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them

* Disable thread sanitizer deadlock detection

* Update CI workflow

- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers

* Update run-ci.py

- set gcovr html medium and high threshold

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- remove this capture from enable/disable serialization

* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*

- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map

* Logging for HIP/HSA/marker/profile_serializer

* Logging for HIP/HSA/marker/queue_controller

* Improve test_retired_correlation_ids asserts

* Fix tests/counter-collection/validate.py

- scale expected SQ_WAVES counter value based on warp size of GPU

* Tweak github comment for code coverage

* Remove gcovr html high/medium threshold args

* Fix tests/counter-collection/validate.py

- round before casting to int in test_counter_values

* operator bool for profile_serializer

- only wait on CV if profile_serializer is used

* Logging updates (profile_serializer + code_object)

* Update counter-collection validate.py

* QueueController does not wait on CV if finalizing/finalized

* Update CI workflow

- remove navi32 from core job

* Improve HIP/HSA/marker tracing get_functor/functor

- remove lambda wrapper around functor

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- do not acquire cvmutex lock during finalization

* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*

- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized

* Update CI workflow

- remove navi32 runners

* bwelton fixes for hangs

* CMake improvements + simplified demangle

- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-03-21 17:52:15 -05:00
Jonathan R. Madsen 875f53b608 Correlation ID Retirement + misc (#527)
* Correlation ID Retirement

- include/rocprofiler-sdk/buffer_tracing.h
  - add rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- include/rocprofiler-sdk/fwd.h
  - ROCPROFILER_BUFFER_TRACING_CORRELATION_ID_RETIREMENT
- lib/rocprofiler-sdk/buffer_tracing.cpp
  - kind string for correlation id retirement
- lib/rocprofiler-sdk/buffer.hpp
  - emplace returns bool
- lib/rocprofiler-sdk/registration.cpp
  - pass lib_instance to copy_table functions
- lib/rocprofiler-sdk/context/context.*
  - update correlation_id struct
    - make ref_count private
    - {get,add,sub}_ref_count() functions
      - sub_ref_count() performs correlation id retirement
    - use stack for "latest" thread-local correlation id
- lib/rocprofiler-sdk/hip/hip.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/hsa.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/marker/marker.*
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - return in iterate_args
  - handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/async_copy.cpp
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - handle table instance in async_copy_init / async_copy_save
- lib/rocprofiler-sdk/hsa/queue.cpp
  - migrate to new {get,add,sub}_ref_count() for correlation ids
  - tweak to external correlation id mapping in WriteInterceptor
- tests/async-copy-tracing/validate.py
  - check retired_correlation_ids
- tests/common/serialization.hpp
  - support rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- tests/kernel-tracing/validate.py
  - check retired_correlation_ids
- tests/common/CMakeLists.txt
  - perfetto external project
- tests/common/perfetto.hpp
  - perfetto categories + aliases
  - add_perfetto_annotation
  - metaprogramming helpers
- tests/tools/CMakeLists.txt
  - link to tests-perfetto
- tests/tools/json-tool.cpp
  - demangling functions
  - serialization of marker API callback args
  - reduce parallel bottleneck in tool_tracing_callback
  - support correlation id retirement
  - Multiple threads for buffers
  - Support ROCPROFILER_TOOL_CONTEXTS_EXCLUDE env variable
  - write_perfetto() function

* Update tests/rocprofv3/tracing/validate.py

- tweak test_hsa_api_trace

* Update PTL submodule

- fixes for data race during destruction of task

* Update lib/rocprofiler-sdk/buffer.*

- unique_buffer_vec_t uses std::unique_ptr instead of allocator::unique_static_ptr_t

* Reduce timeouts in counter collection samples [skip ci]

* Update tests/tools/json-tool.cpp

- tweak demangle(string_view, int*) -> demangle(string_view, int&)

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- move sub_ref_count() to later in async_copy_handler to delay retirement slightly more
2024-02-23 10:30:33 -06:00
Jonathan R. Madsen 0d939edbba Updates/fixes for CI, docs, tests, samples, and common library (#528)
- .github/workflows/continuous_integration.yml
  - apt-get update before apt-get install
  - remove libgtest-dev
  - actions-comment-pull-request: v2.4.3 -> v2.5.0
- .github/workflows/formatting.yml
  - create-pull-request: v5 -> v6
- cmake/rocprofiler_options.cmake
  - remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options
- samples/counter_collection/callback_client.cpp
  - corr_id field renamed to correlation_id
- samples/counter_collection/client.cpp
  - corr_id field renamed to correlation_id
- include/rocprofiler-sdk/fwd.h
  - In rocprofiler_record_counter_t: rename corr_id field to correlation_id
  - doxygen fixes
- lib/common/utility.*
  - remove get_accurate_clock_id_impl
  - timestamp_ns() defaults to CLOCK_BOOTTIME
- lib/rocprofiler-sdk/counters/core.cpp
  - fix spelling mistake: extrenal -> external
  - corr_id field renamed to correlation_id
- lib/rocprofiler-sdk-tool/tool.cpp
  - fix destruction of static tool::output_file before finalization
- scripts/update-docs.sh
  - define PROJECT_NAME
- tests/async-copy-tracing/validate.py
  - init_time and fini_time checks
  - hip_api_traces, marker_api_tracing
- tests/common/serialization.hpp
  - fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id
- tests/kernel-tracing/validate.py
  - init_time and fini_time checks
  - relax test_total_runtime range
- tests/rocprofv3/tracing/CMakeLists.txt
  - remove -M from rocprofv3-test-systrace-execute
  - exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing
- tests/rocprofv3/tracing/validate.py
  - update test_kernel_trace to accept mangled or demangled
- tests/tools/json-tool.cpp
  - remove use of GLOG
  - include init_time and fini_time
  - write_json(...) function
2024-02-22 00:16:43 -06:00
Jonathan R. Madsen aaff4976d2 Kernel Tracing Fix (#439)
* Update lib/rocprofiler-sdk/hsa/queue.cpp

- switch using the kernel_pkt.kernel_dispatch.completion_signal instead of interrupt signal for getting the dispatch time

* Update tests/kernel-tracing/validate.py

- add verification of total runtime collected in test_timestamps
  - the sum of the runtime of all the kernels in reproducible-runtime should be ~1 sec +/- 10%

* Remove include/rocprofiler-sdk/rocprofiler_plugin.h

* Update CI workflow

- update actions/cache@v3 -> v4
- actions/cache/save@v3 -> v4
- thollander/actions-comment-pull-request@v2 -> v2.4.3

* Update pytest.ini

- change default options to one that is more verbose

* Update tests/kernel-tracing/CMakeLists.txt

- skip test_total_runtime when Address or Thread Sanitizer enabled
  - overhead skews the results

* Update tests/kernel-tracing/validate.py

- separate test_total_runtime test
2024-01-30 14:52:17 -06:00
Jonathan R. Madsen c641749fe6 HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*

- updates for intercept table

* Update lib/common/units.hpp

- clang-tidy fixes

* Add lib/rocprofiler-sdk/hip

- tracing implementation for the HIP intercept table

* Update source/lib/rocprofiler-sdk/CMakeLists.txt

- add_subdirectory(hip)

* Update source/lib/rocprofiler-sdk/hsa

- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION

* Update lib/rocprofiler-sdk/hip

- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/tests/intercept_table.cpp

- remove failures for intercepting HIP API tables

* Update include/rocprofiler-sdk/fwd.h

- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args

* Update lib/rocprofiler-sdk/intercept_table.cpp

- support HipDispatchTable and HipCompilerDispatchTable

* Update lib/rocprofiler-sdk/internal_threading.cpp

- Support ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/registration.cpp

- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging

* Update samples/api_{buffered,callback}_tracing

- Modifications to demonstrate HIP API tracing

* Update tests/kernel-tracing

- Modifications to handle/test HIP API tracing

* Separate HIP tracing from HIP compiler tracing

* Fix installation of include/rocprofiler-sdk/hip/*

- add compiler and table headers to install

* Fixes to HIP interception

- hip_api_trace.hpp was updated a bit
  - removed hipGetDeviceProperties (generic)
  - added hipGetDevicePropertiesR0600
  - added hipGetDevicePropertiesR0000
  - removed hipRegisterTracerCallback
  - reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
  - added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers

* Update lib/rocprofiler-sdk/hip/hip.*

- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update lib/rocprofiler-sdk/hsa/hsa.*

- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update test/kernel-tracing/validate.py

- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register

* Update tests/tools/json-tool.cpp

- fix context associated with "HIP_API_CALLBACK"

* Update external/CMakeLists.txt

- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
  - BUILD_TESTING (OFF)
  - BUILD_SHARED_LIBS (OFF)
  - BUILD_OBJECT_LIBS (OFF)
  - BUILD_STATIC_LIBS (ON)
  - CMAKE_POSITION_INDEPENDENT_CODE (ON)
  - CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
  - CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog

* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt

- remove explicit setting of SKIP_BUILD_RPATH

* Update CMakeLists.txt

- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH

* Update tests/CMakeLists.txt

- include(GNUInstallDirs)

* Update samples/CMakeLists.txt

- include(GNUInstallDirs)

* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h

- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- clang-tidy fixes

* Update cmake/rocprofiler_linting.cmake

- add a feature for clang tidy exe

* Update lib/rocprofiler-sdk/hip/hip.cpp

- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- fix merge

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- fix merge

* Update bin/rocprofv3

- args for marker, HIP runtime, and HIP compiler tracing

* Update tests/apps/simple-transpose

- use roctx

* Update tests/rocprofv3/tracing

- validate marker API data

* Update lib/rocprofiler-sdk-tool

- support for HIP runtime, HIP compiler, marker API

* Update queue/queue_controller/registration/utility

- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
  - implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
  - this is used to sync each queue during queue_controller_fini()

* Fix data races: queue/context/stable_vector

- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array

* Update lib/rocprofiler-sdk/hsa/hsa.*

- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables

* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp

- use HSA subtable accessors

* Update rocprofiler_memcheck and CI workflow

- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
  - GCC 13 uses libtsan.so.2

* Update CI workflow

* Update lib/rocprofiler-sdk/counters/{metrics,counters}

- fix possibly dangling reference to a temporary from gcc-13

* Update thread-sanitizer-suppr.txt

- Ignore data races originating in hsa-runtime library

* Update cmake/rocprofiler_memcheck.cmake

- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library

* Update tests/rocprofv3/tracing/CMakeLists.txt

- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test

* Update lib/common/container/record_header_buffer.hpp

- fix data race identified by gcc v13 and libtsan.so.2

* Update hip API id, args, and def

- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0

* Update lib/common/container/record_header_buffer.hpp

- fix deadlock in save/read/reset

* Update source/docs/CMakeLists.txt

- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- remove overloads for HIP_MEMSET_NODE_PARAMS

* Update docs/CMakeLists.txt

- use find_program for shell instead of hardcoded /bin/bash
2024-01-24 16:32:54 -06:00
Jonathan R. Madsen 1f4cf1aa39 Tools update (#397)
* Srnagara/tool counters collect (#331)

* Adding counter collection capability to tools

* Adding counter collection feature to tools

* Adding counter collection capability to tools

* Fixing merge down issues

* Small tool fixes for build + prevent profile realloc

* Reproducing the counter name query issue in buffered callback

* Minor fix for init order + sample that directly uses sdk-tool for debug purposes

* Adding a temporary fix to print the counter names

* Fixing the output file name and reverting the changes of caching the profile config

* Fixing SGPR_Count value

* cleaning up debug prints

* Adding header to counter collection file

* Adding kernel filtering support

* Remove threading

* Cleaning up the code

* Removing redundant prints

* Revert "Remove threading"

This reverts commit 05c58fb9de826e92cf8d2e3d1c31d5578525dcb4.

* Revert "Cleaning up the code"

This reverts commit 1d964882bf2396dee8ad020cbb6c83b36e0674e9.

* Changing the tools code to align with init-order fix

* cmake formatting (cmake-format) (#335)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* source formatting (clang-format v11) (#336)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* Adding support for async memory copy

* source formatting (clang-format v11) (#391)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* Fixing header typo

* Fixing tool_fini

* Replaceing the direction and kind fields values with description

* Update lib/rocprofiler-sdk-tool/helper.cpp

- Remove use of VLA

* Update lib/rocprofiler-sdk-tool/tool.cpp

- Formatting

* Migrate common/config.* to rocprofiler-sdk-tool

* Update lib/rocprofiler-sdk-tool/tool.cpp

- fix clang-tidy issues

* source formatting (clang-format v11) (#392)

Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

* Update lib/common/mpl.hpp

- is_string_type / is_string_type_impl for deducing if type is a string type

* Update include/rocprofiler-sdk/fwd.h

- ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE starts at zero

* Update lib/rocprofiler-sdk/hsa/async_copy.*

- functions for operation ids and names

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- support iterating and getting names for ROCPROFILER_BUFFER_TRACING_MEMORY_COPY

* Update lib/rocprofiler-sdk-tool/config.*

- env ROCPROFILER_ prefix -> ROCPROF_ prefix
- add support for memory copy tracing, counter collection, etc.

* Update lib/rocprofiler-sdk-tool/helper.*

- removed TracerFlushRecord
- removed cxa_demangle (use one in common library)
- removed GetCounterNames (handled in config)
- removed GetKernelNames (handled in config)

* Add lib/rocprofiler-sdk-tool/output_file.*

- separate out get_output_stream function and output_file struct from tool.cpp

* Add lib/rocprofiler-sdk-tool/csv.hpp

- write_csv_entry automatically quotes strings
- csv_encoder struct enforces correct number of columns

* Update lib/rocprofiler-sdk-tool/CMakeLists.txt

- add new files

* Update lib/rocprofiler-sdk-tool/tool.cpp

- update construction of output_file class
- add kernel_symbol_data for serializing kernel trace data
- use config instead of env lookups
- optimize counter collection profile config lookup/creation

* Update bin/rocprofv3

- rocprofv3 --help exits with 0 (as it should)
- command-line arg for memory copy tracing
- command-line arg for mangled kernels
- command-line arg for truncated kernels
- env ROCPROFILER_ prefix -> env ROCPROF_ prefix

* Update tests/async-copy-tracing/validate.py

- update test_async_copy_direction to new enum values

* Update tests/kernel-tracing/validate.py

- update test_async_copy_direction to new enum values

* Update tests/tools/json-tool.cpp

- add ROCPROFILER_BUFFER_TRACING_MEMORY_COPY to supported buffer_name_info

* Update samples/counter_collection/{CMakeLists.txt,main.cpp}

- remove counter-collection-sdk-tool

* Update .github/workflows/docs.yml

- fix paths triggering running the workflow

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

* adding counter collection support

* Adding counter collection test

* changing directory structure of counter collection tests

* Fixing test path for rocprofv3

* Adding hsa-tracing basic test

* cmake formatting (cmake-format) (#362)

Co-authored-by: bgopesh <bgopesh@users.noreply.github.com>

* counter collection tests drop2

* fixing hsa-trace test for rocprofv3 path

* python formatting (black) (#371)

Co-authored-by: bgopesh <bgopesh@users.noreply.github.com>

* both counter colleciton and tracing should work together

* Fixing rocprofv3 path

* Attempt to fix Segfault with AddressSanitizer

* fixing sanitizer segfault

* Update rocprofv3

* Update lib/rocprofiler-sdk-tool/README.md

- update env variables

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- return ROCPROFILER_STATUS_BUFFER_NOT_FOUND if buffer tracing service is configured with invalid buffer

* Update lib/rocprofiler-sdk-tool/tool.cpp

- designated hsa API trace buffer

* Update tests/hsa-tracing/CMakeLists.txt

- Fix environment

* Update rocprofv3

- do not override HSA_TOOLS_LIB
- support ROCPROF_PRELOAD
- LD_PRELOAD librocprofiler-sdk.so

* Restructure tests directory

- move all rocprofv3 integration tests into subfolder

* Update cmake/Templates/rocprofiler-sdk/config.cmake.in

- create rocprofiler-sdk::rocprofv3 cmake target

* Update tests/rocprofv3/hsa-tracing

- improve validate.py
- convert input to dict via csv.DictReader

* Update tests/apps/CMakeLists.txt

- fix build rpath for simple-transpose

* Update  cmake/rocprofiler_memcheck.cmake

- prefer libtsan.so.0

* Update tests/rocprofv3/hsa-tracing

- move to tests/rocprofv3/tracing
- include kernel tracing and memory copy tracing

* Update lib/rocprofiler-sdk-tool/tool.cpp

- normalize "_ID" vs. "_Id" in CSV column names (use "_Id")

* Update lib/rocprofiler-sdk/buffer.{hpp,cpp}

- change signature of buffer::get_buffers()
- buffer::get_buffers() uses static_object

* Update lib/rocprofiler-sdk/context/context.cpp

- update usage of buffer::get_buffers()
  - now returns pointer

* Update lib/rocprofiler-sdk/tests/buffer.cpp

- update to change for signature of buffer::get_buffers()

* Update tests/rocprofv3/tracing/CMakeLists.txt

- use %argt% with -d argument

* Update lib/rocprofiler-sdk-tool/tool.cpp

- use atexit for finalization

* Update tests/rocprofv3/tracing/CMakeLists.txt

- tweaked name of tests

* Update lib/rocprofiler-sdk/hsa/async_copy.*

- async_copy_fini + reference counting signals

* Update lib/rocprofiler-sdk/registration.cpp

- invoke hsa::async_copy_fini() to prevent data race on signals

---------

Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>
Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
Co-authored-by: bgopesh <bgopesh@users.noreply.github.com>
2024-01-22 19:06:25 -06:00
Jonathan R. Madsen 21dd088c8e ROCTx Library Tracing (#390)
* Update include/rocprofiler-sdk/marker/*

- Update rocprofiler_marker_api_args_t for all API functions
- Add ROCPROFILER_MARKER_API_ID_roctxGetThreadId to rocprofiler_marker_api_id_t

* Update include/rocprofiler-sdk/marker/api_args.h

- fix include

* Update lib/common/mpl.hpp

- is_pair
- is_type_complete_v

* Update include/rocprofiler-sdk/marker/*

- fix rocprofiler_marker_api_retval_t
- add roctxGetThreadId to rocprofiler_marker_api_args_t
- fix type in enum: HsaDevice -> HsaAgent
- add table_api_id.h

* Update include/rocprofiler-sdk/marker.h

- include marker/table_api_id.h

* Update include/rocprofiler-sdk/buffer_tracing.h

- Buffer marker tracer records have begin and end timestamp

* Add lib/rocprofiler-sdk/marker

- tracing implementation for marker (roctx) library

* Update include/rocprofiler-sdk/{buffer_tracing,marker/table_api_id}.h

- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- support for ROCPROFILER_BUFFER_TRACING_MARKER_API

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- support for ROCPROFILER_CALLBACK_TRACING_MARKER_API

* Update lib/rocprofiler-sdk/intercept_table.cpp

- template instantiation for notify_runtime_api_registration

* Update lib/rocprofiler-sdk/registration.cpp

- enable roctx in rocprofiler_set_api_table

* Update lib/rocprofiler-sdk/marker/marker.cpp

- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t

* Update lib/rocprofiler/tests for roctx testing

- add roctx.cpp
  - unit tests for roctx callback and buffer tracing
- support marker API in get_{buffer,callback}_tracing_names()

* Update lib/common/logging.cpp

- logging initialized message mentions env variable

* Update lib/common/mpl.hpp

- NOLINT for misc-definitions-in-headers

* Update lib/rocprofiler-sdk/tests/CMakeLists.txt

- include LD_LIBRARY_PATH in rocprofiler-lib-tests-shared tests

* Update lib/rocprofiler-sdk/registration.cpp

- client_library_vec_t is now vector of option<client_library>
  - enables resetting the client_library after finalization
- removed acquiring registration lock when invoke_client_finalizers called via atexit
  - this was causing some lock-order-inversion warnings (potential deadlock)

* Update lib/rocprofiler-sdk/agent.cpp

- model name for agent supports spaces

* Update tests/common/serialization.hpp

- add serialization support for marker tracing data structures

* Update tests/apps

- Add ROCTx markers into reproducible-runtime and transpose

* Update tests/tools/json-tools.cpp

- add marker tracing support
- remove strdup (no longer necessary)

* Update tests/kernel-tracing/validate.py

- validate marker API tracing data

* Update tests/async-copy-tracing/validate.py

- validate marker API tracing data

* Update cmake for load path resolution during testing

* Update tests/async-copy-tracing/CMakeLists.txt

- fix test LD_LIBRARY_PATH

* Update cmake/Templates/rocprofiler-sdk-roctx/config.cmake.in

- fix constructing rocprofiler-sdk-roctx::rocprofiler-sdk-roctx
2024-01-18 09:48:06 -06:00
Jonathan R. Madsen 936816f762 Async memory copy tracing (#317)
* Update samples/api_buffered_tracing/client.cpp

- support ROCPROFILER_BUFFER_TRACING_MEMORY_COPY

* Update include/rocprofiler-sdk/{buffer_tracing,fwd}.h

- update rocprofiler_buffer_tracing_memory_copy_record_t
- add ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_HOST_TO_HOST to rocprofiler_memory_copy_operation_t

* Update lib/rocprofiler-sdk/context/context.*

- get_registered_contexts functions (local copy)

* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp

- include some memory allocations and memory copies for better testing

* Update tests/common/serialization.hpp

- update serialization save function for rocprofiler_buffer_tracing_memory_copy_record_t

* Update lib/rocprofiler-sdk/hsa/hsa.*

- remove stale set_callback / activity_functor_t code
- forward decl hsa_api_meta
- template struct hsa_api_func for getting function return type and args

* Update tests/kernel-tracing/validate.py

- enforce memory_copies data size
- test timestamps in memory copies data
- improve internal and external correlation id validation

* Update lib/rocprofiler-sdk/hsa/defines.hpp

- HSA_API_META_DEFINITION macro

* Update lib/rocprofiler/hsa/rocprofiler-sdk/hsa/hsa.def.cpp

- HSA_API_META_DEFINITION specializations for async copy functions

* Add lib/rocprofiler-sdk/hsa/async_copy.{hpp,cpp}

- implements buffer memory tracing

* Update lib/rocprofiler-sdk/registration.cpp

- invoke rocprofiler::hsa::async_copy_init

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- logging improvements
- improve hsa <-> rocp agent mapping

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- load original signal in async signal handler before store_screlease

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- use store_relaxed instead of store_screlease

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- logging

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- logging

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- misc changes

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- misc changes

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- misc changes

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- return function pointer instead of lambda

* Update reproducible-runtime.cpp

- device sync

* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp

- use *Async variants of hipMalloc and hipMemcpy

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- populate async data properly

* Update tests/kernel-tracing/validate.py

- verification of async copy direction

* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp

- temporarily disable async memcpy functions

* Create tests/tools

- directory containing tool libraries used for collecting data in integration tests

* Update tests/kernel-tracing

- remove kernel-tracing-test-tool library (now rocprofiler-sdk-json-tool)
- update cmake, validate.py, conftest.py accordingly

* Add tests/async-copy-tracing

- integration test validating async copy tracing in transpose example

* Update tests/CMakeLists.txt

- updates for restructuring

* Revert tests/apps/reproducible-runtime

- restore code to semi-original state (no memory copying)

* Update tests/async-copy-tracing/validate.py

- fix comment in test_async_copy_direction

* Fix building tests against installation
2024-01-09 11:34:46 -06:00
Jonathan R. Madsen cf5e4b4b1b Integration Testing (#211)
* Add external/cereal submodule

- used for integration testing

* Update lib/common/container/small_vector.hpp

- documentation notes

* Update tests/apps

- update transpose app (fix build)
- add reproducible-runtime app

* Update include/rocprofiler/fwd.h

- rocprofiler_service_callback_phase_t -> rocprofiler_callback_phase_t

* Update PTL submodule

- fix for task group: submitting tasks from different thread

* Update lib/rocprofiler/hsa/queue.cpp

- CHECK_NOTNULL(_buffer)

* Update lib/rocprofiler/hsa/hsa.cpp

- use buffer::get_buffer instead of manually looking for buffer

* Update lib/rocprofiler/internal_threading.cpp

- use buffer::get_buffer instead of manually looking for buffer

* Update lib/rocprofiler/buffer.cpp

- offset the buffer id
- properly handle rocprofiler_create_buffer reusing rocprofiler_buffer_id_t on a different context

* Update tests

- kernel tracing library for integration testing

* Add cereal submodule

* Update lib/rocprofiler/registration.*

- OnUnload
- Support ROCP_TOOL_LIBRARIES for python usage
- improve finalize function
- remove calling hsa_shut_down in finalize function

* Update lib/rocprofiler/buffer.*

- allocate_buffer sets the buffer id value
- expose (internally) is_valid_buffer_id
- update test

* Update tests/kernel-tracing

- installation
- better organization of JSON groups
- improved messaging

* Update lib/rocprofiler/registration.cpp

- add workaround for hsa-runtime supporting rocprofiler-register

* Update tests/kernel-tracing/kernel-tracing.cpp

- fix memory leaks

* cereal support for minimal JSON

- update cereal submodule to rocprofiler branch
- change REPO_BRANCH in rocprofiler_checkout_git_submodule for cereal
- update tests/kernel-tracing/kernel-tracing.cpp
  - use minimal json
  - slight tweak putting giving contexts name in storing name + context pointer pair in map

* Update tests/kernel-tracing/kernel-tracing.cpp

- support runtime selection of contexts via KERNEL_TRACING_CONTEXTS environment variable

* Update tests

- tests/CMakeLists.txt
  - find_package(Python3 REQUIRED)
- tests/kernel-tracing
  - pytest validation

* Update CI workflow

- install pytest
- add checks for test labels

* Update scripts/run-ci.py

- change --coverage options
  - replace 'unittests' with 'tests'
- replace test label regex '-L unittests' with '-L tests'

* Update requirements.txt

- this is now an empty file since none of the packages are required for this repo
2023-11-16 03:21:39 -06:00