* Renamed agent profiling service to device counting service
Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:
find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +
* Converted dispatch profile to dispatch counting service
* Debug for functioal counters test
* Minor changes for CI
* Minor fix
* More fixes for CI
* Update evaluate_ast.cpp
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
* Add kernel profiling time info to counter collection records
- lib/rocprofiler-sdk/kernel_dispatch
- added profiling_time.{hpp,cpp}
- restructured tracing.cpp
- updated queue.cpp AsyncSignalHandler
- gets kernel dispatch profiling time and passes to dispatch_complete and signal callbacks
- structured some header includes to reduce cyclic include probability
- originally, including kernel_dispatch/tracing.hpp in hsa/queue.hpp created a lot of cyclic includes
* Fix kernel_dispatch.cpp includes
* Fix kernel_dispatch.cpp
- include <cstring>
- replace use of ROCPROFILER_HSA_AMD_EXT_API_ID_NONE with ROCPROFILER_KERNEL_DISPATCH_LAST
Added public API call to setup agent counter collection on a context.
Refactored the return types internally for dispatch counter collection
to use rocprofiler_status_t (allow for more verbose failures to be
surfaced via the API)
Subsequent commits will fill out the sampling functionality for agent
counter collection.
Co-authored-by: Benjamin Welton <ben@amd.com>
* Minor fix
Removal of HSA from counter collection
Tests for AQL
Updated counter collection client to build profiles in tool init
* Rebased
* Debug printing
* Formatting
* More format
* fix shadowing
---------
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
* Added first ATT API
* Finalizing thread trace API
* Fixing more rebase conflicts
* Added codeobj disassembly sample
* Fixing merge issues with rebase [2]
* Adding ATT packets
* Implemented thread trace intercept
* Moved codeobj parser to same repo as rocprofiler
* Moved thread trace to new API
* Fixing merge conflicts
* Fixing more merge conflicts
* Adding thread trace packet reuse
* Merged aql_profile_v2 headers
* Linked ATT sample to aqlprofile
* Updated decoder to include non-loaded codeobjs
* Implemented ISA decoder into ATT sample
* Added marker_id to vaddr
* Updating aql_profile_v2 API to memcpy
* Updating thread trace API to include 64bit markers. Using the result of ISA matching.
* Added instruction type and cycles summary
* Updated sample with selection of kernel by kernel_object
* Added option to copy from memory kernels
* Moved tool_data in thread_trace to dynamic alloc
* Restoring hsa.cpp
* Fixed ATT sample crash. General improvements.
* Moved codeobj library to outside src/
* Updated license header
* Moved codeobj_capture to camelcase
* Solving some more merge conflicts
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/code_object_isa_decode/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/thread_trace/CMakeLists.txt
* Removing unused parameter check
* Adding const to isEmpty
* Removing unused warning
* Adding libdw-dev to requirements
* Running clang-format
* Commenting out new aql calls
* Clang format
* Unused variable fix
* Adding codeobj-decoder coverage
* Commenting out threadtrace
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* P
* WOverloaded
* Addressing clang-tidy
* Virtual destructor on ttracer class
* Corr id
* Fixing code source format
* Update CMakeLists.txt
* Build fixes
* Update source/lib/rocprofiler-sdk-codeobj/code_object_track.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fix shadowing
* Update CMakeLists.txt
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Handle hsa_queue_destroy after finalization
- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue
* Update HIP/HSA/marker update_table logging
* Update rocprofv3 tests
- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them
* Disable thread sanitizer deadlock detection
* Update CI workflow
- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers
* Update run-ci.py
- set gcovr html medium and high threshold
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- remove this capture from enable/disable serialization
* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*
- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map
* Logging for HIP/HSA/marker/profile_serializer
* Logging for HIP/HSA/marker/queue_controller
* Improve test_retired_correlation_ids asserts
* Fix tests/counter-collection/validate.py
- scale expected SQ_WAVES counter value based on warp size of GPU
* Tweak github comment for code coverage
* Remove gcovr html high/medium threshold args
* Fix tests/counter-collection/validate.py
- round before casting to int in test_counter_values
* operator bool for profile_serializer
- only wait on CV if profile_serializer is used
* Logging updates (profile_serializer + code_object)
* Update counter-collection validate.py
* QueueController does not wait on CV if finalizing/finalized
* Update CI workflow
- remove navi32 from core job
* Improve HIP/HSA/marker tracing get_functor/functor
- remove lambda wrapper around functor
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- do not acquire cvmutex lock during finalization
* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*
- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized
* Update CI workflow
- remove navi32 runners
* bwelton fixes for hangs
* CMake improvements + simplified demangle
- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Moved tests/apps to tests/bin
* Renamed cmake project in tests/bin
* Update samples
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
- tweaks to stdout messages
* Update tests
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
* Add tests/lib
- libraries with HIP code
* Update PTL submodule
- remove atexit delete of thread_id_map
* Update cmake/rocprofiler_options.cmake
- Set ROCPROFILER_DEFAULT_FAIL_REGEX
* Update common lib: env + logging
- improved customization of logging settings
- default to disabling logging to files
- install failure handler for rocprofv3
- set_env support in environment.*
* Add lib/rocprofiler-sdk/shared_library.cpp
- shared library constructor
* Update lib/rocprofiler-sdk-tool/tool.cpp
- destructor thread safety
- convert callback_name_info and buffered_name_info to pointers
- install failure handler for logging
* Add tests/bin/hip-in-libraries
- hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels
- used for testing deadlocking within __hipRegisterFatBinary
* Update bin/rocprofv3
- reorganized the env variables
- use exec to launch command
- set ROCPROFILER_LIBRARY_CTOR=1
* Add tests/rocprofv3/tracing-hip-in-libraries
- uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels
* Update bin/rocprofv3
- fix counter collection (no exec)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- replace "Kernel-Name" with "Kernel_Name"
* Update lib/rocprofiler-sdk/registration.cpp
Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries
* Update tests/rocprofv3
- replace "Kernel-Name" with "Kernel_Name"
* Update tests
- vector-ops (bin) stream syncs + runs with 4 queues per device
- improve counter-collection/input1 validation
- rocprofv3/tracing-hip-in-libraries does not do sys-trace
- improved validation script for tracing-hip-in-libraries
- updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection
* Update samples/counter_collection
- updated dispatch_callback(s) and record_callback(s) following reworking of prototypes
* Update bin/rocprofv3
- reorganized help menu
- added options for sub-HSA tables
- added --hip-runtime-trace
- changed --hip-trace to include --hip-compiler-trace
* Update lib/rocprofiler-sdk-tool
- improved kernel filtering
- removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk)
- fixed issue with counter-collection w/o tracing
- added support for fine grained HSA API tracing
- removed directly linking to HSA-runtime
* Update lib/rocprofiler-sdk/agent.cpp
- rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option)
* GPR (vector and scalar) info in kernel symbol data
- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info
* Header include order fix
- Include repo headers first
- Third party library headers next
- standard library headers last
* Update dispatch profiling public API
- introduce rocprofiler_profile_counting_dispatch_data_t
- change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t
- provide rocprofiler_user_data_t pointer in dispatch callback
- provide rocprofiler_user_data_t value (from dispatch cb) in record callback
* Update tests/bin/CMakeLists.txt
- fix add_subdirectory(hip-in-libraries) order
* Update VERSION
- bump to 0.2.0 in prep for AFAR
- .github/workflows/continuous_integration.yml
- apt-get update before apt-get install
- remove libgtest-dev
- actions-comment-pull-request: v2.4.3 -> v2.5.0
- .github/workflows/formatting.yml
- create-pull-request: v5 -> v6
- cmake/rocprofiler_options.cmake
- remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options
- samples/counter_collection/callback_client.cpp
- corr_id field renamed to correlation_id
- samples/counter_collection/client.cpp
- corr_id field renamed to correlation_id
- include/rocprofiler-sdk/fwd.h
- In rocprofiler_record_counter_t: rename corr_id field to correlation_id
- doxygen fixes
- lib/common/utility.*
- remove get_accurate_clock_id_impl
- timestamp_ns() defaults to CLOCK_BOOTTIME
- lib/rocprofiler-sdk/counters/core.cpp
- fix spelling mistake: extrenal -> external
- corr_id field renamed to correlation_id
- lib/rocprofiler-sdk-tool/tool.cpp
- fix destruction of static tool::output_file before finalization
- scripts/update-docs.sh
- define PROJECT_NAME
- tests/async-copy-tracing/validate.py
- init_time and fini_time checks
- hip_api_traces, marker_api_tracing
- tests/common/serialization.hpp
- fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id
- tests/kernel-tracing/validate.py
- init_time and fini_time checks
- relax test_total_runtime range
- tests/rocprofv3/tracing/CMakeLists.txt
- remove -M from rocprofv3-test-systrace-execute
- exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing
- tests/rocprofv3/tracing/validate.py
- update test_kernel_trace to accept mangled or demangled
- tests/tools/json-tool.cpp
- remove use of GLOG
- include init_time and fini_time
- write_json(...) function
* Add support for AQL dimension changes
Adds support for returning dimensions from AQLProfile through rocprofiler
to tools. Includes a much larger expanded test suite that covers nearly
all files in counter collection.
Specific changes below:
samples/counter_collection/print_functional_counters: Modified to check
the validity of dimensions returned in comparison to the actual underlying
data obtained from a kernel execution.
rocprofiler-sdk/aql/helpers: adds function calls to support fetching
dimension information from AQLProfile.
rocprofiler-sdk/aql/packet_construct: modified to allow for events
to be exported to aid evaluate_ast in decoding the output buffer.
lib/rocprofiler-sdk/counters: Instance count now derived from dimension
sizes. rocprofiler_query_counter_dimensions now moved to a callback format
to improve usability.
rocprofiler-sdk/counters/core: Code migrations and exports of functions
for testing.
rocprofiler-sdk/counters/dimensions: Generates a dimension cache to be
used when querying dimension information for a counter id.
rocprofiler-sdk/counters/evaluate_ast: Modified to pass back correct
dimension information and to check/determine output dimensions for derived
counters.
rocprofiler-sdk/counters/id_decode: Modified to have a map between
dimension name -> dimension along with a conversion from the aql profile
id for a dimension (string) -> integer based id (happens only once during
init).
rocprofiler-sdk/hsa/queue: Modified to allow for making testing easier.
Specifically to allow Queue to now be mocked in unit tests for counter
collection.
* Merge with changes for serialization
* Added suggestions
* source formatting (clang-format v11) (#457)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Minor fix
* Test change
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Update include/rocprofiler-sdk/{counters,profile_config}.h
- use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update samples
- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t
* Update run-ci.py
- exclude rocprofiler-sdk-tool from samples (no sample uses that code)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- Update rocprofiler_iterate_agent_supported_counters to use agent ID
* Update lib/rocprofiler-sdk/counters/core.*
- profile_config has pointer to agent instead of copy
* Update lib/rocprofiler-sdk/agent.*
- provide get_agent(...) func via rocp agent id
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation
* Update lib/rocprofiler-sdk/counters.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update lib/rocprofiler-sdk/profile_config.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update source/docs
- requirements.txt + install reqs in cmake
* Bump version to 0.1.0
* Update samples/api_callback_tracing/CMakeLists.txt
- LD_LIBRARY_PATH for test
* Update test/rocprofv3/tracing/CMakeLists.txt
- reorder validation files so memory copy comes first
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for flushing buffers
- variables for buffer_size and buffer_watermark
- increase the watermark to a full buffer
- use dedicated threads for each buffer
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info
* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- drop log level to warning when leak sanitizer is enabled (produces small memory leak)
* Fix init order for getMetricIdMap
Ensure that getMetricIdMap() stays valid until the destruction
of the counting service. Adds a test to ensure that this behavior
is maintained (this test fails without this change).
* source formatting (clang-format v11) (#343)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* cmake formatting (cmake-format) (#342)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Remove threading
* Update usage of rocprofiler::counters::get{Metric,MetricId}Map()
- uses common::static_object
* Update counter init_order testing
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Update lib/rocprofiler-sdk/context/context.*
- get_registered_contexts functions (local copy)
* Update lib/rocprofiler-sdk/hsa/{queue,queue_controller}.cpp
- remove ROCPROFILER_BUFFER_TRACING_MEMORY_COPY code
* Update tests/kernel-tracing/kernel-tracing.cpp
- move stop() and flush() in tool_fini to before reporting of sizes of data collected
* Update lib/rocprofiler-sdk/hsa/hsa.*
- remove stale set_callback / activity_functor_t code
* Update lib/rocprofiler-sdk/buffer.cpp
- full wait instead of returning busy when buffer is busy
- use task_group::join instead of task_group::wait to fully wait for tasks to finish (bug fix)
* Update lib/rocprofiler-sdk/agent.cpp
- support agent mapping for CPU agents
* Remove direct access to vector of registered contexts
* Fix find_package(rocprofiler) in build tree
* Move include/rocprofiler to include/rocprofiler-sdk
* Update include/CMakeLists.txt
- add_subdirectory(rocprofiler-sdk)
* Move lib/rocprofiler to lib/rocprofiler-sdk
* Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool
* Update lib/CMakeLists.txt
- add_subdirectory(rocprofiler-sdk)
- add_subdirectory(rocprofiler-sdk-tool)
* Update lib/rocprofiler-sdk/CMakeLists.txt
* Rename rocprofiler-tool to rocprofiler-sdk-tool
* Replace include rocprofiler/ with include rocprofiler-sdk/
* Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/
* Set VERSION to 0.0.0 and finish install to rocprofiler-sdk
* More fixes for rocprofiler -> rocprofiler-sdk
- fix issue with rocprofiler-sdk-config.cmake.in
- fix counters xml install path
* Fix documentation generation
* Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree
* cmake formatting (cmake-format) (#264)
Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>