* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX
- add UBSAN_OPTIONS to setup-sanitizer-env.sh
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX
* Use -fno-sanitize-recover=undefined flag
- this compiler flag causes all undefined behavior errors to exit
* Revert ROCPROFILER_DEFAULT_FAIL_REGEX
* fix for shift overflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
* Adding source snapshot
* Adding option to serialize only on target kernel
* Fix for tidy
* Formatting
* Testing the new flag
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Migrates profiler_serializer class in QueueController to have an instance per-agent instead of one globally. Other changes in this commit are to allow for maps of the queues associated with each agent to be passed to profiler_serializer when it is turned on/off. Existing test cases cover whether or not the kernels are serialized (multistream app). New test case added to show that this serialization only occurs on a per device level with a kernel launched on one device waiting for a value to be set on the other.
* Change all rocprofiler-X target names to rocprofiler-sdk-X
* Update rocprofiler-sdk-config.cmake
- fix install tree target names
- simplify logic for using find w/ components and find w/o components
* Update rocprofiler-sdk-roctx-config.cmake
- simplify logic for using find w/ components and find w/o components
* Update samples/intercept_table/CMakeLists.txt
- demonstrate/test use of `find_package(rocprofiler-sdk ... COMPONENTS ...)`
* Check to force tool to initialize the ctx id to zero.
* initialize rocprofiler_context_id_t with 0 in units tests
* changelog
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
- ROCPROFILER_API after function
- use rocprofiler_tracing_operation_t in lieu of uint32_t where appropriate
- rocprofiler_tracing_operation_t is not int32_t typedef (formerly uint32_t)
- use const T* instead of T* where appropriate
* Add kernel profiling time info to counter collection records
- lib/rocprofiler-sdk/kernel_dispatch
- added profiling_time.{hpp,cpp}
- restructured tracing.cpp
- updated queue.cpp AsyncSignalHandler
- gets kernel dispatch profiling time and passes to dispatch_complete and signal callbacks
- structured some header includes to reduce cyclic include probability
- originally, including kernel_dispatch/tracing.hpp in hsa/queue.hpp created a lot of cyclic includes
* Fix kernel_dispatch.cpp includes
* Fix kernel_dispatch.cpp
- include <cstring>
- replace use of ROCPROFILER_HSA_AMD_EXT_API_ID_NONE with ROCPROFILER_KERNEL_DISPATCH_LAST
* Tidying ATT dispatch API. ATT Agent to be initialized with rest of profiler. Removing read_index-based wait.
* Formatting
* Adding some input validation
* Add perf test for agent
* Removing async
- missing-new-line CI job: ensures all source files end with new line
- logging updates
- add new line to the end of many files
- fix header include ordering is misc places
- transition to use hsa::get_core_table() and hsa::get_amd_ext_table() in various places instead of making copies
* General fixes to ATT, packets and event ID retrieval
* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* SWDEV-465322: Adding support for r Perfcounter SIMD Mask in ATT
* Apply suggestions from code review
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Adding unit tests
* Adding counters check for gfx9 and SQ block only
* Addressing review comments
* changing the struct size
* fixing header includes
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Update lib/rocprofiler-sdk/context/*
- create correlation_id.{hpp,cpp} and moved implementation into these files instead of in context.{hpp,cpp}
* Update lib/rocprofiler-sdk/thread_trace/att_core.hpp
- fixed header includes
* Update lib/common/utility.hpp (runtime sizeof)
- added compute_runtime_sizeof<T>() function to set the "size" field to be the offset of the "reserved_padding" field if one exists
* Fix to compute_runtime_sizeof
* Added first ATT API
* Finalizing thread trace API
* Fixing more rebase conflicts
* Added codeobj disassembly sample
* Fixing merge issues with rebase [2]
* Adding ATT packets
* Implemented thread trace intercept
* Moved codeobj parser to same repo as rocprofiler
* Moved thread trace to new API
* Fixing merge conflicts
* Fixing more merge conflicts
* Adding thread trace packet reuse
* Merged aql_profile_v2 headers
* Linked ATT sample to aqlprofile
* Updated decoder to include non-loaded codeobjs
* Implemented ISA decoder into ATT sample
* Added marker_id to vaddr
* Updating aql_profile_v2 API to memcpy
* Updating thread trace API to include 64bit markers. Using the result of ISA matching.
* Added instruction type and cycles summary
* Updated sample with selection of kernel by kernel_object
* Added option to copy from memory kernels
* Moved tool_data in thread_trace to dynamic alloc
* Restoring hsa.cpp
* Fixed ATT sample crash. General improvements.
* Moved codeobj library to outside src/
* Updated license header
* Moved codeobj_capture to camelcase
* Solving some more merge conflicts
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/code_object_isa_decode/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/thread_trace/CMakeLists.txt
* Removing unused parameter check
* Adding const to isEmpty
* Removing unused warning
* Adding libdw-dev to requirements
* Running clang-format
* Commenting out new aql calls
* Clang format
* Unused variable fix
* Adding codeobj-decoder coverage
* Commenting out threadtrace
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* P
* WOverloaded
* Addressing clang-tidy
* Virtual destructor on ttracer class
* Corr id
* Fixing code source format
* Update CMakeLists.txt
* Build fixes
* Update source/lib/rocprofiler-sdk-codeobj/code_object_track.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fix shadowing
* Update CMakeLists.txt
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>