develop
100 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
1517a398bf |
[rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements Buffer Pool Design ------------------ Replace the fixed array-based double buffer with a dynamic pool design to fix race conditions that caused "internal correlation id was retired prematurely" errors. The original design had a race where flush callbacks could be delivered out-of-order: when buffer 0 fills and begins flushing, writes go to buffer 1. If buffer 1 fills before buffer 0's flush completes, the buffer index wraps back to 0 (which may still be flushing). Independent flush tasks submitted to the thread pool can complete out of order. The new pool design: - Uses a std::deque of buffer instances that grows as needed - Allocates buffers from the pool when the current buffer needs to flush - Serializes flushes with a mutex to ensure FIFO callback ordering - Returns buffers to the pool after flush completion - Eliminates the race between buffer selection and write operations New Unit Tests -------------- - buffer_correlation_ordering.cpp: Tests that API records are always delivered before their corresponding retirement records - buffer_ordering_stress.cpp: Stress tests buffer flush ordering under high contention with multiple threads rapidly filling buffers HSA Tool Hooks -------------- Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that waits for pending flush tasks before tool finalization, preventing "retired prematurely" errors during HSA shutdown. Sanitizer Improvements ---------------------- - LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder - LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup) - TSAN: Added suppression for false positive on C++11 thread-safe static initialization in create_write_functor - ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto - Disabled attachment tests for sanitizers due to library preloading issues Other Fixes ----------- - Thread-trace agent test: Use heap-allocated callback state - Correlation ID: Refactored reference counting and finalization ordering * [rocprofiler-sdk] Revert buffer pool design changes Revert buffer.cpp and buffer.hpp to the original double-buffer design from develop branch. The pool-based redesign introduced concerns about: - Signal safety (mutex vs atomic_flag) - API changes (flush() return type) - Complexity of the new design This revert removes: - Dynamic buffer pool with std::deque - std::mutex/condition_variable synchronization - buffer_correlation_ordering.cpp test - buffer_ordering_stress.cpp test The underlying buffer flush ordering issue will need to be addressed with a different approach that preserves the original API and synchronization characteristics. * [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization - Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks - Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning - Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp) - Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior: - hsa/queue.cpp (lines 105, 210) - hsa/async_copy.cpp (line 344) - hsa/hsa_barrier.cpp (line 43) - buffer.cpp (lines 107, 138, 185) This ensures no correlation IDs are created once finalization starts (fini_status != 0), preventing races between finalization and ongoing tracing operations. * [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation Buffer records are not guaranteed to arrive in any specific order. Tests and samples should use timestamps for temporal ordering validation instead. Changes: - samples/external_correlation_id_request: Replace 'retired prematurely' arrival order check with timestamp-based validation that retirement timestamp >= max(end_timestamps) for records with the same correlation ID - tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check Correlation IDs are not guaranteed to be monotonically increasing when records are sorted by timestamp. Temporal ordering should be validated using the timestamp fields in each record. * [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal Restore the SYSTEM keyword to target_include_directories for rocprofiler-sdk-fmt to match develop branch. * [rccl] Remove orphaned rocSHMEM gitlink Remove orphaned submodule reference that was introduced during a merge but never had a corresponding .gitmodules entry, causing CI failures with "fatal: no submodule mapping found in .gitmodules". * [rocprofiler-sdk] Add HSA ABI version 0x09 support Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release functions (added in rocr-runtime SWDEV-561708). * [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations This commit consolidates fixes for handling the finalization status during buffer flush operations across the SDK. Changes: - Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully when flushing buffers, as this indicates buffers were already flushed during finalization (not an error condition) - HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check for fini_status to allow operations during finalization process - buffer.cpp: Revert fini_status checks to use > 0 for consistency - correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging to prevent correlation ID creation after finalization starts Files modified: - source/lib/rocprofiler-sdk-tool/tool.cpp - tests/tools/json-tool.cpp - source/lib/rocprofiler-sdk/tests/registration.cpp - source/lib/rocprofiler-sdk/tests/roctx.cpp - samples/api_buffered_tracing/client.cpp - samples/counter_collection/buffered_client.cpp - samples/counter_collection/device_counting_async_client.cpp - samples/external_correlation_id_request/client.cpp - samples/pc_sampling/client.cpp - source/lib/rocprofiler-sdk/buffer.cpp - source/lib/rocprofiler-sdk/context/correlation_id.cpp - source/lib/rocprofiler-sdk/hsa/queue.cpp - source/lib/rocprofiler-sdk/hsa/async_copy.cpp - source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp * [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls in samples and tools. The ERROR_FINALIZED handling was overly complex and the hsa_tool_hooks OnUnload synchronization is no longer needed. Changes: - Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code - Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL - Simplify buffer flush in tool.cpp and json-tool.cpp - Remove ERROR_FINALIZED special handling from test files Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Fix output_stream move semantics to null source pointers The default move constructor and move assignment operator for output_stream did not null out the source's pointers after the move. This caused double-close when the moved-from temporary was destroyed, leading to use-after-free crashes (SIGSEGV in std::ostream::sentry). Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration - generatePerfetto.cpp: Move output_stream into shared_state to prevent use-after-free race conditions during Perfetto callback execution - run-ci.py: Simplify and consolidate sanitizer environment variable configuration for better maintainability Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required for CTest to properly pass suppression files to the sanitizers during memcheck runs. Co-Authored-By: Claude <noreply@anthropic.com> * Revert "[rccl] Remove orphaned rocSHMEM gitlink" This reverts commit 1ad21003941355658fff8114fa27768f11a948f7. * [rocprofiler-sdk] Revert registration.cpp changes Revert changes to registration.cpp to match develop branch. Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Remove suppression file content printing from run-ci.py Co-Authored-By: Claude <noreply@anthropic.com> * Fix output_stream move ctor/assignment operator * Fix erroneous revert of registration.cpp * Fix handling of fini status in correlation ID construction * [rocprofiler-sdk] Fix OMPT segfault during finalization Add nullptr checks in OMPT tracing code to handle the case where correlation_tracing_service::construct() returns nullptr during finalization. This fixes segfaults in openmp-target-sample and tests.integration.execute.openmp-tools. The correlation ID construction now returns nullptr when fini_status > 0, but the OMPT callbacks were not checking for this, causing crashes when dereferencing the null pointer during OpenMP runtime shutdown. Changes: - event_common(): Return nullptr early if correlation ID is null - event(): Check for nullptr before calling sub_ref_count() - ompt_task_create_callback(): Return early if correlation ID is null - ompt_task_schedule_callback(): Return early if correlation ID is null * [rocprofiler-sdk] Fix HSA API tracing segfault during finalization Add nullptr check in hsa_api_impl::functor after correlation ID construction. During finalization, correlation_service::construct() returns nullptr, and without this check the code would dereference the null pointer when accessing corr_id->internal. This fixes the SEGV at address 0x000000000008 (null + 8 byte offset) that occurs when HSA async event threads call hsa_signal_destroy during runtime shutdown after finalization has started. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
8760fb4976 |
attach: Formalize ROCAttach API (#1653)
* attach: Formalize ROCAttach API - Make ROCAttach public with public headers - Change detach to take a PID - attach and detach are now reentrant - Cleanup of states and signal handling in ptrace session - Fixes mixed up definition of ROCPROF_ATTACH_TOOL_LIBRARY - ROCPROF_ATTACH_TOOL_LIBRARY now always means the tool library loaded by the attachment target - ROCPROF_ATTACH_LIBRARY refers to the library used to perform attachment - Add direct call of rocprof-attach - Fix python library call of rocprof-attach - Function now named attach(), changed from main() * attach: rocprof-compute ROCAttach updates - Update to new library names - Correct usage of C lib detach * attach: add test for rocattach - Disable ASan, TSan, and UBSan for the new parallel-attach test - Lower log level for LSan tests, existing behavior from other tests --------- Co-authored-by: Ammar ELWazir <aelwazir@amd.com> |
||
|
|
36d9d33d90 |
Users/mkuriche/rocprofiler sdk fmt build fix memory header (#2537)
* [rocprofiler-sdk] Fix fmt::join build errors - remedy use of fmt::join without include <fmt/ranges.h> * include memory header * Disable FMT build for SDK CI * Add -DROCPROFILER_BUILD_FMT=OFF to sanitizer steps * Add temporary workaround for rccl.h issue * Add ROCPROFILER_INTERNAL_RCCL_API_TRACE to SDK CI builds * disable clang-tidy for vendored includes --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: jbonnell-amd <jason.bonnell@amd.com> |
||
|
|
65c048e918 |
Change rocprofiler-sdk CMake compatibility to AnyNewerVersion (#1632)
* Change rocprofiler-sdk CMake compatibility to AnyNewerVersion Update CMake package version compatibility from SameMinorVersion to AnyNewerVersion to allow downstream packages (like RDC) to use newer versions of rocprofiler-sdk without requiring exact minor version match. This fixes compatibility issues where RDC requests 1.0.0 but finds 1.1.0. * Update projects/rocprofiler-sdk/cmake/rocprofiler_config_install.cmake Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Change rocpd and roctx CMake compatibility to SameMajorVersion Update COMPATIBILITY setting from SameMinorVersion to SameMajorVersion for both rocpd and roctx packages to allow compatibility across major version boundaries. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
a2288eb50b |
[rocprofiler-sdk] Install unit tests and helper functions for integration tests (#921)
* [rocprofiler-sdk] Install unit tests and helper functions for integration tests * Fix rocprofiler-sdk-tests-target export * Fix handling of cmake policy CMP0174 * Remove -vv from new pytest.ini files * add unit tests and integration tests. * add path to ci workflow. * misc. fixes. * pc sampling tests. * bug fixes. * pc sampling tests fix. * misc. * Update CMakeLists.txt * Update rocprofiler_config_install_tests.cmake, correct license name * fix units tests install issues. * fix counters_def file path. * fix bug, arg shifting. * vendor pytest-cmake. * cmake config fix. missing endfunction() * disable tests, 1.rocprofv3-trace-hip-libs. 2.kernel-tracing. 3.external_correlation 4.rocpd. * disable buffered-tracing test and remove pytest-cmake from requirements.txt. * disable hip-graph-tracing test. * fix building standalone tests to load rocprofiler-sdk cmake package first and then find rocprofiler_sdk_pytest module. * addressed comments: 1.add local bin path to code cov workflow. 2.add to cmake prefix path local bin. 3.use ROCPROFILER_MEMCHECK_PRELOAD_ENV_VALUE 4.misc. fix * enabled back tests api_buffered, external_correlation_id, hip-graph, kernel-tracing, rocpd, tracing-hip-in-libraries. and misc fixes(formating, extra fixtures for agent-index tests.) * cpack to use llvm bin for .hsaco debug symbols. * psdb tests fixes. * EOL. * misc. fixes and Disable api_buffered_tracing, external_correlation_id, hip-graph-tracing, kernel-tracing, rocpd, summary, tracing-hip-libraries, tracing-plus-counter-collection. * fix incorrect cmakelists file. * strip smallkernel.bin * format. * revert disabled tests commit. * misc. fix in counter tests. * misc. * search codeobj unit test assets in curr bin and install bin. * refactor newly added rocpd tests. * modify tests for newly added hip-host-tracing. * add LD LIB path to units, psdb is failing due to libs not being found. --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com> Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com> |
||
|
|
12718139fe |
[rocprofiler-sdk] rename librocprofv3-attach.so (#1342)
* attach: rename librocprofv3-attach - Renames library to librocprofiler-sdk-rocattach - ROCAttach library will be formalized and documented in future commit * Address review comments - Rename rocprofv3-attach.py to rocprof-attach.py - Use common filesystem.hpp in rocattach * Fix component name typo * Doc fixup --------- Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com> |
||
|
|
6e195ded9b |
Update rocprofiler_config_interfaces.cmake to use different elf naming (#1722)
* Update rocprofiler_config_interfaces.cmake to use different elf naming * try out conditional for libelf * run cmake-format to fix formatting issue * Remove libelf.patch file from therock-ci-windows.yml * Remove libelf patch from therock-ci-linux.yml as well |
||
|
|
9cf8a5e0b5 | [ROCProfiler-SDK] Remove Python library dependency from Python bindings (#1451) | ||
|
|
4cca398b56 | [rocprofiler-sdk] Update rocprofiler-sdk CONTRIBUTING.md (#1371) | ||
|
|
6b109c11c4 |
[rocprofv3] Reorganize rocprofv3.avail python package (#175)
* Reorganize rocprofv3 python package adding Python version candidates review fix fix test fix remove extra line fix the exception handle fix Lint fail fix installation adding checks to check version format disable test for address sanitizer * review comments * Removing extra lines * fix format * Add lib/python3/site-packages to PYTHONPATH in setup-env.sh * rocprof-compute update rocprofv3 avail lib path * Make rocprofv3 python binding build commands consistent with other python bindings * fix cmake * fix rocprof-compute * revert cmake changes * fix rocprofv3 avail python library * fix cmake * fix cmake --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com> Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com> Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com> Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com> |
||
|
|
026a4e82a3 |
Rollup of build changes needed for compat with TheRock. (#1086)
* Rollup of build changes needed for compat with TheRock. * When built for a non-default ROCM location, the HIP headers can't be found by a few targets. * Uses pkg_check for DRM libraries like ROCR-Runtime does (which avoids accidental fallback to system versions). * Robust fix for nolink targets * nolink targets essentially exist for include directories * all nolink targets are automatically added to rocprofiler-sdk-headers with a $<BUILD_INTERFACE:...> generator expression * Re-add previously used mechanism to find drm libs --------- Co-authored-by: Marius Brehler <marius.brehler@amd.com> Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
63a723a287 |
GFX12 PC Sampling support (#186)
The GFX12 host-trap PC sampling support in SDK and V3. Introducing parser tests specific to GFX12. Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com> |
||
|
|
bf49039005 |
[rocprofiler-sdk][rocprofiler-register] Initial Attachment Support (#316)
* attach: milestone: API tracing - This pairs with another commit in rocprofiler-sdk to fully function - Add ptrace entry points for tool attachment - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup - Remove hardcode for loading of tool library - Make invoke registration functions public again * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-sdk * attach: prestore overhaul - Must be paired with commit in rocprofiler-sdk * attach: add dispatch table rework - Register will load the prestore library and provide entrypoints to sdk * attach: formatting and cleanup * attach: revise dispatch table scheme * attach: formatting * attach: milestone: API tracing - This change must be paired with a change in rocprofiler-register to fully function. - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup and comments * attach: Formatting and crash fixes * attach: add attach duration - Add option attach-duration-msec for attachment * Formatting + sglang hang fix via signal handling * Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-register * Allow null agents for scratch output * attach: improve queue library interface - Significant changes to force exported interfaces back to C - Fixes bug with unknown agents at attachment - Code objects' names may still be incorrect * attach: add code_object support - Kernel traces will now have names and all other information for launches - Add capture of hsa_executable to the queue library - Various logging improvements * attach: rename queue library to prestore * attach: prestore overhaul - Must be paired with commit from rocprofiler-register - Massive overhaul of code organization in prestore library - Separates registrations for different object types - Sets up future changes for initialization * attach: add prestore dispatch table - Removes linkage to prestore library from sdk * attach: cleanup * attach: formatting * attach: fix input prompt not appearing * attach: fix component name in cmake * attach: revert change to export level * Make prestore API public * attach: update sdk attachment library WIP - This commit is NONFUNCTIONAL - Changes around structure to remove classes - Seperate C linkage where needed - Still needs updates to register for correct usage * attach: update register with dispatch table WIP - This commit is NONFUNCTIONAL - Changes rocprofiler_register to handle dispatch table from attach library. - Still needs changes in SDK with dispatch table usage * attach: dispatch table wip - This commit is NONFUNCTIONAL * attach: move attach component into core * attach: rename to rocprofv3-attach * attach: add callbacks for new queues and code objects * attach: finish dispatch table implementation - Fixes kernel tracing * attach: add cmake variable for attachment support * feat: Add --attach alias for rocprofv3 with comprehensive attachment tests - Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py - Create comprehensive attachment test suite with CSV and JSON output validation: - New attachment-test application for testing dynamic profiling scenarios - Unified test script supporting both CSV and JSON output formats - Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info - Add CMake integration for automated attachment testing - Support parameterized output directory and filename specification - Implement proper environment setup for attachment queue registration Tests verify successful attachment to running processes and capture of: - Kernel dispatch traces with workgroup/grid dimensions - Memory copy operations (H2D/D2H) with size validation - HSA API call traces across multiple domains - GPU/CPU agent information and capabilities * Documentation Update * attach: make attach script callable * Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name * attach: revert metrics library path changes * Generic Attachment in Register (#942) Remove tool references in register * Add second param to attach call in rocprof register * Add experimental reattachment support for ROCprofiler-SDK This commit introduces experimental reattachment functionality allowing tools to dynamically reattach to running processes with comprehensive design changes to support multiple attach/detach cycles: **Core Reattachment API:** - Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks - Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports - Implement reattachment tracking in rocprofiler_register_attach to differentiate initial attachment from reattachment cycles - Add rocprofiler_register_invoke_reattach for handling reattachment requests **Design Changes - Registration System Flow:** The registration system now supports a dual-path initialization: 1. Initial Attachment Flow: - rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations() - Full tool initialization with complete context setup - Sets prev_attached atomic flag to track state 2. Reattachment Flow: - rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach() - Bypasses full re-initialization, calls client reattach callbacks instead - Preserves existing contexts and buffers, only reactivates profiling services **Design Changes - Tool Library Loading:** Enhanced rocprofiler-register library loading with function pointer resolution: - Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers - Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions - Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability **Design Changes - Context Management:** Introduced dual context systems for attachment scenarios: - get_contexts() - Original contexts for standard tool initialization - get_attach_contexts() - Separate context map for attachment-specific lifecycle - attach_init() - Creates contexts for ALL buffer tracing services using existing buffers - attach_start() - Selectively starts contexts based on configuration options - attach_detach() - Cleanly stops and destroys attachment contexts **Design Changes - Buffer Management:** Added reset_tmp_file_buffer() template for clean reattachment state: - Properly closes and removes old temporary files - Deletes existing file_buffer instances to prevent stale file position tracking - Creates fresh file_buffer instances for clean reattachment cycles - Addresses core issue where file position metadata becomes stale between cycles **Design Changes - Environment Variable Injection:** Added ROCP_REGISTERED_TOOL_ATTACH environment variable: - Distinguishes attachment-loaded tools from LD_PRELOAD scenarios - Enables registration system to apply attachment-specific logic - Helps tools adapt behavior for attachment vs standard initialization **Attachment Context Management:** - Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle - Add reset_tmp_file_buffer template for clean reattachment state management - Implement get_attach_contexts() for tracking active attachment contexts **Test Infrastructure:** - Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite - Include reattachment test scripts with unified attachment/detachment cycles - Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info - Add conftest.py for JSON and CSV data loading utilities **Configuration Updates:** - Update CMakeLists.txt to include reattachment tests in build system - Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking - Enhance rocprofiler-register library loading with reattach/detach function resolution **Flow Impact Analysis:** This design enables robust multi-cycle attachment by: 1. Preventing duplicate initialization on reattachment 2. Maintaining separate context lifecycles for attachment vs standard operation 3. Ensuring clean temporary file state between attachment cycles 4. Providing tools with explicit reattach/detach callback hooks 5. Supporting both programmatic and environment-based tool configuration The experimental nature allows for iteration on the API while establishing the foundation for production-ready dynamic profiling capabilities. * Fix misc clang-tidy warnings/errors * CMake Option and Environment Variable Updates - CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT - Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED -> * Source reorganization * Formatting + new lines at EOF * Fix flake8 F841: local variable is assigned to but never used * Update attachment test - get rid of 5 second start delay - add roctx * Rework implementation - Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach - Add <rocprofiler-sdk/experimental/registration.h> - TODO: Update process_attachment.rst * Handle re-attachment options - inherit options from previous attachment - check previous options do not modify data collection services * Fix support for tools w/o rocprofiler_configure_attach - fix segfault when rocprofiler_configure_attach does not exist - fix naming convention for functions accepting attach dispatch table - cleanup rocprofiler_configure_attach implementation in rocprofv3 tool * attach: remove unknown agent handling - Change was from earlier commit, no longer needed * attach: add error for attaching without library loaded * attach: revise version numbering * attach: register header revisions * attach: clang format register * attach: formatting * attach: fix build failure - Remove cross dependency into rocprofiler-sdk, fixes build on some systems * attach: revise register library detection * Update rocprofiler-register and attach library - formatting - proper signature of register_functor for rocprofiler-sdk-attach library callback - remove get_dispatch_registration_table() * Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion * Fix output support for rocprofiler-sdk-tool * Fix formatting * Fix clang tidy errors * Misc rocprofiler-sdk-attach fixes * attach: add sigint handling to attach python * tool README.md formatting Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * Fix buffered output issue * attach: add errors for tool attach * CI Fixes * Rework tests * attach: improve library loading in rocprofv3 attach * formatting * Update tests to use pytest framework * Fix test_attachment_hsa_api_trace * attach: catch ctypes exceptions * attach: fix leak in registration * attach: fix sanitizer tests * attach: fix sanitizer tests further * attach: disable attach asan tests * attach: disable ubsan test * attach: fix permissions in installed test package * attach: formatting --------- Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com> Co-authored-by: Tim Gu <Tim.Gu@amd.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: Benjamin Welton <bwelton@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> |
||
|
|
069d5ecce2 |
[ROCProfiler SDK] Updating README Building & Installing Instructions (#931)
* Updating ROCProfiler SDK README * Fixing ROCProfiler SDK License * Fixing ROCProfiler SDK Installation Steps --------- Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com> |
||
|
|
696881ae82 |
LICENSE clean up (#919)
- Clean up and standardization of MIT licenses after discussion with legal team. - Update README.md with blurb for top-level files. - MIT License explicitly mentioned for relevant projects. - Removal of years. - Copyright attribution should be to `Advanced Micro Devices, Inc.` and not `AMD ROCm(TM) Software` - Removal of `All rights reserved.` - Reduce line width of the text for readability. - Add clear visual separators for additional licenses. - Convert text files to markdown format for aforementioned separators. - Update build scripts to point to renamed files. - Fixed SMI doc references Co-authored-by: Maisam Arif <Maisam.Arif@amd.com> |
||
|
|
839c07c4aa |
[CI] Testing stability (#486)
* [CI] Testing Stability
- CMake option ROCPROFILER_DISABLE_UNSTABLE_CTESTS
- used for tests which periodically fail around 1 out of every 10 runs
- set to ON while instability remains, this needs to set to OFF in ROCm 7.1 or, ideally, ROCm 7.0.1
- Use FIXTURES_SETUP and FIXTURES_REQUIRED for some tests
- replace "threw an exception" with "${ROCPROFILER_DEFAULT_FAIL_REGEX}" for misc FAIL_REGULAR_EXPRESSIONS
* Remove contents of all EXCLUDE_{TESTS,LABEL}_REGEX from CI workflow
* Disable patch git step in code-coverage run
* Tweak spin time of reproducible runtime
* Removed patch git step in code-coverage run
* Update ROCPROFILER_DEFAULT_FAIL_REGEX
* Mark test-counter-collection tests as unstable
- add fixtures setup/required
* Remove ATTACHED_FILES_ON_FAIL
- CDash doesn't store enable downloading these properly anyway
* Relax collection-period fuzzing window
* Disable unstable collection-period test
- too unstable
* formatting
* Disable unstable device_counting_service_test.async_counters
* Suppress perfetto internal data race errors
* Switch code-coverage CI jobs to mi300 runner
* Timeout increases
* rocprofv3-test-rocpd updates
- add fixtures
- switch executable
- redefine input/output paths
* Revert code-coverage job to mi300a runner
* Update rocprofv3-test-rocpd-execute-multiproc
- reduce problem size
* disable multiproc rocpd
* Split code-coverage into separate workflow
- network issues cause this job to fail frequently
- when in a separate workflow, it can be restarted easily
* Fixtures for rocprofv3-test-trace-hip-in-libraries
* Disable unstable device_counting_service_test.sync_counters
* Potential fix for code scanning alert no. 171: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* Switch code-coverage to run on rocprof-azure
- mi300a EMU runner set is unstable (network issues)
* tests/rocprofv3/pc-sampling SKIP_REGULAR_EXPRESSION
* Update rocprofv3-test-list-avail-trace-execute
- reduce log level and increase timeout
* rocprofv3: Prevent recursive call to rocprofv3_error_signal_handler + log chaining
* rocprofv3: Use ROCP_ERROR + std::exit instead of ROCP_FATAL
- should help with SKIP_REGULAR_EXPRESSION
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4d79e1df30 |
[SDK] Support CMake option for using internal RCCL tracing + (temporary) enable in CI (#457)
* Temp: disable RCCL tracing
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Adding option to disable rccl tracing from CMake
* Update codeql.yml
* Misc updates
- ROCPROFILER_BUILD_RCCL -> ROCPROFILER_INTERNAL_RCCL_API_TRACE
- env.EXTRA_TEMP_CMAKE_OPTIONS -> env.GLOBAL_CMAKE_OPTIONS
- add (advanced) option ROCPROFILER_INTERNAL_RCCL_API_TRACE
* Fix rocprofiler::sdk::get_enum_label
- missing enum labels for HIP_RUNTIME_API_TABLE_STEP_VERSION > 8
* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt
- improve various aspect of cmake -- particularly echoing where attdecoder_LIBRARY was found
* Use CMAKE_MESSAGE_INDENT
- add prefix to cmake messages to help indicate where messages are coming from
- make find_package(Python3 ...) QUIET for bindings
* Fix rocprofiler::sdk::get_enum_label
- handle HSA_AMD_EXT_API_TABLE_MAJOR_VERSION
* Fix rocprofv3 message for att library path
* Fix tests/rocprofv3/advanced-thread-trace/att_input.yml config
* Fix rocprofv3 check_att_capability + soversion/version library resolution
- Account for ROCPROF_ATT_LIBRARY_PATH in env in check_att_capability
- Add resolve_library_path
- supports resolution of library names to SOVERSION and VERSION paths
* Fix python linting error (unused import)
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
7482372c0b |
[CMake] Removing the mention of failed while running rocminfo (#409)
* Making sure if GPU_TARGETS is given as input that we ignore the usage of rocminfo
* Update rocprofiler-sdk-utilities.cmake
* Update rocprofiler-sdk-utilities.cmake
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3a62fee4ac |
[rocprofv3-avail] Rework rocprofv3-avail tool (#312)
---------
Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
e5097d6a36 |
[ROCm 7.0] [PC sampling] Disable PC sampling tests if the GFX arch doesn't support it (#436)
Disable PC sampling tests if the GFX arch doesn't support it
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0e93099fd7 |
[rocprofv3] SQLite3 database output (rocpd) support + rocprofiler-sdk-rocpd (#403)
* [rocprofv3] rocpd SQLite3 database output support
* Move counters xml and yaml to source/share/rocprofiler-sdk
- more representative of install hierarchy
* Add share/rocprofiler-sdk/rocpd SQL files
* Experimental rocprofiler-sdk SQL API
* rocprofv3 default output format is rocpd
* Fix rocpd event ids for counter collection w/o kernel dispatch
* Remove fktable entries from rocpd_tables.sql
* Fix rocpd schema path
* Fix install component for roctx python bindings
* rocprofiler-sdk-rocpd
- create include/rocprofiler-sdk-rocpd
- create rocprofiler-sdk-rocpd library, package, etc.
- default all "guid" fields to "{{guid}}" in tables
- remove "{{view_uuid}}" support (always unused)
* Migrate rocprofv3 to use rocprofiler-sdk-rocpd
* Fix missing foreign key reference
* Revert change
* Fix cmake comment
* Fix maybe-uninitialized compiler warning
* Fix maybe-uninitialized compiler warning
* Add logging to rocpd_sql_load_schema
* Improve string sanitization when inserting json strings
* Initialize rocpd logging on rocprofiler-sdk-rocpd library load
* Revert lib/output/generatePerfetto.cpp changes
* [temporary] Tweak rocprofv3-test-list-avail-trace-execute test log level
* Update get_install_path for lib/rocprofiler-sdk-rocpd/sql.cpp
- try to resolve issues on RHEL/SLES for dladdr
* Update lib/common/logging.cpp
- enable environ overrides
* dlsym for rocpd_sql_load_schema
* Make dl_info.dli_fname lexically normal
* Implement node_info alternatives if /etc/machine-id does not exist
* Misc include fixes
* SHA256 and UUIDv7 support
* Implement UUIDv7 in generateRocpd.cpp
* Support push/pop environment variables
* Minor tweak
* Fix glog segfaults when unsetting glog env
* Updated CHANGELOG
* Updates tests/pytest-packages
- rocpd_reader.py: RocpdReader
* Update tests / marker_views.sql
- add test_rocpd_data
* Update rocpd_tables.sql
- Use AUTOINCREMENT
- insert "uuid" and "guid" into rocpd_metadata
* Minor updates to generateRocpd.cpp
- don't quote GUID
- use sqlite3_open_v2
- use sqlite3_close_v2
* Update execute_raw_sql_statements_impl
- uses sqlite3_last_insert_rowid for autoincrement
* Update SQL deferred_transaction
- CI check for nullptr to connection
* Apply suggestions from code review
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
* Code review updates
- formatting
- replace if with switch
- remove loop for {{uuid}}
* Fix pmc_groups handling in rocprofv3
* Address code review feedback
- Include rocm_version in rocprofv3 version info
- Note `--version` option for `rocprofv3` in CHANGELOG.md
- remove commented out code
* Fix packaging dependencies
* Fix install package step of CI workflow
* Fix install package step of CI workflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
6efbc492f1 |
[SDK] Update HIP support for ROCm 7.0 (#413)
* Update HIP VERSION to 7.0.0, HIP_RUNTIME_API_TABLE_STEP_VERSION to 13
* Changed rocprofiler_config_interfaces.cmake to check version after finding package
* Update cmake/rocprofiler_config_interfaces.cmake
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b097e276a9 |
[rocprofv3] Add rocpd output support (part 1: prelude) (#401)
* [rocprofv3] Add rocpd output support (part 1: prelude)
- git submodules for sqlite3, GOTCHA, and pybind11
- HIP stream data
- rocprofiler_query_intercept_table_name(...)
- serialization load
- rocprofiler::sdk::get_perfetto_category(KindT)
- rocprofiler::sdk::parse::strip
- common library updates
- md5sum
- hasher
- simple_timer
- static_tl_object
- get_process_start_time_ns(pid_t)
- output library updates
- node_info
- file_generator (generator is now virtual base class)
- stream info updates
* Added submodules
* Code review updates
* Minor unused-but-set-X warning fixes
* Update CI
- install libsqlite3-dev package
* Update CI
- install libsqlite3-dev package
* Fix static thread-local object memory leak
- also fix signal handler chaining
* Remove URL from comment
* Remove page migration exception
* Enable ROCPROFILER_BUILD_SQLITE3 by default
- try find_package(SQLite3) first and then build when ROCPROFILER_BUILD_SQLITE3=ON
* Fix gotcha installation
- make install of target optional
* Validate tracing + counter collection dispatch data
- i.e. correlation ids, thread ids, timestamps
* Make find_package(SQLite3) optional
- ROCm CI does not have SQLite3 dev package installed and cannot build from source (missing tclsh)
* Fixes to tracing + counter collection test
* get_process_start_time_ns update
- original implementation did not work
* Fix pytest-packages test_perfetto_data for counter collection
- erroneous failure when used with same PMC + multiple agents
* cmake policy: option() honors normal variables
- for GOTCHA submodule
* Improve samples/api_buffered_tracing stability
- reduce likelihood of sporadic exception throw
* Update gotcha submodule
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
c47e5838f1 |
[rocprofv3-benchmark] SDK and rocprofv3 Benchmarking Suite (#157)
* Adding Benchmarking Stg1
* config fix
* reset
* add jpeg and decode traces in iteration
* address comments benchmark config files.
* address comments.
* address comments.
* address comments: revert cntrl ctx.
* address comments: revert csv output.
* resolve merge conflits.
* format.
* build fix.
* fix hip runtime api traces.
* loop cb services.
* format.
* bug fix.
* Fix operator>
- public C++ comparison operator
* Update configuration options
- support selected regions (--selected-regions)
- support writing output config json (--output-config)
- update serialization data
* rocprofv3 tool library misc updates
- lambda for starting context
- support for writing config json
* Tool library updates
- Finished support for all benchmarking modes
- Added build spec support to config json
* Fix ROCPROFILER_SOVERSION
- this value should not be multiplied by 10,000
* Minor tweak to rocprofv3
* Benchmarking scripts
* formatting
* Fix duplicate include
* Add reproducible-dispatch-count test app
- used in benchmarking
* registration logging
- report number of registered contexts and active contexts after client initialization
* Serialize environment in rocprofv3 output config
* ROCPROFILER_BUILD_BENCHMARK CMake option
* Update benchmark SQL schema
- hash_id is text
- add md5sum to benchmarked_app
- remove app_id from benchmarked_sdk
- add sdk_id to benchmark_config
- separate hip_trace into hip_runtime_trace and hip_compiler_trace
- use INT instead of INTEGER for MySQL compatibility
- add count column in benchmark_statistics
- allow std_dev to be NULL in benchmark_statistics
* Update rocprofv3-benchmark.py
- use md5 instead of python hash (which includes random seed)
- use args.mysql_database
- compute md5sum of executable
- fix insert_benchmark_config
- marker trace fixes
- memory allocation fixes
- split hip_trace into hip_{runtime,compiler}_trace
- remove app_id from benchmarked_sdk
- support warmup runs
- count field in benchmark_statistics
* Support launcher and environment in YAML
* Update reproducible-dispatch-count.cpp
- support mode which doesn't use hip event timing
* Misc rocprofv3-benchmark.py updates
- fix some MySQL support
- remove some unnecessary logging
* support mysql db.
* Format.
* Updated SQL input files
- moved benchmark_schema.sql to benchmark_table.sql
- added benchmark_views.sql
- uses {{metric}} syntax for variable substitution
* cmake formatting
* update rocprofv3-benchmark.py
- benchmark config labels
- overhead views
* Encode rocprofv3-benchmark PID in rocprofv3 and timem output files
* Minor tweak to benchmark_views.sql
- include count
- reorder fields for readability
* split statements and use IS if values is NONE.
* use backtick instead of double quotes and add IS before NOT NULL.:
* Adding Mandelbrot Benchmark App
* Adding Dockerfile example
* Update dockerfile
* Update dockerfile
* [SDK] rocprofiler_query_external_correlation_id_request_kind_name
* Execution-profile benchmark mode
* Execution profile SQL support
* Rename mandlebrot folder + misc clang-tidy
* [rocprofv3-benchmark] Execution profile support
* Update installation
* add work dir when setting git revision, useful when building outside src.
* Set FULL_VERSION_STRING and ROCPROFILER_SDK_GIT_REVISION
- when benchmark folder is top-level
* Remove unused python packages from requirements.txt
* Use ldd/pyelftools to include linked libs for md5sum
- also add --filter-benchmark and --filter-rocprofv3 options
- support labeling the rocprofv3 options
- use more argparse groups
- more generic application of filters
- support variable substitution in environment, e.g. PATH=/some/path:$PATH
* Environment improvements
- improve reproducibility when env set via input file vs. shell
- support "environment-ignore" to remove environment variables
* Misc formatting
* Misc. fix
* use backticks for defining new columns name
* Support shuffling the order of benchmark modes/rocprofv3 args
* Address review comments
* Update Dockerfile
- rename to Dockerfile
- reduce to one layer
* Support docker build arg BRANCH
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
11fdedb874 |
[Build System] Install FindrocDecode.cmake and FindrocJPEG.cmake (#382)
Install FindrocDecode.cmake and FindrocJPEG.cmake
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
9f7703f918 |
Build system (libdw), correlation ID, and shebang fixes (#354)
* Fix compilation for output library
- link to targets for ATT (amd-comgr, dw, elf)
* Relax correlation ID retirement log failures
- only fail for correlation ID retirement underflow when building in CI mode
* Fix shebang for several files
- license was inserted before shebang in several places
* Update code coverage exclude folders for samples
* Tweak to agent tests
- test to make sure hsa agent is not the old value instead of testing that it is the new value
* Fix libdw include/link
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
692d041316 |
[SDK] Release 1.0 Public API Modifications (#277)
* Make sure all structs/enums can be forward declared
* Updates to counter collection
- consistency updates and cleanup
* Conversion of dimension information to info struct
* Added deprecated folder
* Testing changes
* merge changes
* Fix shadowed variable
* Source code formatting
* Fix shadowed variable
* Update rocprofiler_counter_info_v1_t member names
* Split version.h into version.h and ext_version.h
- ext_version.h contains external version info, e.g. ROCPROFILER_HSA_API_TABLE_MAJOR_VERSION, ROCPROFILER_HSA_RUNTIME_VERSION
- this reduces amount of recompilation after a commit since version.h gets updated with the git revision
* profile_config -> counter_config
* EOF new line
* [Samples] Reduce header includes + reorg counter collection samples
* Misc compilation fixes
- shadowed variables
- use of [[deprecated("...")]] in C code
- unused variables
* Minor misc modifications
- use common:: instead of rocprofiler::common:: when inside rocprofiler namespace
- counters.cpp
- move local anon namespace functions into rocprofiler::counters:: anon namespace
- use std::string_view for get_static_string
- const ref for get_static_ptr
- misc namespace shortening
* [Public API] rocprofiler_get_version_triplet + rocprofiler_version_triplet_t
- struct rocprofiler_version_triplet_t containing fields for the major, minor, and patch version
- public API function: rocprofiler_get_version_triplet
- define C++ operators for rocprofiler_version_triplet_t
- C++ function compute_version_triplet
* [Tests] Improve async-copy-testing test
- relax constraints
- improve logging
* Update counter_config.h doxygen docs
* ROCPROFILER_SDK_BETA_COMPAT
- ppdef which helps with renaming when set to 1
* Remove spurious include
* Fix includes for cxx/version.hpp
* Doxygen fixes for rocprofiler_get_version and rocprofiler_get_version_triplet
* Public API Experimental Designation
- ROCPROFILER_SDK_EXPERIMENTAL added to experimental function
- "(experimental)" added to doxygen @brief entries
* Fix use of assert instead of static_assert in hip/stream.cpp
* Use typedef instead of define for rocprofiler_profile_config_id_t
* Use inline rocprofiler_{create,destroy}_profile_config instead of ppdef
- added <rocprofiler-sdk/deprecated/profile_config.h>
* Doxygen for rocprofiler_{create,destroy}_profile_config
* ROCPROFILER_SDK_DEPRECATED_WARNINGS
* Temporarily comment out ROCPROFILER_SDK_DEPRECATED_WARNINGS=1
* cmake formatting
* Misc variable renaming in samples and tests
* Fix declarations of types
* Fix hip stream tracing service struct name
- rocprofiler_callback_tracing_stream_handle_data_t renamed to rocprofiler_callback_tracing_hip_stream_api_data_t
* Rename "HIP_STREAM_API" to "HIP_STREAM"
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
36f4788ad5 |
[CI] Miscellaneous Testing Updates (#305)
* Add rocprofiler-sdk-utilities.cmake
- contains cmake function rocprofiler_sdk_get_gfx_architectures
* Update perfetto_reader.py
- fix hash collision
* Update project names in tests folders
- rocprofiler-tests -> rocprofiler-sdk-tests
* Fix incorrect allocation-error handling
* [CI] Disable openmp tests for navi2, navi3, and navi4
* Suppress leaks by omptarget and llvm
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3d1b5eec37 |
[CPack] Prevent the modification of interpreter directives (#278)
* SWDEV-521309 - Prevent the modification of interpreter directives
CPACK is converting /usr/bin/env python3 to /usr/libexec/platform-python in RHEL8.
Undefining __brp_mangle_shebangs will prevent the same
* Correct the cmake format
[ROCm/rocprofiler-sdk commit:
|
||
|
|
17b280e171 |
[CI] Update scratch-memory-tracing test (#304)
* Update scratch-memory-tracing test
* Update cmake/rocprofiler_{formatting,linting}.cmake
- fix typo: ROCPROFILE_CLANG_{FORMAT,TIDY}_EXE -> ROCPROFILER_CLANG_{FORMAT,TIDY}_EXE
* Disable assertion in tracing-hip-in-libraries validation
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
17272d5df1 |
Re-enable OpenMP target and testing (#126)
* Re-enable OpenMP target and testing
* Enable openmp target tests on mi200 jobs
* Fix direct self-inclusion of header file
* Enable openmp-target testing on vega20
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
74efdad1aa |
rocJPEG API Tracing (#73)
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Support for rocJPEG API Trace
* Added newline to rocjpeg_version.h
* json-tool code added, initial test/bin commit
* Formatting
* Resolved rocjpeg bin test compilation errors
* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed
* Formatting and compilation errors
* Minor fixes
* Copyright year update and minor fixes
* Doc update fix
* Added rocjpeg csv file in data
* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes
* Added rocdecode and rocjpeg to CI
* Removed rocdecode and rocjpeg from CI and added back build tests option
* Updated Cmake Files
* Added rocDecode and rocJPEG to CI
* Remove cmake line added in error
* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes
* Added find_package for test
* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path
* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests
* Resolve merge conflicts and formatting
* Added regex find and replace instead of include for CI
* VAAPI package causing errors on Vega20
* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved
* Removed workflows regex
* Formatting and minor test modification
* Modified test for vega20
* Update rocDecode and rocJPEG cmake and tests
* Changelog
* Fix merge conflict
* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing
* Removed if found statements, replaced with TARGET:EXISTS
* Skip json file for rocjpeg and rocdecode tests if not supported
* Add os import
---------
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
95ac740f25 |
Fix install for conversion-script (#211)
[ROCm/rocprofiler-sdk commit:
|
||
|
|
e677801859 |
Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX
- add UBSAN_OPTIONS to setup-sanitizer-env.sh
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX
* Use -fno-sanitize-recover=undefined flag
- this compiler flag causes all undefined behavior errors to exit
* Revert ROCPROFILER_DEFAULT_FAIL_REGEX
* fix for shift overflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0a735d1684 |
Partial fix of legacy rocprofiler project name (#110)
* Partial fix of legacy rocprofiler project name
* Formatting fix
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
66465c3954 |
Enhance CMAKE install instructions with std install location/destination (#85)
* Enhancement - usage of package name flags commonly across for getting unique folder name
* Enhancements - updating libexec/pkg usage, avoid sbin
* CMAKE Format Update
* Python Format Update
* Revert "Enhancement - usage of package name flags commonly across for getting unique folder name"
This reverts commit 2dcd1ac5f22ab90112d90648e4b5dab5c54bc639.
* REview Comments - Revert PACKAGE_NAME usage
* Review Comments - Update source folders accordingly to new cmake install locations
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3e1b8ba4ec |
rocDecode API Tracing Support (#49)
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Merge conflict error
* CMake files changed in response to review comments. Attempting to implement callbacks.
* Turned off test building for rocdecode
* Minor fixes for review comments
* Review comments
* Updated formatting
* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.
* Remove iterate args support
* Remove iterate-args
* enforce abi versioning in macro if
* Fix doc error
* removed spaces to fix indentation error
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
f8442415f8 |
SWDEV-492607: Adding ATT wrapper (#40)
* Adding att parser wrapper
* Adding ATT tests as optional
* Adding decoder API for query capability
* Removed samples
* Formatting
* adding new line
* Removed perfetto and moved to static library
* using default search for lib
* Updated to SDK
* Namespace changes
* Added tests
* Small refactor
* Updated API to receive agent_id
* Fixing tests
* Tidy fixes
* Not write to file
* Switch to filesystem.hpp
* Compilation fixes
* Formatting
* Tidy fix
* Removed likely
* Adding tests
* Added gfx9 test
* Adding gfx12 tests
* Formatting
* Enable tidy
* Fix tests
* Fix deadlock on agent test
* Workaround ASAN
* Moving query outside class.
* Fix standalone tool
* Addressing comments
* Formatting
* Change query name
* Fixed some tests. Updated PR comments.
* Formatting
* Improved coverage
* Formatting
* Fix for comments
* Formatting
* Adding some description. Fix error type.
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b2aa29eb0a |
Skip Building of OpenMP Samples and Tests (#77)
* Add option to disable openmp samples
* Skip building openmp tests and samples for now
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a79f8a0198 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
9805ea599c |
Ability to select alternative compiler per file (#1086)
[ROCm/rocprofiler-sdk commit:
|
||
|
|
e6a4ad787a |
SWDEV-466390: Adding License for roctx, docs, tests packages (#1174)
* Adding License for roctx, docs, tests packages
* Fixing Docs/ROCTx packages
* Fixing roctx path
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3e64cedc0c |
SDK: create CMake option for strict checks on CPU vs. GPU timestamps (#1159)
* SDK: create CMake option for strict checks on CPU vs. GPU timestamps
- Configurating CMake with `ROCPROFILER_BUILD_CI_STRICT_TIMESTAMPS=ON` will enable fatal errors if dispatch/memcpy timestamps on GPU are outside of the start/end time from the CPU
- `ROCPROFIELR_BUILD_CI_STRICT_TIMESTAMPS` defaults to the value of `ROCPROFILER_BUILD_CI`
* Formatting
* Disable async_copy frequency scaling
* Disable profiling dispatch time frequency scaling
* Support runtime configuration via env variables
- ROCPROFILER_CI_FREQ_SCALE_TIMESTAMPS env variable will enable scaling the timestamps based on the hsa timestamp period
- ROCPROFILER_CI_STRICT_TIMESTAMPS env variable will enable strict timestamp checks
- when cmake is configured with ROCPROFILER_BUILD_CI_STRICT_TIMESTAMPS=ON, this env variable defaults to true
* ROCPROFILER_BUILD_CI_STRICT_TIMESTAMPS defaults to OFF
* Update cmake-target
* Common tracing::adjust_profiling_time
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
07f698a532 |
CMake: Consistently name CMake Targets (#1082)
* Change all rocprofiler-X target names to rocprofiler-sdk-X
* Update rocprofiler-sdk-config.cmake
- fix install tree target names
- simplify logic for using find w/ components and find w/o components
* Update rocprofiler-sdk-roctx-config.cmake
- simplify logic for using find w/ components and find w/o components
* Update samples/intercept_table/CMakeLists.txt
- demonstrate/test use of `find_package(rocprofiler-sdk ... COMPONENTS ...)`
[ROCm/rocprofiler-sdk commit:
|
||
|
|
d87319ba1c |
Miscellaneous Testing Fixes + CMake find_package include guard (#1106)
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX
* Support find_package called twice
* Skip iteration of ROCPROFILER_AGENT_TYPE_CPU
* Relax tests/rocprofv3/summary
* Tweak to rocprofiler-sdk-tool/tool.cpp
* Move rocprofv3-trigger-list-metrics to libexec
* Remove PC_SAMPLING_TESTS_REGEX from CI workflow
* Update packaging
- core component depends on roctx component
* Increase verbosity for failing tests
* Fix RPATH of rocprofv3-trigger-list-metrics
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3a65aabe8b |
Testing updated RPM dockers (#1136)
* Testing updated RPM dockers
* Trying to fix PSDB for test package dependency
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b610c50913 |
Fixing kokkosp tool library packaging (#1121)
* Fixing kokkosp tool library packaging
* Update source/lib/rocprofiler-sdk-tool/kokkosp/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update CMakeLists.txt
* Update CMakeLists.txt
* Component Requirement in CPack
* Adding package dependency
* Update CMakeLists.txt
* Update rocprofiler_config_packaging.cmake
* Fix rocprofiler-sdk-tool-kokkosp BUILD/INSTALL RPATH
- CMAKE_INSTALL_LIBDIR doesn't help
* Add BUILD/INSTALL RPATH to rocprofv3-trigger-list-metrics
- fixes packaging issues
* Update packaging
- core depends on rocprofiler-sdk-roctx
- add CPACK_DEBIAN_PACKAGE_SHLIBDEPS_PRIVATE_DIRS to resolve inter-package dependencies
* Fix package depends version format
* Improve tests/rocprofv3/summary/validate logging
* Update CI workflow
- prioritize roctx package in Install Packages step
* Remove setting <package-name>_VERSION in config.cmake.in
- this is automatically handled by existence of <package-name>-config-version.cmake
* Update rocprofiler-sdk-config.cmake
- relax find_package versioning requirements to same major and minor version
* Update rocprofiler-sdk-config.cmake
- relax find_package versioning requirements (remove EXACT, specify range)
* Tweak CI workflow
* Update perfetto_reader.py
- better handle failure to load trace processor
* Misc cleanup for config packaging
* Update config packaging
* Update config packaging
* Revert perfetto for core-rpm packages
* Revert perfetto for core-rpm packages
- perfetto < 0.9.0
* Tweak tests/rocprofv3/summary/validate.py
- reorder some checks
---------
Co-authored-by: Ammar Elwazir <aelwazir@useocpm2m-387-013.amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3df0605296 |
Correct the package name of rocprofiler-sdk (#1126)
* Correct the package name of rocprofiler-sdk
ROCM VERSION(for ex: 60300) was missing in the package name.
Added the same
* Use cmake cache string while setting the variable for ROCm Version
* correct the cmake-format
---------
Co-authored-by: Ranjith Ramakrishnan <Ranjith.Ramakrishnan@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
fcd6cc45bd |
Package RCCL headers to support adding RCCL support w/o installed headers (#1075)
- in ROCm CI, rocprofiler-sdk gets built before RCCL is installed, this is a workaround for this issue
[ROCm/rocprofiler-sdk commit:
|
||
|
|
c47a128941 |
Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing
Address comments
* [Draft]: Add support for RCCL tracing
Address PR comments, changes from RCCL upstream
* Add RCCL library table registration
Working on adding support to rocprofiler-register
* Support compilation w/o <rccl/amd_detail/api_trace.h>
- dummy api_trace.h header
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header
* RCCL API tracing tool support
- add to rocprofv3
- add to json-tool
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|