develop
222 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
94c246eb9e | attach: fix typos and older names in documentation (#2684) | ||
|
|
b509e9bd77 |
[rocprofiler-sdk] Fix domain_ops_padding for 515+ HIP operations (#2941)
* [rocprofiler-sdk] Fix domain_ops_padding for 515+ HIP operations The HIP runtime API now has 515+ operations (as of ROCm 7.x), but domain_ops_padding was set to 512. This caused std::out_of_range exceptions when checking operations >= 512 via std::bitset::test(). Changes: - Increase domain_ops_padding from 512 to 1024 - Add compile-time static_assert to validate padding is sufficient for all API domains (HIP, HSA, marker, RCCL, rocDecode, rocJPEG) Co-Authored-By: Claude (claude-opus-4.5) <noreply@anthropic.com> * Update projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/context/domain.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * [rocprofiler-sdk] Apply clang-format-11 to domain.cpp Co-Authored-By: Claude (claude-opus-4.5) <noreply@anthropic.com> * Rework implementation to ensure coverage of all operation enums * Fix compiler error in unit test for enum_string.cpp * Fix data types of domain_ops_padding values * Revert some changes in domain.cpp --------- Co-authored-by: Claude (claude-opus-4.5) <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
1517a398bf |
[rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements Buffer Pool Design ------------------ Replace the fixed array-based double buffer with a dynamic pool design to fix race conditions that caused "internal correlation id was retired prematurely" errors. The original design had a race where flush callbacks could be delivered out-of-order: when buffer 0 fills and begins flushing, writes go to buffer 1. If buffer 1 fills before buffer 0's flush completes, the buffer index wraps back to 0 (which may still be flushing). Independent flush tasks submitted to the thread pool can complete out of order. The new pool design: - Uses a std::deque of buffer instances that grows as needed - Allocates buffers from the pool when the current buffer needs to flush - Serializes flushes with a mutex to ensure FIFO callback ordering - Returns buffers to the pool after flush completion - Eliminates the race between buffer selection and write operations New Unit Tests -------------- - buffer_correlation_ordering.cpp: Tests that API records are always delivered before their corresponding retirement records - buffer_ordering_stress.cpp: Stress tests buffer flush ordering under high contention with multiple threads rapidly filling buffers HSA Tool Hooks -------------- Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that waits for pending flush tasks before tool finalization, preventing "retired prematurely" errors during HSA shutdown. Sanitizer Improvements ---------------------- - LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder - LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup) - TSAN: Added suppression for false positive on C++11 thread-safe static initialization in create_write_functor - ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto - Disabled attachment tests for sanitizers due to library preloading issues Other Fixes ----------- - Thread-trace agent test: Use heap-allocated callback state - Correlation ID: Refactored reference counting and finalization ordering * [rocprofiler-sdk] Revert buffer pool design changes Revert buffer.cpp and buffer.hpp to the original double-buffer design from develop branch. The pool-based redesign introduced concerns about: - Signal safety (mutex vs atomic_flag) - API changes (flush() return type) - Complexity of the new design This revert removes: - Dynamic buffer pool with std::deque - std::mutex/condition_variable synchronization - buffer_correlation_ordering.cpp test - buffer_ordering_stress.cpp test The underlying buffer flush ordering issue will need to be addressed with a different approach that preserves the original API and synchronization characteristics. * [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization - Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks - Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning - Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp) - Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior: - hsa/queue.cpp (lines 105, 210) - hsa/async_copy.cpp (line 344) - hsa/hsa_barrier.cpp (line 43) - buffer.cpp (lines 107, 138, 185) This ensures no correlation IDs are created once finalization starts (fini_status != 0), preventing races between finalization and ongoing tracing operations. * [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation Buffer records are not guaranteed to arrive in any specific order. Tests and samples should use timestamps for temporal ordering validation instead. Changes: - samples/external_correlation_id_request: Replace 'retired prematurely' arrival order check with timestamp-based validation that retirement timestamp >= max(end_timestamps) for records with the same correlation ID - tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check Correlation IDs are not guaranteed to be monotonically increasing when records are sorted by timestamp. Temporal ordering should be validated using the timestamp fields in each record. * [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal Restore the SYSTEM keyword to target_include_directories for rocprofiler-sdk-fmt to match develop branch. * [rccl] Remove orphaned rocSHMEM gitlink Remove orphaned submodule reference that was introduced during a merge but never had a corresponding .gitmodules entry, causing CI failures with "fatal: no submodule mapping found in .gitmodules". * [rocprofiler-sdk] Add HSA ABI version 0x09 support Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release functions (added in rocr-runtime SWDEV-561708). * [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations This commit consolidates fixes for handling the finalization status during buffer flush operations across the SDK. Changes: - Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully when flushing buffers, as this indicates buffers were already flushed during finalization (not an error condition) - HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check for fini_status to allow operations during finalization process - buffer.cpp: Revert fini_status checks to use > 0 for consistency - correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging to prevent correlation ID creation after finalization starts Files modified: - source/lib/rocprofiler-sdk-tool/tool.cpp - tests/tools/json-tool.cpp - source/lib/rocprofiler-sdk/tests/registration.cpp - source/lib/rocprofiler-sdk/tests/roctx.cpp - samples/api_buffered_tracing/client.cpp - samples/counter_collection/buffered_client.cpp - samples/counter_collection/device_counting_async_client.cpp - samples/external_correlation_id_request/client.cpp - samples/pc_sampling/client.cpp - source/lib/rocprofiler-sdk/buffer.cpp - source/lib/rocprofiler-sdk/context/correlation_id.cpp - source/lib/rocprofiler-sdk/hsa/queue.cpp - source/lib/rocprofiler-sdk/hsa/async_copy.cpp - source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp * [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls in samples and tools. The ERROR_FINALIZED handling was overly complex and the hsa_tool_hooks OnUnload synchronization is no longer needed. Changes: - Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code - Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL - Simplify buffer flush in tool.cpp and json-tool.cpp - Remove ERROR_FINALIZED special handling from test files Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Fix output_stream move semantics to null source pointers The default move constructor and move assignment operator for output_stream did not null out the source's pointers after the move. This caused double-close when the moved-from temporary was destroyed, leading to use-after-free crashes (SIGSEGV in std::ostream::sentry). Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration - generatePerfetto.cpp: Move output_stream into shared_state to prevent use-after-free race conditions during Perfetto callback execution - run-ci.py: Simplify and consolidate sanitizer environment variable configuration for better maintainability Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required for CTest to properly pass suppression files to the sanitizers during memcheck runs. Co-Authored-By: Claude <noreply@anthropic.com> * Revert "[rccl] Remove orphaned rocSHMEM gitlink" This reverts commit 1ad21003941355658fff8114fa27768f11a948f7. * [rocprofiler-sdk] Revert registration.cpp changes Revert changes to registration.cpp to match develop branch. Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Remove suppression file content printing from run-ci.py Co-Authored-By: Claude <noreply@anthropic.com> * Fix output_stream move ctor/assignment operator * Fix erroneous revert of registration.cpp * Fix handling of fini status in correlation ID construction * [rocprofiler-sdk] Fix OMPT segfault during finalization Add nullptr checks in OMPT tracing code to handle the case where correlation_tracing_service::construct() returns nullptr during finalization. This fixes segfaults in openmp-target-sample and tests.integration.execute.openmp-tools. The correlation ID construction now returns nullptr when fini_status > 0, but the OMPT callbacks were not checking for this, causing crashes when dereferencing the null pointer during OpenMP runtime shutdown. Changes: - event_common(): Return nullptr early if correlation ID is null - event(): Check for nullptr before calling sub_ref_count() - ompt_task_create_callback(): Return early if correlation ID is null - ompt_task_schedule_callback(): Return early if correlation ID is null * [rocprofiler-sdk] Fix HSA API tracing segfault during finalization Add nullptr check in hsa_api_impl::functor after correlation ID construction. During finalization, correlation_service::construct() returns nullptr, and without this check the code would dereference the null pointer when accessing corr_id->internal. This fixes the SEGV at address 0x000000000008 (null + 8 byte offset) that occurs when HSA async event threads call hsa_signal_destroy during runtime shutdown after finalization has started. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
9a8942a89c |
SWDEV-558836, SWDEV-558837 - Add hipMemSetMemPool and hipMemGetMemPoo… (#1349)
* SWDEV-558836, SWDEV-558837 - Add hipMemSetMemPool and hipMemGetMemPool implementation * Add managed allocation type for mem pools * Update rocprofiler-sdk with APis declaration |
||
|
|
99c3a06f4e |
SWDEV-549518 - Enable logging dynamically through HIP APIS. (#1079)
* SWDEV-549518 - Enable logging dynamically through HIP APIS. * SWDEV-549518 - Adding ROCProfiler related new API changes. * rocprofiler-sdk changes for hip api additions. --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com> |
||
|
|
8760fb4976 |
attach: Formalize ROCAttach API (#1653)
* attach: Formalize ROCAttach API - Make ROCAttach public with public headers - Change detach to take a PID - attach and detach are now reentrant - Cleanup of states and signal handling in ptrace session - Fixes mixed up definition of ROCPROF_ATTACH_TOOL_LIBRARY - ROCPROF_ATTACH_TOOL_LIBRARY now always means the tool library loaded by the attachment target - ROCPROF_ATTACH_LIBRARY refers to the library used to perform attachment - Add direct call of rocprof-attach - Fix python library call of rocprof-attach - Function now named attach(), changed from main() * attach: rocprof-compute ROCAttach updates - Update to new library names - Correct usage of C lib detach * attach: add test for rocattach - Disable ASan, TSan, and UBSan for the new parallel-attach test - Lower log level for LSan tests, existing behavior from other tests --------- Co-authored-by: Ammar ELWazir <aelwazir@amd.com> |
||
|
|
07dd4c85e7 | SWDEV-546308 - Implement hipKernelGetParamInfo API (#1783) | ||
|
|
cf536a8c1a | SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1771) | ||
|
|
5feec0513d | Fix clang format (#1715) | ||
|
|
9f940c7265 |
Add missing API calls to rocprofiler (#1599)
Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com> |
||
|
|
d496bcef18 |
Fix dimension mismatch for multi-GPU systems with identical architect… (#1440)
* Fix dimension mismatch for multi-GPU systems with identical architectures This change addresses an issue where counter dimensions were incorrectly shared across all GPU agents with the same architecture name, even when those agents had different hardware configurations (e.g., different CU counts). Changes: - Updated getBlockDimensions() to accept agent ID instead of architecture name - Made dimension cache agent-specific instead of architecture-specific - Updated set_dimensions() in AST evaluation to use specific agent ID - Modified all API functions to handle agent-specific dimension lookups - Updated tests to work with agent-specific dimensions This fix ensures that dimensions accurately reflect the actual hardware configuration of each individual GPU agent, preventing dimension mismatches in multi-GPU systems where GPUs share the same architecture but have different physical configurations. Counter ID Representation Changes: - Modified counter_id encoding to include agent information in bits 37-32 - Agent logical_node_id is encoded as (value + 1) to ensure agent 0 is detectable - Counter records internally store only 16-bit base metric IDs (bits 15-0) - Tool reconstructs agent-encoded counter IDs from base metric ID & agent info - Instance record counter_id field uses bitwise AND mask to extract base metric ID (counter_id.handle & 0xFFFF) to fit in 16-bit storage - Output generators (CSV, JSON, Perfetto) use agent-encoded IDs for consistency - Updated counter_config.cpp and metrics.cpp to extract base metric ID when needed - All counter lookups now properly handle agent-encoded vs base metric IDs This ensures counter IDs are consistent between metadata and output records while maintaining compact storage in instance records. |
||
|
|
79076c4ad5 |
attach: Cleanup docs from initial commmit (#1302)
- Remove unimplemented older API functions - Remove mentions of reattach API - Remove details on implementing a process attachment library - This will return later as a theory of operation |
||
|
|
46e683d41a |
SWDEV-545950 - Add hipStreamCopyAttributes API Implementation (#914)
* SWDEV-545950 - Add hipStreamCopyAttributes API Implementation * Add unit test for hipStreamCopyAttributes API * Add ChangeLog and nvidia mapping for the API * Update rocprofiler-sdk with new HIP API details * [rocprofiler-sdk] handle hipStreamCopyAttributes in stream tracing service - this new HIP function has multiple stream arguments and needs to be skipped because it does not have an explicit create/destroy/set functionality * Update HIP_RUNTIME_API_TABLE_STEP_VERSION in clr and rocprofiler-sdk * Resolve merge conflicts --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
27ec19116d |
SWDEV-557828 - fix hip-tests on cuda (#1152)
Co-authored-by: Rahul Manocha <rmanocha@amd.com> |
||
|
|
952d1dabe2 |
[ROCProfiler-SDK][ROCR] HSA New API changes for HSA_AMD_EXT_API_TABLE_STEP_VERSION 8 (#1182)
* add new hsa ext api for version 8. * use fmt instead of ostream. * override rccl from therock * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * enable rocr-build * format * disable att consecutive-kernels tests. * Enable ROCR build in code coverage workflow --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> |
||
|
|
c441a87a00 |
[rocprofiler-sdk][RCCL] RCCL New API changes for RCCL_API_TRACE_VERSION_PATCH = 2 (#985)
- Address build issue with RCCL sync with NCCL commit: ROCm/rccl@08a7be2 - Patch Version Bump-up PR: ROCm/rccl#1916 |
||
|
|
aece11079c |
SWDEV-553006: Fix slow lookup of debug symbols (#821)
* SWDEV-553006: Fix slow lookup of debug symbols * Refactor * Better docs * Update projects/rocprofiler-sdk/source/include/rocprofiler-sdk/cxx/codeobj/code_printing.hpp |
||
|
|
63a723a287 |
GFX12 PC Sampling support (#186)
The GFX12 host-trap PC sampling support in SDK and V3. Introducing parser tests specific to GFX12. Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com> |
||
|
|
e79eaaa8a5 | SWDEV-546287 - Implement hipLibrary load/unload (#975) | ||
|
|
bf49039005 |
[rocprofiler-sdk][rocprofiler-register] Initial Attachment Support (#316)
* attach: milestone: API tracing - This pairs with another commit in rocprofiler-sdk to fully function - Add ptrace entry points for tool attachment - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup - Remove hardcode for loading of tool library - Make invoke registration functions public again * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-sdk * attach: prestore overhaul - Must be paired with commit in rocprofiler-sdk * attach: add dispatch table rework - Register will load the prestore library and provide entrypoints to sdk * attach: formatting and cleanup * attach: revise dispatch table scheme * attach: formatting * attach: milestone: API tracing - This change must be paired with a change in rocprofiler-register to fully function. - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup and comments * attach: Formatting and crash fixes * attach: add attach duration - Add option attach-duration-msec for attachment * Formatting + sglang hang fix via signal handling * Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-register * Allow null agents for scratch output * attach: improve queue library interface - Significant changes to force exported interfaces back to C - Fixes bug with unknown agents at attachment - Code objects' names may still be incorrect * attach: add code_object support - Kernel traces will now have names and all other information for launches - Add capture of hsa_executable to the queue library - Various logging improvements * attach: rename queue library to prestore * attach: prestore overhaul - Must be paired with commit from rocprofiler-register - Massive overhaul of code organization in prestore library - Separates registrations for different object types - Sets up future changes for initialization * attach: add prestore dispatch table - Removes linkage to prestore library from sdk * attach: cleanup * attach: formatting * attach: fix input prompt not appearing * attach: fix component name in cmake * attach: revert change to export level * Make prestore API public * attach: update sdk attachment library WIP - This commit is NONFUNCTIONAL - Changes around structure to remove classes - Seperate C linkage where needed - Still needs updates to register for correct usage * attach: update register with dispatch table WIP - This commit is NONFUNCTIONAL - Changes rocprofiler_register to handle dispatch table from attach library. - Still needs changes in SDK with dispatch table usage * attach: dispatch table wip - This commit is NONFUNCTIONAL * attach: move attach component into core * attach: rename to rocprofv3-attach * attach: add callbacks for new queues and code objects * attach: finish dispatch table implementation - Fixes kernel tracing * attach: add cmake variable for attachment support * feat: Add --attach alias for rocprofv3 with comprehensive attachment tests - Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py - Create comprehensive attachment test suite with CSV and JSON output validation: - New attachment-test application for testing dynamic profiling scenarios - Unified test script supporting both CSV and JSON output formats - Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info - Add CMake integration for automated attachment testing - Support parameterized output directory and filename specification - Implement proper environment setup for attachment queue registration Tests verify successful attachment to running processes and capture of: - Kernel dispatch traces with workgroup/grid dimensions - Memory copy operations (H2D/D2H) with size validation - HSA API call traces across multiple domains - GPU/CPU agent information and capabilities * Documentation Update * attach: make attach script callable * Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name * attach: revert metrics library path changes * Generic Attachment in Register (#942) Remove tool references in register * Add second param to attach call in rocprof register * Add experimental reattachment support for ROCprofiler-SDK This commit introduces experimental reattachment functionality allowing tools to dynamically reattach to running processes with comprehensive design changes to support multiple attach/detach cycles: **Core Reattachment API:** - Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks - Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports - Implement reattachment tracking in rocprofiler_register_attach to differentiate initial attachment from reattachment cycles - Add rocprofiler_register_invoke_reattach for handling reattachment requests **Design Changes - Registration System Flow:** The registration system now supports a dual-path initialization: 1. Initial Attachment Flow: - rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations() - Full tool initialization with complete context setup - Sets prev_attached atomic flag to track state 2. Reattachment Flow: - rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach() - Bypasses full re-initialization, calls client reattach callbacks instead - Preserves existing contexts and buffers, only reactivates profiling services **Design Changes - Tool Library Loading:** Enhanced rocprofiler-register library loading with function pointer resolution: - Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers - Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions - Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability **Design Changes - Context Management:** Introduced dual context systems for attachment scenarios: - get_contexts() - Original contexts for standard tool initialization - get_attach_contexts() - Separate context map for attachment-specific lifecycle - attach_init() - Creates contexts for ALL buffer tracing services using existing buffers - attach_start() - Selectively starts contexts based on configuration options - attach_detach() - Cleanly stops and destroys attachment contexts **Design Changes - Buffer Management:** Added reset_tmp_file_buffer() template for clean reattachment state: - Properly closes and removes old temporary files - Deletes existing file_buffer instances to prevent stale file position tracking - Creates fresh file_buffer instances for clean reattachment cycles - Addresses core issue where file position metadata becomes stale between cycles **Design Changes - Environment Variable Injection:** Added ROCP_REGISTERED_TOOL_ATTACH environment variable: - Distinguishes attachment-loaded tools from LD_PRELOAD scenarios - Enables registration system to apply attachment-specific logic - Helps tools adapt behavior for attachment vs standard initialization **Attachment Context Management:** - Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle - Add reset_tmp_file_buffer template for clean reattachment state management - Implement get_attach_contexts() for tracking active attachment contexts **Test Infrastructure:** - Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite - Include reattachment test scripts with unified attachment/detachment cycles - Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info - Add conftest.py for JSON and CSV data loading utilities **Configuration Updates:** - Update CMakeLists.txt to include reattachment tests in build system - Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking - Enhance rocprofiler-register library loading with reattach/detach function resolution **Flow Impact Analysis:** This design enables robust multi-cycle attachment by: 1. Preventing duplicate initialization on reattachment 2. Maintaining separate context lifecycles for attachment vs standard operation 3. Ensuring clean temporary file state between attachment cycles 4. Providing tools with explicit reattach/detach callback hooks 5. Supporting both programmatic and environment-based tool configuration The experimental nature allows for iteration on the API while establishing the foundation for production-ready dynamic profiling capabilities. * Fix misc clang-tidy warnings/errors * CMake Option and Environment Variable Updates - CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT - Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED -> * Source reorganization * Formatting + new lines at EOF * Fix flake8 F841: local variable is assigned to but never used * Update attachment test - get rid of 5 second start delay - add roctx * Rework implementation - Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach - Add <rocprofiler-sdk/experimental/registration.h> - TODO: Update process_attachment.rst * Handle re-attachment options - inherit options from previous attachment - check previous options do not modify data collection services * Fix support for tools w/o rocprofiler_configure_attach - fix segfault when rocprofiler_configure_attach does not exist - fix naming convention for functions accepting attach dispatch table - cleanup rocprofiler_configure_attach implementation in rocprofv3 tool * attach: remove unknown agent handling - Change was from earlier commit, no longer needed * attach: add error for attaching without library loaded * attach: revise version numbering * attach: register header revisions * attach: clang format register * attach: formatting * attach: fix build failure - Remove cross dependency into rocprofiler-sdk, fixes build on some systems * attach: revise register library detection * Update rocprofiler-register and attach library - formatting - proper signature of register_functor for rocprofiler-sdk-attach library callback - remove get_dispatch_registration_table() * Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion * Fix output support for rocprofiler-sdk-tool * Fix formatting * Fix clang tidy errors * Misc rocprofiler-sdk-attach fixes * attach: add sigint handling to attach python * tool README.md formatting Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * Fix buffered output issue * attach: add errors for tool attach * CI Fixes * Rework tests * attach: improve library loading in rocprofv3 attach * formatting * Update tests to use pytest framework * Fix test_attachment_hsa_api_trace * attach: catch ctypes exceptions * attach: fix leak in registration * attach: fix sanitizer tests * attach: fix sanitizer tests further * attach: disable attach asan tests * attach: disable ubsan test * attach: fix permissions in installed test package * attach: formatting --------- Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com> Co-authored-by: Tim Gu <Tim.Gu@amd.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: Benjamin Welton <bwelton@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> |
||
|
|
9849073836 |
SWDEV-540648: Adding realtime clock to v3 tool. Update decoder header. (#666)
* SWDEV-540648: Adding realtime clock to v3 tool. Update header for decoder. * Adding tests * Review comments * Review comment |
||
|
|
a697941150 |
[ROCProfiler SDK CI] Runners Update & Workflow Cache Improvement (#722)
Overriding checks/reviewers as CODEOWNER changes are pending * Runners Update Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Testing ROCProfiler-SDK Testing ROCProfiler-SDK Changing CDash Fixing ROCProfiler-SDK Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating images Updating images Updating images Updating images Updating RHEL and SLES for AQLProfile Fixing RPM OSes AQLprofile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK * Fixing ENV for ROCProfiler-SDK Fixing ENV for ROCProfiler-SDK Temp workaround for OpenMP targets Fixing ROCProfiler-SDK for Ubuntu * Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Update rocprofiler-sdk-continuous_integration.yml Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Adding RPM Package Adding RPM Package Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Update rocprofiler-sdk-continuous_integration.yml Update rocprofiler-sdk-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update rocprofiler-sdk-continuous_integration.yml Fixing AQLProfile * [rocprofiler-sdk][CI] add latest aqlprofile to rocprofiler-sdk workflow (#352) * add aqlprofile * misc. * format * add sudo to install * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml --------- Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> Update aqlprofile-continuous_integration.yml Removing extra packages Removing extra packages Fixing ROCM Path Issues Fixing ROCM Path Issues Fixing ROCM Path Issues Fixing RHEL Fixing RHEL Fixing RHEL Fixing RHEL Fixing RHEL Fixing Sanitizers * General Fixes * Fixing ROCProfiler-SDK CI * Fixing ROCProfiler-SDK CI * Update projects/aqlprofile/dashboard.cmake Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * General Fixes * Update Readme.txt * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Update rocprofiler-sdk-continuous_integration.yml * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Upgrade ROCm Release & Fix for RHEL & SLES - ROCProfiler SDK CI * Fix for RHEL & SLES - ROCProfiler SDK CI * Fix for RHEL & SLES & Sanitizers - ROCProfiler SDK CI * Fix for RHEL & SLES & Sanitizers - ROCProfiler SDK CI * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Update run-ci.py * Fix for Sanitizers & Fix for RHEL 8.8 * Updating Code Coverage Workflow * Updating Code Coverage Workflow * Formatting Fix * Formatting Fix * Fix for Code Coverage & Sanitizers * Fix for Code Coverage & Sanitizers * Fix for Code Coverage & Sanitizers * Caching Docker * Caching Docker * Caching Docker * Changing Runner for CI Builder * Adding CCache * Fixing Core * Fixing Core * Fixing Core * Fixing Core * Fixing Core * Update rocprofiler-sdk-continuous_integration.yml * Update ROCm and amdgpu repository configurations * Refactor repository configuration commands in CI * Fix installation commands in CI workflow * Remove unnecessary packages from installation commands * Update ROCm and amdgpu repository paths in CI config * Update pip installation commands to handle errors * Install AWS CLI in CI workflow * Update rocprofiler-sdk-continuous_integration.yml * Remove awscli installation from CI workflow * Modify PATH and pipx install commands in CI config * Refactor ROCm SDK CI workflow to eliminate redundancy * Add safe.directory configuration for git * Update rocprofiler-sdk-continuous_integration.yml * Fix CMake install prefix in CI workflow * Add variant option to ccache configuration * Change compiler launcher from ccache to sccache * Set up Python virtual environment in CI workflow * Remove ccache launcher from CMake build * Add environment setup for building projects * Add Curl installation step for RHEL 8.8 * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Fixing RPM * Fixing RPM & Code Coverage * Fixing RPM * Fixing CI * Lowering the size of the docker image * Update aqlprofile-continuous_integration.yml * Updating paths in AQLProfile * Splitting the Build CI Docker Images from Main CI * Create Dockerfile.ci, update ci docker workflow to reference it * Splitting the Build CI Docker Images from Main CI * Add new line to Dockerfile.ci * Remove on schedule logic from ci docker workflow, change cdash project name in run-ci.py * Update file path in build_ci_docker_images.yml * Remove context from docker step * Update file path in build_ci_docker_images * more path changes * remove context again * Update rocprofiler-sdk-build_ci_docker_images.yml * Update rocprofiler-sdk-code_coverage.yml * Update rocprofiler-sdk-continuous_integration.yml * Remove env variables from rocprofiler-sdk-build_ci_docker_images.yml * Rename docker images file * Rename KEY to FILE_NAME for Docker tarball * [rocprofiler-sdk][CI] lint fixes (#830) * lint fixes. * Updating Code Coverage Workflow * Update rocprofiler-sdk-code_coverage.yml * Update format.hpp * Update format.hpp --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com> * TEMP: Removing ROCR build from develop * [rocprofiler-sdk][SDK] Add new HIP API changes for ROCm 7.1 (#856) * Add new HIP 7.1 changes. * bug fix. * bug fix. * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix typo in hipDriverEntryPoint case statement --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: jbonnell-amd <jason.bonnell@amd.com> Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> |
||
|
|
9b4c12a357 |
Revert "[rocprofiler-sdk][SDK] Update to address new API changes for HIP ROCm…" (#850)
This reverts commit
|
||
|
|
5ac738150a |
[rocprofiler-sdk][SDK] Update to address new API changes for HIP ROCm 7.1 (#793)
* Add new HIP 7.1 changes. * bug fix. * bug fix. * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
43ac6b2ef5 |
[rocprofiler-sdk] Add support for new RCCL API (#771)
* [rocprofiler-sdk] Add support for new RCCL API Add support for `ncclAllReduceWithBias` * Move func to be in sync with rccl header |
||
|
|
ff43893902 | Fix decoder description (#513) | ||
|
|
cb77f5af5c |
Adding new trace decoder record types and new ATT parameters (#195)
* Adding new trace decoder record types and new ATT parameters * Add compatiblity with decoder 0.1.2 * Added RT * Format * Add logging to sdata values * Review comment * Review comments * Update projects/rocprofiler-sdk/source/include/rocprofiler-sdk/experimental/thread-trace/trace_decoder_types.h |
||
|
|
a5db496d63 |
Include installation sql header in rocpd library (#576)
Include installation of rocpd sql header
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4ca156e572 |
Thread trace and Trace Decoder API tests and samples (#416)
* Adding test and samples to decoder
* Fix sample
* Formatting
* Fix multi test
* Disable sample
* Fix tests
* Format
* Version fix
* Locking the decoder
* Add atomic
* Review comments
* Format
* Adding readme
* merge conflict and adding PCS+ATT test
* Review comments
* Properly disable PCS test
* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt
* Adding back env var test
* Name fix
* Preload sample
* Addressing review comments
* Update docs
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0ff0ffffa2 |
[SDK] Expose counter dims in rocprofiler_counter_info_v1_t and only show counters being profiled in metadata. (#325)
* expose dimensional info in rocprofiler_counter_info_v1_t.
* add counter_id in dim info.
* address review comments
* format.
* address comments.
* use array of pointers for dimensions_instaces.
* format and comments.
* address comments.
* new line.
* Update counter_defs.yaml
* Update counter_defs.yaml
* Update counter_defs.yaml
* counter_defs.
* format counter defs.
* format counter defs.
* format counter defs.
* show only counters being profiled in metadata.
* Format.
* use config for counters and fix warnings.
* add version for rocprofiler_counter_dimension_info_v1_t struct.
* rename rocprofiler_counter_record_dimension_instance_v1_info_t.
* account device id from pmc for counters metadata.
* move dim structs to counters.h.
* address comments to compare value.
* fix tests.
* Address comments. use pointer of arrays for ABI.
* rebase.
* fix build error.
* use separate metadata::init() for rocprofv3.
* also print not found counters.
* precompute all the perf counters needed to be in metadata.
* Misc.
* format
* Format.
* rocprofiler::sdk::container::c_array
* Address comments.
* source/lib/output/metadata.cpp
* lint.
* add unit test for c_array.
* add unit test and serialization support for c_array container.
* Misc.
* Clean files.
* Format.
* clang-tidy.
* add more checks to c_array.
* misc. typo
* Addr comments.
---------
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
620924b15f |
Adding inline callstack information to disassembly (#468)
* Adding callstack information to disassembly
* changelog
* Cleanup
* Fix snapshots.json
* Clang tidy fixes
* Fix infinite recursion
* Apply suggestions from code review
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
* Remove sibling transversal
* Added docstrings
* Apply suggestions from code review
* Update source/include/rocprofiler-sdk/cxx/codeobj/code_printing.hpp
* Review comments
* Format + comments
* Fmt
* Add class name
* Format
* Fix static linkage
* Making funcs inline
---------
Co-authored-by: Giovanni <gbaraldi@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
990946e956 |
[SDK] Fix null handles (#474)
* Fix null handle
- use .handle=0, not .handle=numeric_limits<>::max()
* Update lib.common.hasher
* Fix ROCPROFILER_CONTEXT_NONE
* Use context operator==
* Update CHANGELOG
* Updated null handle for scratch memory and changed allocation test so that free ops account for null agent
---------
Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
483a10f53a |
[SDK] Update UUID (rocprofiler_uuid_t) (#390)
* changing uuid abi
* fix
* review comments
* fix CI fail
* review comments
* fix
* adding static asserts
* making constructor constexpr
* fix CI fail
* upadate UUID length to 16 bytes
* fixing value64
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Update CHANGELOG.md
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
f2a5139a37 |
[CI] add hip api table version of 13 to enum string (#509)
add hip table version to 13, API_ID_LAST is not changed from version=12 since no new struct has been added.
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
7243889d6a |
Add perfetto support for scratch memory (#303)
* Add perfetto support for scratch memory
* Updated tests and docs.
* Update docs data
* Added underflow check
* Record all free events to 0 bytes
* Add format
* Address review comment
* updated tests for scratch memory
* update scratch-memory tests.
[ROCm/rocprofiler-sdk commit:
|
||
|
|
fbf17a42d4 |
[SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges (#363)
* [SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges
Range extent to capture all work between roctxpush/pop operations. Entry callback takes place during roxtxpush and exit callback takes place in roctxpop. This is primarily to allow us to keep an ancestor id on the ancestor stack such that all operations that take place within the push/pop context can be annotated as being apart of this range. With the current setup (where push and pop are two separate operations that need to be combined externally), we cannot keep an ancestor id on the stack and thus cannot tie tracing events to particular ranges.
Correlation id information is inherited from the push operation. Ancestor id needs to be added in a future commit that also outputs this ancestor to CSV.
Output:
```
[ctest] {'size': 64, 'kind': 7, 'operation': 1, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479402642, 'end_timestamp': 2932551491178449, 'thread_id': 3254861}
[ctest] {'size': 64, 'kind': 8, 'operation': 2, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479405878, 'end_timestamp': 2932551491181214, 'thread_id': 3254861}
```
Note: Kind 8 = range extent op.
* Merge fix
Revert several changes
source/lib/rocprofiler-sdk/marker/range_marker.*
- separate out range marker implementation for standard marker implementation
Update public API with marker core range
Support marker core range in sdk (source/lib/rocprofiler-sdk)
Transition rocprofiler-sdk-tool and output lib to use marker core range
Misc fixes for tests
Fix logic in lib/output/generate{CSV,Stats}.cpp
Update tests/rocprofv3/tracing-hip-in-libraries (marker validation)
Fix test_otf2_data
* Test fixes
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
ce7d811719 |
[SDK] Fix buffer tracing stringify of stack-allocated char* buffer (#429)
* [SDK] Fix buffer tracing stringify of stack-allocated char* buffer
* Formatting
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
bde07e7baa |
[SDK] KFD new events API (#321)
* Remove page-migration
* Add KFD events API
* Address review comments
* Move assert checks
* Update enum-string utils
* Update codeowners
* Update KFD header
* Add perfetto category
[ROCm/rocprofiler-sdk commit:
|
||
|
|
8de9854a62 |
[SDK] [CI] Update HSA EXT Step Version (#460)
Update HSA EXT step version
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4d79e1df30 |
[SDK] Support CMake option for using internal RCCL tracing + (temporary) enable in CI (#457)
* Temp: disable RCCL tracing
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Adding option to disable rccl tracing from CMake
* Update codeql.yml
* Misc updates
- ROCPROFILER_BUILD_RCCL -> ROCPROFILER_INTERNAL_RCCL_API_TRACE
- env.EXTRA_TEMP_CMAKE_OPTIONS -> env.GLOBAL_CMAKE_OPTIONS
- add (advanced) option ROCPROFILER_INTERNAL_RCCL_API_TRACE
* Fix rocprofiler::sdk::get_enum_label
- missing enum labels for HIP_RUNTIME_API_TABLE_STEP_VERSION > 8
* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt
- improve various aspect of cmake -- particularly echoing where attdecoder_LIBRARY was found
* Use CMAKE_MESSAGE_INDENT
- add prefix to cmake messages to help indicate where messages are coming from
- make find_package(Python3 ...) QUIET for bindings
* Fix rocprofiler::sdk::get_enum_label
- handle HSA_AMD_EXT_API_TABLE_MAJOR_VERSION
* Fix rocprofv3 message for att library path
* Fix tests/rocprofv3/advanced-thread-trace/att_input.yml config
* Fix rocprofv3 check_att_capability + soversion/version library resolution
- Account for ROCPROF_ATT_LIBRARY_PATH in env in check_att_capability
- Add resolve_library_path
- supports resolution of library names to SOVERSION and VERSION paths
* Fix python linting error (unused import)
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
bb942ef500 |
[rocprofv3] rocpd Python package (#384)
* Squashed commit of the following:
commit f764eb6f4a45baa25eb8f1b50b1035c84578c200
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu May 1 09:09:37 2025 -0500
Misc post rebase fixes
commit 447418b0765819eb2fb5c8b5c3ca9128a091d37e
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu May 1 07:22:07 2025 -0500
Formatting
commit 975661f5e498cde99f8c3ce5486c47db03856d1b
Author: Young Hui <young.hui@amd.com>
Date: Wed Apr 30 21:19:30 2025 -0400
Reorganize rocpd command line and grouped Required Arguments together
- had to add --input to each output.py file again since it was moved out of output_config.py
- ran formatter
commit 9322328611a332c3979f040b652a9e9a9482200e
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 30 17:00:58 2025 +0000
corrected indices on some operation, and kind for kernel dispatch
commit 6c146cd0c508dca6f2453e3844e09e1ed3f9978a
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 30 16:01:26 2025 +0000
some corrections on pf trace output: added categories
commit 4e02d3f8617324c95e4a449243ab9ab3f4695471
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 30 14:35:19 2025 +0000
fixed perfetto cpp with adding stack id and parent stack id to views:tests
commit d7efd9334361cd7d6a842d083a3f8ca51efe72d3
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 30 14:16:32 2025 +0000
fixed perfetto cpp with adding stack id and parent stack id to views
commit cdd0e2ec0788d44fdf2d5833822e055c43cddec6
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 30 09:47:07 2025 -0500
restore output_config, add output_file arg for generate csv
commit 5f9b7d93dcbefd55e0ba6e2674602e809aa61632
Author: JIn Tao <jintao12@amd.com>
Date: Wed Apr 30 09:30:32 2025 +0000
add ROCDecode and ROCJpeg API calls
commit 7724de1263c5f960cc64c5b0e7afb3834d797f87
Author: JIn Tao <jintao12@amd.com>
Date: Wed Apr 30 09:05:14 2025 +0000
Json output: add counters_collection
commit a13930d6d2b87605ca1ece58291172f79d81d91f
Author: JIn Tao <jintao12@amd.com>
Date: Wed Apr 30 09:01:17 2025 +0000
Json output: add scratch_memory
commit 54e62e25c6d89e718324ce3bc51eb80c25756c48
Author: JIn Tao <jintao12@amd.com>
Date: Wed Apr 30 08:00:08 2025 +0000
Json output: add marker_api
commit ab920196c7ddb68a9a1fdc121f43528e513d5a67
Author: root <root@smc300x-ccs-aus-GPUF2C5.cs-aus.dcgpu>
Date: Tue Apr 29 10:48:53 2025 -0500
csv refactor, fix output-format argument for script
commit e033d18356f397e3a684e255dcffd0c0d64ec19e
Author: JIn Tao <jintao12@amd.com>
Date: Tue Apr 29 11:30:16 2025 +0000
minor revison
commit 748f6754ac0238eca63bb12b26f62b514de65a0d
Author: Jin Tao <jintao12@amd.com>
Date: Tue Apr 29 10:09:54 2025 +0000
Json output: complete structures
commit 52c8d77e0eeb8dca7476814ff03b5cdf88055fd6
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Mon Apr 28 15:44:47 2025 +0000
forced tests upd
commit 7fabc80d3b8db7d137b05a958c633ad5bf8dbae9
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Mon Apr 28 15:43:34 2025 +0000
fixed the relative-type index issue (missing load) for agent info and related parameter adjustmets stuff for python
commit f8f5bffc010ad6d43a9f8fee90a79e4342fb9562
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Mon Apr 28 12:28:35 2025 +0000
added 3 convenience views, some refactoring and fixing kernel-rename option
commit 831cd336115153d1e73f01c9120a67c904478f89
Author: Jin Tao <jintao12@amd.com>
Date: Mon Apr 28 14:45:31 2025 +0000
Json output: add kernel_dispatch, hip_api, hsa_api
commit 4c414a1abce51fbdd6d5856b2e36e6272279c671
Author: Jin Tao <jintao12@amd.com>
Date: Fri Apr 25 13:30:01 2025 +0000
optimize the json output code, certain problems need to be done, e.g., empty counters and strings
commit ceecd7cc5b81f014766199c0a57645386ade23dd
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri Apr 25 10:51:16 2025 +0000
removed unused variable session from write_otf
commit 29fdb2db4fe0cc930cd6b3172092604ee5409242
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri Apr 25 10:14:48 2025 +0000
added tests for generated csv and otf2
commit 5091d2d51e7e4d68fcdc95a97a82a0df41f28350
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri Apr 25 10:12:30 2025 +0000
run formating from command line
commit abbb7637b1704ea904540c5ff717102bf450c76d
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri Apr 25 10:09:25 2025 +0000
added check up if cvs os not broken after other chnages
commit 9ff614d6d8e87fc3647d8f3b0120425c24213f3b
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 24 15:09:40 2025 +0000
updated new csv.cpp and otf2.cpp to fit the string_view fix
commit e94ea0f3668c9b972f2dd4144cb4152c1b202f93
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 24 14:51:55 2025 +0000
fixed string view reporting which works at least forgenerateOTF2.cpp
commit 5c5ea532279fba0b7ef5abcd1916d20d0b7fb7b8
Author: Jin Tao <jintao12@amd.com>
Date: Thu Apr 24 14:32:10 2025 +0000
Json output: add strings.marker_api
commit d28d1c18c9693421f7676d6de82c2c20af11eaa0
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 24 13:10:13 2025 +0000
small upd on cmake for tests
commit 325cb3719517ad514291ab620dd85fb04daeb906
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 24 13:08:58 2025 +0000
fixed abs_index for connecting data and handles for otf2 location reporting
commit e9a648ade545795646f6aca61fdbece5a39fea5c
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Tue Apr 22 14:00:12 2025 +0000
commit after fixing warnings, executabel name, validate works
commit 1cd2d63501b8f951996c9dfcd1d0af6d6f16c006
Author: Jin Tao <jintao12@amd.com>
Date: Thu Apr 24 12:38:03 2025 +0000
Json output: add marker_api in buffer_records
commit f01ab23d2f8f6c524568e2c453fe26a2e4320a1c
Author: Young Hui <young.hui@amd.com>
Date: Wed Apr 23 21:37:36 2025 -0400
Add python binding for agent-index-value to output_config
- command line passes correctly for csv, perfetto needs to be fixed
commit a92dd0c060dd398db365ec37af905dcca25c8a7e
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 23 09:08:43 2025 -0500
provisional fix in json.cpp
- For now using absolut index for agents
commit fddacacbb54f5678a40d552ec8a3a2f9de65381b
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 23 08:55:49 2025 -0500
adding agent indexes (abs, log, type) to types for agent_index_value options
- Fixing agent_id populated in rocpd_memory_allocate table
commit 3b0414ba5271d25996564eddc6d757b1afc637af
Author: Jin Tao <jintao12@amd.com>
Date: Wed Apr 23 11:22:59 2025 +0000
minor revision regarding json output
commit a84bc3d0f7dd9bf1014d024ef15eeb7c7ec990c5
Author: Jin Tao <jintao12@amd.com>
Date: Wed Apr 23 11:16:48 2025 +0000
add json output
commit e6c0dd98de0b5f24492ac4396cf8d59bd20d58ad
Author: Young Hui <young.hui@amd.com>
Date: Tue Apr 22 17:22:45 2025 -0400
Add rocpd commandline input param file check, to ensure DB exists
- added OTF2 script
- added placeholder JSON script
commit 1d482257b8f23bf4d64d57d8bd36775b38254026
Author: Young Hui <young.hui@amd.com>
Date: Mon Apr 21 12:47:52 2025 -0400
Clean up some rocpd python files
- removed some unused files
- cleanup __main__.py imports and duplicate main
commit c15af2aac9935ffce92d9d6ce35ab5e9eabed57c
Author: Young Hui <young.hui@amd.com>
Date: Thu Apr 17 18:40:13 2025 -0400
Add rocpd command line support
- right now pftrace and csv are supported
- also removed some otf2 test files, to fix cmake configure
- formatting edits
commit 10bee3bcf496edd8e1ad9521498c926915a33f07
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 17 15:17:15 2025 +0000
experimented with roctxA, formatted
commit 11ff50882bcbedbe516c6461c5f8d65e38d0aae5
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 17 15:17:05 2025 +0000
experimented with roctxA, formatted
commit 7cc87cbf56f6d7117df10dc2cfe45174bedff22a
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 17 14:16:16 2025 +0000
small refactoring on api-calls preprocessing
commit 421bb11d5b97a4b888c0f9a0b46fca229e4abf25
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu Apr 17 14:03:39 2025 +0000
added rccl, rocdecode, rocjpeg, corrected markers
commit 71c1122ec001ce2548aa1e6d7b0d4bbd5ac16d79
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Wed Apr 16 15:59:39 2025 +0000
integration tests for otf2 generation will stay till otf2 gen becomes stable
commit b8ff32bb269a4efec001804eb0064a8c6c7f8f6d
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Wed Apr 16 15:07:19 2025 +0000
intermediate local commit of functioningotf2 output after refactoring writer code
commit 4d6140fbad00a713aed20b72d41fa62219f9aed7
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Wed Apr 16 09:18:40 2025 +0000
intermediate local commit of functioningotf2 output
commit 96f40ebce93ff3a27c01d5e5267eda67c3ab68ec
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri Apr 11 17:58:24 2025 +0000
first working commit on generating otf2
commit 75ddceb4bd3dc6c32cd8a60450bd1c70bf4d3193
Author: acanadas <acanadas@amd.com>
Date: Wed Apr 16 11:24:30 2025 +0000
Add CSV export functionality to ROCm Profiler Data
Implement complete CSV export pipeline with the following components:
- Add write_csv function to libpyrocpd.cpp for core CSV generation
- Create csv.hpp and csv.cpp for CSV formatting and management
- Implement stats_summary.hpp and stats_summary.cpp for performance metrics
- Add comprehensive type definitions for markers, counters and statistics
- Create SQL views in schema_data for efficient data extraction
- Add csv.py module similar to pftrace.py for Python API consistency
- Implement convert.py script with multiple format support (CSV, Perfetto)
This change enables exporting profiling data in CSV format for easier analysis
and integration with external tools.
commit 953223e32faa862e79bd1f61e28a55874efa0589
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Tue Apr 1 19:58:18 2025 -0500
[output] Update generateRocpd.cpp to new schema + misc
- support guid, {{upid}}, {{view_upid}}, etc.
- improve env support for MPI in format_path
commit 54bd3b0def48d91a81045676fb2f5f549b813880
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Tue Apr 1 19:54:41 2025 -0500
[rocpd] Updates + cleanup
- remove stale {autograd,call_stacks,strings,subclass}.py
- reorganize the SQL schema files
- add source/functions.{hpp,cpp}
- custom SQL function
- update interop for defining functions
- misc improvements to "write_perfetto" function
- added rocpd.pftrace
- added rocpd.output_config
commit 32f668b019c961f0797eec9f613cf5dfea0aa377
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Tue Apr 1 14:35:31 2025 -0500
[common] md5sum calculator
commit b6ae75ba270ea92661f6cfe75647531a4d6202f3
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Sun Mar 30 01:57:50 2025 -0500
Optimize sql_generator when query requires ORDER BY
commit 51d6f33b0b1f80dd09de70c91592f928e31a730f
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Sat Mar 29 01:49:52 2025 -0500
Minimal support for merged pftrace from multiple database
commit 90c4add9001cad2af85d14783ac1fb35c89c7770
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Sat Mar 29 00:46:59 2025 -0500
Update cereal submodule
- fix recursive include
commit a5d75dcb5de9c0667af03ec7a34ac484ff864bac
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Fri Mar 28 14:30:32 2025 -0500
Formatting
commit 7345810d5ea5d76b9a4ed9bd548399cb8df1feda
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Fri Mar 28 13:44:24 2025 -0500
Fixed interpolation bug
commit 289739669a84ac83b920f417ef94310dd9ee40c6
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Fri Mar 28 13:32:29 2025 -0500
Initial memory copy implementation created
commit 91416d784bae05984b8e9670d6cd22231fbc8bed
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Fri Mar 28 13:12:43 2025 -0500
Added memory allocation counter tracks, removed midpoint interpolation temporarily due to strange outputs
commit 7e733e393d06fecc198ae4f7891edecb90882136
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Fri Mar 28 13:22:34 2025 -0500
Support multiple database connections
commit e82d7b3ee24a855258dee50ea3a4ee8e52f70509
Author: Mark Meserve <Mark.Meserve@amd.com>
Date: Fri Mar 28 17:44:05 2025 +0000
fix tracing_session init
commit d29ac6270e7466794ca43ffdf061b7514a29ad94
Author: Mark Meserve <Mark.Meserve@amd.com>
Date: Fri Mar 28 17:00:08 2025 +0000
seperate perfetto init to RAII class
commit 0538261898fdc83a05dc3835ec07d90c4b8dd937
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Fri Mar 28 11:03:45 2025 -0500
Kernel trace in perfetto now functional
commit 3eb63de679e8dfbc3bc551302ca097356f17de7d
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Thu Mar 27 17:43:43 2025 -0500
Added memory copy and allocation types, need to test kernel perfetto output
commit c70ce8340a5e4603bed27d6f3f0d95bc77aad196
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 15:16:18 2025 -0500
Misc libpyrocpd updates
- support conditions on read
commit 189226fb3aeaf3485137335392b271f4f1271040
Author: Ian Trowbridge <Ian.Trowbridge@amd.com>
Date: Thu Mar 27 14:43:25 2025 -0500
kenrel_dispatch type added, added load function for serialize
commit 07f6af65733f34c067365c94500b03a9ffff6b6b
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 14:08:30 2025 -0500
ROCPROFILER_BUILD_DEPRECATED_WARNINGS OFF by default
commit b3df97af8fee651d20030dbfc8d6c635774030a7
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 14:06:42 2025 -0500
Fix include
commit 49baef7d173385154d346120313e6c9511665b68
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 13:16:50 2025 -0500
Python interface improvements
commit 7d614ed3ab07836c420e216261e80e0b629739a4
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 13:16:35 2025 -0500
Output keys: support {...} format in addition to %...% format
commit afdb63a0814f0954dfb7500abe5c95fbacdddbc2
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 00:43:50 2025 -0500
[Release] Replace implementation usage
- rocprofiler_record_dimension_info_t -> rocprofiler_counter_record_dimension_info_t
commit d60ecf99334d96a75b08f99d7a5d8556588258e3
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 00:43:04 2025 -0500
[CMake] ROCPROFILER_BUILD_DEPRECATED_WARNINGS option
commit a89de9f205d500ffc9fdbef400b8b712b167782b
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Mar 27 00:30:52 2025 -0500
[python rocpd] bindings updates
- read_{code_objects,kernel_symbols,nodes,processes,threads}
- support writing perfetto output for regions
- support fallback casting of python object
- defined various data types for python
commit bb048b42f5828ce742947e1d3b72a35c578c0b0c
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 22:18:56 2025 -0500
[SDK] hip stream tracing update
- disable hip stream tracing support for HIP compiler API functions
- add "stream_value" field to rocprofiler_callback_tracing_hip_stream_data_t
- data type: rocprofiler_address_t
commit 392a12b20ee795ab025097066444504aac3ddd88
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 22:02:22 2025 -0500
[output library] generateRocpd.cpp updates
- generic functions for accessing fields which may/may not exist
- simple timer logging
- populate parent_stack_id
- misc cleanup
commit 86ae21d178a0c52ec47e27414b63edbd2a62a94d
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:46:34 2025 -0500
[public API] update cxx/serialization/load.hpp
- ROCPROFILER_SDK_CXX_SERIALIZATION_LOAD_DEBUG
- load definitions
- rocprofiler_address_t
- rocprofiler_callback_tracing_code_object_load_data_t
- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t
commit 4ba6232c23296791df484d47db8268e4bc997c0d
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:44:52 2025 -0500
[public API] update cxx/perfetto.hpp
- function: get_perfetto_category for callback tracing and buffer tracing kinds
commit 4c0fcf4395f6c337cf3b955ec45b0567f9b3a477
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:26:32 2025 -0500
[python] Rename rocpd/schema_data/* files
- {tables,views,indexes}.sql
commit 47b862ece9ed3f16d7a5ecd6b632f13b0086bc01
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:23:35 2025 -0500
[external] cleanup external/CMakeLists.txt
commit 4e2a71db9b224b11f4144104b4569a5efb45b6c2
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:23:05 2025 -0500
[samples] Minor tweak to counter collection sample app
commit c62dd55c5e82b4105bdce374dc45c6adf65a0cc4
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:22:08 2025 -0500
[tool library] Rework kernel rename and stream id data
- simplify and improve memory management
- support stream information for memory allocation
commit 2521d499664bb01c0e2f9f9454b5da5c38b29cfc
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Wed Mar 26 21:18:46 2025 -0500
[output library] Misc reorg and updates
- dropped "with_stream" from various data types
- add stream support to memory allocation records
- generator<T> base class resize function
- serialization load functions for kernel symbols, node_info
- reorganization of SQL code
- cereal::SQLite3InputArchive
commit 70a76d6352dca84241f1749da660f8af8e89c469
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Mon Mar 24 03:52:58 2025 -0500
Misc fixes after rebase
commit 4811201c3e07ac8b5b4edc4028df9cdbc3481bd1
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Nov 14 22:30:43 2024 -0600
Python Bindings
Fix rocpd graph support
- remove uniqueness constraint from subclass
- replace assumption of "graphExec" with "pGraphExec" in the parsed API arguments
pybind11 submodule
Initial roctx python bindings
Fix testing
Add rest of python bindings for ROCTX
Add decorators and context manager for roctx
Fix: restore rocprofiler_reset_python3_cache from CMakeLists
work in prorgess! added stub test for pybind
added subdirectory for pybind-test
Fixing CMakeLists for pybind-test
Pybind-test
- Move simple-transpose-pybind.py to tests/bin
- Execute test in CMakeLists for pybind-test
Fix python format
Update rocprofv3 python bindings testing
- move to tests/rocprofv3/python-bindings
- remove misc code
- add marker.py
Update roctx python bindings
Update rocprofv3 python bindings test validation
Update rocprofv3 python bindings test validation
- Add checks for marker_api in json format
- Unify checks for csv and json
patch removing tmp trace_processor_python_api file after validate is run, ptherwise it blocks the work of others
correction of previous commit: correct patch
removed patch because the santizing will be added to perfetto reader
perf reader: removing tmp user-owned file after the data are read
robust handling of removing tmp file via a context
minor clean up perfetto reader, fixing hip-in-libraries-test
minor clean up perfetto reader, fixing hip-in-libraries-test
Fix generateRocpd for older compilers
Formatting
Remove rocpd-db-to-chrome-tracing console script via pip
Fix PYTHONPATH for rocprofv3-trace-roctx-python-bindings-*
replaced context by PerfettoReader destructor to remove tmp perfetto dir after the test run
formatting python
formatted cmake for pybind-tests
Update contexts and decorators roctx python bindings
added stub jypiter notebook for python analysis
formatting notebook
Update jypiter notebook for python analysis
- Create summary tables as views : memory_copy, kernel_dispatch, hip_api
upd the generate_rocpd.py by schme reading and rocpd_node table and queue to rocpd_kernelapi
comments to discuss on generateJson and generateRocpd.cpp
comments to discuss on generateJson and generateRocpd.cpp
remove comments because they block pull
remove comments because they block pull
format generate_rocpd
formatted generateRocpd and generateJSON cpps
formatted generateRocpd and generateJSON cpps
formatted the analysis notebook
reset the Rocpd aand JSon generate to the state before I added comments because my local formatting with c-lang format fails
add using pandas dataframe to generate summaries
added basic pytorch trace example in tests
upd execute toy pytorch test
pretty printing 3 temporary views bypassing pand frames
set instrument cells (functions, constants) in front of usage cells
added experimental routines for creating time slots via markers and counting given api calls within this time slot
formatting
simplified utilities
more simplified utilities, added label usage in report
more simplified utilities, added label usage in report
fogottem commit example matrix mult
added extracting copy operations by kind, added second db to compare, moved looping over database outside basic procedures
marker around print
replaced push.pop with just markers
added duration and matrixmult flops, lists of kernel names and api names
add bin directory variable to pass to generateRocpd
correcting schema
add bin directory variable to pass to generateRocpd
debug upd for matrixmult
create default timed views as copy of the original db
correct table naming in the orifinal db
removing rocpd_string, rocpd_metadata from views
removing rocpd_string, rocpd_metadata from views
removing rocpd_string, rocpd_metadata from views
implemented timed views clean up and creation innotebook
formatted python
formatted python
formatted files via make format and commented pytprch test
added new line at the end of the tableSchema
Add tracing script for rocpd time slicing views
- Start and end options: timestamps/percentage , markers
rebase and format one cpp file
fix source formatting
Rocpd: Time slice data
- Rename tracing.py > time_slice.py
- Adapt time_slice.py to use RocpdImportData, fix typo 'rocpd_api_ops', add output argument (overwrite input otherwise), update rocpd_kernelapi view
- Simplifyed chrome_tracing.py : remove start+end arguments, avoid manipulating the time window/time slice
Fix rebase issues
[DO NOT MERGE] rocpd-schema.md
Schema updates
rocpd schema v3 updates/fixes
rocpd schema v3 updates/fixes
- samples view
- kernels view
- insert zero duration regions as samples
generateRocp update rocpd_memory_allocate table fill
Misc compilation cleanup
tableSchema.sql updates
Add core performance analysis views:
- Add busy view for GPU utilization metrics
- Add top view for overall performance summary
- Add top_kernels view for kernel performance analysis
- Add memory_copies view for memory transfer tracking
- Add memory_allocates view for memory allocation stats
- Add kernel_summary view for aggregate kernel metrics
utilitySchema.sql clean up TOP view and some consistancy changes
added marker views
Update rocpd schema
- new: rocpd_pmc_event
- modified: rocpd_pmc
- modified: _rocpd_memory_copy
- name_id
- add rocdecode API
- populate rocpd_pmc and rocpd_pmc_event
- slight modifications to column names for consistency
:memory allocation insert for db
memory allocation insert for db
memory allocation insert for db
commit after rebase
autoincrement for memory allocation
stringsanitizer applied for pmc description
replaced inner join with left join for stream id, queue id, region name id
Update chrome_tracing.py
- chrome_tracing script updated using new schema
- importer.py pending update with new schema, currently commented out
Misc compilation cleanup after rebase
Update rocpd/time_window.py
- time_window.py script updated using new schema
- Adapt chrome_tracing.py to use time windows
added rocpd_arg and the join view with events, populated
set back default python version to 3.6 for remote
addeddereferencing stream ids and usage, some formatting
refine error messages and help arguments in time_windows.py
test if removing exact python version fixes integration test
removed unused variable
formatting and added back exact 3.6 to fix python linting (try)
removed the exact python version again because it causes advanced analysis to fail
revise microseconds to nanoseconds
made type and name text again, put back strict version for python
Reorganization of public cxx serialization headers
Rework generator<T> (abstract base) + file_generator<T>
- return file_generator<T> from buffered_output
Add gotcha submodule
Update cxx serialization headers
Revert some HIP stream changes
[output library] Misc updates
- fix serialization of agent_info
- add parseRocpd.{hpp,cpp}
- move common SQL utils to sql.{hpp,cpp}
[python libraries] Bindings for rocpd + reorg
- move common cmake code into source/lib/python/utilities.cmake
- build python bindings for rocpd (primitive implementation)
- Require ROCPROFILER_BUILD_SQLITE3=OFF for rocpd python bindings
- wrap sqlite3_open / sqlite3_close / sqlite3_open_v2 / sqlite3_close_v2
Update rocpd/importer.py
- importer.py updated using new schema
commit aff19818a5c9f9f6004013144ae00e2c31b21739
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Nov 14 22:27:39 2024 -0600
[SEPARATE PR] HIP API buffer records with args (ext)
- Needs to be in separate PR
- New buffer tracing domain(s) for HIP APIs which include the arguments and the return value in the buffer records
Update HIP stream support for extended HIP buffer tracing
Update rocprofv3 tool library and output library to use extended HIP buffer tracing recods
commit 43c3f0ddd5a104346d6db77b8a1b66fd9ec2f797
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Nov 14 18:47:52 2024 -0600
[SEPARATE PR] Combo of several separate PRs
[SEPARATE PR] Update correlation ID retirement
- Needs to be in separate PR
- correlation_id_finalize
[SEPARATE PR] rocprofiler_query_intercept_table_name
- Needs to be in separate PR
- Function to get the name of intercept tables
[SEPARATE PR] Memory copy and memory alloc updates
commit e3da9738b06f974fb6b935893f4172852819b6bc
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Mon Nov 4 17:22:30 2024 -0600
Add SQLite3 build support
Initial SQL writing to rocpd schema in C++
task code
Implement kernel dispatch and memory copy writing to table
Update generateRocpd.cpp
- SQLITE3_CHECK macro
- Error messages for sqlite3_exec errors
- Tweaked rocpd_kernelcodeobject to use kernel ids as id
- fixed some issues:
- use string_entries.at(...) instead of (SELECT id FROM ... WHERE string = ...)
- use custom op_idx instead of relying on correlation ID (one correlation ID can map to multiple ops)
Update generate-rocpd.py
- misc tweaks to mirror generateRocpd.cpp implementation
- add strings for counter names
- use kernel id for rocpd_kernelcodeobject
Tweak tests/bin/hip-graph/hip-graph.cpp
- use stream sync
Update generateRocpd.cpp
- made table and view schema more readable
- sql_exec_callback
Add source/scripts/rocprofv3-db-to-tracing.py
- Script which reads a rocprofv3 rocpd database and outputs a chrome tracing format JSON
Common library updates
- static_tl_object: similar to static_object but for thread-local objects
- additional template metaprogramming constructs
- reverse
- function_traits
rocprofiler_stream_id_t: opaque handle for a stream
- e.g. HIP stream
- the same HIP stream may map to different HSA queues at different points in the application
- added to:
- rocprofiler_buffer_tracing_hip_api_record_t
- rocprofiler_buffer_tracing_memory_copy_record_t
- rocprofiler_callback_tracing_hip_api_data_t
- rocprofiler_callback_tracing_memory_copy_data_t
rocprofiler_stream_id_t: output support
- use stream_id in generatePerfetto.cpp
- use stream_id in generateRocpd.cpp
rocprofiler_stream_id_t: output support
- rocpd_kernelapi and rocpd_copyapi encode stream/queue as integer instead of string
Update source/scripts/rocprofv3-db-to-tracing.py
- Create temporary view from multiple db file
- Modify write_chrome_tracing_json to use the tmp views
Update rocprofv3-db-to-tracing.py
- retain rocpd tables names in create_temp_view instead of appending "_tmp"
- improve the GPU/Agent and Queue identifiers
- make the SQL statements easier to read
- remove rocpd_hsaApi usage (not necessary as HSA data resides in rocpd_api table)
- directly connect to input if only one database
Update generateRocpd.cpp
- use dispatch id for rocpd_kernelapi id (primary key)
- use dispatch_id for rocpd_op sequenceId
- create copy_id for rocpd_copyapi id (primary key)
- use copy_id for rocpd_op sequenceId
Update tests/{async-copy-tracing,kernel-tracing,page-migration}
- expand reading the RCCL and scratch memory trace data
Add sdk::parse::strip function
- utility function for stripping characters from beginning and end of string
Preliminary rocpd_node table
- rocpd_node is a meta-table for process ids
- added lib/output/node_info.{hpp,cpp}
- add tool::node_info instance to tool::metadata
Update page-migration test
Python packaging of rocpd
add temporary view rocpd_copyapi
Update rocpd/importer.py
- Add rocpd_kernelapi and rocpd_copyapi
- Create meta temporary views using RocpdSchema
JSON tool handling of HIP compiler API data
Misc rocpd updates
- SQL formatting
- remove unused imports
- fix relative import
lib/output metadata updates
- add process_init_ns and process_fini_ns
- add command_line member
rocprofv3 updates: tool.cpp
- simplify buffer service configuration
- rocprofiler_at_intercept_table_registration -> api_timestamps_callback -> start time
- record init, fini, start, and end times
Added simple_timer to common library
Add missing new lines at end of files
f
added labels for multiprocessed, tested kernel-rename, groupd by queue, type-relative of but bug in run-label
- fixed static marker name to dynamic
rocpd fix python linting errors
Squashed commit of JSON changes, small CSV, OTF2, pftrace fixes, and rocpd commandline params
Contains these original commits:
commit 2c7a10771d60ad0b93073f94f6226a6e92ade4cb
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Wed May 14 18:57:30 2025 +0000
removed run markers reporting from otf2 as they duplicate api calls
commit 826c0c13a164e7a5d2c7f1963cc437ee53416658
Author: Young Hui <young.hui@amd.com>
Date: Wed May 14 12:55:51 2025 -0400
rocpd command line updates
- some stats-summary params moved to generic params since applies to json and csv
- added --group-by-queue for perfetto
commit 33a82dcc9c05869f70219b75c76e4d0e6ae84a39
Author: acanadas <acanadas@amd.com>
Date: Wed May 14 07:12:45 2025 -0500
Fix corr_id for kernel in write_csv, fix std dev in summary views
commit b89dfbd02a243c3e1b5d1a4a968ab5b7c9ecb3a3
Author: acanadas <acanadas@amd.com>
Date: Wed May 14 04:03:56 2025 -0500
Fix generate json merged for multiple db
commit 4968f65f51a6539c95a801c1340487a5675b1f45
Author: Jin Tao <jin.tao@amd.com>
Date: Wed May 14 08:02:10 2025 +0000
clean the code for json: remove host fuction, but keep the basic structure
commit 0f5ed2011a6da5a6046974c03c64569a4747102e
Author: Jin Tao <jin.tao@amd.com>
Date: Wed May 14 07:12:23 2025 +0000
small fix
commit 152c149905e98f33688fdb0cfc5ca88b3c61694d
Author: Young Hui <young.hui@amd.com>
Date: Tue May 13 22:13:35 2025 -0400
Schema revert rocpd_event.correlation_id back to INT
- removed duplicate K.tid from kernel views
- re-enabled some json.cpp code to build
commit 8af5ff79b1f85cd1f2ac4c61af580dc891e6dd70
Author: Jin Tao <jin.tao@amd.com>
Date: Tue May 13 11:13:56 2025 +0000
add host_functions
commit 2a3e0307ab7571be0f46ffd2f706d2c9f34cdce1
Author: acanadas <acanadas@amd.com>
Date: Tue May 13 06:08:10 2025 -0500
Adding summary to write_json, add extra information for sql column type error
commit d339f1cc5eb856e9e3bd7461ac4d776e8265b7cd
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Mon May 12 17:42:47 2025 +0000
fixed counter annotations for kernel dispatch AND shadowed names in jsob.cpp
commit d3a8ece3135a8cb5a78b05f7ca653793d89d5d5a
Author: acanadas <acanadas@amd.com>
Date: Mon May 12 09:18:56 2025 -0500
Fix node for counters in write_json
commit 5153841b8d047bb4c78aeda5675848f2887aef29
Author: acanadas <acanadas@amd.com>
Date: Mon May 12 08:26:22 2025 -0500
Adding counters. code_objects and kernel_symbols in write_json
commit e979a1ab87d6b148d2b90ae25fc60d5e766e3251
Author: Jin Tao <jin.tao@amd.com>
Date: Mon May 12 13:10:13 2025 +0000
add json counter_collection.records
commit 026d022052122ee36f83022e270110840dc38aa3
Author: Young Hui <young.hui@amd.com>
Date: Sat May 10 00:29:33 2025 -0400
Fix kernel_dispatch thread_id in Perfetto traces, more pftrace args supported and formatting
- perfetto-backend=system results in 0KB pftrace file, but probably need the traced and perfetto daemons (have not tested with them)
commit f1341487723672b407fa89d2d383944d92663c55
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri May 9 17:08:40 2025 +0000
fixed counters for perfetto but need more testing
commit 33612f9bc6121e38ad9b41d2aecd035d1d629efb
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri May 9 13:34:50 2025 +0000
rearchitectured generator tests into a dedicated rocpd-generators suit (to be extended) and jdon file-name input made a parameter
commit da1e386b77bc1bbbbdc1ca8ee2ffd8c09a99ae5b
Author: acanadas <acanadas@amd.com>
Date: Fri May 9 08:36:00 2025 -0500
fixing buffer_record missing values in json, fix using clang_tidy
commit f7adb8b28ab64127ef20ff642ced40ccace7fef8
Author: acanadas <acanadas@amd.com>
Date: Fri May 9 10:08:28 2025 +0000
revise callback_records.counters_collection, still has 3 TBDs
commit 49d4057e7b7e295953915b90cdeb7359948c5034
Author: acanadas <acanadas@amd.com>
Date: Thu May 8 11:10:32 2025 -0500
Fixing correlation_id for csv and json
commit bb37556c9d1da7fe1ccb0bc5bf0e05d4f0989476
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu May 8 13:27:01 2025 +0000
This is a combination of 3 commits.
tree node indexing updated, added extended agent struct for otf2 for efffective handling agents, agent indexes and laeled agent names
rolled back tree id as process counter, tested well (source and test)
commit 38dc36da08a5a505b6d7d47840690b4d6c29b429
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Thu May 8 13:26:46 2025 +0000
tree node indexing updated, added extended agent struct for otf2 for efffective handling agents, agent indexes and laeled agent names
commit 2958ba37abc86e5b03bc03f5d529c0efcfd2ce45
Author: acanadas <acanadas@amd.com>
Date: Thu May 8 08:08:10 2025 -0500
Adding size, kind and operation for buffer_records in write_json
commit 654c42d8a1c341ef1bfbb724eaf6b24e64fb5475
Author: acanadas <acanadas@amd.com>
Date: Thu May 8 07:40:30 2025 -0500
Fixing merge with schema change
commit 01f133bc0fed66d60cb5fe05d9e0b256938c7fb0
Author: acanadas <acanadas@amd.com>
Date: Thu May 8 06:48:58 2025 -0500
fixing write_json merge
commit dd3bbd5ed6faabc2207d9bf933c5c8efb8a2fe16
Author: root <root@smc300x-ccs-aus-GPUF2C5.cs-aus.dcgpu>
Date: Thu May 8 04:45:56 2025 -0500
update write_json adding strings, buffer_records and counters
commit 6a991305bf75f4f2e944b40e7194b43fcf1ec340
Author: Young Hui <young.hui@amd.com>
Date: Thu May 8 01:29:08 2025 -0400
Schema changed rocpd_event.correlation_id to TEXT type [WIP]
- when --kernel-rename is used, SQL stores correlation_id.external.value value with precision error. To store accurately, need to store as TEXT.
- work-in-progress commit, still getting a few rocpd SQL conversion warnings due to TEXT type change.
commit d50b058dc764394f9fc9cfc264866f43f9354da0
Author: oshkarav_amdeng <oshkarav@amd.com>
Date: Fri May 2 09:32:12 2025 +0000
This is a combination of 2 commits.
fixed clang tidy complaint on otf2 and perfetto, added guid-s t info names, but still 2-process DBs fail
fixed mutliprocess otf2, but agents still need to be labeled by guid
Misc fixes
- flush less often for perfetto
- metadata create_agent_index
- remove sqlite3_close from write_otf2
- warning if gotcha_wrap fails
- fix rocpd-generators tests
- fix multi-python build config
Whitespace cleanup
Revert tests/rocprofv3/summary/CMakeLists.txt
Remove unused scripts
- generate-rocpd.py is outdated
- rocprofv3-db-to-tracing.py is scratch
- simple-transpose-pybind.py is no longer used
Fix maybe-uninitialized compiler warning
Fix maybe-uninitialized compiler warning
Update rocpd.write_X functions to require RocpdImportData instance
- create libpyrocpd.RocpdImportData
- rocpd.importer.RocpdImportData inherits from libpyrocpd.RocpdImportData
- write_csv, write_json, etc. all require instance of RocpdImportData instead of list of sqlite3.Connection
output/rocpd source reorganization
- move lib/output/parseRocpd.* to lib/python/rocpd/source/common.*
- move lib/output/serialization to lib/python/rocpd/source/serialization
- move lib/output/sql/generator.hpp to lib/python/rocpd/source/sql_generator.hpp
Minor refactor of lib/output/sql/*
Remove lib/output/sql.hpp
- this file just included other headers
Fix rocpd source reorg
Update schema files
- replace quotes with backticks
remove short guid from perfetto
fix string sanitization in generateRocpd.cpp
* Update cereal submodule
* Updates following rebase
- remove MANIFEST.in
- remove rocpd/schema_data/*.sql
- use rocprofiler-sdk-rocpd
* rocpd command line update to pass importData instead of connection
* Update get_perfetto_category for ROCDECODE_API_EXT
* data_views.sql updates
- remove kernels_renamed view
- remove rccl view
- remove rocjpeg view
- remove rocdecode view
- remove api_regions view
- remove api_threads view
* Perfetto, OTF2, output config updates
- Support kernel rename for output config
- Combine memory copy and kernels into same stream track
- Fix get_category_string in perfetto
- Support kernel renaming in OTF2 and Perfetto
* Add samples to perfetto
* ORDER BY for perfetto regions/samples
* Fix busy view, fix integer overflow in summary views
* CSV adding --stats-per-rank and --stats-summary-per-rank reports
Squash of these original commits:
commit 4c0ded4efb6ef273953faf7fd100a54ce16ae00b
Author: Jin Tao <jintao12@amd.com>
Date: Mon Jun 2 07:25:06 2025 +0000
change all rank output to PID instead of GUID
commit 969950b009ad18a156a31bfee43f64e032720262
Author: Jin Tao <jintao12@amd.com>
Date: Wed May 28 11:22:29 2025 +0000
Change --stats-per-rank and --stats-summary-per-rank to csv.py
commit 970f88dec3bb83663d8a84d8bf4dfcfb8857e903
Author: Jin Tao <jintao12@amd.com>
Date: Thu May 22 11:39:51 2025 +0000
refactor and small bug fix
commit a4c5ab4290c0a9b7b45421c3fc21150914d12110
Author: Jin Tao <jintao12@amd.com>
Date: Thu May 22 10:59:13 2025 +0000
add args for --stats-per-node and --stats-summary-per-node
commit 432907d0a54de886d01f962f3084c6708bdb1456
Author: Jin Tao <jintao12@amd.com>
Date: Thu May 22 09:22:55 2025 +0000
add summary for marker, rccl, rocdecode, rocjpeg_calls
commit ab53c9c8f4a6f5f7bf40a32bf1f93add1dbd10f5
Author: Jin Tao <jintao12@amd.com>
Date: Wed May 21 13:33:04 2025 +0000
add hsa_api_stats_by_node, hip_api_stats_by_nod
commit b26974bdd6d0047759b4014f9c226cf771f0a2ad
Author: Jin Tao <jintao12@amd.com>
Date: Wed May 21 11:14:07 2025 +0000
Add memory allocation stats by node and memory copy stats by node
commit d02819c2c4bd57bfeb0537d3a42d9f22866e8600
Author: Jin Tao <jintao12@amd.com>
Date: Wed May 21 08:33:08 2025 +0000
Kernels (or categories) by node completed
* JSON fixes for agent in memory allocation if free in write_json, add group by nid for _summary_node views
Squash of these original fixes:
commit 25d8d985b319585fab16eb6d8e9ec5c91e767a91
Author: acanadas <acanadas@amd.com>
Date: Thu May 22 14:00:20 2025 +0000
fix agent for memory allocation if free in write_json, add group by nid for _summary_node views
commit d64905418a307db9e56b816e4b9f4de587bc3a14
Author: acanadas <acanadas@amd.com>
Date: Thu May 22 08:55:21 2025 +0000
Fix typo json.cpp for memory copy src agent
* CSV add domain_stats_per_rank and PID header fix
Squash of these 3 commits:
commit 10ec1cf40cc92a9d7ce7db84a8c6bbe20c51a340
Author: Jin Tao <jintao12@amd.com>
Date: Tue Jun 3 10:06:33 2025 +0000
small fix PID header rename
commit 6775fd3b2031b30ae389b165cf44b321026c1e80
Author: Jin Tao <jintao12@amd.com>
Date: Mon May 26 11:47:23 2025 +0000
Add domain_stats_per_rank
commit 9f3eb1793e354c1f04bab1fbc837e86f1f5b5030
Author: Jin Tao <jintao12@amd.com>
Date: Thu May 22 12:11:28 2025 +0000
add domain_summary_node view
CSV PID header formatting
* upd otf2 generate validation test went well
* fix Cmake formatting
* In data_views, change P.id/T.id to P.pid/T.tid
* Remove unapproved views for this PR
* Remove rocpd JSON output support
* Fix clang-tidy errors
* Revert output_config changes + cmake + remove chrome_tracing.py
* Revert
* Misc rocpd cleanup
- remove all CsvType per-node and stats enum values
- remove rocpd::types::marker
- remove rocpd/source/stats_summary.{hpp,cpp}
- add `sample_regions` view
- add `regions_and_samples` view
- move tests/rocprofv3/rocpd-generators to tests/rocprofv3/rocpd
- merge validate_perfetto.py and validate_otf2.py into single validate.py
* Remove stats options from rocpd.output_config
* Remove json from rocpd.__main__
* Remove rocpd.csv summary/stats options
* Remove generate_from_rocpd.py
* Add rocpd subparser (convert) + migrate tests to use python3 -m rocpd
* Add additional tests for the rocpd command line
- Check the --help flag works for:
- python3 -m rocpd --help
- python3 -m rocpd convert --help
- python3 -m rocpd.csv --help
- python3 -m rocpd.otf2 --help
- python3 -m rocpd.pftrace --help
* adding rocpd python shebangs and main function parameter ordering
* Fix sanitizer tests + remove read_code_objects and read_kernel_symbols
* Misc updates
- Update CHANGELOG.md and source/lib/python/rocpd/README.md
- update time_window.py
- find min/max in regions_and_samples/kernels/memory_allocations/memory_copies
- RocpdImportData supports all the same functions as sqlite3.Connection
* Improve time_window.py
- find the tables with start/end dynamically
- find the tables with timestamp dynamically
* Minor revert to time_window.py
* Remove tests/rocprofv3/pytorch-tests
* Fix python 3.6 error in time_window.py
- 'type' object is not subscriptable
* Fix rocpd installation
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Young Hui <young.hui@amd.com>
Co-authored-by: acanadas <acanadas@amd.com>
Co-authored-by: Jin Tao <jintao12@amd.com>
Co-authored-by: oshkarav_amdeng <oshkarav@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
adc4e6995d |
[SDK] Support HIP 7.0 API changes (#432)
* [Do not merge] Make changes to api_args
* Support HIP 7.0 API changes
- Provide ROCPROFILER_SDK_ variants of ROCPROFILER_ version defines
- Provide ROCPROFILER_SDK_COMPUTE_VERSION
- hipCtxGetApiVersion changes parameter from int* to unsigned int*
- hipMemcpyHtoD and hipMemcpyHtoDAsync changed void* to const void*
* Fix comment
---------
Co-authored-by: Jatin Chaudhary <jatchaud@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0e93099fd7 |
[rocprofv3] SQLite3 database output (rocpd) support + rocprofiler-sdk-rocpd (#403)
* [rocprofv3] rocpd SQLite3 database output support
* Move counters xml and yaml to source/share/rocprofiler-sdk
- more representative of install hierarchy
* Add share/rocprofiler-sdk/rocpd SQL files
* Experimental rocprofiler-sdk SQL API
* rocprofv3 default output format is rocpd
* Fix rocpd event ids for counter collection w/o kernel dispatch
* Remove fktable entries from rocpd_tables.sql
* Fix rocpd schema path
* Fix install component for roctx python bindings
* rocprofiler-sdk-rocpd
- create include/rocprofiler-sdk-rocpd
- create rocprofiler-sdk-rocpd library, package, etc.
- default all "guid" fields to "{{guid}}" in tables
- remove "{{view_uuid}}" support (always unused)
* Migrate rocprofv3 to use rocprofiler-sdk-rocpd
* Fix missing foreign key reference
* Revert change
* Fix cmake comment
* Fix maybe-uninitialized compiler warning
* Fix maybe-uninitialized compiler warning
* Add logging to rocpd_sql_load_schema
* Improve string sanitization when inserting json strings
* Initialize rocpd logging on rocprofiler-sdk-rocpd library load
* Revert lib/output/generatePerfetto.cpp changes
* [temporary] Tweak rocprofv3-test-list-avail-trace-execute test log level
* Update get_install_path for lib/rocprofiler-sdk-rocpd/sql.cpp
- try to resolve issues on RHEL/SLES for dladdr
* Update lib/common/logging.cpp
- enable environ overrides
* dlsym for rocpd_sql_load_schema
* Make dl_info.dli_fname lexically normal
* Implement node_info alternatives if /etc/machine-id does not exist
* Misc include fixes
* SHA256 and UUIDv7 support
* Implement UUIDv7 in generateRocpd.cpp
* Support push/pop environment variables
* Minor tweak
* Fix glog segfaults when unsetting glog env
* Updated CHANGELOG
* Updates tests/pytest-packages
- rocpd_reader.py: RocpdReader
* Update tests / marker_views.sql
- add test_rocpd_data
* Update rocpd_tables.sql
- Use AUTOINCREMENT
- insert "uuid" and "guid" into rocpd_metadata
* Minor updates to generateRocpd.cpp
- don't quote GUID
- use sqlite3_open_v2
- use sqlite3_close_v2
* Update execute_raw_sql_statements_impl
- uses sqlite3_last_insert_rowid for autoincrement
* Update SQL deferred_transaction
- CI check for nullptr to connection
* Apply suggestions from code review
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
* Code review updates
- formatting
- replace if with switch
- remove loop for {{uuid}}
* Fix pmc_groups handling in rocprofv3
* Address code review feedback
- Include rocm_version in rocprofv3 version info
- Note `--version` option for `rocprofv3` in CHANGELOG.md
- remove commented out code
* Fix packaging dependencies
* Fix install package step of CI workflow
* Fix install package step of CI workflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
d093527471 |
Fix to PR #413 HIP_RUNTIME_API_TABLE_STEP_VERSION 9 vs. 10 (#419)
Fix HIP_RUNTIME_API_TABLE_STEP_VERSION 9 vs. 10
- in rocm-6.4.x, hipLinkAddData, hipLinkAddFile, hipLinkComplete, hipLinkCreate, hipLinkDestroy are HIP_RUNTIME_API_TABLE_STEP_VERSION == 9
- in amd-staging, hipEventRecordWithFlags is HIP_RUNTIME_API_TABLE_STEP_VERSION == 9
- this should fixed in amd-staging such that HIP_RUNTIME_API_TABLE_STEP_VERSION == 10
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
6efbc492f1 |
[SDK] Update HIP support for ROCm 7.0 (#413)
* Update HIP VERSION to 7.0.0, HIP_RUNTIME_API_TABLE_STEP_VERSION to 13
* Changed rocprofiler_config_interfaces.cmake to check version after finding package
* Update cmake/rocprofiler_config_interfaces.cmake
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
abf3f869e9 |
Making ROCTx API doxygen generated document more readable (#385)
* Making ROCTx API doxygen generated document more readable
* fixing build
* Fix linking errors
* Fixing header
* Fixing Topics and Types
* doxygen configuration fixes
* Fixing build
* Fix unnecessory doc parsing warnings
* formatting and linting fixes
* rebasing SDK modular PR
* Fixing missing line
* Fixing ROCtx documentation after merge
* Removing flake changes
* changed back WARN_IF_DOC_ERROR to Yes
[ROCm/rocprofiler-sdk commit:
|
||
|
|
cfea25a13a |
ATT Doc updates. Fix trace-decode return error. (#406)
* Doc updates. Some cleanup.
* Formatting
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
de6c6af917 |
HSA API Addition in EXT Table (#407)
* HSA API Addition in EXT Table
* Adding the real new API trace capability
* Formatting fix
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b097e276a9 |
[rocprofv3] Add rocpd output support (part 1: prelude) (#401)
* [rocprofv3] Add rocpd output support (part 1: prelude)
- git submodules for sqlite3, GOTCHA, and pybind11
- HIP stream data
- rocprofiler_query_intercept_table_name(...)
- serialization load
- rocprofiler::sdk::get_perfetto_category(KindT)
- rocprofiler::sdk::parse::strip
- common library updates
- md5sum
- hasher
- simple_timer
- static_tl_object
- get_process_start_time_ns(pid_t)
- output library updates
- node_info
- file_generator (generator is now virtual base class)
- stream info updates
* Added submodules
* Code review updates
* Minor unused-but-set-X warning fixes
* Update CI
- install libsqlite3-dev package
* Update CI
- install libsqlite3-dev package
* Fix static thread-local object memory leak
- also fix signal handler chaining
* Remove URL from comment
* Remove page migration exception
* Enable ROCPROFILER_BUILD_SQLITE3 by default
- try find_package(SQLite3) first and then build when ROCPROFILER_BUILD_SQLITE3=ON
* Fix gotcha installation
- make install of target optional
* Validate tracing + counter collection dispatch data
- i.e. correlation ids, thread ids, timestamps
* Make find_package(SQLite3) optional
- ROCm CI does not have SQLite3 dev package installed and cannot build from source (missing tclsh)
* Fixes to tracing + counter collection test
* get_process_start_time_ns update
- original implementation did not work
* Fix pytest-packages test_perfetto_data for counter collection
- erroneous failure when used with same PMC + multiple agents
* cmake policy: option() honors normal variables
- for GOTCHA submodule
* Improve samples/api_buffered_tracing stability
- reduce likelihood of sporadic exception throw
* Update gotcha submodule
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
682b9967e0 |
[RSERP-1802] Add trace decoder to API (#398)
* Add trace decoder to API.
* Cleanup and activity
* Rename
* Minor fix
* Replace tt/TT with thread_trace/THREAD_TRACE
- public API types are not abbreviated
* Fix aliases
* Build system updates
- activate clang-tidy for all subfolders in lib
- fix addition of sources for att-tool
* Fix clang-tidy issues with lib/att-tool/counters.{hpp,cpp}
* Delete counters.cpp
* Formatting
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|