* attach: milestone: API tracing
- This pairs with another commit in rocprofiler-sdk to fully
function
- Add ptrace entry points for tool attachment
- API tracing works at this commit
- Queue tracing not supported yet
* attach: cleanup
- Remove hardcode for loading of tool library
- Make invoke registration functions public again
* attach: proxy queue first draft
- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-sdk
* attach: prestore overhaul
- Must be paired with commit in rocprofiler-sdk
* attach: add dispatch table rework
- Register will load the prestore library and provide entrypoints to sdk
* attach: formatting and cleanup
* attach: revise dispatch table scheme
* attach: formatting
* attach: milestone: API tracing
- This change must be paired with a change in rocprofiler-register to
fully function.
- API tracing works at this commit
- Queue tracing not supported yet
* attach: cleanup and comments
* attach: Formatting and crash fixes
* attach: add attach duration
- Add option attach-duration-msec for attachment
* Formatting + sglang hang fix via signal handling
* Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues
* attach: proxy queue first draft
- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-register
* Allow null agents for scratch output
* attach: improve queue library interface
- Significant changes to force exported interfaces back to C
- Fixes bug with unknown agents at attachment
- Code objects' names may still be incorrect
* attach: add code_object support
- Kernel traces will now have names and all other information for launches
- Add capture of hsa_executable to the queue library
- Various logging improvements
* attach: rename queue library to prestore
* attach: prestore overhaul
- Must be paired with commit from rocprofiler-register
- Massive overhaul of code organization in prestore library
- Separates registrations for different object types
- Sets up future changes for initialization
* attach: add prestore dispatch table
- Removes linkage to prestore library from sdk
* attach: cleanup
* attach: formatting
* attach: fix input prompt not appearing
* attach: fix component name in cmake
* attach: revert change to export level
* Make prestore API public
* attach: update sdk attachment library WIP
- This commit is NONFUNCTIONAL
- Changes around structure to remove classes
- Seperate C linkage where needed
- Still needs updates to register for correct usage
* attach: update register with dispatch table WIP
- This commit is NONFUNCTIONAL
- Changes rocprofiler_register to handle dispatch table from attach
library.
- Still needs changes in SDK with dispatch table usage
* attach: dispatch table wip
- This commit is NONFUNCTIONAL
* attach: move attach component into core
* attach: rename to rocprofv3-attach
* attach: add callbacks for new queues and code objects
* attach: finish dispatch table implementation
- Fixes kernel tracing
* attach: add cmake variable for attachment support
* feat: Add --attach alias for rocprofv3 with comprehensive attachment tests
- Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py
- Create comprehensive attachment test suite with CSV and JSON output validation:
- New attachment-test application for testing dynamic profiling scenarios
- Unified test script supporting both CSV and JSON output formats
- Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info
- Add CMake integration for automated attachment testing
- Support parameterized output directory and filename specification
- Implement proper environment setup for attachment queue registration
Tests verify successful attachment to running processes and capture of:
- Kernel dispatch traces with workgroup/grid dimensions
- Memory copy operations (H2D/D2H) with size validation
- HSA API call traces across multiple domains
- GPU/CPU agent information and capabilities
* Documentation Update
* attach: make attach script callable
* Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name
* attach: revert metrics library path changes
* Generic Attachment in Register (#942)
Remove tool references in register
* Add second param to attach call in rocprof register
* Add experimental reattachment support for ROCprofiler-SDK
This commit introduces experimental reattachment functionality allowing tools
to dynamically reattach to running processes with comprehensive design changes
to support multiple attach/detach cycles:
**Core Reattachment API:**
- Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks
- Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports
- Implement reattachment tracking in rocprofiler_register_attach to differentiate
initial attachment from reattachment cycles
- Add rocprofiler_register_invoke_reattach for handling reattachment requests
**Design Changes - Registration System Flow:**
The registration system now supports a dual-path initialization:
1. Initial Attachment Flow:
- rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations()
- Full tool initialization with complete context setup
- Sets prev_attached atomic flag to track state
2. Reattachment Flow:
- rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach()
- Bypasses full re-initialization, calls client reattach callbacks instead
- Preserves existing contexts and buffers, only reactivates profiling services
**Design Changes - Tool Library Loading:**
Enhanced rocprofiler-register library loading with function pointer resolution:
- Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers
- Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions
- Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability
**Design Changes - Context Management:**
Introduced dual context systems for attachment scenarios:
- get_contexts() - Original contexts for standard tool initialization
- get_attach_contexts() - Separate context map for attachment-specific lifecycle
- attach_init() - Creates contexts for ALL buffer tracing services using existing buffers
- attach_start() - Selectively starts contexts based on configuration options
- attach_detach() - Cleanly stops and destroys attachment contexts
**Design Changes - Buffer Management:**
Added reset_tmp_file_buffer() template for clean reattachment state:
- Properly closes and removes old temporary files
- Deletes existing file_buffer instances to prevent stale file position tracking
- Creates fresh file_buffer instances for clean reattachment cycles
- Addresses core issue where file position metadata becomes stale between cycles
**Design Changes - Environment Variable Injection:**
Added ROCP_REGISTERED_TOOL_ATTACH environment variable:
- Distinguishes attachment-loaded tools from LD_PRELOAD scenarios
- Enables registration system to apply attachment-specific logic
- Helps tools adapt behavior for attachment vs standard initialization
**Attachment Context Management:**
- Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle
- Add reset_tmp_file_buffer template for clean reattachment state management
- Implement get_attach_contexts() for tracking active attachment contexts
**Test Infrastructure:**
- Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite
- Include reattachment test scripts with unified attachment/detachment cycles
- Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info
- Add conftest.py for JSON and CSV data loading utilities
**Configuration Updates:**
- Update CMakeLists.txt to include reattachment tests in build system
- Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking
- Enhance rocprofiler-register library loading with reattach/detach function resolution
**Flow Impact Analysis:**
This design enables robust multi-cycle attachment by:
1. Preventing duplicate initialization on reattachment
2. Maintaining separate context lifecycles for attachment vs standard operation
3. Ensuring clean temporary file state between attachment cycles
4. Providing tools with explicit reattach/detach callback hooks
5. Supporting both programmatic and environment-based tool configuration
The experimental nature allows for iteration on the API while establishing
the foundation for production-ready dynamic profiling capabilities.
* Fix misc clang-tidy warnings/errors
* CMake Option and Environment Variable Updates
- CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT
- Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED ->
* Source reorganization
* Formatting + new lines at EOF
* Fix flake8 F841: local variable is assigned to but never used
* Update attachment test
- get rid of 5 second start delay
- add roctx
* Rework implementation
- Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach
- Add <rocprofiler-sdk/experimental/registration.h>
- TODO: Update process_attachment.rst
* Handle re-attachment options
- inherit options from previous attachment
- check previous options do not modify data collection services
* Fix support for tools w/o rocprofiler_configure_attach
- fix segfault when rocprofiler_configure_attach does not exist
- fix naming convention for functions accepting attach dispatch table
- cleanup rocprofiler_configure_attach implementation in rocprofv3 tool
* attach: remove unknown agent handling
- Change was from earlier commit, no longer needed
* attach: add error for attaching without library loaded
* attach: revise version numbering
* attach: register header revisions
* attach: clang format register
* attach: formatting
* attach: fix build failure
- Remove cross dependency into rocprofiler-sdk, fixes build on some systems
* attach: revise register library detection
* Update rocprofiler-register and attach library
- formatting
- proper signature of register_functor for rocprofiler-sdk-attach library callback
- remove get_dispatch_registration_table()
* Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion
* Fix output support for rocprofiler-sdk-tool
* Fix formatting
* Fix clang tidy errors
* Misc rocprofiler-sdk-attach fixes
* attach: add sigint handling to attach python
* tool README.md formatting
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
* Fix buffered output issue
* attach: add errors for tool attach
* CI Fixes
* Rework tests
* attach: improve library loading in rocprofv3 attach
* formatting
* Update tests to use pytest framework
* Fix test_attachment_hsa_api_trace
* attach: catch ctypes exceptions
* attach: fix leak in registration
* attach: fix sanitizer tests
* attach: fix sanitizer tests further
* attach: disable attach asan tests
* attach: disable ubsan test
* attach: fix permissions in installed test package
* attach: formatting
---------
Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
Co-authored-by: Tim Gu <Tim.Gu@amd.com>
Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: Benjamin Welton <bwelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* expose dimensional info in rocprofiler_counter_info_v1_t.
* add counter_id in dim info.
* address review comments
* format.
* address comments.
* use array of pointers for dimensions_instaces.
* format and comments.
* address comments.
* new line.
* Update counter_defs.yaml
* Update counter_defs.yaml
* Update counter_defs.yaml
* counter_defs.
* format counter defs.
* format counter defs.
* format counter defs.
* show only counters being profiled in metadata.
* Format.
* use config for counters and fix warnings.
* add version for rocprofiler_counter_dimension_info_v1_t struct.
* rename rocprofiler_counter_record_dimension_instance_v1_info_t.
* account device id from pmc for counters metadata.
* move dim structs to counters.h.
* address comments to compare value.
* fix tests.
* Address comments. use pointer of arrays for ABI.
* rebase.
* fix build error.
* use separate metadata::init() for rocprofv3.
* also print not found counters.
* precompute all the perf counters needed to be in metadata.
* Misc.
* format
* Format.
* rocprofiler::sdk::container::c_array
* Address comments.
* source/lib/output/metadata.cpp
* lint.
* add unit test for c_array.
* add unit test and serialization support for c_array container.
* Misc.
* Clean files.
* Format.
* clang-tidy.
* add more checks to c_array.
* misc. typo
* Addr comments.
---------
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit: bf0fad1d54]
* Add trace decoder to API.
* Cleanup and activity
* Rename
* Minor fix
* Replace tt/TT with thread_trace/THREAD_TRACE
- public API types are not abbreviated
* Fix aliases
* Build system updates
- activate clang-tidy for all subfolders in lib
- fix addition of sources for att-tool
* Fix clang-tidy issues with lib/att-tool/counters.{hpp,cpp}
* Delete counters.cpp
* Formatting
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit: 65786f619d]
* Doc updates for AFAR V
* doc updates
* Updating All AFAR history
* updating table with --output-format
* updating doc for yaml and json support
* Update source/docs/pc_sampling.md
* Update README.md
* Update source/docs/pc_sampling.md
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update pc_sampling.md
---------
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit: 6c41b8d73a]
* Update lib/rocprofiler-sdk/counters/{tests,parser/tests}/CMakeLists.txt
- use rocprofiler-static-library instead of rocprofiler-object-library
* Update scripts/run-ci.py
- support gcovr and pycobertura
* Update CI workflow for code coverage
- load/save cache for XML code coverage (via gcovr)
- generate and write code coverage comment
- archive code coverage HTML report
- fix name for sanitizer jobs
* Update CI workflow
- tweaks to env for PATH and LD_LIBRARY_PATH
* Add scripts/upload-image-to-github.py
- script for saving images to orphan branches to be used in markdown links
* Update CI workflow
- fix upload artifact conflict
- use upload-image-to-github.py
* Update CI workflow
- install extra packages for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- install more recent git
- tweak package installs for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- remove duplicate --cap-add=SYS_PTRACE
* Update CI and upload-image-to-github.py
- print versions
* Update upload-image-to-github.py
- check exit code of some subprocesses
* Update CI workflow
- fix GITHUB_PATH ordering
- fix LD_LIBRARY_PATH
* Update CI workflow
- fix code coverage cache keys (use SHAs)
- copy .codecov to .codecov.ref if a cached .codecov exists
* Update upload-image-to-github.py
- Update git pull/push commands
* Update upload-image-to-github.py
- git fetch before pulling
- git pull before committing
* Update upload-image-to-github.py
- git fetch after committing
- git pull after committing
* Update CI workflow
- list files before cat
* Update upload-image-to-github.py
- output messages
* Update CI workflow and upload-image-to-github.py
- fix output directory path for script to work with CI workflow
* Update CI workflow
- finishing touches/fixes on the code coverage comment generation
* Reproducible filenames
* Update CI workflow
- fix archive of code coverage data
* Fix relative path of reproducible file loc
* Update upload-image-to-github.py
- change update method
* rocprofiler-v2-internal -> rocprofiler-sdk-internal
[ROCm/rocprofiler-sdk commit: c5e45803e9]