bf49039005
* attach: milestone: API tracing - This pairs with another commit in rocprofiler-sdk to fully function - Add ptrace entry points for tool attachment - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup - Remove hardcode for loading of tool library - Make invoke registration functions public again * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-sdk * attach: prestore overhaul - Must be paired with commit in rocprofiler-sdk * attach: add dispatch table rework - Register will load the prestore library and provide entrypoints to sdk * attach: formatting and cleanup * attach: revise dispatch table scheme * attach: formatting * attach: milestone: API tracing - This change must be paired with a change in rocprofiler-register to fully function. - API tracing works at this commit - Queue tracing not supported yet * attach: cleanup and comments * attach: Formatting and crash fixes * attach: add attach duration - Add option attach-duration-msec for attachment * Formatting + sglang hang fix via signal handling * Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues * attach: proxy queue first draft - Adds ability to trace with queues during attachment - Must be paired with updated rocprofiler-register * Allow null agents for scratch output * attach: improve queue library interface - Significant changes to force exported interfaces back to C - Fixes bug with unknown agents at attachment - Code objects' names may still be incorrect * attach: add code_object support - Kernel traces will now have names and all other information for launches - Add capture of hsa_executable to the queue library - Various logging improvements * attach: rename queue library to prestore * attach: prestore overhaul - Must be paired with commit from rocprofiler-register - Massive overhaul of code organization in prestore library - Separates registrations for different object types - Sets up future changes for initialization * attach: add prestore dispatch table - Removes linkage to prestore library from sdk * attach: cleanup * attach: formatting * attach: fix input prompt not appearing * attach: fix component name in cmake * attach: revert change to export level * Make prestore API public * attach: update sdk attachment library WIP - This commit is NONFUNCTIONAL - Changes around structure to remove classes - Seperate C linkage where needed - Still needs updates to register for correct usage * attach: update register with dispatch table WIP - This commit is NONFUNCTIONAL - Changes rocprofiler_register to handle dispatch table from attach library. - Still needs changes in SDK with dispatch table usage * attach: dispatch table wip - This commit is NONFUNCTIONAL * attach: move attach component into core * attach: rename to rocprofv3-attach * attach: add callbacks for new queues and code objects * attach: finish dispatch table implementation - Fixes kernel tracing * attach: add cmake variable for attachment support * feat: Add --attach alias for rocprofv3 with comprehensive attachment tests - Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py - Create comprehensive attachment test suite with CSV and JSON output validation: - New attachment-test application for testing dynamic profiling scenarios - Unified test script supporting both CSV and JSON output formats - Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info - Add CMake integration for automated attachment testing - Support parameterized output directory and filename specification - Implement proper environment setup for attachment queue registration Tests verify successful attachment to running processes and capture of: - Kernel dispatch traces with workgroup/grid dimensions - Memory copy operations (H2D/D2H) with size validation - HSA API call traces across multiple domains - GPU/CPU agent information and capabilities * Documentation Update * attach: make attach script callable * Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name * attach: revert metrics library path changes * Generic Attachment in Register (#942) Remove tool references in register * Add second param to attach call in rocprof register * Add experimental reattachment support for ROCprofiler-SDK This commit introduces experimental reattachment functionality allowing tools to dynamically reattach to running processes with comprehensive design changes to support multiple attach/detach cycles: **Core Reattachment API:** - Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks - Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports - Implement reattachment tracking in rocprofiler_register_attach to differentiate initial attachment from reattachment cycles - Add rocprofiler_register_invoke_reattach for handling reattachment requests **Design Changes - Registration System Flow:** The registration system now supports a dual-path initialization: 1. Initial Attachment Flow: - rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations() - Full tool initialization with complete context setup - Sets prev_attached atomic flag to track state 2. Reattachment Flow: - rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach() - Bypasses full re-initialization, calls client reattach callbacks instead - Preserves existing contexts and buffers, only reactivates profiling services **Design Changes - Tool Library Loading:** Enhanced rocprofiler-register library loading with function pointer resolution: - Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers - Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions - Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability **Design Changes - Context Management:** Introduced dual context systems for attachment scenarios: - get_contexts() - Original contexts for standard tool initialization - get_attach_contexts() - Separate context map for attachment-specific lifecycle - attach_init() - Creates contexts for ALL buffer tracing services using existing buffers - attach_start() - Selectively starts contexts based on configuration options - attach_detach() - Cleanly stops and destroys attachment contexts **Design Changes - Buffer Management:** Added reset_tmp_file_buffer() template for clean reattachment state: - Properly closes and removes old temporary files - Deletes existing file_buffer instances to prevent stale file position tracking - Creates fresh file_buffer instances for clean reattachment cycles - Addresses core issue where file position metadata becomes stale between cycles **Design Changes - Environment Variable Injection:** Added ROCP_REGISTERED_TOOL_ATTACH environment variable: - Distinguishes attachment-loaded tools from LD_PRELOAD scenarios - Enables registration system to apply attachment-specific logic - Helps tools adapt behavior for attachment vs standard initialization **Attachment Context Management:** - Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle - Add reset_tmp_file_buffer template for clean reattachment state management - Implement get_attach_contexts() for tracking active attachment contexts **Test Infrastructure:** - Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite - Include reattachment test scripts with unified attachment/detachment cycles - Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info - Add conftest.py for JSON and CSV data loading utilities **Configuration Updates:** - Update CMakeLists.txt to include reattachment tests in build system - Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking - Enhance rocprofiler-register library loading with reattach/detach function resolution **Flow Impact Analysis:** This design enables robust multi-cycle attachment by: 1. Preventing duplicate initialization on reattachment 2. Maintaining separate context lifecycles for attachment vs standard operation 3. Ensuring clean temporary file state between attachment cycles 4. Providing tools with explicit reattach/detach callback hooks 5. Supporting both programmatic and environment-based tool configuration The experimental nature allows for iteration on the API while establishing the foundation for production-ready dynamic profiling capabilities. * Fix misc clang-tidy warnings/errors * CMake Option and Environment Variable Updates - CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT - Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED -> * Source reorganization * Formatting + new lines at EOF * Fix flake8 F841: local variable is assigned to but never used * Update attachment test - get rid of 5 second start delay - add roctx * Rework implementation - Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach - Add <rocprofiler-sdk/experimental/registration.h> - TODO: Update process_attachment.rst * Handle re-attachment options - inherit options from previous attachment - check previous options do not modify data collection services * Fix support for tools w/o rocprofiler_configure_attach - fix segfault when rocprofiler_configure_attach does not exist - fix naming convention for functions accepting attach dispatch table - cleanup rocprofiler_configure_attach implementation in rocprofv3 tool * attach: remove unknown agent handling - Change was from earlier commit, no longer needed * attach: add error for attaching without library loaded * attach: revise version numbering * attach: register header revisions * attach: clang format register * attach: formatting * attach: fix build failure - Remove cross dependency into rocprofiler-sdk, fixes build on some systems * attach: revise register library detection * Update rocprofiler-register and attach library - formatting - proper signature of register_functor for rocprofiler-sdk-attach library callback - remove get_dispatch_registration_table() * Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion * Fix output support for rocprofiler-sdk-tool * Fix formatting * Fix clang tidy errors * Misc rocprofiler-sdk-attach fixes * attach: add sigint handling to attach python * tool README.md formatting Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * Fix buffered output issue * attach: add errors for tool attach * CI Fixes * Rework tests * attach: improve library loading in rocprofv3 attach * formatting * Update tests to use pytest framework * Fix test_attachment_hsa_api_trace * attach: catch ctypes exceptions * attach: fix leak in registration * attach: fix sanitizer tests * attach: fix sanitizer tests further * attach: disable attach asan tests * attach: disable ubsan test * attach: fix permissions in installed test package * attach: formatting --------- Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com> Co-authored-by: Tim Gu <Tim.Gu@amd.com> Co-authored-by: Claude Code <claude@anthropic.com> Co-authored-by: Benjamin Welton <bwelton@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com>
1139 γραμμές
35 KiB
ReStructuredText
1139 γραμμές
35 KiB
ReStructuredText
.. meta::
|
|
:description: Technical guide for implementing ROCprofiler-SDK process attachment
|
|
:keywords: ROCprofiler-SDK, process attachment, ptrace, dynamic profiling, tool development
|
|
|
|
.. _process_attachment_implementation:
|
|
|
|
********************************************************************************
|
|
Implementing Process Attachment Tools
|
|
********************************************************************************
|
|
|
|
Overview
|
|
========
|
|
|
|
This document provides the technical details needed to implement a process attachment tool similar to ``rocprofv3 --attach``. Process attachment allows profiling tools to dynamically attach to running GPU applications without requiring application restart.
|
|
|
|
The implementation uses specific exported C functions and involves low-level process manipulation using ptrace, environment variable injection, library loading, and coordination with the ROCprofiler-SDK registration system.
|
|
|
|
Exported C Functions for Attachment
|
|
===================================
|
|
|
|
The attachment functionality provides the following exported C functions that tools can use:
|
|
|
|
ROCprofiler-Attach Functions
|
|
-----------------------------
|
|
|
|
These functions are exported from the ``rocprofiler-attach`` binary:
|
|
|
|
.. code-block:: cpp
|
|
|
|
extern "C" {
|
|
// Start attachment to a target process
|
|
void attach(uint32_t pid) ROCPROFILER_EXPORT;
|
|
|
|
// Detach from target process and cleanup
|
|
void detach() ROCPROFILER_EXPORT;
|
|
}
|
|
|
|
**Function Details:**
|
|
|
|
- **``attach(uint32_t pid)``**: Main entry point for starting attachment to a process
|
|
- Takes the target process ID as parameter
|
|
- Initiates ptrace-based attachment sequence
|
|
- Spawns background thread for ptrace operations
|
|
|
|
- **``detach()``**: Entry point for detaching from the target process
|
|
- Cleans up attachment resources and terminates profiling
|
|
- Joins ptrace thread and releases resources
|
|
|
|
ROCprofiler-Register Functions
|
|
------------------------------
|
|
|
|
These functions are exported from the ``librocprofiler-register.so`` library and are called via ptrace:
|
|
|
|
.. code-block:: cpp
|
|
|
|
extern "C" {
|
|
// Activate profiling in target process (called via ptrace)
|
|
rocprofiler_register_error_code_t
|
|
rocprofiler_register_attach(const char* environment_buffer, const char* tool_lib_path)
|
|
ROCPROFILER_REGISTER_PUBLIC_API;
|
|
|
|
// Deactivate profiling in target process (called via ptrace)
|
|
rocprofiler_register_error_code_t
|
|
rocprofiler_register_detach()
|
|
ROCPROFILER_REGISTER_PUBLIC_API;
|
|
|
|
// Reattach to previously attached process (experimental)
|
|
rocprofiler_register_error_code_t
|
|
rocprofiler_register_invoke_reattach()
|
|
ROCPROFILER_REGISTER_PUBLIC_API;
|
|
|
|
// Client callback functions for reattachment support
|
|
void rocprofiler_call_client_reattach(void)
|
|
ROCPROFILER_REGISTER_PUBLIC_API;
|
|
void rocprofiler_call_client_detach(void)
|
|
ROCPROFILER_REGISTER_PUBLIC_API;
|
|
}
|
|
|
|
**Function Details:**
|
|
|
|
- **``rocprofiler_register_attach(const char* environment_buffer, const char* tool_lib_path)``**:
|
|
- Called via ptrace from the attachment system
|
|
- Receives serialized environment variables for profiling configuration
|
|
- Receives the tool library path to load (defaults to "librocprofiler-sdk-tool.so" if NULL)
|
|
- Loads the specified tool library and activates profiling services
|
|
- Returns ``rocprofiler_register_error_code_t`` status
|
|
|
|
- **``rocprofiler_register_detach()``**:
|
|
- Called via ptrace to stop profiling in the target process
|
|
- Calls the tool's detach function and cleans up resources
|
|
- Returns ``rocprofiler_register_error_code_t`` status
|
|
|
|
- **``rocprofiler_register_invoke_reattach()``**: (EXPERIMENTAL)
|
|
- Called to reattach profiling to a previously attached process
|
|
- Invokes client reattach callbacks without full re-initialization
|
|
- Used for resuming profiling after temporary detachment
|
|
- Returns ``rocprofiler_register_error_code_t`` status
|
|
|
|
- **``rocprofiler_call_client_reattach()`` and ``rocprofiler_call_client_detach()``**:
|
|
- C wrapper functions for client tool reattachment callbacks
|
|
- Automatically resolved and called by the registration system
|
|
- Enable tools to handle dynamic attach/detach cycles
|
|
|
|
Function Call Sequence
|
|
======================
|
|
|
|
Initial Attachment Sequence
|
|
---------------------------
|
|
|
|
The initial attachment process follows this sequence:
|
|
|
|
.. code-block:: text
|
|
|
|
Tool Implementation
|
|
|
|
|
v
|
|
attach(pid) ← Your tool calls this
|
|
|
|
|
v
|
|
Ptrace attachment & environment setup
|
|
|
|
|
v
|
|
rocprofiler_register_attach(env_buffer) ← Called via ptrace in target
|
|
|
|
|
v
|
|
Profiling active in target process
|
|
|
|
|
v
|
|
[Profiling data collection...]
|
|
|
|
|
v
|
|
rocprofiler_register_detach() ← Called via ptrace in target
|
|
|
|
|
v
|
|
detach() ← Your tool calls this
|
|
|
|
|
v
|
|
Cleanup complete
|
|
|
|
Reattachment Sequence (Experimental)
|
|
------------------------------------
|
|
|
|
For reattachment to a previously attached process:
|
|
|
|
.. code-block:: text
|
|
|
|
Tool Implementation
|
|
|
|
|
v
|
|
attach(pid) ← Your tool calls this again
|
|
|
|
|
v
|
|
Ptrace attachment & environment setup
|
|
|
|
|
v
|
|
rocprofiler_register_attach(env_buffer) ← Detects previous attachment
|
|
|
|
|
v
|
|
rocprofiler_register_invoke_reattach() ← Calls client reattach callbacks
|
|
|
|
|
v
|
|
Profiling resumed in target process
|
|
|
|
|
v
|
|
[Continued profiling data collection...]
|
|
|
|
|
v
|
|
rocprofiler_register_detach() ← Called via ptrace in target
|
|
|
|
|
v
|
|
detach() ← Your tool calls this
|
|
|
|
|
v
|
|
Cleanup complete
|
|
|
|
Using the Attachment Functions
|
|
==============================
|
|
|
|
Here's how to use these functions in your own attachment tool:
|
|
|
|
Basic Attachment Tool Implementation
|
|
-----------------------------------
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <dlfcn.h>
|
|
#include <iostream>
|
|
#include <thread>
|
|
#include <chrono>
|
|
|
|
class ROCprofilerAttachmentTool {
|
|
private:
|
|
void* attach_lib_handle = nullptr;
|
|
void (*attach_func)(uint32_t) = nullptr;
|
|
void (*detach_func)() = nullptr;
|
|
|
|
public:
|
|
bool initialize() {
|
|
// Load the rocprofiler-attach library/binary
|
|
attach_lib_handle = dlopen("librocprofiler-attach.so", RTLD_NOW);
|
|
if (!attach_lib_handle) {
|
|
std::cerr << "Failed to load rocprofiler-attach: " << dlerror() << std::endl;
|
|
return false;
|
|
}
|
|
|
|
// Get the attachment function pointers
|
|
attach_func = (void(*)(uint32_t))dlsym(attach_lib_handle, "attach");
|
|
detach_func = (void(*)())dlsym(attach_lib_handle, "detach");
|
|
|
|
if (!attach_func || !detach_func) {
|
|
std::cerr << "Failed to find attachment functions" << std::endl;
|
|
return false;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
bool attach_to_process(pid_t pid, uint32_t duration_ms = 0) {
|
|
// Validate the target process
|
|
if (kill(pid, 0) != 0) {
|
|
std::cerr << "Target process " << pid << " is not accessible" << std::endl;
|
|
return false;
|
|
}
|
|
|
|
std::cout << "Attaching to process " << pid << std::endl;
|
|
|
|
// Start attachment - this will handle all ptrace operations
|
|
attach_func(pid);
|
|
|
|
if (duration_ms > 0) {
|
|
// Profile for specified duration
|
|
std::cout << "Profiling for " << duration_ms << " milliseconds..." << std::endl;
|
|
std::this_thread::sleep_for(std::chrono::milliseconds(duration_ms));
|
|
|
|
// Stop profiling
|
|
detach_func();
|
|
} else {
|
|
std::cout << "Profiling until process ends or manual detach..." << std::endl;
|
|
// Monitor process or wait for external signal to detach
|
|
while (kill(pid, 0) == 0) {
|
|
std::this_thread::sleep_for(std::chrono::seconds(1));
|
|
}
|
|
detach_func();
|
|
}
|
|
|
|
std::cout << "Profiling completed" << std::endl;
|
|
return true;
|
|
}
|
|
|
|
~ROCprofilerAttachmentTool() {
|
|
if (attach_lib_handle) {
|
|
dlclose(attach_lib_handle);
|
|
}
|
|
}
|
|
};
|
|
|
|
Complete Tool Example
|
|
--------------------
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <iostream>
|
|
#include <vector>
|
|
#include <string>
|
|
#include <cstdlib>
|
|
|
|
int main(int argc, char* argv[]) {
|
|
if (argc < 2) {
|
|
std::cerr << "Usage: " << argv[0] << " <PID> [duration_ms]" << std::endl;
|
|
std::cerr << " PID: Process ID to attach to" << std::endl;
|
|
std::cerr << " duration_ms: Optional profiling duration in milliseconds" << std::endl;
|
|
return 1;
|
|
}
|
|
|
|
pid_t target_pid = std::stoi(argv[1]);
|
|
uint32_t duration = (argc > 2) ? std::stoi(argv[2]) : 0;
|
|
|
|
// Set up profiling environment variables before attachment
|
|
setenv("ROCP_TOOL_ATTACH", "1", 1);
|
|
|
|
// Note: The attachment system now uses the hardcoded default tool library path
|
|
// "librocprofiler-sdk-tool.so" and no longer uses environment variables for tool selection
|
|
|
|
setenv("ROCPROF_HIP_API_TRACE", "1", 1);
|
|
setenv("ROCPROF_KERNEL_TRACE", "1", 1);
|
|
setenv("ROCPROF_MEMORY_COPY_TRACE", "1", 1);
|
|
setenv("ROCPROF_OUTPUT_PATH", "./attachment-output", 1);
|
|
setenv("ROCPROF_OUTPUT_FILE_NAME", "attached_profile", 1);
|
|
|
|
// Initialize and run attachment tool
|
|
ROCprofilerAttachmentTool tool;
|
|
if (!tool.initialize()) {
|
|
std::cerr << "Failed to initialize attachment tool" << std::endl;
|
|
return 1;
|
|
}
|
|
|
|
if (!tool.attach_to_process(target_pid, duration)) {
|
|
std::cerr << "Attachment failed" << std::endl;
|
|
return 1;
|
|
}
|
|
|
|
std::cout << "Attachment completed successfully" << std::endl;
|
|
return 0;
|
|
}
|
|
|
|
Experimental Reattachment API
|
|
=============================
|
|
|
|
ROCprofiler-SDK now provides experimental support for reattachment, allowing tools to handle dynamic attach/detach cycles more efficiently.
|
|
|
|
Tool Configuration for Reattachment
|
|
-----------------------------------
|
|
|
|
Tools that support reattachment should implement the experimental configuration structure:
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <rocprofiler-sdk/registration.h>
|
|
|
|
// Experimental reattachment callbacks
|
|
void tool_reattach(void* tool_data) {
|
|
// Reinitialize contexts and resume profiling
|
|
// This is called when reattaching to a previously profiled process
|
|
}
|
|
|
|
void tool_detach(void* tool_data) {
|
|
// Suspend profiling operations temporarily
|
|
// This is called during detachment, but contexts may be preserved
|
|
}
|
|
|
|
extern "C" rocprofiler_tool_configure_result_experimental_t*
|
|
rocprofiler_configure_experimental(uint32_t version,
|
|
const char* runtime_version,
|
|
uint32_t prio,
|
|
rocprofiler_client_id_t* client_id)
|
|
{
|
|
static auto cfg = rocprofiler_tool_configure_result_experimental_t {
|
|
.size = sizeof(rocprofiler_tool_configure_result_experimental_t),
|
|
.initialize = &tool_init,
|
|
.finalize = &tool_fini,
|
|
.tool_data = nullptr,
|
|
.tool_reattach = &tool_reattach, // Experimental reattachment support
|
|
.tool_detach = &tool_detach // Experimental detachment support
|
|
};
|
|
|
|
return &cfg;
|
|
}
|
|
|
|
Client Callback Functions
|
|
-------------------------
|
|
|
|
The registration system automatically provides C wrapper functions:
|
|
|
|
.. code-block:: cpp
|
|
|
|
// These are automatically generated and called by rocprofiler-register
|
|
extern "C" void rocprofiler_call_client_reattach(void) {
|
|
// Calls the tool's reattach callback with stored tool_data
|
|
}
|
|
|
|
extern "C" void rocprofiler_call_client_detach(void) {
|
|
// Calls the tool's detach callback with stored tool_data
|
|
}
|
|
|
|
Reattachment Environment Variables
|
|
---------------------------------
|
|
|
|
When using reattachment, set this additional environment variable:
|
|
|
|
.. code-block:: cpp
|
|
|
|
// Indicates that the tool was loaded via attachment (not LD_PRELOAD)
|
|
setenv("ROCPROFILER_REGISTER_TOOL_ATTACHED", "1", 1);
|
|
|
|
This helps the registration system differentiate between initial attachment and reattachment cycles.
|
|
|
|
Environment Variable Configuration
|
|
=================================
|
|
|
|
Before calling the attachment functions, set up environment variables that will be injected into the target process:
|
|
|
|
Required Variables
|
|
-----------------
|
|
|
|
.. code-block:: cpp
|
|
|
|
// Essential for attachment functionality
|
|
setenv("ROCP_TOOL_ATTACH", "1", 1);
|
|
|
|
Tool Library Configuration
|
|
--------------------------
|
|
|
|
The attachment system now uses a hardcoded default tool library path:
|
|
|
|
.. code-block:: cpp
|
|
|
|
// The attachment system automatically uses "librocprofiler-sdk-tool.so"
|
|
// No environment variable configuration is needed or supported
|
|
|
|
Tracing Options
|
|
--------------
|
|
|
|
.. code-block:: cpp
|
|
|
|
// Enable different types of tracing
|
|
setenv("ROCPROF_HIP_API_TRACE", "1", 1); // HIP API calls
|
|
setenv("ROCPROF_HSA_API_TRACE", "1", 1); // HSA API calls
|
|
setenv("ROCPROF_KERNEL_TRACE", "1", 1); // Kernel dispatches
|
|
setenv("ROCPROF_MEMORY_COPY_TRACE", "1", 1); // Memory operations
|
|
setenv("ROCPROF_MEMORY_ALLOCATION_TRACE", "1", 1); // Memory allocations
|
|
setenv("ROCPROF_SCRATCH_MEMORY_TRACE", "1", 1); // Scratch memory
|
|
setenv("ROCPROF_MARKER_TRACE", "1", 1); // ROCTx markers
|
|
|
|
Output Configuration
|
|
-------------------
|
|
|
|
.. code-block:: cpp
|
|
|
|
// Control output location and format
|
|
setenv("ROCPROF_OUTPUT_PATH", "/path/to/output", 1);
|
|
setenv("ROCPROF_OUTPUT_FILE_NAME", "profile_name", 1);
|
|
setenv("ROCPROF_OUTPUT_FORMAT", "csv", 1); // or "json", "pftrace", etc.
|
|
|
|
Build Configuration
|
|
==================
|
|
|
|
To build a tool using the attachment functions:
|
|
|
|
CMakeLists.txt
|
|
-------------
|
|
|
|
.. code-block:: cmake
|
|
|
|
cmake_minimum_required(VERSION 3.16)
|
|
project(my_rocprofiler_attach_tool)
|
|
|
|
set(CMAKE_CXX_STANDARD 17)
|
|
|
|
# Find ROCprofiler SDK (for headers and linking)
|
|
find_package(rocprofiler-sdk REQUIRED)
|
|
|
|
add_executable(my_attach_tool
|
|
main.cpp
|
|
attachment_tool.cpp
|
|
)
|
|
|
|
# Link with required libraries
|
|
target_link_libraries(my_attach_tool
|
|
rocprofiler-sdk::rocprofiler-sdk
|
|
dl # for dlopen/dlsym operations
|
|
)
|
|
|
|
# Set capabilities for ptrace operations
|
|
add_custom_command(TARGET my_attach_tool POST_BUILD
|
|
COMMAND sudo setcap cap_sys_ptrace+ep $<TARGET_FILE:my_attach_tool>
|
|
COMMENT "Setting ptrace capability"
|
|
)
|
|
|
|
Error Handling
|
|
=============
|
|
|
|
When using the attachment functions, handle these common error conditions:
|
|
|
|
.. code-block:: cpp
|
|
|
|
class AttachmentErrorHandler {
|
|
public:
|
|
static bool validate_target_process(pid_t pid) {
|
|
// Check if process exists
|
|
if (kill(pid, 0) != 0) {
|
|
std::cerr << "Process " << pid << " not found or not accessible" << std::endl;
|
|
return false;
|
|
}
|
|
|
|
// Check if it's a GPU application
|
|
std::string maps_path = "/proc/" + std::to_string(pid) + "/maps";
|
|
std::ifstream maps(maps_path);
|
|
std::string line;
|
|
|
|
bool has_gpu_libs = false;
|
|
while (std::getline(maps, line)) {
|
|
if (line.find("libamdhip64.so") != std::string::npos ||
|
|
line.find("libhsa-runtime64.so") != std::string::npos) {
|
|
has_gpu_libs = true;
|
|
break;
|
|
}
|
|
}
|
|
|
|
if (!has_gpu_libs) {
|
|
std::cerr << "Process " << pid << " does not appear to use GPU APIs" << std::endl;
|
|
return false;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
static void handle_attachment_errors() {
|
|
// Check for common permission issues
|
|
if (geteuid() != 0) {
|
|
std::cerr << "Warning: Not running as root. Ensure CAP_SYS_PTRACE capability is set." << std::endl;
|
|
}
|
|
|
|
// Check if rocprofiler libraries are available
|
|
if (getenv("LD_LIBRARY_PATH") == nullptr ||
|
|
std::string(getenv("LD_LIBRARY_PATH")).find("/opt/rocm/lib") == std::string::npos) {
|
|
std::cerr << "Warning: /opt/rocm/lib may not be in LD_LIBRARY_PATH" << std::endl;
|
|
}
|
|
}
|
|
};
|
|
|
|
Architecture Overview
|
|
=====================
|
|
|
|
Process attachment consists of several cooperating components:
|
|
|
|
.. code-block:: text
|
|
|
|
Attachment Tool (your implementation)
|
|
|
|
|
v
|
|
1. Process Discovery & Validation
|
|
|
|
|
v
|
|
2. Ptrace Attachment & Control
|
|
|
|
|
v
|
|
3. Environment Variable Injection
|
|
|
|
|
v
|
|
4. Library Loading (rocprofiler-register)
|
|
|
|
|
v
|
|
5. Profiling Service Activation
|
|
|
|
|
v
|
|
6. Data Collection & Management
|
|
|
|
|
v
|
|
7. Detachment & Cleanup
|
|
|
|
Theoretical Implementation Details
|
|
=================================
|
|
|
|
Core Implementation Components
|
|
=============================
|
|
|
|
1. Process Discovery and Validation
|
|
-----------------------------------
|
|
|
|
**Target Process Requirements:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <sys/types.h>
|
|
#include <signal.h>
|
|
#include <unistd.h>
|
|
|
|
bool validate_target_process(pid_t pid) {
|
|
// Check if process exists and is accessible
|
|
if (kill(pid, 0) != 0) {
|
|
return false; // Process doesn't exist or no permission
|
|
}
|
|
|
|
// Verify it's a GPU application by checking loaded libraries
|
|
std::string maps_path = "/proc/" + std::to_string(pid) + "/maps";
|
|
std::ifstream maps(maps_path);
|
|
std::string line;
|
|
|
|
bool has_hip = false, has_hsa = false;
|
|
while (std::getline(maps, line)) {
|
|
if (line.find("libamdhip64.so") != std::string::npos) has_hip = true;
|
|
if (line.find("libhsa-runtime64.so") != std::string::npos) has_hsa = true;
|
|
}
|
|
|
|
return has_hip || has_hsa; // Must use HIP or HSA
|
|
}
|
|
|
|
2. Ptrace-Based Process Control
|
|
------------------------------
|
|
|
|
**Core Ptrace Operations:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <sys/ptrace.h>
|
|
#include <sys/wait.h>
|
|
#include <sys/user.h>
|
|
|
|
class ProcessAttachment {
|
|
private:
|
|
pid_t target_pid;
|
|
bool attached = false;
|
|
|
|
public:
|
|
bool attach(pid_t pid) {
|
|
target_pid = pid;
|
|
|
|
// Attach to the target process
|
|
if (ptrace(PTRACE_ATTACH, target_pid, nullptr, nullptr) == -1) {
|
|
perror("ptrace PTRACE_ATTACH failed");
|
|
return false;
|
|
}
|
|
|
|
// Wait for the process to stop
|
|
int status;
|
|
if (waitpid(target_pid, &status, 0) == -1) {
|
|
perror("waitpid failed");
|
|
detach();
|
|
return false;
|
|
}
|
|
|
|
if (!WIFSTOPPED(status)) {
|
|
fprintf(stderr, "Process did not stop after attach\n");
|
|
detach();
|
|
return false;
|
|
}
|
|
|
|
attached = true;
|
|
return true;
|
|
}
|
|
|
|
bool detach() {
|
|
if (!attached) return true;
|
|
|
|
// Detach and allow process to continue
|
|
if (ptrace(PTRACE_DETACH, target_pid, nullptr, nullptr) == -1) {
|
|
perror("ptrace PTRACE_DETACH failed");
|
|
return false;
|
|
}
|
|
|
|
attached = false;
|
|
return true;
|
|
}
|
|
};
|
|
|
|
3. Environment Variable Injection
|
|
---------------------------------
|
|
|
|
**Environment Variable Management:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <fstream>
|
|
#include <vector>
|
|
|
|
class EnvironmentInjector {
|
|
public:
|
|
struct EnvironmentVar {
|
|
std::string name;
|
|
std::string value;
|
|
};
|
|
|
|
// Prepare environment variables for profiling
|
|
std::vector<EnvironmentVar> prepare_profiling_env(
|
|
const std::vector<std::string>& trace_options,
|
|
const std::string& output_path,
|
|
const std::string& output_file) {
|
|
|
|
std::vector<EnvironmentVar> env_vars;
|
|
|
|
// Essential attachment variable
|
|
env_vars.push_back({"ROCP_TOOL_ATTACH", "1"});
|
|
|
|
// Configure tracing based on options
|
|
for (const auto& option : trace_options) {
|
|
if (option == "hip-trace") {
|
|
env_vars.push_back({"ROCPROF_HIP_API_TRACE", "1"});
|
|
}
|
|
if (option == "kernel-trace") {
|
|
env_vars.push_back({"ROCPROF_KERNEL_TRACE", "1"});
|
|
}
|
|
if (option == "hsa-trace") {
|
|
env_vars.push_back({"ROCPROF_HSA_API_TRACE", "1"});
|
|
}
|
|
if (option == "memory-copy-trace") {
|
|
env_vars.push_back({"ROCPROF_MEMORY_COPY_TRACE", "1"});
|
|
}
|
|
}
|
|
|
|
// Output configuration
|
|
env_vars.push_back({"ROCPROF_OUTPUT_PATH", output_path});
|
|
env_vars.push_back({"ROCPROF_OUTPUT_FILE_NAME", output_file});
|
|
|
|
return env_vars;
|
|
}
|
|
|
|
// Serialize environment for injection
|
|
std::vector<uint8_t> serialize_environment(const std::vector<EnvironmentVar>& vars) {
|
|
std::vector<uint8_t> buffer(4); // Start with count
|
|
uint32_t count = vars.size();
|
|
|
|
// Store count in first 4 bytes
|
|
buffer[0] = count & 0xFF;
|
|
buffer[1] = (count >> 8) & 0xFF;
|
|
buffer[2] = (count >> 16) & 0xFF;
|
|
buffer[3] = (count >> 24) & 0xFF;
|
|
|
|
// Add each variable as null-terminated name and value
|
|
for (const auto& var : vars) {
|
|
// Add variable name
|
|
for (char c : var.name) {
|
|
buffer.push_back(c);
|
|
}
|
|
buffer.push_back(0); // Null terminate name
|
|
|
|
// Add variable value
|
|
for (char c : var.value) {
|
|
buffer.push_back(c);
|
|
}
|
|
buffer.push_back(0); // Null terminate value
|
|
}
|
|
|
|
return buffer;
|
|
}
|
|
};
|
|
|
|
4. Memory Manipulation and Library Loading
|
|
------------------------------------------
|
|
|
|
**Remote Memory Operations:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <sys/mman.h>
|
|
|
|
class RemoteMemoryManager {
|
|
private:
|
|
pid_t target_pid;
|
|
|
|
public:
|
|
RemoteMemoryManager(pid_t pid) : target_pid(pid) {}
|
|
|
|
// Allocate memory in remote process
|
|
void* remote_mmap(size_t length, int prot, int flags) {
|
|
// Find a suitable location for injection
|
|
struct user_regs_struct regs;
|
|
if (ptrace(PTRACE_GETREGS, target_pid, nullptr, ®s) == -1) {
|
|
return nullptr;
|
|
}
|
|
|
|
// Save original registers
|
|
struct user_regs_struct orig_regs = regs;
|
|
|
|
// Set up mmap syscall
|
|
regs.rax = 9; // __NR_mmap
|
|
regs.rdi = 0; // addr (let kernel choose)
|
|
regs.rsi = length;
|
|
regs.rdx = prot;
|
|
regs.r10 = flags;
|
|
regs.r8 = -1; // fd
|
|
regs.r9 = 0; // offset
|
|
|
|
if (ptrace(PTRACE_SETREGS, target_pid, nullptr, ®s) == -1) {
|
|
return nullptr;
|
|
}
|
|
|
|
// Execute syscall
|
|
if (ptrace(PTRACE_SYSCALL, target_pid, nullptr, nullptr) == -1) {
|
|
return nullptr;
|
|
}
|
|
|
|
// Wait for syscall completion
|
|
int status;
|
|
waitpid(target_pid, &status, 0);
|
|
|
|
// Get result
|
|
if (ptrace(PTRACE_GETREGS, target_pid, nullptr, ®s) == -1) {
|
|
return nullptr;
|
|
}
|
|
|
|
void* result = (void*)regs.rax;
|
|
|
|
// Restore original registers
|
|
ptrace(PTRACE_SETREGS, target_pid, nullptr, &orig_regs);
|
|
|
|
return (result == (void*)-1) ? nullptr : result;
|
|
}
|
|
|
|
// Write data to remote process memory
|
|
bool write_memory(void* addr, const void* data, size_t size) {
|
|
const uint8_t* bytes = static_cast<const uint8_t*>(data);
|
|
size_t written = 0;
|
|
|
|
while (written < size) {
|
|
long word = 0;
|
|
size_t to_copy = std::min(sizeof(long), size - written);
|
|
|
|
// For partial words, read existing content first
|
|
if (to_copy < sizeof(long)) {
|
|
errno = 0;
|
|
word = ptrace(PTRACE_PEEKDATA, target_pid,
|
|
(uint8_t*)addr + written, nullptr);
|
|
if (errno != 0) return false;
|
|
}
|
|
|
|
// Copy new data into word
|
|
memcpy(&word, bytes + written, to_copy);
|
|
|
|
// Write word to remote process
|
|
if (ptrace(PTRACE_POKEDATA, target_pid,
|
|
(uint8_t*)addr + written, word) == -1) {
|
|
return false;
|
|
}
|
|
|
|
written += to_copy;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
};
|
|
|
|
5. Library Injection and Symbol Resolution
|
|
------------------------------------------
|
|
|
|
**Dynamic Library Loading:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <dlfcn.h>
|
|
#include <link.h>
|
|
|
|
class LibraryInjector {
|
|
private:
|
|
pid_t target_pid;
|
|
RemoteMemoryManager memory_manager;
|
|
|
|
public:
|
|
LibraryInjector(pid_t pid) : target_pid(pid), memory_manager(pid) {}
|
|
|
|
// Inject rocprofiler-register library
|
|
bool inject_register_library() {
|
|
const char* lib_path = "/opt/rocm/lib/librocprofiler-register.so";
|
|
|
|
// Find dlopen in target process
|
|
void* dlopen_addr = find_function_address("dlopen");
|
|
if (!dlopen_addr) {
|
|
fprintf(stderr, "Could not find dlopen in target process\n");
|
|
return false;
|
|
}
|
|
|
|
// Allocate memory for library path
|
|
void* path_addr = memory_manager.remote_mmap(
|
|
strlen(lib_path) + 1,
|
|
PROT_READ | PROT_WRITE,
|
|
MAP_PRIVATE | MAP_ANONYMOUS);
|
|
|
|
if (!path_addr) return false;
|
|
|
|
// Write library path to remote memory
|
|
if (!memory_manager.write_memory(path_addr, lib_path, strlen(lib_path) + 1)) {
|
|
return false;
|
|
}
|
|
|
|
// Call dlopen in target process
|
|
return call_remote_function(dlopen_addr,
|
|
{(uint64_t)path_addr, RTLD_NOW | RTLD_GLOBAL});
|
|
}
|
|
|
|
void* find_function_address(const char* function_name) {
|
|
// Parse /proc/PID/maps to find loaded libraries
|
|
std::string maps_path = "/proc/" + std::to_string(target_pid) + "/maps";
|
|
std::ifstream maps(maps_path);
|
|
std::string line;
|
|
|
|
while (std::getline(maps, line)) {
|
|
if (line.find("libc.so") != std::string::npos) {
|
|
// Extract base address of libc
|
|
size_t dash = line.find('-');
|
|
std::string base_addr_str = line.substr(0, dash);
|
|
void* base_addr = (void*)std::stoull(base_addr_str, nullptr, 16);
|
|
|
|
// Open libc and find function offset
|
|
void* handle = dlopen("libc.so.6", RTLD_LAZY);
|
|
if (handle) {
|
|
void* func_addr = dlsym(handle, function_name);
|
|
if (func_addr) {
|
|
// Calculate actual address in target process
|
|
return (uint8_t*)base_addr + ((uint8_t*)func_addr - (uint8_t*)dlsym(RTLD_DEFAULT, "main"));
|
|
}
|
|
dlclose(handle);
|
|
}
|
|
}
|
|
}
|
|
return nullptr;
|
|
}
|
|
};
|
|
|
|
6. ROCprofiler-Register Communication Protocol
|
|
----------------------------------------------
|
|
|
|
**Attachment Protocol Implementation:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
extern "C" {
|
|
// Function signatures from rocprofiler-register
|
|
typedef void (*attach_func_t)(uint32_t pid);
|
|
typedef void (*detach_func_t)();
|
|
}
|
|
|
|
class ROCprofilerAttachment {
|
|
private:
|
|
pid_t target_pid;
|
|
void* register_handle = nullptr;
|
|
attach_func_t attach_func = nullptr;
|
|
detach_func_t detach_func = nullptr;
|
|
|
|
public:
|
|
bool initialize() {
|
|
// Load rocprofiler-register library
|
|
register_handle = dlopen("/opt/rocm/lib/librocprofiler-register.so", RTLD_NOW);
|
|
if (!register_handle) {
|
|
fprintf(stderr, "Failed to load rocprofiler-register: %s\n", dlerror());
|
|
return false;
|
|
}
|
|
|
|
// Get attachment functions
|
|
attach_func = (attach_func_t)dlsym(register_handle, "attach");
|
|
detach_func = (detach_func_t)dlsym(register_handle, "detach");
|
|
|
|
if (!attach_func || !detach_func) {
|
|
fprintf(stderr, "Failed to find attachment functions\n");
|
|
return false;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
bool attach_to_process(pid_t pid, const std::vector<uint8_t>& env_buffer) {
|
|
target_pid = pid;
|
|
|
|
// Set up environment for rocprofiler-register
|
|
// This involves injecting the environment buffer into the target process
|
|
|
|
// Call the attach function
|
|
attach_func(pid);
|
|
|
|
return true;
|
|
}
|
|
|
|
void detach_from_process() {
|
|
if (detach_func) {
|
|
detach_func();
|
|
}
|
|
}
|
|
};
|
|
|
|
Complete Attachment Tool Implementation
|
|
======================================
|
|
|
|
**Main Attachment Tool Structure:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
#include <iostream>
|
|
#include <vector>
|
|
#include <string>
|
|
#include <chrono>
|
|
#include <thread>
|
|
|
|
class ROCprofilerAttachTool {
|
|
private:
|
|
ProcessAttachment process_control;
|
|
EnvironmentInjector env_injector;
|
|
LibraryInjector lib_injector;
|
|
ROCprofilerAttachment rocprof_attachment;
|
|
|
|
public:
|
|
struct AttachmentConfig {
|
|
pid_t target_pid;
|
|
std::vector<std::string> trace_options;
|
|
std::string output_path = "./rocprof-attachment-output";
|
|
std::string output_filename = "attached_profile";
|
|
uint32_t duration_msec = 0; // 0 = until process ends
|
|
};
|
|
|
|
bool attach_and_profile(const AttachmentConfig& config) {
|
|
// 1. Validate target process
|
|
if (!validate_target_process(config.target_pid)) {
|
|
std::cerr << "Invalid or inaccessible target process: " << config.target_pid << std::endl;
|
|
return false;
|
|
}
|
|
|
|
// 2. Initialize rocprofiler attachment system
|
|
if (!rocprof_attachment.initialize()) {
|
|
std::cerr << "Failed to initialize rocprofiler attachment system" << std::endl;
|
|
return false;
|
|
}
|
|
|
|
// 3. Attach to target process
|
|
if (!process_control.attach(config.target_pid)) {
|
|
std::cerr << "Failed to attach to process " << config.target_pid << std::endl;
|
|
return false;
|
|
}
|
|
|
|
// 4. Prepare environment variables
|
|
auto env_vars = env_injector.prepare_profiling_env(
|
|
config.trace_options,
|
|
config.output_path,
|
|
config.output_filename);
|
|
auto env_buffer = env_injector.serialize_environment(env_vars);
|
|
|
|
// 5. Inject rocprofiler-register library
|
|
LibraryInjector injector(config.target_pid);
|
|
if (!injector.inject_register_library()) {
|
|
std::cerr << "Failed to inject rocprofiler-register library" << std::endl;
|
|
process_control.detach();
|
|
return false;
|
|
}
|
|
|
|
// 6. Activate profiling
|
|
if (!rocprof_attachment.attach_to_process(config.target_pid, env_buffer)) {
|
|
std::cerr << "Failed to activate profiling" << std::endl;
|
|
process_control.detach();
|
|
return false;
|
|
}
|
|
|
|
// 7. Allow process to continue with profiling active
|
|
if (!process_control.detach()) {
|
|
std::cerr << "Warning: Failed to detach cleanly" << std::endl;
|
|
}
|
|
|
|
// 8. Wait for specified duration or until process ends
|
|
if (config.duration_msec > 0) {
|
|
std::cout << "Profiling for " << config.duration_msec << " milliseconds..." << std::endl;
|
|
std::this_thread::sleep_for(std::chrono::milliseconds(config.duration_msec));
|
|
|
|
// Re-attach to stop profiling
|
|
rocprof_attachment.detach_from_process();
|
|
} else {
|
|
std::cout << "Profiling until process ends..." << std::endl;
|
|
// Monitor process and wait for it to end
|
|
while (kill(config.target_pid, 0) == 0) {
|
|
std::this_thread::sleep_for(std::chrono::seconds(1));
|
|
}
|
|
}
|
|
|
|
std::cout << "Profiling completed. Output saved to: "
|
|
<< config.output_path << "/" << config.output_filename << std::endl;
|
|
return true;
|
|
}
|
|
};
|
|
|
|
// Example usage
|
|
int main(int argc, char* argv[]) {
|
|
if (argc < 2) {
|
|
std::cerr << "Usage: " << argv[0] << " <PID> [options]" << std::endl;
|
|
return 1;
|
|
}
|
|
|
|
ROCprofilerAttachTool::AttachmentConfig config;
|
|
config.target_pid = std::stoi(argv[1]);
|
|
config.trace_options = {"hip-trace", "kernel-trace", "memory-copy-trace"};
|
|
config.duration_msec = 5000; // 5 seconds
|
|
|
|
ROCprofilerAttachTool tool;
|
|
if (!tool.attach_and_profile(config)) {
|
|
std::cerr << "Attachment and profiling failed" << std::endl;
|
|
return 1;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
Required System Permissions and Setup
|
|
=====================================
|
|
|
|
**Permission Requirements:**
|
|
|
|
.. code-block:: bash
|
|
|
|
# Your attachment tool will need:
|
|
|
|
# 1. Ptrace permissions (may require root or capabilities)
|
|
sudo setcap cap_sys_ptrace+ep your_attachment_tool
|
|
|
|
# 2. Access to /proc filesystem
|
|
# Usually available by default
|
|
|
|
# 3. Ability to load shared libraries
|
|
# Ensure ROCm libraries are in LD_LIBRARY_PATH
|
|
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
|
|
|
|
**Build Requirements:**
|
|
|
|
.. code-block:: cmake
|
|
|
|
# CMakeLists.txt for your attachment tool
|
|
cmake_minimum_required(VERSION 3.16)
|
|
project(rocprofiler_attach_tool)
|
|
|
|
set(CMAKE_CXX_STANDARD 17)
|
|
|
|
find_package(rocprofiler-sdk REQUIRED)
|
|
|
|
add_executable(rocprofiler_attach_tool
|
|
main.cpp
|
|
process_attachment.cpp
|
|
environment_injection.cpp
|
|
library_injection.cpp
|
|
)
|
|
|
|
target_link_libraries(rocprofiler_attach_tool
|
|
rocprofiler-sdk::rocprofiler-sdk
|
|
dl # for dlopen/dlsym
|
|
)
|
|
|
|
Error Handling and Debugging
|
|
============================
|
|
|
|
**Common Issues and Solutions:**
|
|
|
|
1. **Ptrace Permissions**: Use ``strace`` to debug ptrace failures
|
|
2. **Library Loading**: Check ``/proc/PID/maps`` to verify library injection
|
|
3. **Environment Variables**: Validate environment buffer format
|
|
4. **Process State**: Monitor target process status during attachment
|
|
|
|
**Debugging Techniques:**
|
|
|
|
.. code-block:: cpp
|
|
|
|
// Enable debug logging
|
|
setenv("ROCPROF_LOGGING_LEVEL", "trace", 1);
|
|
|
|
// Monitor attachment progress
|
|
bool debug_attachment(pid_t pid) {
|
|
std::cout << "Target process memory maps:" << std::endl;
|
|
std::string cmd = "cat /proc/" + std::to_string(pid) + "/maps";
|
|
system(cmd.c_str());
|
|
|
|
std::cout << "Target process environment:" << std::endl;
|
|
cmd = "cat /proc/" + std::to_string(pid) + "/environ | tr '\\0' '\\n'";
|
|
system(cmd.c_str());
|
|
|
|
return true;
|
|
}
|
|
|
|
This implementation guide provides the foundation needed to build a complete process attachment tool for ROCprofiler-SDK. The actual rocprofv3 implementation uses similar techniques with additional optimizations and error handling.
|