Граф коммитов

144 Коммитов

Автор SHA1 Сообщение Дата
Mark Meserve bf49039005 [rocprofiler-sdk][rocprofiler-register] Initial Attachment Support (#316)
* attach: milestone: API tracing

- This pairs with another commit in rocprofiler-sdk to fully
  function
- Add ptrace entry points for tool attachment
- API tracing works at this commit
- Queue tracing not supported yet

* attach: cleanup

- Remove hardcode for loading of tool library
- Make invoke registration functions public again

* attach: proxy queue first draft

- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-sdk

* attach: prestore overhaul

- Must be paired with commit in rocprofiler-sdk

* attach: add dispatch table rework

- Register will load the prestore library and provide entrypoints to sdk

* attach: formatting and cleanup

* attach: revise dispatch table scheme

* attach: formatting

* attach: milestone: API tracing

- This change must be paired with a change in rocprofiler-register to
  fully function.
- API tracing works at this commit
- Queue tracing not supported yet

* attach: cleanup and comments

* attach: Formatting and crash fixes

* attach: add attach duration

- Add option attach-duration-msec for attachment

* Formatting + sglang hang fix via signal handling

* Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues

* attach: proxy queue first draft

- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-register

* Allow null agents for scratch output

* attach: improve queue library interface

- Significant changes to force exported interfaces back to C
- Fixes bug with unknown agents at attachment
- Code objects' names may still be incorrect

* attach: add code_object support

- Kernel traces will now have names and all other information for launches
- Add capture of hsa_executable to the queue library
- Various logging improvements

* attach: rename queue library to prestore

* attach: prestore overhaul

- Must be paired with commit from rocprofiler-register
- Massive overhaul of code organization in prestore library
  - Separates registrations for different object types
  - Sets up future changes for initialization

* attach: add prestore dispatch table

- Removes linkage to prestore library from sdk

* attach: cleanup

* attach: formatting

* attach: fix input prompt not appearing

* attach: fix component name in cmake

* attach: revert change to export level

* Make prestore API public

* attach: update sdk attachment library WIP

- This commit is NONFUNCTIONAL

- Changes around structure to remove classes
- Seperate C linkage where needed
- Still needs updates to register for correct usage

* attach: update register with dispatch table WIP
- This commit is NONFUNCTIONAL

- Changes rocprofiler_register to handle dispatch table from attach
  library.
- Still needs changes in SDK with dispatch table usage

* attach: dispatch table wip
- This commit is NONFUNCTIONAL

* attach: move attach component into core

* attach: rename to rocprofv3-attach

* attach: add callbacks for new queues and code objects

* attach: finish dispatch table implementation

- Fixes kernel tracing

* attach: add cmake variable for attachment support

* feat: Add --attach alias for rocprofv3 with comprehensive attachment tests

- Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py
- Create comprehensive attachment test suite with CSV and JSON output validation:
- New attachment-test application for testing dynamic profiling scenarios
- Unified test script supporting both CSV and JSON output formats
- Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info
- Add CMake integration for automated attachment testing
- Support parameterized output directory and filename specification
- Implement proper environment setup for attachment queue registration

Tests verify successful attachment to running processes and capture of:
- Kernel dispatch traces with workgroup/grid dimensions
- Memory copy operations (H2D/D2H) with size validation
- HSA API call traces across multiple domains
- GPU/CPU agent information and capabilities

* Documentation Update

* attach: make attach script callable

* Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name

* attach: revert metrics library path changes

* Generic Attachment in Register (#942)

Remove tool references in register

* Add second param to attach call in rocprof register

* Add experimental reattachment support for ROCprofiler-SDK

This commit introduces experimental reattachment functionality allowing tools
to dynamically reattach to running processes with comprehensive design changes
to support multiple attach/detach cycles:

**Core Reattachment API:**
- Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks
- Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports
- Implement reattachment tracking in rocprofiler_register_attach to differentiate
initial attachment from reattachment cycles
- Add rocprofiler_register_invoke_reattach for handling reattachment requests

**Design Changes - Registration System Flow:**
The registration system now supports a dual-path initialization:

1. Initial Attachment Flow:
    - rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations()
    - Full tool initialization with complete context setup
    - Sets prev_attached atomic flag to track state

2. Reattachment Flow:
    - rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach()
    - Bypasses full re-initialization, calls client reattach callbacks instead
    - Preserves existing contexts and buffers, only reactivates profiling services

**Design Changes - Tool Library Loading:**
Enhanced rocprofiler-register library loading with function pointer resolution:
- Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers
- Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions
- Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability

**Design Changes - Context Management:**
Introduced dual context systems for attachment scenarios:
- get_contexts() - Original contexts for standard tool initialization
- get_attach_contexts() - Separate context map for attachment-specific lifecycle
- attach_init() - Creates contexts for ALL buffer tracing services using existing buffers
- attach_start() - Selectively starts contexts based on configuration options
- attach_detach() - Cleanly stops and destroys attachment contexts

**Design Changes - Buffer Management:**
Added reset_tmp_file_buffer() template for clean reattachment state:
- Properly closes and removes old temporary files
- Deletes existing file_buffer instances to prevent stale file position tracking
- Creates fresh file_buffer instances for clean reattachment cycles
- Addresses core issue where file position metadata becomes stale between cycles

**Design Changes - Environment Variable Injection:**
Added ROCP_REGISTERED_TOOL_ATTACH environment variable:
- Distinguishes attachment-loaded tools from LD_PRELOAD scenarios
- Enables registration system to apply attachment-specific logic
- Helps tools adapt behavior for attachment vs standard initialization

**Attachment Context Management:**
- Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle
- Add reset_tmp_file_buffer template for clean reattachment state management
- Implement get_attach_contexts() for tracking active attachment contexts

**Test Infrastructure:**
- Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite
- Include reattachment test scripts with unified attachment/detachment cycles
- Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info
- Add conftest.py for JSON and CSV data loading utilities

**Configuration Updates:**
- Update CMakeLists.txt to include reattachment tests in build system
- Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking
- Enhance rocprofiler-register library loading with reattach/detach function resolution

**Flow Impact Analysis:**
This design enables robust multi-cycle attachment by:
1. Preventing duplicate initialization on reattachment
2. Maintaining separate context lifecycles for attachment vs standard operation
3. Ensuring clean temporary file state between attachment cycles
4. Providing tools with explicit reattach/detach callback hooks
5. Supporting both programmatic and environment-based tool configuration

The experimental nature allows for iteration on the API while establishing
the foundation for production-ready dynamic profiling capabilities.

* Fix misc clang-tidy warnings/errors

* CMake Option and Environment Variable Updates

- CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT
- Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED ->

* Source reorganization

* Formatting + new lines at EOF

* Fix flake8 F841: local variable is assigned to but never used

* Update attachment test

- get rid of 5 second start delay
- add roctx

* Rework implementation

- Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach
- Add <rocprofiler-sdk/experimental/registration.h>
- TODO: Update process_attachment.rst

* Handle re-attachment options

- inherit options from previous attachment
- check previous options do not modify data collection services

* Fix support for tools w/o rocprofiler_configure_attach

- fix segfault when rocprofiler_configure_attach does not exist
- fix naming convention for functions accepting attach dispatch table
- cleanup rocprofiler_configure_attach implementation in rocprofv3 tool

* attach: remove unknown agent handling

- Change was from earlier commit, no longer needed

* attach: add error for attaching without library loaded

* attach: revise version numbering

* attach: register header revisions

* attach: clang format register

* attach: formatting

* attach: fix build failure

- Remove cross dependency into rocprofiler-sdk, fixes build on some systems

* attach: revise register library detection

* Update rocprofiler-register and attach library

- formatting
- proper signature of register_functor for rocprofiler-sdk-attach library callback
- remove get_dispatch_registration_table()

* Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion

* Fix output support for rocprofiler-sdk-tool

* Fix formatting

* Fix clang tidy errors

* Misc rocprofiler-sdk-attach fixes

* attach: add sigint handling to attach python

* tool README.md formatting

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* Fix buffered output issue

* attach: add errors for tool attach

* CI Fixes

* Rework tests

* attach: improve library loading in rocprofv3 attach

* formatting

* Update tests to use pytest framework

* Fix test_attachment_hsa_api_trace

* attach: catch ctypes exceptions

* attach: fix leak in registration

* attach: fix sanitizer tests

* attach: fix sanitizer tests further

* attach: disable attach asan tests

* attach: disable ubsan test

* attach: fix permissions in installed test package

* attach: formatting

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
Co-authored-by: Tim Gu <Tim.Gu@amd.com>
Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: Benjamin Welton <bwelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-09-18 18:10:45 -05:00
Swati Rawat e655bb37a7 Update installation.rst (#1034) 2025-09-17 11:10:55 -04:00
Matt Williams af2f2c1345 Update index.rst (#1014) 2025-09-16 09:59:04 -04:00
Julian Jose 2d3803da89 Update using-rocprofv3 documentation (#331)
* Update using-rocprofv3 documentation

* Update using-rocprofv3.rst

* Update using-rocprofv3.rst

* Update projects/rocprofiler-sdk/source/docs/how-to/using-rocprofv3.rst

Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>

* Update projects/rocprofiler-sdk/source/docs/how-to/using-rocprofv3.rst

Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>
2025-09-11 12:11:04 +05:30
Ammar ELWazir 069d5ecce2 [ROCProfiler SDK] Updating README Building & Installing Instructions (#931)
* Updating ROCProfiler SDK README

* Fixing ROCProfiler SDK License

* Fixing ROCProfiler SDK Installation Steps

---------

Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com>
2025-09-11 12:08:49 +05:30
usrihari123 2449bfd483 Update the scratch memory docs with the new allocation_size field (#685)
* Update the scratch memory docs with the new allocation_size field

* Address review comment

---------

Co-authored-by: Srihari <srihariu1@gmail.com>
2025-08-28 17:37:06 +05:30
systems-assistant[bot] b645010655 Using semaphore to sync with all peer processes in finalization stage (#169)
* Using semaphore to sync with all peer processes in finalization stage

[rocprofv3] Implement synchronization using POSIX semaphore in finalization

* clang format code

* clang 11 format code

* Add process sync option for rocprofv3

* Default value of process sync is false

* Update source/lib/rocprofiler-sdk-tool/tool.cpp

Apply suggestion by Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update according to comments

* add new line to helper.hpp

---------

Co-authored-by: Huanran Wang <huanrwan@amd.com>
Co-authored-by: Huanran Wang <huanran.wang@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-25 08:57:41 -05:00
systems-assistant[bot] c7b9533836 [Docs] Update using-pc-sampling (#157) 2025-08-21 11:14:16 -04:00
systems-assistant[bot] 351d598869 [Docs] Adding AQLprofile info (#150)
* Adding AQLprofile info

* update aqlprofile text

---------

Co-authored-by: Matt Williams <Matt.Williams+amdeng@amd.com>
2025-08-19 08:44:25 -07:00
Baraldi, Giovanni 6a6b16be93 Adding GPU index as a parameter for ATT (#547)
* Adding GPU index as a parameter for ATT

* Tidy fix

* Using tokenize

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

* Adding error logging. Using idx instead of id.

---------

Co-authored-by: Giovanni <gbaraldi@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

[ROCm/rocprofiler-sdk commit: fd6f96ffb5]
2025-08-04 23:15:50 +02:00
Trowbridge, Ian 6b2a4fcfc2 Revert memory allocation CSV output file header and update tests (#532)
* Reverted header and field location for csv memory allocation and updated tests

* Updated example csv file and made small update

[ROCm/rocprofiler-sdk commit: 533a8329d8]
2025-08-04 13:22:27 -05:00
Bhardwaj, Gopesh f625253208 SWDEV-544115 Adding documentation for rocprofv3 advanced options (#516)
* SWDEV-544115 Adding documentaiton for rocprofv3 advanced options

* minor changes

* updating rocpd documentation

* updated changelog

* adressed Feedback

[ROCm/rocprofiler-sdk commit: 4120c12ed5]
2025-07-30 22:25:40 +05:30
Baraldi, Giovanni 4ca156e572 Thread trace and Trace Decoder API tests and samples (#416)
* Adding test and samples to decoder

* Fix sample

* Formatting

* Fix multi test

* Disable sample

* Fix tests

* Format

* Version fix

* Locking the decoder

* Add atomic

* Review comments

* Format

* Adding readme

* merge conflict and adding PCS+ATT test

* Review comments

* Properly disable PCS test

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

* Adding back env var test

* Name fix

* Preload sample

* Addressing review comments

* Update docs

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: e898079a13]
2025-07-22 20:08:12 -05:00
Gill, Harkirat b88018d24d Update output file fields docs to correctly define Grid_Size (#526)
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: e948034c83]
2025-07-22 23:16:01 +05:30
Nagaraj, Sriraksha 28d2a8f5bb [rocprofv3-avail] - Add sample data (#514)
* Add sample data for avail and remove color code for non terminal output

* review comments

* review comments

* add documentation

* test fix

[ROCm/rocprofiler-sdk commit: 2447a85215]
2025-07-22 08:39:59 -07:00
Indic, Vladimir d5aba741f3 [Host-Trap PC Sampling] Host-Trap PC sampling an introduce an arbitrary sampling skid of [0, 2] instructions (#515)
* Arbitrary host-trap sampling skid (doc)

The host-trap PC sampling might introduce a skid of [0, 2]
instructions. We documented this information and provides
some advice to application developers how to find
hot-spots in the profiles generated by host-trap sampling.

[ROCm/rocprofiler-sdk commit: 650d35bdaa]
2025-07-17 17:59:46 +02:00
Nagaraj, Sriraksha c8912d2bb6 [rocprofv3-avail] Documentation update and column formatting (#447)
* addressing issues

* doc fix

* test fix

* fix

* fix formatting issue and doc update

* fix column size

* fix

* fix formatting in output

* tests fix

* test fix

* add new line

* add new line

* fix new line

* fixing typo in using-rocprofv3-avail.rst

[ROCm/rocprofiler-sdk commit: 3aaffc42da]
2025-07-10 11:41:12 -05:00
U, Srihari 7243889d6a Add perfetto support for scratch memory (#303)
* Add perfetto support for scratch memory

* Updated tests and docs.

* Update docs data

* Added underflow check

* Record all free events to 0 bytes

* Add format

* Address review comment

* updated tests for scratch memory

* update scratch-memory tests.

[ROCm/rocprofiler-sdk commit: 6f2a5a9646]
2025-07-09 21:05:45 +05:30
Bhardwaj, Gopesh d5ca98baed Adding OpenMP usage with rocprofv3 (#472)
* Adding openmp usage with rocprofv3

* minor changes

* Fixing missing line

[ROCm/rocprofiler-sdk commit: e7616c3aad]
2025-07-02 12:25:24 +05:30
Baraldi, Giovanni cd5d5f8142 [rocprofv3] Fix ATT library path (#476)
* Fix library path

* Update docs

* Review comments

* Update source/bin/rocprofv3.py

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: c0c08b2f08]
2025-07-01 22:08:29 +02:00
Verma, Saurabh 442da1f287 PC-Sampling doc updates - FW version (#455)
* Initial doc update

* addressed review comments

* addressed review comments - 2

* accept reviewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-2

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-3

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-4

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update README.md

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update CHANGELOG.md as per viewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept review suggestion

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* accept reviewer suggestion

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

---------

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: f70f369d46]
2025-06-25 13:11:18 +05:30
Baraldi, Giovanni 0ea9dbf7a8 Adding doc links for trace decoder, aqlprofile and viewer (#464)
Adding interlinks for trace decoder, aqlprofile and viewer

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 9dadbbace5]
2025-06-18 14:10:18 +02:00
Bhardwaj, Gopesh 1f4084c7b5 Adding rocpd documenation (#449)
* Adding rocpd docuemenation

* rocpd format

* CHANGELOG update and indexing

* Fixing links

* format fixes

* fixing table

* major edits

* fixed logical error

* fixing rocprofv3 avail

[ROCm/rocprofiler-sdk commit: 3e43b1f019]
2025-06-17 15:41:53 +05:30
Kandula, Venkateshwar reddy 1562e4573c [DOCS] SWDEV-534589 Update docs with new info in kernel_trace csv output (#438)
* Update docs with new info in kernel_trace csv output and add flag for csv in docs.

* Misc.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Vaddireddy, Sushma <Sushma.Vaddireddy@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 1c91774c6a]
2025-06-10 08:20:07 +05:30
Nagaraj, Sriraksha 3a62fee4ac [rocprofv3-avail] Rework rocprofv3-avail tool (#312)
---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>

[ROCm/rocprofiler-sdk commit: 80d60d8535]
2025-06-06 11:51:37 -07:00
Kumar, Amit a85fb8f456 add binary link (#427)
* add binary link

* Update source/docs/how-to/using-thread-trace.rst

---------

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

[ROCm/rocprofiler-sdk commit: 7411640761]
2025-05-30 11:52:31 -05:00
Baraldi, Giovanni af98f5163b Adding Thread Trace API reference (#417)
* Adding Thread Trace API reference

* Doc fixes

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/_toc.yml.in

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/index.rst

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/thread_trace.rst

* Apply suggestions from code review

Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>

* Update source/docs/_toc.yml.in

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>
Co-authored-by: Xu, Alex <Alex.Xu@amd.com>

[ROCm/rocprofiler-sdk commit: eedfecd905]
2025-05-30 11:51:46 -05:00
Baraldi, Giovanni c585122767 Adding using-thread-trace.rst (#408)
* Adding using-thread-trace.rst

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

* Add to index/toc

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>

[ROCm/rocprofiler-sdk commit: b590612966]
2025-05-29 15:41:42 -05:00
Bhardwaj, Gopesh e55c31db27 SWDEV-533894 Documentation for python bindings (#404)
* SWDEV-533894 Documenation for python bindings

* Fixing missing-new line check

* Addressed Feedback

[ROCm/rocprofiler-sdk commit: 7f7827fb30]
2025-05-27 22:39:21 -05:00
Rawat, Swati 3f47d1fffa Doc review (#386)
* doc review

* more updates

* install title

* Update rocprofiler.h

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: c255ec5b5c]
2025-05-27 11:28:38 -05:00
Bhardwaj, Gopesh abf3f869e9 Making ROCTx API doxygen generated document more readable (#385)
* Making ROCTx API doxygen generated document more readable

* fixing build

* Fix linking errors

* Fixing header

* Fixing Topics and Types

* doxygen configuration fixes

* Fixing build

* Fix unnecessory doc parsing warnings

* formatting and linting fixes

* rebasing SDK modular PR

* Fixing missing line

* Fixing ROCtx documentation after merge

* Removing flake changes

* changed back WARN_IF_DOC_ERROR to Yes

[ROCm/rocprofiler-sdk commit: b48fa532bc]
2025-05-22 18:08:55 -05:00
Welton, Benjamin cfce653d86 [SDK] Standardize rocprofiler-sdk counter definition YAML schema (#370)
* Convert YAML Format

Convert YAML format and reader to properly read the YAML.

Comparison between output's from the YAML show only changes in ordering
of architectures (and ids).

* Test fixes

* Add script for converting the YAML schema to source/scripts

* Update documentation

* Change the extra counter code block to YAML

* Add missing new line at EOF

* remove name issues

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 33e43e66d3]
2025-05-14 13:31:51 -05:00
Kandula, Venkateshwar reddy 89fbdeb196 [docs] Improve readability of ROCprofiler-SDK API library documentation (#359)
* Use custom .rst to make api doc more readable.

* Update index.rst

* Misc docs updates

- doxygen source code fixes
- updated doxygen files
- fixed conf.py (does not generate code in source tree)

* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api/modules.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api/global_data_structures_topics_files.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Duplicate

* test warnings

* Update CMakeLists.txt

* Update rocprofiler-sdk.dox.in

* Update update-docs.sh

* fix docs build failures by -q -T flags.

* set warn_as_error to NO.

* test -W to suppress warnings.

* remove -q flag from make.

* reduce dot graph depth to 100

* Update custom docs target

- docs target is now no longer part of the dependency list for the all target
- installation of docs requires explicitly building the docs target (i.e. OPTIONAL install of _build/html/ folder)

* add quit and trace mode back.

* increase DOT_GRAPH_MAX_NODES to 500 back.

* Format.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 6ec9526475]
2025-05-14 11:17:51 +05:30
Elwazir, Ammar c47e5838f1 [rocprofv3-benchmark] SDK and rocprofv3 Benchmarking Suite (#157)
* Adding Benchmarking Stg1

* config fix

* reset

* add jpeg and decode traces in iteration

* address comments benchmark config files.

* address comments.

* address comments.

* address comments: revert cntrl ctx.

* address comments: revert csv output.

* resolve merge conflits.

* format.

* build fix.

* fix hip runtime api traces.

* loop cb services.

* format.

* bug fix.

* Fix operator>

- public C++ comparison operator

* Update configuration options

- support selected regions (--selected-regions)
- support writing output config json (--output-config)
- update serialization data

* rocprofv3 tool library misc updates

- lambda for starting context
- support for writing config json

* Tool library updates

- Finished support for all benchmarking modes
- Added build spec support to config json

* Fix ROCPROFILER_SOVERSION

- this value should not be multiplied by 10,000

* Minor tweak to rocprofv3

* Benchmarking scripts

* formatting

* Fix duplicate include

* Add reproducible-dispatch-count test app

- used in benchmarking

* registration logging

- report number of registered contexts and active contexts after client initialization

* Serialize environment in rocprofv3 output config

* ROCPROFILER_BUILD_BENCHMARK CMake option

* Update benchmark SQL schema

- hash_id is text
- add md5sum to benchmarked_app
- remove app_id from benchmarked_sdk
- add sdk_id to benchmark_config
- separate hip_trace into hip_runtime_trace and hip_compiler_trace
- use INT instead of INTEGER for MySQL compatibility
- add count column in benchmark_statistics
- allow std_dev to be NULL in benchmark_statistics

* Update rocprofv3-benchmark.py

- use md5 instead of python hash (which includes random seed)
- use args.mysql_database
- compute md5sum of executable
- fix insert_benchmark_config
  - marker trace fixes
  - memory allocation fixes
  - split hip_trace into hip_{runtime,compiler}_trace
- remove app_id from benchmarked_sdk
- support warmup runs
- count field in benchmark_statistics

* Support launcher and environment in YAML

* Update reproducible-dispatch-count.cpp

- support mode which doesn't use hip event timing

* Misc rocprofv3-benchmark.py updates

- fix some MySQL support
- remove some unnecessary logging

* support mysql db.

* Format.

* Updated SQL input files

- moved benchmark_schema.sql to benchmark_table.sql
- added benchmark_views.sql
  - uses {{metric}} syntax for variable substitution

* cmake formatting

* update rocprofv3-benchmark.py

- benchmark config labels
- overhead views

* Encode rocprofv3-benchmark PID in rocprofv3 and timem output files

* Minor tweak to benchmark_views.sql

- include count
- reorder fields for readability

* split statements and use IS if values is NONE.

* use backtick instead of double quotes and add IS before NOT NULL.:

* Adding Mandelbrot Benchmark App

* Adding Dockerfile example

* Update dockerfile

* Update dockerfile

* [SDK] rocprofiler_query_external_correlation_id_request_kind_name

* Execution-profile benchmark mode

* Execution profile SQL support

* Rename mandlebrot folder + misc clang-tidy

* [rocprofv3-benchmark] Execution profile support

* Update installation

* add work dir when setting git revision, useful when building outside src.

* Set FULL_VERSION_STRING and ROCPROFILER_SDK_GIT_REVISION

- when benchmark folder is top-level

* Remove unused python packages from requirements.txt

* Use ldd/pyelftools to include linked libs for md5sum

- also add --filter-benchmark and --filter-rocprofv3 options
- support labeling the rocprofv3 options
- use more argparse groups
- more generic application of filters
- support variable substitution in environment, e.g. PATH=/some/path:$PATH

* Environment improvements

- improve reproducibility when env set via input file vs. shell
- support "environment-ignore" to remove environment variables

* Misc formatting

* Misc. fix

* use backticks for defining new columns name

* Support shuffling the order of benchmark modes/rocprofv3 args

* Address review comments

* Update Dockerfile

- rename to Dockerfile
- reduce to one layer

* Support docker build arg BRANCH

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 6f17da7ade]
2025-05-13 16:18:23 -05:00
Bhardwaj, Gopesh 0f877ea176 Replacing ROCm 6.5 mentions with ROCm 7.0 (#391)
[ROCm/rocprofiler-sdk commit: fbab96e552]
2025-05-12 17:05:45 +05:30
Trowbridge, Ian 24f054f509 Fix HIP Streams Duplication Error (#313)
* Fix stream duplication and fixed tests

* Added comments to explain stream.cpp code, change stream nullptr check to occur in update table to prevent readding null stream, simplified hip-streams bin file code, add destroyStreams to hip-streams bin file code

* Removed roctx from CMakeLists.txt

* Updated documentation

* Fix documentation

* Removed update_table for HIP compiler table and updated stream.cpp to remove support for HIP compiler table

* Added runtime initialization check for HIP

* Changed tool name, working on fixing memory management

* Added context for counter collection kernel rename combination

* Changed name from map to set and changed description

* Fix documentation description for group-by-queue

* Merged memory copy and kernel operations onto a single track when on the same stream

* Updated perfetto output to remove hardware information from track name to merge all memory copy and kernel operations on the same stream to the same track:

* Most pr comments addressed

* Added filter for counter collection and removed kernel buffer tracing hack

* Added PR comment fixes

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: e626df43eb]
2025-05-01 00:56:15 -05:00
Madsen, Jonathan 871686ad40 [rocprofv3] Use -P for collection period shorthand option (#356)
* [rocprofv3] Use -P for collection period option

- Reserve -p for profiler attachment

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: d2bde3ce27]
2025-04-27 20:18:26 -05:00
Bhardwaj, Gopesh 20bc0deaf1 doc improvements for 1.0.0 (#367)
* correcting rocprofiler_configure

* Fix indexing order

* doc feedback

[ROCm/rocprofiler-sdk commit: bbe9eab53a]
2025-04-24 17:05:22 +05:30
Bhardwaj, Gopesh 1646c6fdd0 Using miniconda docker (#366)
* Using miniconda docker

* remove sudo

* Remove double install of rocprofiler-docs conda environment

* Fix building docs

* Fix build docs

- Additional system packages

* Using miniforge

* Fixing warning as errors build issue

* cmake formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 024cf0e5e3]
2025-04-23 23:52:03 -05:00
Bhardwaj, Gopesh 93abda4cfd Copilot suggestions (#360)
* Copilot suggestions

* Fixing perfetto links

* correcting default value of agent-index

[ROCm/rocprofiler-sdk commit: 1f1c192a5e]
2025-04-22 20:52:37 +05:30
Bhardwaj, Gopesh ef8b185a04 Remove SDK as beta from docs (#351)
[ROCm/rocprofiler-sdk commit: 780b96ad3a]
2025-04-21 21:31:14 +05:30
Nagaraj, Sriraksha 2e7d0b3aec [rocprofv3] signal handler fix (#332)
* rocprofv3: LD_PRELOAD for signal and sigaction

- wrappers around `signal` and `sigaction` to prevent applications which install signal handlers to replace the rocprofv3 signal handlers
- minor tweaks to buffer sizes (use page_size instead of
KiB)

* [DO NOT COMMIT] extra logging

* Switch git submodule url for perfetto

- use GitHub URL as this is more accessible

* Update ring_buffer<Tp>

- account for alignment padding

* Update buffered_output

- track number of bytes stored
- add nullptr checks

* Update tmp_file_buffer

- track number of bytes
- read_tmp_file does not create tmp file if it does not already exist

* Update tmp_file

- add exists member function for checking whether temporary file already exists
- tweak remove() implementation

* Update config.hpp

- add option to enable/disable signal handlers
- add option for minimum_output_bytes

* Make signal, sigaction functions visible

* rocprofv3 tool updates

- chained signals
- override the signal handler(s) installed by the application
- improve cleanup of temporary files
- support minimum output bytes

* Add commandline support

* fixing test

* minor fix

* minor fix

* fix clang issue

* fix

* Adding docs

* review comments

* review changes

* review

* YUV pulldown additions to rocdecode

* More rocdecode changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: 87badfbd15]
2025-04-17 21:10:52 -07:00
Indic, Vladimir 3dc4148c46 MI300 Stochastic PC Sampling Documentation and Changelog (#336)
* MI300 Stochastic PC Sampling Documentation

* Stochastic PC sampling title renaming

---------

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>

[ROCm/rocprofiler-sdk commit: 96a0ef244f]
2025-04-15 14:04:19 -07:00
Bhardwaj, Gopesh 8778237298 doc improvements for 1.0.0 part 2 (#330)
* update installation steps

* Github Issue #50 Adding README's for samples

* Making name change to ROCprofiler-SDK for consistency

* Fix HIP trace documentation

* Fix HSA trace in docs

* Fix kernel trace in docs

* Fixing memory copy and memory allocation traces

* runtime trace and sys trace doc update

* Fix scratch memory doc

* kernel naming and filtering options

* Adding collection period in docs

* Perfetto configs update

* summary output file

* kernel trace format fix

* update CHANGELOG

* Agent index doc update

* rocm-smi output

* group by queue option

* Updated --group-by-queue description

* perfetto visualization

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>

[ROCm/rocprofiler-sdk commit: ca7cce9e81]
2025-04-15 13:30:07 -07:00
Trowbridge, Ian 5467c83188 rocDecode Buffer Tracing Support (#315)
* Added buffer tracing support for rocdecode and updated tests to work with buffer tracing

* Updated perfetto to output args individually rather than as a string list

* Updated docstrings and operation type, changed OTF2 code to remove warning due to change in operation type

* Updated tests for review comments

* Test args exist and return value

* Updated to use string entry

* Change function name

* Updated PR to reflect review comments

* Updated for PR review comments

* Change function name

[ROCm/rocprofiler-sdk commit: 077723337a]
2025-04-11 21:56:36 +00:00
Rawat, Swati 8685fe685f Fixing broken link (#326)
* fixing broken link

* added metadata information

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 379d760fc1]
2025-04-03 18:12:34 +05:30
Meserve, Mark d80c047fd2 Additional 1.0.0 changes (#317)
* Additional 1.0.0 changes

- Update VERSION
- Add beta compatibility for rocprofiler_agent_set_profile_callback_t

* Fix location of deprecated typedef rocprofiler_agent_set_profile_callback_t

* rocprofiler_record_counter_t -> rocprofiler_counter_record_t

* Experimental + deprecated annotations

* rocprofiler_record_dimension_info_t -> rocprofiler_counter_record_dimension_info_t

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: a1fcdf7f83]
2025-03-26 02:12:03 -05:00
Bhardwaj, Gopesh 146169577b doc improvements and fixes SWDEV-523395,SWDEV-516979 (#314)
* doc improvements and fixes SWDEV-523395,SWDEV-516979

* Adding changes from PR 231

[ROCm/rocprofiler-sdk commit: 6d6eec230c]
2025-03-26 10:09:08 +05:30
Madsen, Jonathan 43af686b72 Updated source/docs/sphinx/requirements.txt (#310)
- Re-ran pip-compile on source/docs/sphinx/requirements.in

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 2061c52817]
2025-03-24 14:00:49 +05:30
Welton, Benjamin 692d041316 [SDK] Release 1.0 Public API Modifications (#277)
* Make sure all structs/enums can be forward declared

* Updates to counter collection

- consistency updates and cleanup

* Conversion of dimension information to info struct

* Added deprecated folder

* Testing changes

* merge changes

* Fix shadowed variable

* Source code formatting

* Fix shadowed variable

* Update rocprofiler_counter_info_v1_t member names

* Split version.h into version.h and ext_version.h

- ext_version.h contains external version info, e.g. ROCPROFILER_HSA_API_TABLE_MAJOR_VERSION, ROCPROFILER_HSA_RUNTIME_VERSION
- this reduces amount of recompilation after a commit since version.h gets updated with the git revision

* profile_config -> counter_config

* EOF new line

* [Samples] Reduce header includes + reorg counter collection samples

* Misc compilation fixes

- shadowed variables
- use of [[deprecated("...")]] in C code
- unused variables

* Minor misc modifications

- use common:: instead of rocprofiler::common:: when inside rocprofiler namespace
- counters.cpp
  - move local anon namespace functions into rocprofiler::counters:: anon namespace
  - use std::string_view for get_static_string
  - const ref for get_static_ptr
  - misc namespace shortening

* [Public API] rocprofiler_get_version_triplet + rocprofiler_version_triplet_t

- struct rocprofiler_version_triplet_t containing fields for the major, minor, and patch version
- public API function: rocprofiler_get_version_triplet
- define C++ operators for rocprofiler_version_triplet_t
- C++ function compute_version_triplet

* [Tests] Improve async-copy-testing test

- relax constraints
- improve logging

* Update counter_config.h doxygen docs

* ROCPROFILER_SDK_BETA_COMPAT

- ppdef which helps with renaming when set to 1

* Remove spurious include

* Fix includes for cxx/version.hpp

* Doxygen fixes for rocprofiler_get_version and rocprofiler_get_version_triplet

* Public API Experimental Designation

- ROCPROFILER_SDK_EXPERIMENTAL added to experimental function
- "(experimental)" added to doxygen @brief entries

* Fix use of assert instead of static_assert in hip/stream.cpp

* Use typedef instead of define for rocprofiler_profile_config_id_t

* Use inline rocprofiler_{create,destroy}_profile_config instead of ppdef

- added <rocprofiler-sdk/deprecated/profile_config.h>

* Doxygen for rocprofiler_{create,destroy}_profile_config

* ROCPROFILER_SDK_DEPRECATED_WARNINGS

* Temporarily comment out ROCPROFILER_SDK_DEPRECATED_WARNINGS=1

* cmake formatting

* Misc variable renaming in samples and tests

* Fix declarations of types

* Fix hip stream tracing service struct name

- rocprofiler_callback_tracing_stream_handle_data_t renamed to rocprofiler_callback_tracing_hip_stream_api_data_t

* Rename "HIP_STREAM_API" to "HIP_STREAM"

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: 4cd121e27b]
2025-03-24 12:07:33 +05:30