273 Revīzijas

Autors SHA1 Ziņojums Datums
Benjamin Welton 1517a398bf [rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements

Buffer Pool Design
------------------
Replace the fixed array-based double buffer with a dynamic pool design to
fix race conditions that caused "internal correlation id was retired
prematurely" errors.

The original design had a race where flush callbacks could be delivered
out-of-order: when buffer 0 fills and begins flushing, writes go to
buffer 1. If buffer 1 fills before buffer 0's flush completes, the
buffer index wraps back to 0 (which may still be flushing). Independent
flush tasks submitted to the thread pool can complete out of order.

The new pool design:
- Uses a std::deque of buffer instances that grows as needed
- Allocates buffers from the pool when the current buffer needs to flush
- Serializes flushes with a mutex to ensure FIFO callback ordering
- Returns buffers to the pool after flush completion
- Eliminates the race between buffer selection and write operations

New Unit Tests
--------------
- buffer_correlation_ordering.cpp: Tests that API records are always
  delivered before their corresponding retirement records
- buffer_ordering_stress.cpp: Stress tests buffer flush ordering under
  high contention with multiple threads rapidly filling buffers

HSA Tool Hooks
--------------
Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that
waits for pending flush tasks before tool finalization, preventing
"retired prematurely" errors during HSA shutdown.

Sanitizer Improvements
----------------------
- LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder
- LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup)
- TSAN: Added suppression for false positive on C++11 thread-safe static
  initialization in create_write_functor
- ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto
- Disabled attachment tests for sanitizers due to library preloading issues

Other Fixes
-----------
- Thread-trace agent test: Use heap-allocated callback state
- Correlation ID: Refactored reference counting and finalization ordering

* [rocprofiler-sdk] Revert buffer pool design changes

Revert buffer.cpp and buffer.hpp to the original double-buffer
design from develop branch. The pool-based redesign introduced
concerns about:
- Signal safety (mutex vs atomic_flag)
- API changes (flush() return type)
- Complexity of the new design

This revert removes:
- Dynamic buffer pool with std::deque
- std::mutex/condition_variable synchronization
- buffer_correlation_ordering.cpp test
- buffer_ordering_stress.cpp test

The underlying buffer flush ordering issue will need to be
addressed with a different approach that preserves the original
API and synchronization characteristics.

* [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization

- Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks
- Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning
- Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp)
- Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior:
  - hsa/queue.cpp (lines 105, 210)
  - hsa/async_copy.cpp (line 344)
  - hsa/hsa_barrier.cpp (line 43)
  - buffer.cpp (lines 107, 138, 185)

This ensures no correlation IDs are created once finalization starts (fini_status != 0),
preventing races between finalization and ongoing tracing operations.

* [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation

Buffer records are not guaranteed to arrive in any specific order. Tests and
samples should use timestamps for temporal ordering validation instead.

Changes:
- samples/external_correlation_id_request: Replace 'retired prematurely' arrival
  order check with timestamp-based validation that retirement timestamp >=
  max(end_timestamps) for records with the same correlation ID
- tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check
- tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check
- tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check

Correlation IDs are not guaranteed to be monotonically increasing when records
are sorted by timestamp. Temporal ordering should be validated using the
timestamp fields in each record.

* [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal

Restore the SYSTEM keyword to target_include_directories for
rocprofiler-sdk-fmt to match develop branch.

* [rccl] Remove orphaned rocSHMEM gitlink

Remove orphaned submodule reference that was introduced during a merge
but never had a corresponding .gitmodules entry, causing CI failures
with "fatal: no submodule mapping found in .gitmodules".

* [rocprofiler-sdk] Add HSA ABI version 0x09 support

Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which
introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release
functions (added in rocr-runtime SWDEV-561708).

* [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations

This commit consolidates fixes for handling the finalization status during
buffer flush operations across the SDK.

Changes:
- Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully
  when flushing buffers, as this indicates buffers were already flushed
  during finalization (not an error condition)
- HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check
  for fini_status to allow operations during finalization process
- buffer.cpp: Revert fini_status checks to use > 0 for consistency
- correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging
  to prevent correlation ID creation after finalization starts

Files modified:
- source/lib/rocprofiler-sdk-tool/tool.cpp
- tests/tools/json-tool.cpp
- source/lib/rocprofiler-sdk/tests/registration.cpp
- source/lib/rocprofiler-sdk/tests/roctx.cpp
- samples/api_buffered_tracing/client.cpp
- samples/counter_collection/buffered_client.cpp
- samples/counter_collection/device_counting_async_client.cpp
- samples/external_correlation_id_request/client.cpp
- samples/pc_sampling/client.cpp
- source/lib/rocprofiler-sdk/buffer.cpp
- source/lib/rocprofiler-sdk/context/correlation_id.cpp
- source/lib/rocprofiler-sdk/hsa/queue.cpp
- source/lib/rocprofiler-sdk/hsa/async_copy.cpp
- source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp

* [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling

Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls
in samples and tools. The ERROR_FINALIZED handling was overly complex
and the hsa_tool_hooks OnUnload synchronization is no longer needed.

Changes:
- Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code
- Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL
- Simplify buffer flush in tool.cpp and json-tool.cpp
- Remove ERROR_FINALIZED special handling from test files

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Fix output_stream move semantics to null source pointers

The default move constructor and move assignment operator for
output_stream did not null out the source's pointers after the move.
This caused double-close when the moved-from temporary was destroyed,
leading to use-after-free crashes (SIGSEGV in std::ostream::sentry).

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration

- generatePerfetto.cpp: Move output_stream into shared_state to prevent
  use-after-free race conditions during Perfetto callback execution
- run-ci.py: Simplify and consolidate sanitizer environment variable
  configuration for better maintainability

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions

The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required
for CTest to properly pass suppression files to the sanitizers during
memcheck runs.

Co-Authored-By: Claude <noreply@anthropic.com>

* Revert "[rccl] Remove orphaned rocSHMEM gitlink"

This reverts commit 1ad21003941355658fff8114fa27768f11a948f7.

* [rocprofiler-sdk] Revert registration.cpp changes

Revert changes to registration.cpp to match develop branch.

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Remove suppression file content printing from run-ci.py

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix output_stream move ctor/assignment operator

* Fix erroneous revert of registration.cpp

* Fix handling of fini status in correlation ID construction

* [rocprofiler-sdk] Fix OMPT segfault during finalization

Add nullptr checks in OMPT tracing code to handle the case where
correlation_tracing_service::construct() returns nullptr during
finalization. This fixes segfaults in openmp-target-sample and
tests.integration.execute.openmp-tools.

The correlation ID construction now returns nullptr when fini_status > 0,
but the OMPT callbacks were not checking for this, causing crashes when
dereferencing the null pointer during OpenMP runtime shutdown.

Changes:
- event_common(): Return nullptr early if correlation ID is null
- event(): Check for nullptr before calling sub_ref_count()
- ompt_task_create_callback(): Return early if correlation ID is null
- ompt_task_schedule_callback(): Return early if correlation ID is null

* [rocprofiler-sdk] Fix HSA API tracing segfault during finalization

Add nullptr check in hsa_api_impl::functor after correlation ID
construction. During finalization, correlation_service::construct()
returns nullptr, and without this check the code would dereference
the null pointer when accessing corr_id->internal.

This fixes the SEGV at address 0x000000000008 (null + 8 byte offset)
that occurs when HSA async event threads call hsa_signal_destroy
during runtime shutdown after finalization has started.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2026-01-27 13:27:54 -05:00
Mark Meserve 8760fb4976 attach: Formalize ROCAttach API (#1653)
* attach: Formalize ROCAttach API

- Make ROCAttach public with public headers
- Change detach to take a PID
  - attach and detach are now reentrant
- Cleanup of states and signal handling in ptrace session
- Fixes mixed up definition of ROCPROF_ATTACH_TOOL_LIBRARY
  - ROCPROF_ATTACH_TOOL_LIBRARY now always means the tool library loaded by the attachment target
  - ROCPROF_ATTACH_LIBRARY refers to the library used to perform attachment
- Add direct call of rocprof-attach
- Fix python library call of rocprof-attach
  - Function now named attach(), changed from main()

* attach: rocprof-compute ROCAttach updates

- Update to new library names
- Correct usage of C lib detach

* attach: add test for rocattach

- Disable ASan, TSan, and UBSan for the new parallel-attach test
- Lower log level for LSan tests, existing behavior from other tests

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2026-01-15 14:32:14 -06:00
Jonathan R. Madsen 7fcea905f3 [rocprofiler-sdk] Fix double-buffering emplace and flush synchronization (#2334)
* Fix buffer tracing synchronization lock

- PR #529 (in rocprofiler-sdk-internal) introduced waiting on the syncer flag when emplacing in a buffer to prevent the overwriting buffer records currently being processed in a buffer flush callback
- The above fix introduced a block on the both buffers when a buffer flush callback was being executed instead of a block on the buffer being flushed.

* Add rocpd tests for duplicate records

* Address code review comments
2026-01-06 06:06:18 -06:00
Giovanni Lenzi Baraldi 0e04fdd571 Workaround for SWDEV-559598. Enabling more thread trace tests. (#1336)
* Workaround for SWDEV-559598

* gfx11 fix
2025-11-27 20:03:38 +01:00
Ammar ELWazir ed42157c31 Fixing Code Object Data Race and Thread Safety & Adding validation test (#2014) 2025-11-26 20:28:52 -06:00
Jonathan R. Madsen a2288eb50b [rocprofiler-sdk] Install unit tests and helper functions for integration tests (#921)
* [rocprofiler-sdk] Install unit tests and helper functions for integration tests

* Fix rocprofiler-sdk-tests-target export

* Fix handling of cmake policy CMP0174

* Remove -vv from new pytest.ini files

* add unit tests and integration tests.

* add path to ci workflow.

* misc. fixes.

* pc sampling tests.

* bug fixes.

* pc sampling tests fix.

* misc.

* Update CMakeLists.txt

* Update rocprofiler_config_install_tests.cmake, correct license name

* fix units tests install issues.

* fix counters_def file path.

* fix bug, arg shifting.

* vendor pytest-cmake.

* cmake config fix. missing endfunction()

* disable tests, 1.rocprofv3-trace-hip-libs. 2.kernel-tracing. 3.external_correlation 4.rocpd.

* disable buffered-tracing test and remove pytest-cmake from requirements.txt.

* disable hip-graph-tracing test.

* fix building standalone tests to load rocprofiler-sdk cmake package first and then find rocprofiler_sdk_pytest module.

* addressed comments: 1.add local bin path to code cov workflow. 2.add to cmake prefix path local bin. 3.use ROCPROFILER_MEMCHECK_PRELOAD_ENV_VALUE 4.misc. fix

* enabled back tests api_buffered, external_correlation_id, hip-graph, kernel-tracing, rocpd, tracing-hip-in-libraries. and misc fixes(formating, extra fixtures for agent-index tests.)

* cpack to use llvm bin for .hsaco debug symbols.

* psdb tests fixes.

* EOL.

* misc. fixes and Disable api_buffered_tracing, external_correlation_id, hip-graph-tracing, kernel-tracing, rocpd, summary, tracing-hip-libraries, tracing-plus-counter-collection.

* fix incorrect cmakelists file.

* strip smallkernel.bin

* format.

* revert disabled tests commit.

* misc. fix in counter tests.

* misc.

* search codeobj unit test assets in curr bin and install bin.

* refactor newly added rocpd tests.

* modify tests for newly added hip-host-tracing.

* add LD LIB path to units, psdb is failing due to libs not being found.

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-21 08:06:56 -06:00
Jonathan R. Madsen 18d956eb9c [rocprofiler-sdk] Fix hip compiler table initialization after finalization (#1174)
* [rocprofiler-sdk] Fix hip compiler table initialization after finalization

- Resolves tickets
  - https://ontrack-internal.amd.com/browse/SWDEV-557219
  - https://ontrack-internal.amd.com/browse/SWDEV-505503

* Tweak log message

* Remove unsupported hip limit enums

- hipLimitDevRuntimeSyncDepth
- hipLimitDevRuntimePendingLaunchCount

* Update conftest.py

Co-authored-by: Mark Meserve <mark.meserve@amd.com>

* Update README.md

Co-authored-by: Mark Meserve <mark.meserve@amd.com>

* Update hip_host.cpp

---------

Co-authored-by: Mark Meserve <mark.meserve@amd.com>
2025-11-18 08:28:42 -08:00
systems-assistant[bot] 061948a5ec [rocpd] Adding merge and package submodules for rocpd (#164)
* adding ROCpd database merge

* adding ROCpd database merge concatenating all tables

* update merge script

  - copy all tables from files

* fix merge format

* Add package submodule, initial POC.  Need to refine

* Minor fixes and clean up duplicated code in package.py

* Revamp metadata layout, add wildcard and .rpdb parsing

* Add auto merge & package when > 5 DBs, add examples, don't use auto_merge when using sub-commands merge & package

* - Extend package/yaml inputs to all rocpd modules
- Improve handling more corner cases for bad input files when parsing input parameters (bad yaml files, bad .rpdb folder, folders as input)
- Changed to use UUID in merged filename instead of the time, in auto-merge algorithm

* Minor text fixes for consistancy between modules

* Add more wildcard support and add package, merge tests

* Make changes based on review suggestions

* Move parsing packages into importer.py, simplified adding required params to a function

* fix package test by flattening input list before processing

* Integrate merge.py changes from Jonathan to add name-collision checks, recreating indexes, foreign key check (disabled for now, due to processing time)

* Rework rocpd.<submodule>.{add_args,process_args}

- add_args function returns a functor which accepts input and args
- time_window functor returned from add_args automatically applies time windowing of input

* change merge&package limit to 1, merge should create data views

* Move files by default instead of making copies

- copying can be enabled by passing "copy=True" or --copy cmdline argument

* refactor package to make the logic cleaner, set merge limit back to 5

* Allow automerge-limit param to override limit, change default back to 1.  Tests updated to use query, much quicker

* Update --help instructions for package

---------

Co-authored-by: acanadas <acanadas@amd.com>
Co-authored-by: a-canadasruiz <Araceli.CanadasRuiz@amd.com>
Co-authored-by: Young Hui <young.hui@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-11-12 17:07:12 -05:00
Mark Meserve 11d12a82fb rocprofiler-sdk: attach: fix test permissions (#1528)
* attach: fix test permissions

- Test is now skipped if insufficient permissions detected
- Should fix test (for now) in Azure CI pipeline
- Add more extensive permission checking for the tests
- Add default parameters to prevent running rm -rf on a root directory
- Add use for unused LOG_LEVEL parameter
2025-11-10 09:15:50 -06:00
Gopesh Bhardwaj 06bf110c84 Adding counters support for strix halo (#1358)
* Adding counters support for strix halo

* Updated coutners list

* Added missing counter info

* Updated arch support
2025-11-07 00:45:03 -06:00
Benjamin Welton d496bcef18 Fix dimension mismatch for multi-GPU systems with identical architect… (#1440)
* Fix dimension mismatch for multi-GPU systems with identical architectures

This change addresses an issue where counter dimensions were incorrectly
shared across all GPU agents with the same architecture name, even when
those agents had different hardware configurations (e.g., different CU counts).

Changes:
- Updated getBlockDimensions() to accept agent ID instead of architecture name
- Made dimension cache agent-specific instead of architecture-specific
- Updated set_dimensions() in AST evaluation to use specific agent ID
- Modified all API functions to handle agent-specific dimension lookups
- Updated tests to work with agent-specific dimensions

This fix ensures that dimensions accurately reflect the actual hardware
configuration of each individual GPU agent, preventing dimension mismatches
in multi-GPU systems where GPUs share the same architecture but have
different physical configurations.

Counter ID Representation Changes:
- Modified counter_id encoding to include agent information in bits 37-32
- Agent logical_node_id is encoded as (value + 1) to ensure agent 0 is detectable
- Counter records internally store only 16-bit base metric IDs (bits 15-0)
- Tool reconstructs agent-encoded counter IDs from base metric ID & agent info
- Instance record counter_id field uses bitwise AND mask to extract base metric ID
  (counter_id.handle & 0xFFFF) to fit in 16-bit storage
- Output generators (CSV, JSON, Perfetto) use agent-encoded IDs for consistency
- Updated counter_config.cpp and metrics.cpp to extract base metric ID when needed
- All counter lookups now properly handle agent-encoded vs base metric IDs

This ensures counter IDs are consistent between metadata and output records while
maintaining compact storage in instance records.
2025-10-27 07:58:20 -07:00
Vladimir Indic 920b33c0b9 PCS: Temporarily Masking Trap Handler Latency (#1109)
- Temporarily, masking out the trap handler latency, by detecting
untagged error samples.
- Disabling checks for the number of invalid samples.
2025-10-22 14:18:08 +02:00
Jonathan R. Madsen 4cca398b56 [rocprofiler-sdk] Update rocprofiler-sdk CONTRIBUTING.md (#1371) 2025-10-20 21:46:24 -05:00
itrowbri e7a26594b7 [rocprofiler-sdk] Fix Stream ID Error for Attachment (#1142)
* Changed stream error warning, remove regex search from attach execute test

* Formatting

* Revert accidental change

* Fix stream hang error due to grabbing same lock twice

* Updated add stream code, need to update tests

* Update attachment tests to use streams, threads, and multiple devices

* Update tests and fix stream issues

* Updated error messages to be more explicit, updated json to csv code in conftest to include streams and threads

* Formatting

* Add attachment label to attachment tests and update validation to fix errors

* Fix attach twice conftest

* Disabled thread san tests for attachment since they no longer work with bin file changes

* Updated for comment

* Added null check for getting attach status
2025-10-17 16:34:05 -05:00
systems-assistant[bot] 6b109c11c4 [rocprofv3] Reorganize rocprofv3.avail python package (#175)
* Reorganize rocprofv3 python package

adding Python version candidates

review fix

fix test

fix

remove extra line

fix the exception handle

fix Lint fail

fix installation

adding checks to check version format

disable test for address sanitizer

* review comments

* Removing extra lines

* fix format

* Add lib/python3/site-packages to PYTHONPATH in setup-env.sh

* rocprof-compute update rocprofv3 avail lib path

* Make rocprofv3 python binding build commands consistent with other python bindings

* fix cmake

* fix rocprof-compute

* revert cmake changes

* fix rocprofv3 avail python library

* fix cmake

* fix cmake

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-17 08:27:17 -07:00
Venkateshwar Reddy Kandula 952d1dabe2 [ROCProfiler-SDK][ROCR] HSA New API changes for HSA_AMD_EXT_API_TABLE_STEP_VERSION 8 (#1182)
* add new hsa ext api for version 8.

* use fmt instead of ostream.

* override rccl from therock

* Update rocprofiler-sdk-continuous_integration.yml

* Update rocprofiler-sdk-continuous_integration.yml

* Update rocprofiler-sdk-continuous_integration.yml

* enable rocr-build

* format

* disable att consecutive-kernels tests.

* Enable ROCR build in code coverage workflow

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
2025-10-06 13:09:39 -05:00
itrowbri abd6029603 [rocprofv3] MultiKernelDispatch ATT support (#774)
* Initial consecutive kernel WIP

* Updated logic after discussion, create context only when needed, change set of captured ids to dispatch_id_t type

* Updated to fix concurrency issues and revert kernel_iterations

* Add captured id in first lock capture

* Updated code to use wlock, added comments, removed some unecessary atomic

* Cleaned up, need to add test

* Add test to check that generated stats csv file is not empty

* Updated test to check if vector-ops kernels are being used

* Fix phase bug

* Updated for comments

* Flattened ATT logic a bit

* Fix incorrect if-statement

* Fix merge conflict
2025-09-23 20:19:27 -05:00
systems-assistant[bot] 1e9d8abbf6 [rocpd] Convert to perfetto does not display scratch_memory correctly - SWDEV-542550 (#168)
Add scratch memory to pftrace generated with rocpd

----

Co-authored-by: Marko Crnobrnja <Marko.Crnobrnja@amd.com>
Co-authored-by: Aleksei Tumakaev <atumakae@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
2025-09-23 09:55:30 +02:00
itrowbri 9910705685 Disable validation tests when execution test is disabled (#1060) 2025-09-22 09:18:32 -05:00
systems-assistant[bot] 63a723a287 GFX12 PC Sampling support (#186)
The GFX12 host-trap PC sampling support in SDK and V3.
Introducing parser tests specific to GFX12.

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-09-22 13:17:00 +02:00
systems-assistant[bot] 3f001b0305 [rocpd] Refactor to use python to convert rocpd to CSV + add CSV tests + remove old cpp implementation (#159)
* Write agent info to CSV

* Write kernel to CSV

* Write memory copy to CSV

* Write memory allocation to CSV

* Write hip api to CSV

* Write hsa api to CSV

* Write marker api to CSV

* Write counters to CSV

* Write scratch memory to CSV

* Write rccl api to CSV

* Write rocdecode api to CSV

* Write rocjpeg api to CSV

* Remove info_process joins

* Format agent id

* Compose full file name is sql writer function

* Add missing fields to kernel traces csv

* Rename vgpr_count to arch_vgpr_count

* Fix kernel name

* Skip empty query results

* Format csv.py

* Delete c++ CSV writer

* Add CSV header comparison test

* Fix comment spacing in csv.py

* Change ALLOC to ALLOCATE in memory allocation writer

* Do not append trace to agent info file name

* Revert changes for VGPR_Count

* Fix csv validation test

* Add sorting by guid

* Use EXISTS to check query results are not empty

* Merge API-specific queries

* Optimize regions query

* Column name mapping for agent info

* Pass config to sql writer

* Move agent id string building to a separate function

* add titled_headers argument

* Remove titled-columns argument

* Improvements for regions csv

* fix CSV validation test

* improve CSV validation test

* remove roctxMarkA from csv validation test

* fix capability field titles in agent info

* remove filter.py from query as that is still experimental

* Remove some aliases, now that query will auto-title the column headers

---------

Co-authored-by: Aleksei Tumakaev <atumakae@amd.com>
Co-authored-by: Young Hui <young.hui@amd.com>
Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
2025-09-19 10:15:57 -04:00
Mark Meserve bf49039005 [rocprofiler-sdk][rocprofiler-register] Initial Attachment Support (#316)
* attach: milestone: API tracing

- This pairs with another commit in rocprofiler-sdk to fully
  function
- Add ptrace entry points for tool attachment
- API tracing works at this commit
- Queue tracing not supported yet

* attach: cleanup

- Remove hardcode for loading of tool library
- Make invoke registration functions public again

* attach: proxy queue first draft

- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-sdk

* attach: prestore overhaul

- Must be paired with commit in rocprofiler-sdk

* attach: add dispatch table rework

- Register will load the prestore library and provide entrypoints to sdk

* attach: formatting and cleanup

* attach: revise dispatch table scheme

* attach: formatting

* attach: milestone: API tracing

- This change must be paired with a change in rocprofiler-register to
  fully function.
- API tracing works at this commit
- Queue tracing not supported yet

* attach: cleanup and comments

* attach: Formatting and crash fixes

* attach: add attach duration

- Add option attach-duration-msec for attachment

* Formatting + sglang hang fix via signal handling

* Changed FATAL_IF to DFATAL_IF for scratch_memory due to persistent crash when iterating queues

* attach: proxy queue first draft

- Adds ability to trace with queues during attachment
- Must be paired with updated rocprofiler-register

* Allow null agents for scratch output

* attach: improve queue library interface

- Significant changes to force exported interfaces back to C
- Fixes bug with unknown agents at attachment
- Code objects' names may still be incorrect

* attach: add code_object support

- Kernel traces will now have names and all other information for launches
- Add capture of hsa_executable to the queue library
- Various logging improvements

* attach: rename queue library to prestore

* attach: prestore overhaul

- Must be paired with commit from rocprofiler-register
- Massive overhaul of code organization in prestore library
  - Separates registrations for different object types
  - Sets up future changes for initialization

* attach: add prestore dispatch table

- Removes linkage to prestore library from sdk

* attach: cleanup

* attach: formatting

* attach: fix input prompt not appearing

* attach: fix component name in cmake

* attach: revert change to export level

* Make prestore API public

* attach: update sdk attachment library WIP

- This commit is NONFUNCTIONAL

- Changes around structure to remove classes
- Seperate C linkage where needed
- Still needs updates to register for correct usage

* attach: update register with dispatch table WIP
- This commit is NONFUNCTIONAL

- Changes rocprofiler_register to handle dispatch table from attach
  library.
- Still needs changes in SDK with dispatch table usage

* attach: dispatch table wip
- This commit is NONFUNCTIONAL

* attach: move attach component into core

* attach: rename to rocprofv3-attach

* attach: add callbacks for new queues and code objects

* attach: finish dispatch table implementation

- Fixes kernel tracing

* attach: add cmake variable for attachment support

* feat: Add --attach alias for rocprofv3 with comprehensive attachment tests

- Add `--attach` as an alias to existing `-p/--pid` functionality in rocprofv3.py
- Create comprehensive attachment test suite with CSV and JSON output validation:
- New attachment-test application for testing dynamic profiling scenarios
- Unified test script supporting both CSV and JSON output formats
- Pytest-based validation for kernel traces, memory copies, HSA API calls, and agent info
- Add CMake integration for automated attachment testing
- Support parameterized output directory and filename specification
- Implement proper environment setup for attachment queue registration

Tests verify successful attachment to running processes and capture of:
- Kernel dispatch traces with workgroup/grid dimensions
- Memory copy operations (H2D/D2H) with size validation
- HSA API call traces across multiple domains
- GPU/CPU agent information and capabilities

* Documentation Update

* attach: make attach script callable

* Added ROCPROFILER_REGISTER_ATTACHMENT_TOOL_LIB to remove hardcoded name

* attach: revert metrics library path changes

* Generic Attachment in Register (#942)

Remove tool references in register

* Add second param to attach call in rocprof register

* Add experimental reattachment support for ROCprofiler-SDK

This commit introduces experimental reattachment functionality allowing tools
to dynamically reattach to running processes with comprehensive design changes
to support multiple attach/detach cycles:

**Core Reattachment API:**
- Add rocprofiler_tool_configure_result_experimental_t with tool_reattach/tool_detach callbacks
- Add rocprofiler_call_client_reattach and rocprofiler_call_client_detach C exports
- Implement reattachment tracking in rocprofiler_register_attach to differentiate
initial attachment from reattachment cycles
- Add rocprofiler_register_invoke_reattach for handling reattachment requests

**Design Changes - Registration System Flow:**
The registration system now supports a dual-path initialization:

1. Initial Attachment Flow:
    - rocprofiler_register_attach() -> rocprofiler_register_invoke_all_registrations()
    - Full tool initialization with complete context setup
    - Sets prev_attached atomic flag to track state

2. Reattachment Flow:
    - rocprofiler_register_attach() detects prev_attached=true -> rocprofiler_register_invoke_reattach()
    - Bypasses full re-initialization, calls client reattach callbacks instead
    - Preserves existing contexts and buffers, only reactivates profiling services

**Design Changes - Tool Library Loading:**
Enhanced rocprofiler-register library loading with function pointer resolution:
- Extended rocp_set_api_table_data_t tuple to include reattach/detach function pointers
- Automatic symbol resolution for rocprofiler_call_client_reattach/detach functions
- Support for both LD_PRELOAD and dlopen scenarios with consistent callback availability

**Design Changes - Context Management:**
Introduced dual context systems for attachment scenarios:
- get_contexts() - Original contexts for standard tool initialization
- get_attach_contexts() - Separate context map for attachment-specific lifecycle
- attach_init() - Creates contexts for ALL buffer tracing services using existing buffers
- attach_start() - Selectively starts contexts based on configuration options
- attach_detach() - Cleanly stops and destroys attachment contexts

**Design Changes - Buffer Management:**
Added reset_tmp_file_buffer() template for clean reattachment state:
- Properly closes and removes old temporary files
- Deletes existing file_buffer instances to prevent stale file position tracking
- Creates fresh file_buffer instances for clean reattachment cycles
- Addresses core issue where file position metadata becomes stale between cycles

**Design Changes - Environment Variable Injection:**
Added ROCP_REGISTERED_TOOL_ATTACH environment variable:
- Distinguishes attachment-loaded tools from LD_PRELOAD scenarios
- Enables registration system to apply attachment-specific logic
- Helps tools adapt behavior for attachment vs standard initialization

**Attachment Context Management:**
- Add attach_init/attach_start/attach_detach functions for dynamic context lifecycle
- Add reset_tmp_file_buffer template for clean reattachment state management
- Implement get_attach_contexts() for tracking active attachment contexts

**Test Infrastructure:**
- Add projects/rocprofiler-sdk/tests/rocprofv3/reattach/ comprehensive test suite
- Include reattachment test scripts with unified attachment/detachment cycles
- Add validate.py with trace data validation for kernel, memory copy, HSA API, and agent info
- Add conftest.py for JSON and CSV data loading utilities

**Configuration Updates:**
- Update CMakeLists.txt to include reattachment tests in build system
- Add environment variable ROCP_REGISTERED_TOOL_ATTACH for attachment state tracking
- Enhance rocprofiler-register library loading with reattach/detach function resolution

**Flow Impact Analysis:**
This design enables robust multi-cycle attachment by:
1. Preventing duplicate initialization on reattachment
2. Maintaining separate context lifecycles for attachment vs standard operation
3. Ensuring clean temporary file state between attachment cycles
4. Providing tools with explicit reattach/detach callback hooks
5. Supporting both programmatic and environment-based tool configuration

The experimental nature allows for iteration on the API while establishing
the foundation for production-ready dynamic profiling capabilities.

* Fix misc clang-tidy warnings/errors

* CMake Option and Environment Variable Updates

- CMake: ROCPROFILER_REGISTER_ALWAYS_SUPPORT_ATTACH -> ROCPROFILER_REGISTER_BUILD_DEFAULT_ATTACHMENT
- Env: ROCPROFILER_REGISTER_ATTACHMENT_ENABLED ->

* Source reorganization

* Formatting + new lines at EOF

* Fix flake8 F841: local variable is assigned to but never used

* Update attachment test

- get rid of 5 second start delay
- add roctx

* Rework implementation

- Remove rocprofiler_tool_configure_result_experimental_t in lieu of rocprofiler_configure_attach
- Add <rocprofiler-sdk/experimental/registration.h>
- TODO: Update process_attachment.rst

* Handle re-attachment options

- inherit options from previous attachment
- check previous options do not modify data collection services

* Fix support for tools w/o rocprofiler_configure_attach

- fix segfault when rocprofiler_configure_attach does not exist
- fix naming convention for functions accepting attach dispatch table
- cleanup rocprofiler_configure_attach implementation in rocprofv3 tool

* attach: remove unknown agent handling

- Change was from earlier commit, no longer needed

* attach: add error for attaching without library loaded

* attach: revise version numbering

* attach: register header revisions

* attach: clang format register

* attach: formatting

* attach: fix build failure

- Remove cross dependency into rocprofiler-sdk, fixes build on some systems

* attach: revise register library detection

* Update rocprofiler-register and attach library

- formatting
- proper signature of register_functor for rocprofiler-sdk-attach library callback
- remove get_dispatch_registration_table()

* Bump rocprofiler-register version to 0.6.0 + AnyNewerVersion

* Fix output support for rocprofiler-sdk-tool

* Fix formatting

* Fix clang tidy errors

* Misc rocprofiler-sdk-attach fixes

* attach: add sigint handling to attach python

* tool README.md formatting

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* Fix buffered output issue

* attach: add errors for tool attach

* CI Fixes

* Rework tests

* attach: improve library loading in rocprofv3 attach

* formatting

* Update tests to use pytest framework

* Fix test_attachment_hsa_api_trace

* attach: catch ctypes exceptions

* attach: fix leak in registration

* attach: fix sanitizer tests

* attach: fix sanitizer tests further

* attach: disable attach asan tests

* attach: disable ubsan test

* attach: fix permissions in installed test package

* attach: formatting

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
Co-authored-by: Tim Gu <Tim.Gu@amd.com>
Co-authored-by: Claude Code <claude@anthropic.com>
Co-authored-by: Benjamin Welton <bwelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-09-18 18:10:45 -05:00
Gopesh Bhardwaj dd44ae3295 [Palamida scan] SWDEV-553053 Adding missing copyrights information (#836)
* SWDEV-553053 Adding missing copyrights information
2025-09-10 23:44:27 -07:00
systems-assistant[bot] b60c0ceddd [rocprofv3] Unconditionally collect stream and kernel rename data in rocprofv3 for rocpd (#171)
* Remove config checks for stream and kernel rename data collection

* Updated csv generation to check if kernel rename is on before calling get_kernel_name

* Update metadata to use kernel_rename bool argument

* Formatting + unconditionally store kernel name in rocpd

* Readded kernel rename parameter after rebase

* Fixed rebase conflicts

* Updated comment in line with github comments

* Added check in rocpd csv.cpp to output kernel name if region name is empty

* Add test for kernel rename

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
2025-09-10 16:03:15 -05:00
Giovanni Lenzi Baraldi 9849073836 SWDEV-540648: Adding realtime clock to v3 tool. Update decoder header. (#666)
* SWDEV-540648: Adding realtime clock to v3 tool. Update header for decoder.

* Adding tests

* Review comments

* Review comment
2025-09-10 12:39:27 +02:00
Benjamin Welton 53dcae49c6 [rocprofiler-sdk] Disable multiplex tests (#876) 2025-09-09 13:18:10 -05:00
systems-assistant[bot] 2cfedef6b6 [CI] Increase rocDecode and rocJPEG Code Coverage (#183)
* Increase rocDecode code coverage and add version check

* Update rocJPEG tests

* Fix rocJPEG tests

* Enable building tests/samples in rocm release compat workflow

* Readded rocJPEG test skips

* formatting

* Adding ROCm libraries for the code-coverage job

* Added return value check for error message and updated compatability to enable tests

* Disable rocm_release_compatibility samples and tests until openmp issue is resolved

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
2025-09-03 19:20:11 +05:30
itrowbri 4d98a0169f Handle special cases when stream value is hipStreamLegacy (0x01) or hipStreamPerThread (0x02) (#343)
* Updated stream code to handle special cases when stream value is 0x01 or 0x02

* Removed extra definitions and updated tests to account for special case

* Modified stream.cpp so that each thread assigned a unique stream ID when hipStreamPerThread is used as stream value. Modified tests to check that threads are assigned unique, repeated values when hipStreamPerThread is called

* Updated idx_offset, stream_map, and thread counter to be in one struct.

* Update stream.cpp to only use add_stream() and update tests for seperate unit test for hipStreamPerThread

* Remove unecessary comment

* Removed unecessary line

* Updated tests and stream.cpp to update stream ID correctly

* Updated test structure
2025-08-27 20:04:13 -05:00
Giovanni Lenzi Baraldi cb77f5af5c Adding new trace decoder record types and new ATT parameters (#195)
* Adding new trace decoder record types and new ATT parameters

* Add compatiblity with decoder 0.1.2

* Added RT

* Format

* Add logging to sdata values

* Review comment

* Review comments

* Update projects/rocprofiler-sdk/source/include/rocprofiler-sdk/experimental/thread-trace/trace_decoder_types.h
2025-08-12 14:31:12 +02:00
Welton, Benjamin ea4e6dc572 Fix hsa_code_object_app test deadlock with profiler serialization (#577)
Problem with original test:
- Created circular dependencies between queues:
  * Queue1: Kernel A → Barrier(waits for signal_2) → Kernel C
  * Queue2: Barrier(waits for signal_1) → Kernel B → sets signal_2
- With strict "one kernel at a time" serialization, this created deadlock:
  * Queue1 executed Kernel A, then blocked on barrier waiting for signal_2
  * Serializer switched to Queue2, but Queue2 was blocked waiting for signal_1
  * Neither queue could proceed: Queue1 needed Queue2's Kernel B to complete,
    but Queue2 couldn't start until Queue1 finished completely
- Test would hang indefinitely at hsa_signal_wait_relaxed() for signal_2

Solution implemented:
- Reordered packet submission to eliminate circular dependencies
- Ensured signal producers execute before consumers need them:
  * Kernel A produces signal_1 before Queue2's barrier needs it
  * Kernel B produces signal_2 before Queue1's continuation needs it
- Dependencies now flow forward without cycles, allowing serializer progress

Refactoring changes:
- Extract common functionality into helper functions:
  * create_completion_signal() for signal creation
  * create_queue() for queue creation
  * submit_kernel_packet() for kernel dispatch packets
  * submit_barrier_packet() for barrier packets
- Add comprehensive documentation explaining expected execution pattern
- Simplify main() function making the dependency flow more readable

Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: b5e1645a14]
2025-08-05 17:29:07 -07:00
Bhardwaj, Gopesh abc105f289 Fix missing include file in rocJPEG test app (#525)
- Fixes compilation error for rocJPEG test app
- SWDEV-544094

[ROCm/rocprofiler-sdk commit: 5f422c1993]
2025-08-05 11:50:01 -05:00
Baraldi, Giovanni 6a6b16be93 Adding GPU index as a parameter for ATT (#547)
* Adding GPU index as a parameter for ATT

* Tidy fix

* Using tokenize

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

* Adding error logging. Using idx instead of id.

---------

Co-authored-by: Giovanni <gbaraldi@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

[ROCm/rocprofiler-sdk commit: fd6f96ffb5]
2025-08-04 23:15:50 +02:00
Trowbridge, Ian 6b2a4fcfc2 Revert memory allocation CSV output file header and update tests (#532)
* Reverted header and field location for csv memory allocation and updated tests

* Updated example csv file and made small update

[ROCm/rocprofiler-sdk commit: 533a8329d8]
2025-08-04 13:22:27 -05:00
Baraldi, Giovanni b30a084da2 Removing ATT buffer size limitation (#534)
* Removing SQTT buffer size limitation

* Update source/lib/rocprofiler-sdk/thread_trace/core.cpp

* Added testing for buffer size. Formatting.

* Add test as unstable

* Increase default buffer size

* Apply suggestions from code review

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* Fix typo from code review

* Update tests/thread-trace/agent.cpp

---------

Co-authored-by: Giovanni <gbaraldi@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

[ROCm/rocprofiler-sdk commit: 1ba08cd4df]
2025-07-29 22:47:40 +02:00
Indic, Vladimir fe6cfc9cc8 PCS test: cast agent name to str (#546)
* PCS test: cast agent name to str

[ROCm/rocprofiler-sdk commit: 2d8936362e]
2025-07-29 12:11:15 -07:00
Hui, Young 68ae6cf65f [rocpd] Adding summary module to generate summaries from rocpd database + query submodule + rocpd command-line tools (#488)
* adding summary.py to generate tmp <category_region>_summary views

* migrating CSV summary to SDK method of writing CSVs

  - Add domain_view to summary.py
  - omit the C++ code of writing CSV because it gets revered later anyway

* Add summary subparser and write_sql_view_to_csv function

* adding all <>_summary views generation to summary.py

* add summary_per_rank feature

* add --summary-per-rank

* reconstruct generate_summary_view and create_domain_view

-introduce by_rank

* remove sqr and variance in summary views

* use RocpdImportData instead of connection

* two fixes on summary.py

--modify the generate_summary_view function to return a tuple with view name and sql code

add if_not_exits parameter to generete_summary_view

* Refactor summary.py to allow output path and filename args, and apply time_window
- clean up summary table column headers
- only generate by-rank views if that param is specified

* Add ProcessID to Hostname output and csv, so users can identify the system in the by-rank summaries

* Summary.py, just add hostname to by-rank summaries, instead of creating mapping table

* Summary - migrate csv writer to pandas, for more future flexibility

* Adding a few simple tests for summary.py

* Linting fixes

* add region_categories to summary options

  -  Automatically retrieve region categories from the database if argument is None

* add backticks for view_names

* fix tests after rebase

* Made code review changes
- fixed whitespace in CMakelists.txt
- adding query.py module & subparser in __main__.py
- refactor summary function to return query
- used query.py to output csv
- used query.py to also output summary to console
- provided new command line options to select summary output to csv or console

* Made fix to jinja template in query.py, as suggested by copilot

* Consolidated output calls to query in export_view function based on feedback
- refactored: helpers, query functions, create view functions
- extended formats to include what query supports (md, html, pdf, json)
- added json format to query, and changed orient=records
- adding jinja2 and reportlab to requirements.txt

* Add version_info for rocpd and roctx

* Add rocpd commandline tool

* Add executable permissions to source/bin/rocpd.py

* Removed rocpd2query, and cleaned up --help examples

---------

Co-authored-by: acanadas <acanadas@amd.com>
Co-authored-by: Jin Tao <jintao12@amd.com>
Co-authored-by: a-canadasruiz <Araceli.CanadasRuiz@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 3954cedd25]
2025-07-24 16:12:06 -05:00
Baraldi, Giovanni 4ca156e572 Thread trace and Trace Decoder API tests and samples (#416)
* Adding test and samples to decoder

* Fix sample

* Formatting

* Fix multi test

* Disable sample

* Fix tests

* Format

* Version fix

* Locking the decoder

* Add atomic

* Review comments

* Format

* Adding readme

* merge conflict and adding PCS+ATT test

* Review comments

* Properly disable PCS test

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

* Adding back env var test

* Name fix

* Preload sample

* Addressing review comments

* Update docs

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: e898079a13]
2025-07-22 20:08:12 -05:00
Madsen, Jonathan 990946e956 [SDK] Fix null handles (#474)
* Fix null handle

- use .handle=0, not .handle=numeric_limits<>::max()

* Update lib.common.hasher

* Fix ROCPROFILER_CONTEXT_NONE

* Use context operator==

* Update CHANGELOG

* Updated null handle for scratch memory and changed allocation test so that free ops account for null agent

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>

[ROCm/rocprofiler-sdk commit: 4d6a61f5e5]
2025-07-18 12:05:52 -05:00
Nagaraj, Sriraksha c8912d2bb6 [rocprofv3-avail] Documentation update and column formatting (#447)
* addressing issues

* doc fix

* test fix

* fix

* fix formatting issue and doc update

* fix column size

* fix

* fix formatting in output

* tests fix

* test fix

* add new line

* add new line

* fix new line

* fixing typo in using-rocprofv3-avail.rst

[ROCm/rocprofiler-sdk commit: 3aaffc42da]
2025-07-10 11:41:12 -05:00
U, Srihari 7243889d6a Add perfetto support for scratch memory (#303)
* Add perfetto support for scratch memory

* Updated tests and docs.

* Update docs data

* Added underflow check

* Record all free events to 0 bytes

* Add format

* Address review comment

* updated tests for scratch memory

* update scratch-memory tests.

[ROCm/rocprofiler-sdk commit: 6f2a5a9646]
2025-07-09 21:05:45 +05:30
Welton, Benjamin fbf17a42d4 [SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges (#363)
* [SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges

Range extent to capture all work between roctxpush/pop operations. Entry callback takes place during roxtxpush and exit callback takes place in roctxpop. This is primarily to allow us to keep an ancestor id on the ancestor stack such that all operations that take place within the push/pop context can be annotated as being apart of this range. With the current setup (where push and pop are two separate operations that need to be combined externally), we cannot keep an ancestor id on the stack and thus cannot tie tracing events to particular ranges.

Correlation id information is inherited from the push operation. Ancestor id needs to be added in a future commit that also outputs this ancestor to CSV.

Output:

```
[ctest] {'size': 64, 'kind': 7, 'operation': 1, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479402642, 'end_timestamp': 2932551491178449, 'thread_id': 3254861}
[ctest] {'size': 64, 'kind': 8, 'operation': 2, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479405878, 'end_timestamp': 2932551491181214, 'thread_id': 3254861}
```

Note: Kind 8 = range extent op.

* Merge fix

Revert several changes

source/lib/rocprofiler-sdk/marker/range_marker.*

- separate out range marker implementation for standard marker implementation

Update public API with marker core range

Support marker core range in sdk (source/lib/rocprofiler-sdk)

Transition rocprofiler-sdk-tool and output lib to use marker core range

Misc fixes for tests

Fix logic in lib/output/generate{CSV,Stats}.cpp

Update tests/rocprofv3/tracing-hip-in-libraries (marker validation)

Fix test_otf2_data

* Test fixes

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: 2c4e20b951]
2025-07-08 23:41:22 -07:00
Madsen, Jonathan 145944dc30 [CI] Disable conversion script validation (#492)
- pytest fails when you selectively disable all the tests in the suite

[ROCm/rocprofiler-sdk commit: 99e262ca4a]
2025-06-30 17:02:53 -05:00
Madsen, Jonathan fb51f0e5d4 [CI] Disable other unstable tests (#491)
Disable other unstable tests

- validation test_validate_counter_collection_pmc1 for conversion script
- increase timeout for tests/rocprofv3/pc-sampling execution phase

[ROCm/rocprofiler-sdk commit: f0fe04b95e]
2025-06-30 15:53:30 -05:00
Madsen, Jonathan fff7146ab8 [CI] Disable other unstable tests (#490)
* Disable other unstable tests

* Disable validating test_total_runtime in kernel-tracing

* The disabled tests will be stabilized and re-enabled by ROCm 7.0.1 or ROCm 7.1 

[ROCm/rocprofiler-sdk commit: 69f71b8097]
2025-06-30 15:42:38 -05:00
Madsen, Jonathan 839c07c4aa [CI] Testing stability (#486)
* [CI] Testing Stability

- CMake option ROCPROFILER_DISABLE_UNSTABLE_CTESTS
  - used for tests which periodically fail around 1 out of every 10 runs
  - set to ON while instability remains, this needs to set to OFF in ROCm 7.1 or, ideally, ROCm 7.0.1
- Use FIXTURES_SETUP and FIXTURES_REQUIRED for some tests
- replace "threw an exception" with "${ROCPROFILER_DEFAULT_FAIL_REGEX}" for misc FAIL_REGULAR_EXPRESSIONS

* Remove contents of all EXCLUDE_{TESTS,LABEL}_REGEX from CI workflow

* Disable patch git step in code-coverage run

* Tweak spin time of reproducible runtime

* Removed patch git step in code-coverage run

* Update ROCPROFILER_DEFAULT_FAIL_REGEX

* Mark test-counter-collection tests as unstable

- add fixtures setup/required

* Remove ATTACHED_FILES_ON_FAIL

- CDash doesn't store enable downloading these properly anyway

* Relax collection-period fuzzing window

* Disable unstable collection-period test

- too unstable

* formatting

* Disable unstable device_counting_service_test.async_counters

* Suppress perfetto internal data race errors

* Switch code-coverage CI jobs to mi300 runner

* Timeout increases

* rocprofv3-test-rocpd updates

- add fixtures
- switch executable
- redefine input/output paths

* Revert code-coverage job to mi300a runner

* Update rocprofv3-test-rocpd-execute-multiproc

- reduce problem size

* disable multiproc rocpd

* Split code-coverage into separate workflow

- network issues cause this job to fail frequently
- when in a separate workflow, it can be restarted easily

* Fixtures for rocprofv3-test-trace-hip-in-libraries

* Disable unstable device_counting_service_test.sync_counters

* Potential fix for code scanning alert no. 171: Workflow does not contain permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Switch code-coverage to run on rocprof-azure

- mi300a EMU runner set is unstable (network issues)

* tests/rocprofv3/pc-sampling SKIP_REGULAR_EXPRESSION

* Update rocprofv3-test-list-avail-trace-execute

- reduce log level and increase timeout

* rocprofv3: Prevent recursive call to rocprofv3_error_signal_handler + log chaining

* rocprofv3: Use ROCP_ERROR + std::exit instead of ROCP_FATAL

- should help with SKIP_REGULAR_EXPRESSION

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

[ROCm/rocprofiler-sdk commit: 640ca55ac0]
2025-06-30 15:07:37 -05:00
Kandula, Venkateshwar reddy 8f5d00ca5d [SDK] Add gfx950 targets for tests and samples (#399)
* add gfx950 targets.

* add gfx950 targets to ci workflows.

* Format.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Vaddireddy, Sushma <Sushma.Vaddireddy@amd.com>

[ROCm/rocprofiler-sdk commit: 375866383b]
2025-06-26 14:25:08 -05:00
Kuricheti, Mythreya bde07e7baa [SDK] KFD new events API (#321)
* Remove page-migration

* Add KFD events API

* Address review comments

* Move assert checks

* Update enum-string utils

* Update codeowners

* Update KFD header

* Add perfetto category

[ROCm/rocprofiler-sdk commit: 8a461afe20]
2025-06-26 13:28:45 -05:00
Trowbridge, Ian 0a9849a5cf Add copyright disclaimer for scan (#453)
[ROCm/rocprofiler-sdk commit: 0ab43420ba]
2025-06-24 16:25:26 -05:00
Elwazir, Ammar 4d79e1df30 [SDK] Support CMake option for using internal RCCL tracing + (temporary) enable in CI (#457)
* Temp: disable RCCL tracing

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Adding option to disable rccl tracing from CMake

* Update codeql.yml

* Misc updates

- ROCPROFILER_BUILD_RCCL -> ROCPROFILER_INTERNAL_RCCL_API_TRACE
- env.EXTRA_TEMP_CMAKE_OPTIONS -> env.GLOBAL_CMAKE_OPTIONS
- add (advanced) option ROCPROFILER_INTERNAL_RCCL_API_TRACE

* Fix rocprofiler::sdk::get_enum_label

- missing enum labels for HIP_RUNTIME_API_TABLE_STEP_VERSION > 8

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

- improve various aspect of cmake -- particularly echoing where attdecoder_LIBRARY was found

* Use CMAKE_MESSAGE_INDENT

- add prefix to cmake messages to help indicate where messages are coming from
- make find_package(Python3 ...) QUIET for bindings

* Fix rocprofiler::sdk::get_enum_label

- handle HSA_AMD_EXT_API_TABLE_MAJOR_VERSION

* Fix rocprofv3 message for att library path

* Fix tests/rocprofv3/advanced-thread-trace/att_input.yml config

* Fix rocprofv3 check_att_capability + soversion/version library resolution

- Account for ROCPROF_ATT_LIBRARY_PATH in env in check_att_capability
- Add resolve_library_path
  - supports resolution of library names to SOVERSION and VERSION paths

* Fix python linting error (unused import)

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: aeb1621c2b]
2025-06-17 08:32:54 -05:00
Trowbridge, Ian 4f7d6d7f1c Removed roc_video_dec files due to reoccurring errors (#439)
* Removed roc_video_dec files due to reoccurring errors

* Added back rocDecCreateVideoParser

* Fix remaining errors with rocdecode.cpp

[ROCm/rocprofiler-sdk commit: 72b97a8b7e]
2025-06-12 10:49:08 -05:00