develop
58 İşleme
| Yazar | SHA1 | Mesaj | Tarih | |
|---|---|---|---|---|
|
|
1517a398bf |
[rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements Buffer Pool Design ------------------ Replace the fixed array-based double buffer with a dynamic pool design to fix race conditions that caused "internal correlation id was retired prematurely" errors. The original design had a race where flush callbacks could be delivered out-of-order: when buffer 0 fills and begins flushing, writes go to buffer 1. If buffer 1 fills before buffer 0's flush completes, the buffer index wraps back to 0 (which may still be flushing). Independent flush tasks submitted to the thread pool can complete out of order. The new pool design: - Uses a std::deque of buffer instances that grows as needed - Allocates buffers from the pool when the current buffer needs to flush - Serializes flushes with a mutex to ensure FIFO callback ordering - Returns buffers to the pool after flush completion - Eliminates the race between buffer selection and write operations New Unit Tests -------------- - buffer_correlation_ordering.cpp: Tests that API records are always delivered before their corresponding retirement records - buffer_ordering_stress.cpp: Stress tests buffer flush ordering under high contention with multiple threads rapidly filling buffers HSA Tool Hooks -------------- Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that waits for pending flush tasks before tool finalization, preventing "retired prematurely" errors during HSA shutdown. Sanitizer Improvements ---------------------- - LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder - LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup) - TSAN: Added suppression for false positive on C++11 thread-safe static initialization in create_write_functor - ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto - Disabled attachment tests for sanitizers due to library preloading issues Other Fixes ----------- - Thread-trace agent test: Use heap-allocated callback state - Correlation ID: Refactored reference counting and finalization ordering * [rocprofiler-sdk] Revert buffer pool design changes Revert buffer.cpp and buffer.hpp to the original double-buffer design from develop branch. The pool-based redesign introduced concerns about: - Signal safety (mutex vs atomic_flag) - API changes (flush() return type) - Complexity of the new design This revert removes: - Dynamic buffer pool with std::deque - std::mutex/condition_variable synchronization - buffer_correlation_ordering.cpp test - buffer_ordering_stress.cpp test The underlying buffer flush ordering issue will need to be addressed with a different approach that preserves the original API and synchronization characteristics. * [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization - Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks - Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning - Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp) - Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior: - hsa/queue.cpp (lines 105, 210) - hsa/async_copy.cpp (line 344) - hsa/hsa_barrier.cpp (line 43) - buffer.cpp (lines 107, 138, 185) This ensures no correlation IDs are created once finalization starts (fini_status != 0), preventing races between finalization and ongoing tracing operations. * [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation Buffer records are not guaranteed to arrive in any specific order. Tests and samples should use timestamps for temporal ordering validation instead. Changes: - samples/external_correlation_id_request: Replace 'retired prematurely' arrival order check with timestamp-based validation that retirement timestamp >= max(end_timestamps) for records with the same correlation ID - tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check Correlation IDs are not guaranteed to be monotonically increasing when records are sorted by timestamp. Temporal ordering should be validated using the timestamp fields in each record. * [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal Restore the SYSTEM keyword to target_include_directories for rocprofiler-sdk-fmt to match develop branch. * [rccl] Remove orphaned rocSHMEM gitlink Remove orphaned submodule reference that was introduced during a merge but never had a corresponding .gitmodules entry, causing CI failures with "fatal: no submodule mapping found in .gitmodules". * [rocprofiler-sdk] Add HSA ABI version 0x09 support Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release functions (added in rocr-runtime SWDEV-561708). * [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations This commit consolidates fixes for handling the finalization status during buffer flush operations across the SDK. Changes: - Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully when flushing buffers, as this indicates buffers were already flushed during finalization (not an error condition) - HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check for fini_status to allow operations during finalization process - buffer.cpp: Revert fini_status checks to use > 0 for consistency - correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging to prevent correlation ID creation after finalization starts Files modified: - source/lib/rocprofiler-sdk-tool/tool.cpp - tests/tools/json-tool.cpp - source/lib/rocprofiler-sdk/tests/registration.cpp - source/lib/rocprofiler-sdk/tests/roctx.cpp - samples/api_buffered_tracing/client.cpp - samples/counter_collection/buffered_client.cpp - samples/counter_collection/device_counting_async_client.cpp - samples/external_correlation_id_request/client.cpp - samples/pc_sampling/client.cpp - source/lib/rocprofiler-sdk/buffer.cpp - source/lib/rocprofiler-sdk/context/correlation_id.cpp - source/lib/rocprofiler-sdk/hsa/queue.cpp - source/lib/rocprofiler-sdk/hsa/async_copy.cpp - source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp * [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls in samples and tools. The ERROR_FINALIZED handling was overly complex and the hsa_tool_hooks OnUnload synchronization is no longer needed. Changes: - Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code - Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL - Simplify buffer flush in tool.cpp and json-tool.cpp - Remove ERROR_FINALIZED special handling from test files Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Fix output_stream move semantics to null source pointers The default move constructor and move assignment operator for output_stream did not null out the source's pointers after the move. This caused double-close when the moved-from temporary was destroyed, leading to use-after-free crashes (SIGSEGV in std::ostream::sentry). Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration - generatePerfetto.cpp: Move output_stream into shared_state to prevent use-after-free race conditions during Perfetto callback execution - run-ci.py: Simplify and consolidate sanitizer environment variable configuration for better maintainability Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required for CTest to properly pass suppression files to the sanitizers during memcheck runs. Co-Authored-By: Claude <noreply@anthropic.com> * Revert "[rccl] Remove orphaned rocSHMEM gitlink" This reverts commit 1ad21003941355658fff8114fa27768f11a948f7. * [rocprofiler-sdk] Revert registration.cpp changes Revert changes to registration.cpp to match develop branch. Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Remove suppression file content printing from run-ci.py Co-Authored-By: Claude <noreply@anthropic.com> * Fix output_stream move ctor/assignment operator * Fix erroneous revert of registration.cpp * Fix handling of fini status in correlation ID construction * [rocprofiler-sdk] Fix OMPT segfault during finalization Add nullptr checks in OMPT tracing code to handle the case where correlation_tracing_service::construct() returns nullptr during finalization. This fixes segfaults in openmp-target-sample and tests.integration.execute.openmp-tools. The correlation ID construction now returns nullptr when fini_status > 0, but the OMPT callbacks were not checking for this, causing crashes when dereferencing the null pointer during OpenMP runtime shutdown. Changes: - event_common(): Return nullptr early if correlation ID is null - event(): Check for nullptr before calling sub_ref_count() - ompt_task_create_callback(): Return early if correlation ID is null - ompt_task_schedule_callback(): Return early if correlation ID is null * [rocprofiler-sdk] Fix HSA API tracing segfault during finalization Add nullptr check in hsa_api_impl::functor after correlation ID construction. During finalization, correlation_service::construct() returns nullptr, and without this check the code would dereference the null pointer when accessing corr_id->internal. This fixes the SEGV at address 0x000000000008 (null + 8 byte offset) that occurs when HSA async event threads call hsa_signal_destroy during runtime shutdown after finalization has started. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
a2288eb50b |
[rocprofiler-sdk] Install unit tests and helper functions for integration tests (#921)
* [rocprofiler-sdk] Install unit tests and helper functions for integration tests * Fix rocprofiler-sdk-tests-target export * Fix handling of cmake policy CMP0174 * Remove -vv from new pytest.ini files * add unit tests and integration tests. * add path to ci workflow. * misc. fixes. * pc sampling tests. * bug fixes. * pc sampling tests fix. * misc. * Update CMakeLists.txt * Update rocprofiler_config_install_tests.cmake, correct license name * fix units tests install issues. * fix counters_def file path. * fix bug, arg shifting. * vendor pytest-cmake. * cmake config fix. missing endfunction() * disable tests, 1.rocprofv3-trace-hip-libs. 2.kernel-tracing. 3.external_correlation 4.rocpd. * disable buffered-tracing test and remove pytest-cmake from requirements.txt. * disable hip-graph-tracing test. * fix building standalone tests to load rocprofiler-sdk cmake package first and then find rocprofiler_sdk_pytest module. * addressed comments: 1.add local bin path to code cov workflow. 2.add to cmake prefix path local bin. 3.use ROCPROFILER_MEMCHECK_PRELOAD_ENV_VALUE 4.misc. fix * enabled back tests api_buffered, external_correlation_id, hip-graph, kernel-tracing, rocpd, tracing-hip-in-libraries. and misc fixes(formating, extra fixtures for agent-index tests.) * cpack to use llvm bin for .hsaco debug symbols. * psdb tests fixes. * EOL. * misc. fixes and Disable api_buffered_tracing, external_correlation_id, hip-graph-tracing, kernel-tracing, rocpd, summary, tracing-hip-libraries, tracing-plus-counter-collection. * fix incorrect cmakelists file. * strip smallkernel.bin * format. * revert disabled tests commit. * misc. fix in counter tests. * misc. * search codeobj unit test assets in curr bin and install bin. * refactor newly added rocpd tests. * modify tests for newly added hip-host-tracing. * add LD LIB path to units, psdb is failing due to libs not being found. --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com> Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com> |
||
|
|
fbf17a42d4 |
[SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges (#363)
* [SWDEV-516561][1/2] Add MARKER_RANGE_EXTENT to capture ROCTX ranges
Range extent to capture all work between roctxpush/pop operations. Entry callback takes place during roxtxpush and exit callback takes place in roctxpop. This is primarily to allow us to keep an ancestor id on the ancestor stack such that all operations that take place within the push/pop context can be annotated as being apart of this range. With the current setup (where push and pop are two separate operations that need to be combined externally), we cannot keep an ancestor id on the stack and thus cannot tie tracing events to particular ranges.
Correlation id information is inherited from the push operation. Ancestor id needs to be added in a future commit that also outputs this ancestor to CSV.
Output:
```
[ctest] {'size': 64, 'kind': 7, 'operation': 1, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479402642, 'end_timestamp': 2932551491178449, 'thread_id': 3254861}
[ctest] {'size': 64, 'kind': 8, 'operation': 2, 'correlation_id': {'internal': 1525, 'external': 0, 'ancestor': 1524}, 'start_timestamp': 2932551479405878, 'end_timestamp': 2932551491181214, 'thread_id': 3254861}
```
Note: Kind 8 = range extent op.
* Merge fix
Revert several changes
source/lib/rocprofiler-sdk/marker/range_marker.*
- separate out range marker implementation for standard marker implementation
Update public API with marker core range
Support marker core range in sdk (source/lib/rocprofiler-sdk)
Transition rocprofiler-sdk-tool and output lib to use marker core range
Misc fixes for tests
Fix logic in lib/output/generate{CSV,Stats}.cpp
Update tests/rocprofv3/tracing-hip-in-libraries (marker validation)
Fix test_otf2_data
* Test fixes
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
bde07e7baa |
[SDK] KFD new events API (#321)
* Remove page-migration
* Add KFD events API
* Address review comments
* Move assert checks
* Update enum-string utils
* Update codeowners
* Update KFD header
* Add perfetto category
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b097e276a9 |
[rocprofv3] Add rocpd output support (part 1: prelude) (#401)
* [rocprofv3] Add rocpd output support (part 1: prelude)
- git submodules for sqlite3, GOTCHA, and pybind11
- HIP stream data
- rocprofiler_query_intercept_table_name(...)
- serialization load
- rocprofiler::sdk::get_perfetto_category(KindT)
- rocprofiler::sdk::parse::strip
- common library updates
- md5sum
- hasher
- simple_timer
- static_tl_object
- get_process_start_time_ns(pid_t)
- output library updates
- node_info
- file_generator (generator is now virtual base class)
- stream info updates
* Added submodules
* Code review updates
* Minor unused-but-set-X warning fixes
* Update CI
- install libsqlite3-dev package
* Update CI
- install libsqlite3-dev package
* Fix static thread-local object memory leak
- also fix signal handler chaining
* Remove URL from comment
* Remove page migration exception
* Enable ROCPROFILER_BUILD_SQLITE3 by default
- try find_package(SQLite3) first and then build when ROCPROFILER_BUILD_SQLITE3=ON
* Fix gotcha installation
- make install of target optional
* Validate tracing + counter collection dispatch data
- i.e. correlation ids, thread ids, timestamps
* Make find_package(SQLite3) optional
- ROCm CI does not have SQLite3 dev package installed and cannot build from source (missing tclsh)
* Fixes to tracing + counter collection test
* get_process_start_time_ns update
- original implementation did not work
* Fix pytest-packages test_perfetto_data for counter collection
- erroneous failure when used with same PMC + multiple agents
* cmake policy: option() honors normal variables
- for GOTCHA submodule
* Improve samples/api_buffered_tracing stability
- reduce likelihood of sporadic exception throw
* Update gotcha submodule
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
cd106cda3c |
[CI]: test-async-copy-tracing-validate : test_ancestor_ids (#364)
* Add ancestor ID to pftrace
* Simplify hipMemcpyAsync ancestor ID checks
* Review comments
[ROCm/rocprofiler-sdk commit:
|
||
|
|
5467c83188 |
rocDecode Buffer Tracing Support (#315)
* Added buffer tracing support for rocdecode and updated tests to work with buffer tracing
* Updated perfetto to output args individually rather than as a string list
* Updated docstrings and operation type, changed OTF2 code to remove warning due to change in operation type
* Updated tests for review comments
* Test args exist and return value
* Updated to use string entry
* Change function name
* Updated PR to reflect review comments
* Updated for PR review comments
* Change function name
[ROCm/rocprofiler-sdk commit:
|
||
|
|
692d041316 |
[SDK] Release 1.0 Public API Modifications (#277)
* Make sure all structs/enums can be forward declared
* Updates to counter collection
- consistency updates and cleanup
* Conversion of dimension information to info struct
* Added deprecated folder
* Testing changes
* merge changes
* Fix shadowed variable
* Source code formatting
* Fix shadowed variable
* Update rocprofiler_counter_info_v1_t member names
* Split version.h into version.h and ext_version.h
- ext_version.h contains external version info, e.g. ROCPROFILER_HSA_API_TABLE_MAJOR_VERSION, ROCPROFILER_HSA_RUNTIME_VERSION
- this reduces amount of recompilation after a commit since version.h gets updated with the git revision
* profile_config -> counter_config
* EOF new line
* [Samples] Reduce header includes + reorg counter collection samples
* Misc compilation fixes
- shadowed variables
- use of [[deprecated("...")]] in C code
- unused variables
* Minor misc modifications
- use common:: instead of rocprofiler::common:: when inside rocprofiler namespace
- counters.cpp
- move local anon namespace functions into rocprofiler::counters:: anon namespace
- use std::string_view for get_static_string
- const ref for get_static_ptr
- misc namespace shortening
* [Public API] rocprofiler_get_version_triplet + rocprofiler_version_triplet_t
- struct rocprofiler_version_triplet_t containing fields for the major, minor, and patch version
- public API function: rocprofiler_get_version_triplet
- define C++ operators for rocprofiler_version_triplet_t
- C++ function compute_version_triplet
* [Tests] Improve async-copy-testing test
- relax constraints
- improve logging
* Update counter_config.h doxygen docs
* ROCPROFILER_SDK_BETA_COMPAT
- ppdef which helps with renaming when set to 1
* Remove spurious include
* Fix includes for cxx/version.hpp
* Doxygen fixes for rocprofiler_get_version and rocprofiler_get_version_triplet
* Public API Experimental Designation
- ROCPROFILER_SDK_EXPERIMENTAL added to experimental function
- "(experimental)" added to doxygen @brief entries
* Fix use of assert instead of static_assert in hip/stream.cpp
* Use typedef instead of define for rocprofiler_profile_config_id_t
* Use inline rocprofiler_{create,destroy}_profile_config instead of ppdef
- added <rocprofiler-sdk/deprecated/profile_config.h>
* Doxygen for rocprofiler_{create,destroy}_profile_config
* ROCPROFILER_SDK_DEPRECATED_WARNINGS
* Temporarily comment out ROCPROFILER_SDK_DEPRECATED_WARNINGS=1
* cmake formatting
* Misc variable renaming in samples and tests
* Fix declarations of types
* Fix hip stream tracing service struct name
- rocprofiler_callback_tracing_stream_handle_data_t renamed to rocprofiler_callback_tracing_hip_stream_api_data_t
* Rename "HIP_STREAM_API" to "HIP_STREAM"
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
36f4788ad5 |
[CI] Miscellaneous Testing Updates (#305)
* Add rocprofiler-sdk-utilities.cmake
- contains cmake function rocprofiler_sdk_get_gfx_architectures
* Update perfetto_reader.py
- fix hash collision
* Update project names in tests folders
- rocprofiler-tests -> rocprofiler-sdk-tests
* Fix incorrect allocation-error handling
* [CI] Disable openmp tests for navi2, navi3, and navi4
* Suppress leaks by omptarget and llvm
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
d1aeee3599 |
Add an option to disable perfetto debug annotations in json tool (#258)
* Add opt-in to disable perfetto annotations
Add an env option `ROCPROFILER_DISABLE_PERFETTO_ANNOTATIONS`
to disable perfetto function-arg annotations.
If there are a large number of records, the tests that use this tool timeout
on some machines
* Update iteration kind
* Remove test_retired_correlation_ids for page-migration
[ROCm/rocprofiler-sdk commit:
|
||
|
|
74efdad1aa |
rocJPEG API Tracing (#73)
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Support for rocJPEG API Trace
* Added newline to rocjpeg_version.h
* json-tool code added, initial test/bin commit
* Formatting
* Resolved rocjpeg bin test compilation errors
* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed
* Formatting and compilation errors
* Minor fixes
* Copyright year update and minor fixes
* Doc update fix
* Added rocjpeg csv file in data
* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes
* Added rocdecode and rocjpeg to CI
* Removed rocdecode and rocjpeg from CI and added back build tests option
* Updated Cmake Files
* Added rocDecode and rocJPEG to CI
* Remove cmake line added in error
* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes
* Added find_package for test
* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path
* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests
* Resolve merge conflicts and formatting
* Added regex find and replace instead of include for CI
* VAAPI package causing errors on Vega20
* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved
* Removed workflows regex
* Formatting and minor test modification
* Modified test for vega20
* Update rocDecode and rocJPEG cmake and tests
* Changelog
* Fix merge conflict
* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing
* Removed if found statements, replaced with TARGET:EXISTS
* Skip json file for rocjpeg and rocdecode tests if not supported
* Add os import
---------
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b610b5ff56 |
SDK: No bg thread if no clients use SDK (#123)
* SDK: No bg thread if no clients use SDK
* Update CHANGELOG
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
edb51fc861 |
update copyright date to 2025 (#102)
* Update LICENSE
* Update conf.py
* Update copyright year
* [fix] Update copyright year
* Update copyright year "ROCm Developer Tools"
* Add license headers to c++ files
* Add license to *.py
* Update licenses in rocdecode sources
---------
Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3e1b8ba4ec |
rocDecode API Tracing Support (#49)
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Merge conflict error
* CMake files changed in response to review comments. Attempting to implement callbacks.
* Turned off test building for rocdecode
* Minor fixes for review comments
* Review comments
* Updated formatting
* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.
* Remove iterate args support
* Remove iterate-args
* enforce abi versioning in macro if
* Fix doc error
* removed spaces to fix indentation error
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
82261be227 |
SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary
* adding json output for host_functions
* added error handling
* adding json tool support
* Adding tests
* formatting changes
* Adding documentation
* refactoring as per amd-staging
* Adding intializers and changing macros
* Fix page-migration background thread on fork (#31)
* Fix page-migration background thread on fork
After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.
Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).
This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.
* Formatting fixes
* Detach page-migration background thread and update test timeout
* Attach files with ctest
* Update corr-id assert
* Tweak on-fork, simplify background thread
* Revert thread detach
* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)
* Adding Trace Period feature to rocprofv3
* Adding feature documentation
* Update source/bin/rocprofv3.py
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fixing format
* Moving to Collection Period and changing the input params
* Format Fixes
* Fixing rebasing issues
* Removing atomic include from the tool
* Adding more options for units, optimizing the code
* Fixing rocprofv3.py
* Fixing time conv & adding time controlled app
* Fixing format
* Changing to shared memory testing methodology
* use of shmem use
* Fix include headers for transpose-time-controlled.cpp
* Format upload-image-to-github.py
* Removing shmem and using only env var to dump timestamps from the tool
* Tool Fixes + Test Config
* Adding Tests
* Fixing Review comments
* Update trace period implementation
* Update trace period tests
* check between start and stop timestamps
* Merge Fix
* Update validate.py
* Improve safety of rocprofiler_stop_context after finalization
* Pass context id to collection_period_cntrl by value
* Adding 20 us error margin
* Ensure log level for collection-period test is not more than warning
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Update lib/rocprofiler-sdk/code_object/hip/code_object.*
- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define
* Update lib/rocprofiler-sdk/code_object/hip/code_object.*
- debugging for error that cannot be locally reproduced
* Update lib/rocprofiler-sdk/code_object/hip/code_object.*
- improve error handling and logging
* Update lib/rocprofiler-sdk/code_object/hip/code_object.*
- tweak to non-fatal logging messages
* Update lib/rocprofiler-sdk/code_object/hip/code_object.*
- cleanup of logging messages
* Update host kernel symbol register data fields
* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
792329fefd |
SWDEV-492625 memory free functions (#11)
* SWDEV-492625: Track free memory HSA functions to help determine total amount of memory allocated on the system at any one time
* Minor fixes to address comments
* Update allocation size description
* Moved get function back to specialization, minor typo fixes
* Removed memory_operation_type field, removed memory_pool allocation enum, converted starting address to hex string for json format.
* Made conversion to hex_string a function, changed address to use union rocprofiler_address_t type, changed VMEM descriptors
* Removed as_hex from the global namespace
* Formatting
* Removed TRACK_EVENT for memory allocation, now TRACK_COUNTER for memory allocation is being performed
* Check if address was recorded before retrieving allocation size in generate Perfetto
* Formatting
* Update source/lib/output/generatePerfetto.cpp
* Explicitly disable app-abort tests
* Remove excluding app-abort test from workflow CI
- redundant bc these tests are explicitly marked as disabled now
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a79f8a0198 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
1ea688c447 |
Runtime Initialization Tracing (#1105)
* Runtime initialization tracing
- calbacks and buffer entries notifying when a runtime has been initialized
* Minor cleanup to registration.cpp
* JSON tool implementation
* Increase perfetto_reader timeout
* Handle perfetto_reader timeout when attr doesn't exist
* clang-tidy fixes to memory_allocation.cpp
[ROCm/rocprofiler-sdk commit:
|
||
|
|
94f4f56c40 |
Memory Allocation Tracking (#1142)
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions
* Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected
* Debugging and implementing generateCSV function
* Memory allocation size and starting address outputted to csv and json file formats
* Formatting
* Initial setup for OTF2 and Perfetto generation
* Collecting agent id for memory_allocation and formatting
* Modified memory_allocation.cpp to set up code for AMD_EXT commands
* Support for memory_pool_allocate added
* Removed accidently added file
* Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works
* Formatting
* Fixed perfetto and otf2 output
* Fixed flag issue due to incorrect buffer use
* Updated documentation
* Small cleaning and comments
* Added test for HSA memory allocation tracing
* Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error
* Decreased lower limit of hip calls for test
* Modified summary tests to vary number of allocate requests
* Minor fixes to address comments. Still need to address OTF2 comments
* Fix docs and changed OTF2 to use enum for type specified in location_base construction
* Fixed schema error
* Added vmem command tracking. Need to add test
* Updated test to work with vmem command and updated generateCSV to output int instead of hex string.
* OTF2 enum update and mispelling fix
* CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API
* Update CMakeLists.txt for memory allocation test
* Updated summary test
* Minor fixes to address comments
* Moved domain_type.hpp enum to before LAST
* Fixed compile errors and formatting
* Fixed stats summary domain name error
* Added rocprofv3 test
* Page migration test fix
* Undo page migration test changes. Failures do not appear to have to do with memory allocation
[ROCm/rocprofiler-sdk commit:
|
||
|
|
36d357337d |
Report page migration events as start/end (#793)
* Squashed commit of the following:
commit b76f2635f4b65599f03812a73d0cf410f5ada213
Author: Mythreya <mythreya.kuricheti@amd.com>
Date: Fri Apr 26 00:29:09 2024 +0000
Changed for PR feedback
commit bedb8ad566ff42fbf117b19202c26c507abcf8ac
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Apr 25 19:20:06 2024 -0500
Fix installation
commit a98f8a69459a1450a1be9c98e20b3c1e7f2568c2
Author: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Date: Thu Apr 25 19:16:35 2024 -0500
Restructure the headers
commit 46489a020ffafdd5f4ce3f580469ff233ef67fe1
Author: Mythreya <mythreya.kuricheti@amd.com>
Date: Tue Apr 23 23:31:10 2024 +0000
Update hsa include
commit 8e795282cce348fc6aa736b7857b21aeb32aa20a
Author: Mythreya <mythreya.kuricheti@amd.com>
Date: Tue Apr 23 23:02:32 2024 +0000
Report page migration events as start/end
* Updated tests accordingly
* Page migration events are reported independently
commit 8784e5ad4895a626a2a8e4ac12f8021b34172bd4
Author: Mythreya <mythreya.kuricheti@amd.com>
Date: Tue Apr 16 17:01:57 2024 +0000
Update handling of dropped page migration events
Previously, we dropped all locally buffered events when we detect that
KFD has dropped some events. This may drop too many pending events too eagerly.
When we receive an end event and cannot find the corresponding start,
we can be sure that KFD has dropped some events in the immediate past.
When this happens, we look through all locally buffered events and report
the start events that are older than 10s as partial events --- they have
no "end" information (we expect that the end events have been dropped).
We also set the polling timeout to 10s to prevent the local buffer from
getting too large with events waiting to be paired up.
Updated tests
commit 2e8e0b07eeda9b5990e1ae8d28dcd3a035ce38e1
Author: Mythreya <mythreya.kuricheti@amd.com>
Date: Tue Apr 16 17:01:31 2024 +0000
Docs for triggers
* Fix page migration sample
* Fix hasher, kfd install
* Add hsa include
* Install KFD include dir
* Updates from code review
- single timestamp field
- node_id -> agent_id
- from_node -> from_agent
- to_node -> to_agent
* Misc revisions
* Remove page-migration install target
* Update page-migration pytest
* Tweak to serialization
* Address PR comments
* Update page-migration test
* Add cli args, update iterations
* Address PR comments
* Add abi.cpp for static_asserts
* Update page_migration gtest with only runtime tests
* Moved helpers into utils.hpp
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
07f698a532 |
CMake: Consistently name CMake Targets (#1082)
* Change all rocprofiler-X target names to rocprofiler-sdk-X
* Update rocprofiler-sdk-config.cmake
- fix install tree target names
- simplify logic for using find w/ components and find w/o components
* Update rocprofiler-sdk-roctx-config.cmake
- simplify logic for using find w/ components and find w/o components
* Update samples/intercept_table/CMakeLists.txt
- demonstrate/test use of `find_package(rocprofiler-sdk ... COMPONENTS ...)`
[ROCm/rocprofiler-sdk commit:
|
||
|
|
3819a846da |
Check to force tools to initialize the ctx id to zero. (#1135)
* Check to force tool to initialize the ctx id to zero.
* initialize rocprofiler_context_id_t with 0 in units tests
* changelog
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
345fde3ebc |
Renamed agent profiling service to device counting service (#1132)
* Renamed agent profiling service to device counting service
Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:
find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +
* Converted dispatch profile to dispatch counting service
* Debug for functioal counters test
* Minor changes for CI
* Minor fix
* More fixes for CI
* Update evaluate_ast.cpp
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
fcd6cc45bd |
Package RCCL headers to support adding RCCL support w/o installed headers (#1075)
- in ROCm CI, rocprofiler-sdk gets built before RCCL is installed, this is a workaround for this issue
[ROCm/rocprofiler-sdk commit:
|
||
|
|
c47a128941 |
Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing
Address comments
* [Draft]: Add support for RCCL tracing
Address PR comments, changes from RCCL upstream
* Add RCCL library table registration
Working on adding support to rocprofiler-register
* Support compilation w/o <rccl/amd_detail/api_trace.h>
- dummy api_trace.h header
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header
* RCCL API tracing tool support
- add to rocprofv3
- add to json-tool
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
2a4591dcae |
Fix rocprofv3 output filename containing sub-directory (#1062)
* Fix -d option broken by hostname
* Fix rocprofv3 output filename containing directory
* Fix TID handling in Perfetto and OTF2 output
* Revert changes which removed hostname
* Revise tests/rocprofv3/tracing output filenames
- specify an output filename for tests which include a subdirectory
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
264c48fa69 |
Misc API cleanup and consistency fixes (#1023)
- ROCPROFILER_API after function
- use rocprofiler_tracing_operation_t in lieu of uint32_t where appropriate
- rocprofiler_tracing_operation_t is not int32_t typedef (formerly uint32_t)
- use const T* instead of T* where appropriate
[ROCm/rocprofiler-sdk commit:
|
||
|
|
2393b2ee45 |
look for symbols in dynsym table (#990)
* look for symbols in dynsym table
* checking both symtab and dynsym
* Avoid symbol duplication in non stripped binaries
* clang-format
* Minor elf_utils.cpp updates
- use 'else if' instead of 'if'
- logging tweaks
* Update registration
- tweak logging
* Update testing
- strip the rocprofiler-sdk-c-tool library
- add test-c-tool-rocp-tool-lib-execute test which does NOT LD_PRELOAD the library (uses only ROCP_TOOL_LIBRARIES instead)
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
12ae4e6ce4 |
Migrate to rocprofiler-sdk:: namespace in CMake everywhere (#892)
- remove all usage/support for rocprofiler:: namespace
[ROCm/rocprofiler-sdk commit:
|
||
|
|
ada9d97b78 |
Miscellanous AFAR 5 Updates (#891)
* Dispatch table copy/update uses ROCP_TRACE instead of ROCP_INFO
* Update rocprofiler-sdk CMake config
- rocprofiler::rocprofiler is alias to rocprofiler-sdk::rocprofiler-sdk instead of other way around
* Prefer rocprofiler-sdk::rocprofiler-sdk over rocprofiler::rocprofiler
* Fix WITH_UNWIND for glog
- requires a value of "none" instead of boolean now
* Update include/rocprofiler-sdk/registration.h
- explicit struct names to permit forward decl
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- ROCPROFILER_SDK_CEREAL_NAMESPACE_BEGIN and ROCPROFILER_SDK_CEREAL_NAMESPACE_END to enable customized namespace
[ROCm/rocprofiler-sdk commit:
|
||
|
|
9fde046078 |
JSON schema + JSON node vs. Perfetto category consistency (#869)
* Fix schema differences b/t JSON and perfetto
- "kernel_dispatches" (JSON) vs. "kernel_dispatch" (Perfetto) -> "kernel_dispatch"
- "scratch_api" (JSON) -> "scratch_memory"
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- remove unnecessary includes causing circular dependency
* rocprofv3 docs
- Output formats
- JSON schema
* Spelling fix
* Update schema descriptions
* Improve assert log in test_perfetto_data
[ROCm/rocprofiler-sdk commit:
|
||
|
|
46a9637496 |
Adding Perfetto support (#867)
* Perfetto submodule
* include/rocprofiler-sdk/cxx/perfetto.hpp
- adapted from tests/common/perfetto.hpp
- updated json-tool to use <rocprofiler-sdk/cxx/perfetto.hpp>
* Update include/rocprofiler-sdk/cxx
- add details/delimit.hpp
- add details/join.hpp
- extend details/mpl.hpp
- extend details/operators.hpp
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- update MEMORY_COPY direction names
* Preliminary perfetto support
* Update lib/rocprofiler-sdk-tool/generatePerfetto.cpp
- fix getting roctx msg vs. buffer operation name
* Temporary variable restructuring
* Perfetto patches after rebasing onto main
* Revert lib/rocprofiler-sdk/hsa/async_copy.cpp
- revert name
* Update lib/rocprofiler-sdk-tool/generatePerfetto.cpp
- fix ReadTrace
* Update tests/bin/hip-in-libraries
- sleep_for
* Support PFTRACE output format option in rocprofv3
* Change perfetto logging
* Update rocprofv3 tests to generate pftrace output
* Minor tweak to json-tool.cpp
* Update requirements.txt for perfetto testing
* Fix data race on amount_read in generatePerfetto.cpp
* Add testing for pftrace output
- relatively simple testing which verifies that the pftrace file has the same number of entries as JSON data for HIP/HSA/marker/kernel/memory_copy
* Fix import in perfetto_reader.py
* Fix data race in generatePerfetto.cpp
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0ba9f26a6a |
Adding JSON support (#860)
* Adding json support
minor bugs
Fixing tests
Fixing formatting issues
Fixing test
test fix
Misc testing fixes
Use rocprofiler/cxx/name_info in rocprofiler-sdk-tool
fixes to reduce the Json file size
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/tool.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/tool.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/helper.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/helper.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/tool.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/tool.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/helper.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/tool.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Update source/lib/rocprofiler-sdk-tool/generateJSON.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
misc fixes
Removing int cast for JSON tests
formatting
removing a condition test on Navi3
adding debug info
Misc fix
* CSV updates
- fix stats
- numerical formatter support for customizing write_csv_entry
- misc formatting
- get_marker_stats_file
* Misc tests/rocprofv3/counter-collection/input2 fixes
- rocprofiler_configure_pytest_files in rocprofv3/counter-collection/input2
- removed state code from merge in rocprofv3/counter-collection/input2
* Tool: "Agent-id" -> "Agent_Id"
- consistency
* Tool update
- remove rocprofiler_tool_marker_record_t
- add marker_tracing_kind_conversion
- fix memory leak in write_json
- minor update to get_output_stream
- rework handling of marker records
* Update tests/pytest-packages/pytest_utils/__init__.py
- add collapse_dict_list function for converting a dictionary value that is a list of length one into a directly mapped value
* Update tests/rocprofv3/**/conftest.py
- use collapse_dict_list when reading in JSONs
* Update tests/rocprofv3/counter-collection/input1/validate.py
- relax testing requirements gfx1102 (AQLProfile bugs)
- in addition to relaxed testing requirements for gfx1101
* Update tests/rocprofv3/tracing/validate.py
- fix removal of PID in every marker record
* Update tests/rocprofv3/tracing-plus-cc
- remove test design that relies on iterating subdirectories
* Wrapper around __libc_start_main
- Ensures finalization happens before main returns
- Update tests/rocprofv3/tracing/validate.py
- wrapper around __libc_start_main changed roctx calls
* Combine include/rocprofiler-sdk/cxx/serialization.hpp and include/rocprofiler-sdk/external/serialization.hpp
- tests/common/serialization.hpp simply includes include/rocprofiler-sdk/cxx/serialization.hpp now
* Update lib/rocprofiler-sdk/hip/hip.cpp
- tracing function immediately returns when fini_status is non-zero
* Update lib/rocprofiler-sdk/hsa/hsa.cpp
- remove logging of tracing function when fini_status is non-zero
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- remove rocprofv3_trigger_list_metrics.cpp from TOOL_SOURCES
* Update tests/rocprofv3/tracing-plus-cc/CMakeLists.txt
- fix depends
* Domain statistics
* Update tests/rocprofv3/tracing-plus-cc/CMakeLists.txt
- do not set ROCP_LOG_LEVEL in env
* Remove erroneous <bits/utility.h> include
* Restructure tool source + reduce tool table + support multiple formats
- buffered_output struct for handling output
- support multiple output formats, e.g. --output-format csv,json
- rename buffer_type_t -> domain_type
- simplified generation of CSV output files
- removed rocprofiler_tool_marker_record_t
* Update lib/common/container/ring_buffer.hpp
- value_type alias in ring_buffer<Tp>
* Remove all but one json-execute tests
- generate CSV and JSON in same run
* Fix include for domain_type.cpp
* Update tests/rocprofv3/tracing-plus-cc/input.txt
- only specify counters which can be found on gfx8, gfx9, gfx10, gfx11, etc.
- use :device= syntax
* Update lib/rocprofiler-sdk-tool/config.cpp
- support :device=N syntax for counters file
- improve stripping comments in PMC files
- only read after pmc:
* Rework tool library counter collection
- fatal error if all requested counters for device are not found
- support :device= syntax
* Update tests/rocprofv3/tracing-plus-cc/input.txt
- removed L2CacheHit (not supported on mi300)
* Disable JSON tests in tests/rocprofv3
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- support rocprofiler_record_dimension_info_t
* Update tool JSON schema
- remove domain_type::CODE_OBJECT
- rocprofiler_tool_agent_v0_t
- rocprofiler_agent_v0_t + counters
- rocprofiler_tool_counter_info_t
- get_code_object_data()
* Update JSON schema for tool
* Update lib/rocprofiler-sdk-tool/tool.cpp
- fix ROCP_WARNING_IF
* rocprofv3 -> rocprofv3.sh
- install rocprofv3.sh into sbin
- configure_file <source-tree>/rocprofv3.sh -> <binary-tree>/bin/rocprofv3
* Update tool counter collection
- rocprofiler_tool_record_counter_t
- rocprofiler_tool_counter_collection_record_t
* Update tests/rocprofv3/counter-collection/input1/CMakeLists.txt
- use rocprofiler_configure_pytest_files for validate.py, conftest.py, and input.txt
* Update tests/rocprofv3/counter-collection/input1/validate.py
- re-enable test_validate_counter_collection_pmc1_json
* Update tests/rocprofv3/counter-collection/input2/validate.py
- remove unused code
* Update tests/rocprofv3/counter-collection/input2/validate.py
- remove unused code
* Update tests/rocprofv3/hsa-queue-dependency/validate.py
- re-enable JSON tests
* Misc tests/rocprofv3 CMake updates
* Update tests/rocprofv3/tracing/validate.py
- re-enable JSON tests
* Update tests/rocprofv3/tracing-hip-in-libraries/validate.py
- re-enable JSON tests
* Update tests/rocprofv3/tracing/validate.py
- remove unused node_exists function
* Update tests/rocprofv3/tracing/validate.py
- fix test_marker_api_trace_json
---------
Co-authored-by: Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
f167317524 |
Public C++ header files and samples updates (#819)
* Public C++ header files (source/include/rocprofiler-sdk/cxx)
* Update samples/api_buffered_tracing
- scratch memory and page migration
- README
* Update samples/api_buffered_tracing
- page migration component in sample
* Update tests/page-migration/validate.py
- fix checks for page migration operation names
* Update tests/page-migration/validate.py
- fix get_allocated_pages
* Update scratch memory and page migration validations
* Fix include/rocprofiler-sdk/cxx installation
* Rework include/rocprofiler-sdk/cxx
- Improve name_info to support const char*, string_view, string
* Update samples/api_{buffered,callback}_tracing
* External correlation ID request sample
- includes correlation ID retirement demo
* Update samples/api_buffered_tracing/README.md
* Update lib/rocprofiler-sdk/hsa/queue.cpp
- generate correlation ID for kernel launch if one doesn't exist
* Remove priority check from tool libraries (samples/tests)
- if(priority > 0) return nullptr check in rocprofiler_configure has proliferated beyond its intended use
* Apply suggestions from code review
[ROCm/rocprofiler-sdk commit:
|
||
|
|
1514f05887 |
Async memory copy callback tracing + memory copy size (#791)
* Async memory copy tracing update
- rocprofiler_buffer_tracing_memory_copy_record_t: thread_id and bytes
- support ROCPROFILER_CALLBACK_TRACING_MEMORY_COPY
- init_public_api_struct can fully construct
* Testing for callback async copy tracing
[ROCm/rocprofiler-sdk commit:
|
||
|
|
627a9f54f1 |
Simplify json-tool JSON schema for callback records (#790)
- formerly, the rocprofiler_callback_tracing_record_t data was stored in itr["record"], e.g. itr["record"]["correlation_id"]
- dropped "record" key, e.g. itr["correlation_id"]
[ROCm/rocprofiler-sdk commit:
|
||
|
|
2aef3c3d15 |
rocprofiler_kernel_dispatch_info_t + header record for buffered counter collection (#758)
* Update include/rocprofiler-sdk
- defines.h
- ROCPROFILER_VERSION_10_0 -> ROCPROFILER_SDK_VERSION_0_0
- fwd.h
- rocprofiler_counter_record_kind_t
- rocprofiler_kernel_dispatch_info_t
- rocprofiler_record_counter_t
- has dispatch id instead of correlation id
- rocprofiler_counter_info_v0_t
- added rocprofiler_counter_id_t field
- added is_constant field
- reordered better packing
- dispatch_profile.h
- added rocprofiler_profile_counting_dispatch_record_t for use as a header record for rocprofiler_profile_counting_dispatch_data_t
- callback_tracing.h
- rocprofiler_callback_tracing_kernel_dispatch_data_t uses rocprofiler_kernel_dispatch_info_t
- buffer_tracing.h
- rocprofiler_buffer_tracing_kernel_dispatch_record_t uses rocprofiler_kernel_dispatch_info_t
* Update lib/rocprofiler-sdk/*
- transition to rocprofiler_kernel_dispatch_info_t
- set id and is_constant values for rocprofiler_counter_info_v0_t in rocprofiler_query_counter_info
* Update lib/rocprofiler-sdk-tool
- transition to rocprofiler_kernel_dispatch_info_t
* Update lib/rocprofiler-sdk/counters/tests/core.cpp
- transition to rocprofiler_kernel_dispatch_info_t
* Update samples
- transition to rocprofiler_kernel_dispatch_info_t
- transition to rocprofiler_counter_record_kind_t
* Update tests
- transition to rocprofiler_kernel_dispatch_info_t
- transition to rocprofiler_counter_record_kind_t
- improve integration test validation for counter-collection
- update serialization for new/additional types
* Fix tests/counter-collection/validate.py
- loosen restrictions on the length of counter description
* Update include/rocprofiler-sdk/buffer_tracing.h
- remove accidental packed attribute
* Update lib/rocprofiler-sdk/counters/xml/derived_counters.xml
- Add description for TCC_TAG_STALL_sum (reference: https://rocm.docs.amd.com/en/develop/conceptual/gpu-arch/mi300-mi200-performance-counters.html)
* Update tests/page-migration/validate.py
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4f99edbad5 |
Page migration reporting (#651)
* Page migration reporting support
* Page migration: Update parser and reporting
Container does not lave latest KFD header, so CI might fail
* Add kfd_ioctl.h
* Formatting
* Update get_key
- get key was not used (and shouldn't be), so delete it
* clang-tidy fixes
* Tests for page migration
* Apply suggestions from code review
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update tests/bin/page-migration/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update page-migration test app
- add hipHostRegister to register mmap'ed allocation with HIP
- misc cleanup and reorg
- remove HSA_XNACK=1 from test env
* Update lib/rocprofiler-sdk/tests/page_migration.cpp
- fix compilation error
* Minor updates (reorg, rename)
* Page migration reporting support
* Page migration: Update parser and reporting
Container does not lave latest KFD header, so CI might fail
* Update page migration tests, fix trigger types
* Page Migration Tracing Support Refactoring (#753)
* Reorganization
* Update page migration init/fini
* Formatting
* Update page_migration.cpp
- change logging severity
* Skip test if KFD does not support page migration reporting
* Rework skipping test if KFD does not support page migration
* Fix event trigger enum values
* Fix clang-diagnostic-unused-const-variable
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
03fb9ace21 |
CTest Environment Update (#756)
* Update test/tools/json-tool.cpp
- push/pop ppid as external correlation id instead of pid
* Update environment variables for tests and samples
* Revert to old CDash dashboard in run-ci.py
* Revert to new CDash dashboard in run-ci.py
[ROCm/rocprofiler-sdk commit:
|
||
|
|
5e8a3b4f16 |
Callback tracing for kernel dispatches + External correlation ID request service (#682)
* Support ROCPROFILER_CALLBACK_TRACING_KERNEL_DISPATCH
* Fix doxygen
* Update callback tracing
- temporary hacks for kind operation name and iterate kind operations
* Update source/include/rocprofiler-sdk
- introduce sequence id for kernel dispatches
* Update lib/rocprofiler-sdk (seq id)
- support sequence id passing
* Update tests (seq id)
- testing for sequence ids
* Cleanup include/rocprofiler-sdk/fwd.h
* Misc cleanup
* External Correlation ID Request Service (#699)
* External correlation ID request service
- callback requesting an external correlation ID instead of fetching from top of pushed external correlation ID stack
* Update external correlation id request support
- pass internal correlation ID in callback
- async copy generates a correlation ID if none already exists
- added external correlation ID request support for scratch memory tracing
- updated scratch memory tracing to use tracing:: functions
* Update hsa/queue.hpp
- new line at EOF
* Misc tweaks
- remove unnecessary logging in agent.cpp
- correlation_id::add_ref_count check for retirement
- finalization check in HSA queue AsyncSignalHandler
* Improve assertion failure logging in misc tests
* Update include/rocprofiler-sdk/fwd.h
- remove rocprofiler_record_counter_header_t
* Move lib/rocprofiler-sdk/tracing.hpp into lib/rocprofiler-sdk/tracing/ folder
* Update lib/rocprofiler-sdk/hsa/*
- hsa::get_hsa_status_string
- queue_info_session.hpp header
- rocprofiler_packet.hpp
* Update lib/rocprofiler-sdk/{counters,hip,marker}
- execute_phase_exit_callbacks tweaks
- queue_info_session tweaks
* Move rocprofiler_kernel_dispatch_operation_t to include/rocprofiler-sdk/fwd.h
* Update rocprofiler_buffer_tracing_kernel_dispatch_record_t
- add operation field and thread_id field
* Add lib/rocprofiler-sdk/kernel_dispatch
- enum <-> string mapping for kernel dispatch
- tracing implementations
* Update lib/rocprofiler-sdk/CMakeLists.txt
- tracing and kernel dispatch sub-directories
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- invoke rocprofiler::kernel_tracing functions
* Update tests/common/serialization.hpp
- support operation and thread_id fields for rocprofiler_buffer_tracing_kernel_dispatch_record_t
* Update tests/tools/json-tool.cpp
- use external correlation id request service
* Rename sequence_id to dispatch_id
[ROCm/rocprofiler-sdk commit:
|
||
|
|
95acc01042 |
Fix code_object_operation_t and memory_copy_operation_t enums (#751)
- enums for operations should not contain callback/buffer tracing categorization
- e.g. ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LOAD should be ROCPROIFLER_CODE_OBJECT_LOAD
[ROCm/rocprofiler-sdk commit:
|
||
|
|
6076c751a3 |
hsa multiqueue application (#618)
* hsa multiqueuw application
* cmake formatting (cmake-format) (#619)
Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com>
* source formatting (clang-format v11) (#620)
Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com>
* comppialtion fix
* Update tests/bin/CMakeLists.txt
Reorder `add_subdirectory` to fix (recurrent) issues with ROCTx in `CMAKE_BUILD_RPATH`
* addressing early feedback
* cmake updates
* more cmake updates
* adding queue dependency test
* updating test
* test updates
* removed hsa_api_trace header
* reformating headers to prevent clang from reordering
* Fixing packaging
* Fixes for hangs
* source formatting (clang-format v11) (#676)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* cmake formatting (cmake-format) (#673)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* structure change
* cmake formatting (cmake-format) (#680)
Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com>
* Adding kernel trace to test hang fix
* rebased and fixed kernel-trace test
* rhel clang fixes
* source formatting (clang-format v11) (#685)
Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com>
* Update lib/rocprofiler-sdk-tool/helper.hpp
- remove suppression of -Wshadow
* Update tests/bin/hsa-queue-dependency/CMakeLists.txt
- cleanup unnecessary code
- GPU_LIST -> GPU_TARGETS
* GPU_LIST -> GPU_TARGETS
* Remove installation of test executable and libraries
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
fb1b61d79a |
Add support for scratch reporting (#523)
* Add ToolsApiTable
Add ToolsApiTable wrapping for
scratch memory tracking
* Add initial support for scratch memory tracking
Buffering is implemented
* cmake formatting (cmake-format) (#525)
Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>
* source formatting (clang-format v11) (#524)
Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>
* Add callback tracing for scratch
Fixed the error where scratch tracking init was called irrespective of whether any client requested for it
* Apply suggestions from code review
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
* Fix tools api copy/update
Table were saved/updated incorrectly in previous
commit. Also adds passing user data through the callback
* Fix OpKind sequence for scratch tracking
Previously scratch was using OpKind from rocprofiler-sdk, but
templates were instantiated using API ID. These differ by 1
* Integration tests for scratch reporting
Added buffer and callback integration tests for scratch reporting
* source formatting (clang-format v11) (#550)
Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>
* cmake formatting (cmake-format) (#551)
Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>
* python formatting (black) (#549)
Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>
* CI fixes
* source formatting (clang-format v11) (#554)
Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>
* Update api
Rebase on main and updates based on PR feedback
* Update scratch reporting and address PR comments
- Added agent id to buffer records
- Updated `test_internal_correlation_ids` - Is almost identical to
one in async-copy
- Updated scratch test to check for agent id
- Updated queue id serialization in callback records (prints
handle as nested key)
- Remove `marker_api_traces` from scratch `test_internal_correlation_ids`
validation test
- Rename `amd_tools_api` to `scratch_memory`
- Added doxygen comments
- Remove scratch callback from `tool.cpp`
- Replace assert with `LOF_IF` in `scratch_memory.cpp`
* Update tools table
Changed to match up with changes to hsa tables in main branch
* Rework scratch memory structure
* Update tests
- Added suggestions from PR review, and updated tests accordingly
* Misc cleanup
* Update scratch test
As of Apr 4th, `hsa_amd_agent_set_async_scratch_limit` is disabled.
Note,
> This API: `hsa_amd_agent_set_async_scratch_limit` is currently
> disabled. We need some changes in CP firmware to be able to do this
> and these changes are not ready yet.
> With the current code, you will also not get notifications for
> alternate-scratch allocations because this feature has been disabled
> while CP firmware is making additional changes
> We are hoping to have that feature enabled by ROCm-6.3
* Minor update to lib/rocprofiler-sdk/internal_threading.*
- delay destruction of shared_ptrs of the tasks to prevent rare (but possible) data race on the destruction of the shared_ptr
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
043db3b3a4 |
Use small_vector for API iterate_args (#597)
* Use small_vector for API iterate_args
- replace dim3 value arguments with rocprofiler_dim3_t
- dim3 has a non-trivial destructor
- common::mpl::unqualified_type
- common::stringified_argument_array_t<N> alias
- assert_public_data_type_properties()
- common::container::small_vector<T>::at function
- stringize returns small_vector<stringified_argument>
- stack allocated vector
- remove has_pc_sampling condition (HSA, HIP)
- this will be handled in queue interception
* Misc tweaks
[ROCm/rocprofiler-sdk commit:
|
||
|
|
407fc57ede |
Shared Library Constructor (rocprofv3 deadlock fix) (#599)
* Moved tests/apps to tests/bin
* Renamed cmake project in tests/bin
* Update samples
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
- tweaks to stdout messages
* Update tests
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
* Add tests/lib
- libraries with HIP code
* Update PTL submodule
- remove atexit delete of thread_id_map
* Update cmake/rocprofiler_options.cmake
- Set ROCPROFILER_DEFAULT_FAIL_REGEX
* Update common lib: env + logging
- improved customization of logging settings
- default to disabling logging to files
- install failure handler for rocprofv3
- set_env support in environment.*
* Add lib/rocprofiler-sdk/shared_library.cpp
- shared library constructor
* Update lib/rocprofiler-sdk-tool/tool.cpp
- destructor thread safety
- convert callback_name_info and buffered_name_info to pointers
- install failure handler for logging
* Add tests/bin/hip-in-libraries
- hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels
- used for testing deadlocking within __hipRegisterFatBinary
* Update bin/rocprofv3
- reorganized the env variables
- use exec to launch command
- set ROCPROFILER_LIBRARY_CTOR=1
* Add tests/rocprofv3/tracing-hip-in-libraries
- uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels
* Update bin/rocprofv3
- fix counter collection (no exec)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- replace "Kernel-Name" with "Kernel_Name"
* Update lib/rocprofiler-sdk/registration.cpp
Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries
* Update tests/rocprofv3
- replace "Kernel-Name" with "Kernel_Name"
* Update tests
- vector-ops (bin) stream syncs + runs with 4 queues per device
- improve counter-collection/input1 validation
- rocprofv3/tracing-hip-in-libraries does not do sys-trace
- improved validation script for tracing-hip-in-libraries
- updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection
* Update samples/counter_collection
- updated dispatch_callback(s) and record_callback(s) following reworking of prototypes
* Update bin/rocprofv3
- reorganized help menu
- added options for sub-HSA tables
- added --hip-runtime-trace
- changed --hip-trace to include --hip-compiler-trace
* Update lib/rocprofiler-sdk-tool
- improved kernel filtering
- removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk)
- fixed issue with counter-collection w/o tracing
- added support for fine grained HSA API tracing
- removed directly linking to HSA-runtime
* Update lib/rocprofiler-sdk/agent.cpp
- rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option)
* GPR (vector and scalar) info in kernel symbol data
- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info
* Header include order fix
- Include repo headers first
- Third party library headers next
- standard library headers last
* Update dispatch profiling public API
- introduce rocprofiler_profile_counting_dispatch_data_t
- change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t
- provide rocprofiler_user_data_t pointer in dispatch callback
- provide rocprofiler_user_data_t value (from dispatch cb) in record callback
* Update tests/bin/CMakeLists.txt
- fix add_subdirectory(hip-in-libraries) order
* Update VERSION
- bump to 0.2.0 in prep for AFAR
[ROCm/rocprofiler-sdk commit:
|
||
|
|
d820cf3018 |
Update rocprofiler_query_available_agents(...) (#596)
* Agent info version
* Complete implementation
- revert "rocprofiler_iterate_agents" to "rocprofiler_query_available_agents"
* Misc tweaks
- update rocprofiler_query_available_agents impl
* Update include/rocprofiler-sdk/agent.h
- Fix undocumented param for rocprofiler_query_available_agents
[ROCm/rocprofiler-sdk commit:
|
||
|
|
cb6b79c323 |
Fix rocprofiler_iterate_callback_tracing_kind_operation_args for HIP compiler callbacks (#532)
* Fix HIP compiler iterate args
- `include/rocprofiler-sdk/hip/api_args.h`
- replace struct fields named "f" with "func"
- replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/callback_tracing.cpp`
- iterate_args for HIP compiler table
- `lib/rocprofiler-sdk/registration.cpp`
- fix warning about roctx num_tables
- `lib/rocprofiler-sdk/hip/hip.def.cpp`
- replace struct fields named "f" with "func"
- replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/{hip,hsa,marker}/utils.hpp`
- improve `stringize_impl`
- `lib/rocprofiler-sdk/hsa/code_object.cpp`
- remove stale commented out code
- `lib/rocprofiler-sdk/hsa/queue_controller.*`
- destory_queue -> destroy_queue
- `tests/tools/json-tool.cpp`
- improve parallelism in tool_tracing_callback
- serialize the marker api args
- only invoke rocprofiler_iterate_callback_tracing_kind_operation_args in exit phase
- `samples/counter_collection/CMakeLists.txt`
- reduce timeout on tests to 120 seconds
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- disable dereference of double pointer in stringize_impl
* Update lib/common
- indirection_level in mpl.hpp
- stringize_arg.hpp
* Rework rocprofiler_iterate_callback_tracing_kind_operation_args
- provide more information in rocprofiler_callback_tracing_operation_args_cb_t
- support specifying the dereference level to account for output paramters
[ROCm/rocprofiler-sdk commit:
|
||
|
|
15302ff11d |
C compatibility for public headers (#566)
* C compatibility for public headers
- add tests/tools/c-tool.c
- builds a tool (which does nothing) with C language
- ensures that tool can be compiled in C
- add tests/c-tool/CMakeLists.txt
- ensures that tool library build from C is a valid tool
- rocprofiler_counter_info_v0_t is_derived is int instead of bool
- C does not have bool unless <stdbool.h> is included
- add `include/rocprofiler-sdk/hsa/api_trace_version.h
- handles providing HSA_*_TABLE_(MAJOR|STEP)_VERSION values if compiled from C
- cmake define in version.h.in for ROCPROFILER_HSA_*_TABLE_(MAJOR|STEP)_VERSION
- HSA table versions compiled with
- use rocprofiler_(hsa|hip|marker)_api_no_args struct to handle incompatibility b/t empty structs in C vs. C++ (size of 0 vs. size of 1)
- extern "C" in include/rocprofiler-sdk/{hsa,hip,marker}/api_args.h
- fixed spelling error: derrived -> derived
- scope YY_NO_INPUT compile definition to lib/rocprofiler-sdk/counters/parser/*
* Revert CDash dashboard
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a360de4550 |
Correlation ID Retirement + misc (#527)
* Correlation ID Retirement
- include/rocprofiler-sdk/buffer_tracing.h
- add rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- include/rocprofiler-sdk/fwd.h
- ROCPROFILER_BUFFER_TRACING_CORRELATION_ID_RETIREMENT
- lib/rocprofiler-sdk/buffer_tracing.cpp
- kind string for correlation id retirement
- lib/rocprofiler-sdk/buffer.hpp
- emplace returns bool
- lib/rocprofiler-sdk/registration.cpp
- pass lib_instance to copy_table functions
- lib/rocprofiler-sdk/context/context.*
- update correlation_id struct
- make ref_count private
- {get,add,sub}_ref_count() functions
- sub_ref_count() performs correlation id retirement
- use stack for "latest" thread-local correlation id
- lib/rocprofiler-sdk/hip/hip.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/hsa.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/marker/marker.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/async_copy.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- handle table instance in async_copy_init / async_copy_save
- lib/rocprofiler-sdk/hsa/queue.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- tweak to external correlation id mapping in WriteInterceptor
- tests/async-copy-tracing/validate.py
- check retired_correlation_ids
- tests/common/serialization.hpp
- support rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- tests/kernel-tracing/validate.py
- check retired_correlation_ids
- tests/common/CMakeLists.txt
- perfetto external project
- tests/common/perfetto.hpp
- perfetto categories + aliases
- add_perfetto_annotation
- metaprogramming helpers
- tests/tools/CMakeLists.txt
- link to tests-perfetto
- tests/tools/json-tool.cpp
- demangling functions
- serialization of marker API callback args
- reduce parallel bottleneck in tool_tracing_callback
- support correlation id retirement
- Multiple threads for buffers
- Support ROCPROFILER_TOOL_CONTEXTS_EXCLUDE env variable
- write_perfetto() function
* Update tests/rocprofv3/tracing/validate.py
- tweak test_hsa_api_trace
* Update PTL submodule
- fixes for data race during destruction of task
* Update lib/rocprofiler-sdk/buffer.*
- unique_buffer_vec_t uses std::unique_ptr instead of allocator::unique_static_ptr_t
* Reduce timeouts in counter collection samples [skip ci]
* Update tests/tools/json-tool.cpp
- tweak demangle(string_view, int*) -> demangle(string_view, int&)
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- move sub_ref_count() to later in async_copy_handler to delay retirement slightly more
[ROCm/rocprofiler-sdk commit:
|
||
|
|
f760a3ceaa |
Updates/fixes for CI, docs, tests, samples, and common library (#528)
- .github/workflows/continuous_integration.yml
- apt-get update before apt-get install
- remove libgtest-dev
- actions-comment-pull-request: v2.4.3 -> v2.5.0
- .github/workflows/formatting.yml
- create-pull-request: v5 -> v6
- cmake/rocprofiler_options.cmake
- remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options
- samples/counter_collection/callback_client.cpp
- corr_id field renamed to correlation_id
- samples/counter_collection/client.cpp
- corr_id field renamed to correlation_id
- include/rocprofiler-sdk/fwd.h
- In rocprofiler_record_counter_t: rename corr_id field to correlation_id
- doxygen fixes
- lib/common/utility.*
- remove get_accurate_clock_id_impl
- timestamp_ns() defaults to CLOCK_BOOTTIME
- lib/rocprofiler-sdk/counters/core.cpp
- fix spelling mistake: extrenal -> external
- corr_id field renamed to correlation_id
- lib/rocprofiler-sdk-tool/tool.cpp
- fix destruction of static tool::output_file before finalization
- scripts/update-docs.sh
- define PROJECT_NAME
- tests/async-copy-tracing/validate.py
- init_time and fini_time checks
- hip_api_traces, marker_api_tracing
- tests/common/serialization.hpp
- fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id
- tests/kernel-tracing/validate.py
- init_time and fini_time checks
- relax test_total_runtime range
- tests/rocprofv3/tracing/CMakeLists.txt
- remove -M from rocprofv3-test-systrace-execute
- exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing
- tests/rocprofv3/tracing/validate.py
- update test_kernel_trace to accept mangled or demangled
- tests/tools/json-tool.cpp
- remove use of GLOG
- include init_time and fini_time
- write_json(...) function
[ROCm/rocprofiler-sdk commit:
|