develop
69 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
1517a398bf |
[rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements Buffer Pool Design ------------------ Replace the fixed array-based double buffer with a dynamic pool design to fix race conditions that caused "internal correlation id was retired prematurely" errors. The original design had a race where flush callbacks could be delivered out-of-order: when buffer 0 fills and begins flushing, writes go to buffer 1. If buffer 1 fills before buffer 0's flush completes, the buffer index wraps back to 0 (which may still be flushing). Independent flush tasks submitted to the thread pool can complete out of order. The new pool design: - Uses a std::deque of buffer instances that grows as needed - Allocates buffers from the pool when the current buffer needs to flush - Serializes flushes with a mutex to ensure FIFO callback ordering - Returns buffers to the pool after flush completion - Eliminates the race between buffer selection and write operations New Unit Tests -------------- - buffer_correlation_ordering.cpp: Tests that API records are always delivered before their corresponding retirement records - buffer_ordering_stress.cpp: Stress tests buffer flush ordering under high contention with multiple threads rapidly filling buffers HSA Tool Hooks -------------- Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that waits for pending flush tasks before tool finalization, preventing "retired prematurely" errors during HSA shutdown. Sanitizer Improvements ---------------------- - LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder - LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup) - TSAN: Added suppression for false positive on C++11 thread-safe static initialization in create_write_functor - ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto - Disabled attachment tests for sanitizers due to library preloading issues Other Fixes ----------- - Thread-trace agent test: Use heap-allocated callback state - Correlation ID: Refactored reference counting and finalization ordering * [rocprofiler-sdk] Revert buffer pool design changes Revert buffer.cpp and buffer.hpp to the original double-buffer design from develop branch. The pool-based redesign introduced concerns about: - Signal safety (mutex vs atomic_flag) - API changes (flush() return type) - Complexity of the new design This revert removes: - Dynamic buffer pool with std::deque - std::mutex/condition_variable synchronization - buffer_correlation_ordering.cpp test - buffer_ordering_stress.cpp test The underlying buffer flush ordering issue will need to be addressed with a different approach that preserves the original API and synchronization characteristics. * [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization - Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks - Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning - Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp) - Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior: - hsa/queue.cpp (lines 105, 210) - hsa/async_copy.cpp (line 344) - hsa/hsa_barrier.cpp (line 43) - buffer.cpp (lines 107, 138, 185) This ensures no correlation IDs are created once finalization starts (fini_status != 0), preventing races between finalization and ongoing tracing operations. * [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation Buffer records are not guaranteed to arrive in any specific order. Tests and samples should use timestamps for temporal ordering validation instead. Changes: - samples/external_correlation_id_request: Replace 'retired prematurely' arrival order check with timestamp-based validation that retirement timestamp >= max(end_timestamps) for records with the same correlation ID - tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check - tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check Correlation IDs are not guaranteed to be monotonically increasing when records are sorted by timestamp. Temporal ordering should be validated using the timestamp fields in each record. * [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal Restore the SYSTEM keyword to target_include_directories for rocprofiler-sdk-fmt to match develop branch. * [rccl] Remove orphaned rocSHMEM gitlink Remove orphaned submodule reference that was introduced during a merge but never had a corresponding .gitmodules entry, causing CI failures with "fatal: no submodule mapping found in .gitmodules". * [rocprofiler-sdk] Add HSA ABI version 0x09 support Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release functions (added in rocr-runtime SWDEV-561708). * [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations This commit consolidates fixes for handling the finalization status during buffer flush operations across the SDK. Changes: - Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully when flushing buffers, as this indicates buffers were already flushed during finalization (not an error condition) - HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check for fini_status to allow operations during finalization process - buffer.cpp: Revert fini_status checks to use > 0 for consistency - correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging to prevent correlation ID creation after finalization starts Files modified: - source/lib/rocprofiler-sdk-tool/tool.cpp - tests/tools/json-tool.cpp - source/lib/rocprofiler-sdk/tests/registration.cpp - source/lib/rocprofiler-sdk/tests/roctx.cpp - samples/api_buffered_tracing/client.cpp - samples/counter_collection/buffered_client.cpp - samples/counter_collection/device_counting_async_client.cpp - samples/external_correlation_id_request/client.cpp - samples/pc_sampling/client.cpp - source/lib/rocprofiler-sdk/buffer.cpp - source/lib/rocprofiler-sdk/context/correlation_id.cpp - source/lib/rocprofiler-sdk/hsa/queue.cpp - source/lib/rocprofiler-sdk/hsa/async_copy.cpp - source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp * [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls in samples and tools. The ERROR_FINALIZED handling was overly complex and the hsa_tool_hooks OnUnload synchronization is no longer needed. Changes: - Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code - Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL - Simplify buffer flush in tool.cpp and json-tool.cpp - Remove ERROR_FINALIZED special handling from test files Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Fix output_stream move semantics to null source pointers The default move constructor and move assignment operator for output_stream did not null out the source's pointers after the move. This caused double-close when the moved-from temporary was destroyed, leading to use-after-free crashes (SIGSEGV in std::ostream::sentry). Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration - generatePerfetto.cpp: Move output_stream into shared_state to prevent use-after-free race conditions during Perfetto callback execution - run-ci.py: Simplify and consolidate sanitizer environment variable configuration for better maintainability Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required for CTest to properly pass suppression files to the sanitizers during memcheck runs. Co-Authored-By: Claude <noreply@anthropic.com> * Revert "[rccl] Remove orphaned rocSHMEM gitlink" This reverts commit 1ad21003941355658fff8114fa27768f11a948f7. * [rocprofiler-sdk] Revert registration.cpp changes Revert changes to registration.cpp to match develop branch. Co-Authored-By: Claude <noreply@anthropic.com> * [rocprofiler-sdk] Remove suppression file content printing from run-ci.py Co-Authored-By: Claude <noreply@anthropic.com> * Fix output_stream move ctor/assignment operator * Fix erroneous revert of registration.cpp * Fix handling of fini status in correlation ID construction * [rocprofiler-sdk] Fix OMPT segfault during finalization Add nullptr checks in OMPT tracing code to handle the case where correlation_tracing_service::construct() returns nullptr during finalization. This fixes segfaults in openmp-target-sample and tests.integration.execute.openmp-tools. The correlation ID construction now returns nullptr when fini_status > 0, but the OMPT callbacks were not checking for this, causing crashes when dereferencing the null pointer during OpenMP runtime shutdown. Changes: - event_common(): Return nullptr early if correlation ID is null - event(): Check for nullptr before calling sub_ref_count() - ompt_task_create_callback(): Return early if correlation ID is null - ompt_task_schedule_callback(): Return early if correlation ID is null * [rocprofiler-sdk] Fix HSA API tracing segfault during finalization Add nullptr check in hsa_api_impl::functor after correlation ID construction. During finalization, correlation_service::construct() returns nullptr, and without this check the code would dereference the null pointer when accessing corr_id->internal. This fixes the SEGV at address 0x000000000008 (null + 8 byte offset) that occurs when HSA async event threads call hsa_signal_destroy during runtime shutdown after finalization has started. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
dd44ae3295 |
[Palamida scan] SWDEV-553053 Adding missing copyrights information (#836)
* SWDEV-553053 Adding missing copyrights information |
||
|
|
a697941150 |
[ROCProfiler SDK CI] Runners Update & Workflow Cache Improvement (#722)
Overriding checks/reviewers as CODEOWNER changes are pending * Runners Update Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update aqlprofile-continuous_integration.yml Testing ROCProfiler-SDK Testing ROCProfiler-SDK Changing CDash Fixing ROCProfiler-SDK Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Moving AQLProfile Navi3 and Navi4 to DIND Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating images Updating images Updating images Updating images Updating RHEL and SLES for AQLProfile Fixing RPM OSes AQLprofile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for AQLProfile Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK Updating RHEL and SLES for ROCProfiler-SDK * Fixing ENV for ROCProfiler-SDK Fixing ENV for ROCProfiler-SDK Temp workaround for OpenMP targets Fixing ROCProfiler-SDK for Ubuntu * Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Update rocprofiler-sdk-continuous_integration.yml Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Fixing Ubuntu Workflow Adding RPM Package Adding RPM Package Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Fixing OPenMP Compiler Issues Update rocprofiler-sdk-continuous_integration.yml Update rocprofiler-sdk-continuous_integration.yml Update aqlprofile-continuous_integration.yml Update rocprofiler-sdk-continuous_integration.yml Fixing AQLProfile * [rocprofiler-sdk][CI] add latest aqlprofile to rocprofiler-sdk workflow (#352) * add aqlprofile * misc. * format * add sudo to install * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml --------- Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> Update aqlprofile-continuous_integration.yml Removing extra packages Removing extra packages Fixing ROCM Path Issues Fixing ROCM Path Issues Fixing ROCM Path Issues Fixing RHEL Fixing RHEL Fixing RHEL Fixing RHEL Fixing RHEL Fixing Sanitizers * General Fixes * Fixing ROCProfiler-SDK CI * Fixing ROCProfiler-SDK CI * Update projects/aqlprofile/dashboard.cmake Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * General Fixes * Update Readme.txt * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Update rocprofiler-sdk-continuous_integration.yml * Fix ROCProfiler SDK CI * Fix ROCProfiler SDK CI * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Fix for RHEL and Sanitizers for ROCProfiler-SDK * Upgrade ROCm Release & Fix for RHEL & SLES - ROCProfiler SDK CI * Fix for RHEL & SLES - ROCProfiler SDK CI * Fix for RHEL & SLES & Sanitizers - ROCProfiler SDK CI * Fix for RHEL & SLES & Sanitizers - ROCProfiler SDK CI * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Adding ROCR Installation * Update run-ci.py * Fix for Sanitizers & Fix for RHEL 8.8 * Updating Code Coverage Workflow * Updating Code Coverage Workflow * Formatting Fix * Formatting Fix * Fix for Code Coverage & Sanitizers * Fix for Code Coverage & Sanitizers * Fix for Code Coverage & Sanitizers * Caching Docker * Caching Docker * Caching Docker * Changing Runner for CI Builder * Adding CCache * Fixing Core * Fixing Core * Fixing Core * Fixing Core * Fixing Core * Update rocprofiler-sdk-continuous_integration.yml * Update ROCm and amdgpu repository configurations * Refactor repository configuration commands in CI * Fix installation commands in CI workflow * Remove unnecessary packages from installation commands * Update ROCm and amdgpu repository paths in CI config * Update pip installation commands to handle errors * Install AWS CLI in CI workflow * Update rocprofiler-sdk-continuous_integration.yml * Remove awscli installation from CI workflow * Modify PATH and pipx install commands in CI config * Refactor ROCm SDK CI workflow to eliminate redundancy * Add safe.directory configuration for git * Update rocprofiler-sdk-continuous_integration.yml * Fix CMake install prefix in CI workflow * Add variant option to ccache configuration * Change compiler launcher from ccache to sccache * Set up Python virtual environment in CI workflow * Remove ccache launcher from CMake build * Add environment setup for building projects * Add Curl installation step for RHEL 8.8 * Update rocprofiler-sdk-continuous_integration.yml * Update rocprofiler-sdk-continuous_integration.yml * Fixing RPM * Fixing RPM & Code Coverage * Fixing RPM * Fixing CI * Lowering the size of the docker image * Update aqlprofile-continuous_integration.yml * Updating paths in AQLProfile * Splitting the Build CI Docker Images from Main CI * Create Dockerfile.ci, update ci docker workflow to reference it * Splitting the Build CI Docker Images from Main CI * Add new line to Dockerfile.ci * Remove on schedule logic from ci docker workflow, change cdash project name in run-ci.py * Update file path in build_ci_docker_images.yml * Remove context from docker step * Update file path in build_ci_docker_images * more path changes * remove context again * Update rocprofiler-sdk-build_ci_docker_images.yml * Update rocprofiler-sdk-code_coverage.yml * Update rocprofiler-sdk-continuous_integration.yml * Remove env variables from rocprofiler-sdk-build_ci_docker_images.yml * Rename docker images file * Rename KEY to FILE_NAME for Docker tarball * [rocprofiler-sdk][CI] lint fixes (#830) * lint fixes. * Updating Code Coverage Workflow * Update rocprofiler-sdk-code_coverage.yml * Update format.hpp * Update format.hpp --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com> * TEMP: Removing ROCR build from develop * [rocprofiler-sdk][SDK] Add new HIP API changes for ROCm 7.1 (#856) * Add new HIP 7.1 changes. * bug fix. * bug fix. * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix typo in hipDriverEntryPoint case statement --------- Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: jbonnell-amd <jason.bonnell@amd.com> Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com> |
||
|
|
9df2c1ec68 |
[rocprofiler-sdk] Fix formatting, linting, and CI workflows (#345)
* [rocprofiler-sdk] Fix formatting and lint workflows - several formatting workflows were silently failing when listing files * format metrics_test.h * Improve formatting job robustness * Source formatting workflow does not use container * Use PyPi clang-format * Format rocpd/source/csv.cpp source * Fix rocprofiler-sdk CI workflow - fix invalid context access * Update run-ci.py - fix ctest_update * Update run-ci.py - handle old checkout in ROCm/rocprofiler-sdk |
||
|
|
906030caf4 |
Changing CDash Project (#188)
* Changing CDash Project * Fixing CI * Fixing AQLProfile CDash * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI * Fixing CI |
||
|
|
f051f37cdc |
GPU-less runners update (#503)
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* Update codeql.yml
* clean up
* clean up
* clean up
* Update codeql.yml
* Update codeql.yml
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
839c07c4aa |
[CI] Testing stability (#486)
* [CI] Testing Stability
- CMake option ROCPROFILER_DISABLE_UNSTABLE_CTESTS
- used for tests which periodically fail around 1 out of every 10 runs
- set to ON while instability remains, this needs to set to OFF in ROCm 7.1 or, ideally, ROCm 7.0.1
- Use FIXTURES_SETUP and FIXTURES_REQUIRED for some tests
- replace "threw an exception" with "${ROCPROFILER_DEFAULT_FAIL_REGEX}" for misc FAIL_REGULAR_EXPRESSIONS
* Remove contents of all EXCLUDE_{TESTS,LABEL}_REGEX from CI workflow
* Disable patch git step in code-coverage run
* Tweak spin time of reproducible runtime
* Removed patch git step in code-coverage run
* Update ROCPROFILER_DEFAULT_FAIL_REGEX
* Mark test-counter-collection tests as unstable
- add fixtures setup/required
* Remove ATTACHED_FILES_ON_FAIL
- CDash doesn't store enable downloading these properly anyway
* Relax collection-period fuzzing window
* Disable unstable collection-period test
- too unstable
* formatting
* Disable unstable device_counting_service_test.async_counters
* Suppress perfetto internal data race errors
* Switch code-coverage CI jobs to mi300 runner
* Timeout increases
* rocprofv3-test-rocpd updates
- add fixtures
- switch executable
- redefine input/output paths
* Revert code-coverage job to mi300a runner
* Update rocprofv3-test-rocpd-execute-multiproc
- reduce problem size
* disable multiproc rocpd
* Split code-coverage into separate workflow
- network issues cause this job to fail frequently
- when in a separate workflow, it can be restarted easily
* Fixtures for rocprofv3-test-trace-hip-in-libraries
* Disable unstable device_counting_service_test.sync_counters
* Potential fix for code scanning alert no. 171: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* Switch code-coverage to run on rocprof-azure
- mi300a EMU runner set is unstable (network issues)
* tests/rocprofv3/pc-sampling SKIP_REGULAR_EXPRESSION
* Update rocprofv3-test-list-avail-trace-execute
- reduce log level and increase timeout
* rocprofv3: Prevent recursive call to rocprofv3_error_signal_handler + log chaining
* rocprofv3: Use ROCP_ERROR + std::exit instead of ROCP_FATAL
- should help with SKIP_REGULAR_EXPRESSION
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
8f5d00ca5d |
[SDK] Add gfx950 targets for tests and samples (#399)
* add gfx950 targets.
* add gfx950 targets to ci workflows.
* Format.
---------
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Vaddireddy, Sushma <Sushma.Vaddireddy@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0a9849a5cf |
Add copyright disclaimer for scan (#453)
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0e93099fd7 |
[rocprofv3] SQLite3 database output (rocpd) support + rocprofiler-sdk-rocpd (#403)
* [rocprofv3] rocpd SQLite3 database output support
* Move counters xml and yaml to source/share/rocprofiler-sdk
- more representative of install hierarchy
* Add share/rocprofiler-sdk/rocpd SQL files
* Experimental rocprofiler-sdk SQL API
* rocprofv3 default output format is rocpd
* Fix rocpd event ids for counter collection w/o kernel dispatch
* Remove fktable entries from rocpd_tables.sql
* Fix rocpd schema path
* Fix install component for roctx python bindings
* rocprofiler-sdk-rocpd
- create include/rocprofiler-sdk-rocpd
- create rocprofiler-sdk-rocpd library, package, etc.
- default all "guid" fields to "{{guid}}" in tables
- remove "{{view_uuid}}" support (always unused)
* Migrate rocprofv3 to use rocprofiler-sdk-rocpd
* Fix missing foreign key reference
* Revert change
* Fix cmake comment
* Fix maybe-uninitialized compiler warning
* Fix maybe-uninitialized compiler warning
* Add logging to rocpd_sql_load_schema
* Improve string sanitization when inserting json strings
* Initialize rocpd logging on rocprofiler-sdk-rocpd library load
* Revert lib/output/generatePerfetto.cpp changes
* [temporary] Tweak rocprofv3-test-list-avail-trace-execute test log level
* Update get_install_path for lib/rocprofiler-sdk-rocpd/sql.cpp
- try to resolve issues on RHEL/SLES for dladdr
* Update lib/common/logging.cpp
- enable environ overrides
* dlsym for rocpd_sql_load_schema
* Make dl_info.dli_fname lexically normal
* Implement node_info alternatives if /etc/machine-id does not exist
* Misc include fixes
* SHA256 and UUIDv7 support
* Implement UUIDv7 in generateRocpd.cpp
* Support push/pop environment variables
* Minor tweak
* Fix glog segfaults when unsetting glog env
* Updated CHANGELOG
* Updates tests/pytest-packages
- rocpd_reader.py: RocpdReader
* Update tests / marker_views.sql
- add test_rocpd_data
* Update rocpd_tables.sql
- Use AUTOINCREMENT
- insert "uuid" and "guid" into rocpd_metadata
* Minor updates to generateRocpd.cpp
- don't quote GUID
- use sqlite3_open_v2
- use sqlite3_close_v2
* Update execute_raw_sql_statements_impl
- uses sqlite3_last_insert_rowid for autoincrement
* Update SQL deferred_transaction
- CI check for nullptr to connection
* Apply suggestions from code review
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
* Code review updates
- formatting
- replace if with switch
- remove loop for {{uuid}}
* Fix pmc_groups handling in rocprofv3
* Address code review feedback
- Include rocm_version in rocprofv3 version info
- Note `--version` option for `rocprofv3` in CHANGELOG.md
- remove commented out code
* Fix packaging dependencies
* Fix install package step of CI workflow
* Fix install package step of CI workflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b097e276a9 |
[rocprofv3] Add rocpd output support (part 1: prelude) (#401)
* [rocprofv3] Add rocpd output support (part 1: prelude)
- git submodules for sqlite3, GOTCHA, and pybind11
- HIP stream data
- rocprofiler_query_intercept_table_name(...)
- serialization load
- rocprofiler::sdk::get_perfetto_category(KindT)
- rocprofiler::sdk::parse::strip
- common library updates
- md5sum
- hasher
- simple_timer
- static_tl_object
- get_process_start_time_ns(pid_t)
- output library updates
- node_info
- file_generator (generator is now virtual base class)
- stream info updates
* Added submodules
* Code review updates
* Minor unused-but-set-X warning fixes
* Update CI
- install libsqlite3-dev package
* Update CI
- install libsqlite3-dev package
* Fix static thread-local object memory leak
- also fix signal handler chaining
* Remove URL from comment
* Remove page migration exception
* Enable ROCPROFILER_BUILD_SQLITE3 by default
- try find_package(SQLite3) first and then build when ROCPROFILER_BUILD_SQLITE3=ON
* Fix gotcha installation
- make install of target optional
* Validate tracing + counter collection dispatch data
- i.e. correlation ids, thread ids, timestamps
* Make find_package(SQLite3) optional
- ROCm CI does not have SQLite3 dev package installed and cannot build from source (missing tclsh)
* Fixes to tracing + counter collection test
* get_process_start_time_ns update
- original implementation did not work
* Fix pytest-packages test_perfetto_data for counter collection
- erroneous failure when used with same PMC + multiple agents
* cmake policy: option() honors normal variables
- for GOTCHA submodule
* Improve samples/api_buffered_tracing stability
- reduce likelihood of sporadic exception throw
* Update gotcha submodule
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
cfce653d86 |
[SDK] Standardize rocprofiler-sdk counter definition YAML schema (#370)
* Convert YAML Format
Convert YAML format and reader to properly read the YAML.
Comparison between output's from the YAML show only changes in ordering
of architectures (and ids).
* Test fixes
* Add script for converting the YAML schema to source/scripts
* Update documentation
* Change the extra counter code block to YAML
* Add missing new line at EOF
* remove name issues
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
89fbdeb196 |
[docs] Improve readability of ROCprofiler-SDK API library documentation (#359)
* Use custom .rst to make api doc more readable.
* Update index.rst
* Misc docs updates
- doxygen source code fixes
- updated doxygen files
- fixed conf.py (does not generate code in source tree)
* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
* Update source/docs/api-reference/rocprofiler-sdk_api/modules.rst
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
* Update source/docs/api-reference/rocprofiler-sdk_api/global_data_structures_topics_files.rst
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
* Duplicate
* test warnings
* Update CMakeLists.txt
* Update rocprofiler-sdk.dox.in
* Update update-docs.sh
* fix docs build failures by -q -T flags.
* set warn_as_error to NO.
* test -W to suppress warnings.
* remove -q flag from make.
* reduce dot graph depth to 100
* Update custom docs target
- docs target is now no longer part of the dependency list for the all target
- installation of docs requires explicitly building the docs target (i.e. OPTIONAL install of _build/html/ folder)
* add quit and trace mode back.
* increase DOT_GRAPH_MAX_NODES to 500 back.
* Format.
---------
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
9f7703f918 |
Build system (libdw), correlation ID, and shebang fixes (#354)
* Fix compilation for output library
- link to targets for ATT (amd-comgr, dw, elf)
* Relax correlation ID retirement log failures
- only fail for correlation ID retirement underflow when building in CI mode
* Fix shebang for several files
- license was inserted before shebang in several places
* Update code coverage exclude folders for samples
* Tweak to agent tests
- test to make sure hsa agent is not the old value instead of testing that it is the new value
* Fix libdw include/link
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a83ee16d51 |
[Misc] fix the agent_id field (#297)
* fix the agent_id field
* Fix shebang
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
8778237298 |
doc improvements for 1.0.0 part 2 (#330)
* update installation steps
* Github Issue #50 Adding README's for samples
* Making name change to ROCprofiler-SDK for consistency
* Fix HIP trace documentation
* Fix HSA trace in docs
* Fix kernel trace in docs
* Fixing memory copy and memory allocation traces
* runtime trace and sys trace doc update
* Fix scratch memory doc
* kernel naming and filtering options
* Adding collection period in docs
* Perfetto configs update
* summary output file
* kernel trace format fix
* update CHANGELOG
* Agent index doc update
* rocm-smi output
* group by queue option
* Updated --group-by-queue description
* perfetto visualization
---------
Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
5a5fe7f4bf |
Copyright Compliance (#333)
* Added copyright information to requested files
* Formatting
* Fix bad function name error
[ROCm/rocprofiler-sdk commit:
|
||
|
|
36f4788ad5 |
[CI] Miscellaneous Testing Updates (#305)
* Add rocprofiler-sdk-utilities.cmake
- contains cmake function rocprofiler_sdk_get_gfx_architectures
* Update perfetto_reader.py
- fix hash collision
* Update project names in tests folders
- rocprofiler-tests -> rocprofiler-sdk-tests
* Fix incorrect allocation-error handling
* [CI] Disable openmp tests for navi2, navi3, and navi4
* Suppress leaks by omptarget and llvm
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
9764f96427 |
removing gfx940 and gfx941 targets (#286)
* removing gfx940 and gfx941 targets
* updated changelog
[ROCm/rocprofiler-sdk commit:
|
||
|
|
95ac740f25 |
Fix install for conversion-script (#211)
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0608bbb4db |
SWDEV-499989: Conversion Script to change counter collection output format from v3 to v1 (#107)
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1
* Add logging and argparsing
* Dropping duplicated counters in pmc multiple lines
* Adding test for conversion
* moving conversion script to test files
* copy conversion script from scripts folder
[ROCm/rocprofiler-sdk commit:
|
||
|
|
e677801859 |
Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX
- add UBSAN_OPTIONS to setup-sanitizer-env.sh
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX
* Use -fno-sanitize-recover=undefined flag
- this compiler flag causes all undefined behavior errors to exit
* Revert ROCPROFILER_DEFAULT_FAIL_REGEX
* fix for shift overflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
edb51fc861 |
update copyright date to 2025 (#102)
* Update LICENSE
* Update conf.py
* Update copyright year
* [fix] Update copyright year
* Update copyright year "ROCm Developer Tools"
* Add license headers to c++ files
* Add license to *.py
* Update licenses in rocdecode sources
---------
Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
04ace57589 |
ROCTx Documentation (#29)
* Add roctx doc
* Add roctx doxyfile input
* Update links and toc
* Build doxysphinx for both doxygen files
* Update scripts
* Generate roctx doxygen files
* Change doxygen path
to allow for 2 doxyfiles
* Make doxygen dir for script
* Call make _doxygen dir with p flag
* Create _doxygen dir in workfllow
* Create doc dirs for doxygen
* Run update docs as sudo
* Fix typo in mkdir command
* Include graphviz for dot
* Install dot for docs CI
* Install dot as sudo due to permission denied
* Install doxygen via sudo
* Install doxysphinx
* Add postcheckout step to RTD to config and gen doxygen docs
* On RTD, update doxygen after creating env
* update docs.yml
* update docs.yml
* fixing build-docs-from-source
* Fixing build docs from source
* update docs.yml
* trying to fix readthedocs
* trying to fix readthedocs
* update docs.yml
* improve mainpage documentation
* update docs
* clang-format fix
---------
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
247ba0afa1 |
Download perfetto trace_processor_shell (#105)
* Download perfetto trace_processor_shell
* Upgrade to perfetto-trace-processor-shell v0.0.4
* Fix run-ci.py warning
- warning message:
CMake Warning (dev) at /.../build/CTestCustom.cmake:16:
Syntax Warning in cmake code at column 77
Argument not separated from preceding token by whitespace.
* Update tests/pytest-packages/pytest_utils/perfetto_reader.py
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a79f8a0198 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
7608eb49d6 |
Updating CI
Update continuous_integration.yml
Update continuous_integration.yml
Adding EMU Runners
Update continuous_integration.yml
Update continuous_integration.yml
Bump thollander/actions-comment-pull-request from 2.5.0 to 3.0.1
Bumps [thollander/actions-comment-pull-request](https://github.com/thollander/actions-comment-pull-request) from 2.5.0 to 3.0.1.
- [Release notes](https://github.com/thollander/actions-comment-pull-request/releases)
- [Commits](https://github.com/thollander/actions-comment-pull-request/compare/v2.5.0...v3.0.1)
---
updated-dependencies:
- dependency-name: thollander/actions-comment-pull-request
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Update continuous_integration.yml
Update continuous_integration.yml
Update run-ci.py
Update upload-image-to-github.py
Update continuous_integration.yml
Update continuous_integration.yml
Update continuous_integration.yml
Update continuous_integration.yml
Update continuous_integration.yml
using github output
Update continuous_integration.yml
Revert temp change
Update continuous_integration.yml
Update continuous_integration.yml
[ROCm/rocprofiler-sdk commit:
|
||
|
|
c17952fd23 |
rocprofv3: refactor and reorganize rocprofiler-sdk-tool library (#1138)
* Add rocprofv3-multi-node.md to source/lib/rocprofiler-sdk-tool
* Initial source re-organization
- create "output" static library
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- add GPR count fields to kernel symbol serialization
* Add source/scripts/generate-rocpd.py
- reads one or more JSON output files from rocprofv3 and writes rocpd SQLite3 database
- Note: preliminary implementation
* More reorganization b/t lib/rocprofiler-sdk-tool and lib/output
* Updates to generate-rocpd.py
- add SQL views
- option: --absolute-timestamps -> --normalize-timestamps
- option: --generic-markers
- misc fixes with regards to getting the views working
- support marker names
* Update generate-rocpd.py
- Add --marker-mode option
* Update generate-rocpd.py
- Improve debugging of bad bulk SQLite statements
* Update rocprofv3-multi-node.md
- cleanup of proposed SQL schema
* lib/output/format_path.{hpp,cpp}
- rename format to format_path (in config.hpp and config.cpp)
- move format_path functionality to format_path.{hpp,cpp}
* Rework lib/output/tmp_file_buffer.{hpp,cpp}
* Update output_key.cpp
- support %cwd%, %launch_date%
* Rework lib/output/buffered_output.hpp
* Support csv_output_file constructed via domain_type
* Update lib/output/domain_type.{hpp,cpp}
- get_domain_trace_file_name
- get_domain_stats_file_name
* Update lib/rocprofiler-sdk-tool/tool.cpp
- tweak headers
* Update lib/output/generate*.cpp
- remove include of helpers.hpp
- CSV uses domain_type for filenames
* Update samples/counter_collection/per_dev_serialization.cpp
- make wait_on volatile
* Remove tool_table from lib/output and lib/rocprofiler-sdk-tool
- Also split various structs into their own files
- lib/output/agent_info
- lib/output/metadata
- lib/output/kernel_symbol_info
- lib/output/counter_info
- Implemented rocprofiler::tool::metadata
* Optimize rocprofiler_tool_counter_collection_record_t
- reduce the size of the struct from 24784 bytes to 8376 bytes
* Introduced output_config
- split subset of config (from tools library) into output_config to be able to configure the output generating functions separately from the tool library
- this is a significant step towards the output generating functions not relying on static global memory
* Stream chunks of data into output instead of loading all info memory
* Remove duplicate group_segment_size in rocprofiler_kernel_dispatch_info_t serialization
* Adding Q&A to rocprofv3-multi-node.md
* Remove all remaining include lib/rocprofiler-sdk-tool from lib/output
- migrated a fair amount of code from lib/rocprofiler-sdk-tool/helper.hpp to lib/output
* Update Q&A of rocprofv3-multi-node.md
* Fix minor compilation errors + minor cleanup
* Update hsa/async_copy.cpp
- when ROCPROFILER_CI_STRICT_TIMESTAMPS > 0, reduce the active_signal sync wait time
* Update profiling_time.hpp
- fix log messages for when start/end time is less/greater than enqueue/current CPU time
* Fix generate_stats for tool_counter_record_t
* Dictionary optimization for generate-rocpd.py
---------
Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
6752f1ea9b |
rocprofv3: stabilize rocprofv3 summary tests (#1161)
* Update tests/bin/transpose/transpose.cpp
- add hipMemGetInfo call to display the available vs. total memory on the GPU
* Update tests/rocprofv3/summary/validate.py
- Updated test_summary_display_data after addition of hipMemGetInfo to transpose test exe
* Tweak code coverage comment uploading
- create unique orphan branch per PR
- reduce quality of PNG files (85 -> 70)
* Revert some of code coverage comment uploading
- remove creation of unique orphan branch per PR
* Tweak code coverage comment uploading
- create unique orphan branch per PR
[ROCm/rocprofiler-sdk commit:
|
||
|
|
34c35c26ba |
Fix misaligned stores in buffer (#1063)
* Fix misaligned read/write to buffer
- causes undefined behavior
* Update run-ci.py
- fix spurious CDash submission failure warning
* Improve run-ci.py support for UBSan
* Relax rocprofv3 summary stats count expectation
* Update CHANGELOG
[ROCm/rocprofiler-sdk commit:
|
||
|
|
34bcfb0b9d |
Update run-ci.py with new cdash portal (#1048)
* Update run-ci.py
* Update run-ci.py
* Update run-ci.py
* Update run-ci.py
* Update run-ci.py
* Update run-ci.py
[ROCm/rocprofiler-sdk commit:
|
||
|
|
2dc1c0d9f5 |
rocprofv3 doc updates (#982)
* updating rocprofv3
* using rocprofv3
* review updates
* naming standardization
* Update source/docs/how-to/using-rocprofv3.rst
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
* review comments
* adding API references
* kernel filtering
* Remove Sphinx warn as error
To bypass false warning for linking between rst and md
* remove unused (duplicate) refs in _toc.yml.in
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Peter Jun Park <peter.park@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
03ff04bbe3 |
Adding changes for handling abort signals (#979)
* Adding changes for handling abort signals
* Fix the test failure
* Fixing CmakeLists error
* Addressing review comments
* fixing warnings
* fixing execute test
* Fixing abort app test
* Address review comments
* Apply suggestions from code review
* Apply suggestions from code review
* Fixes for testing issues
* Adding kernel filtering test
* Removing text input file
* fix formatting issues
* misc fix
* Suppress signal-unsafe error in ThreadSanitizer
- rename signal handler to rocprofv3_error_signal_handler to ensure specific filtering
* Fix rocprofv3 aborted-app validation
---------
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4cd076b9cc |
Test using HIP Graphs (#835)
* Test using hip graphs
* Remove assert for api_end < async_end
* Update rocprofv3/tracing-hip-in-libraries::test_api_trace
* Update rocprofv3/tracing-hip-in-libraries::test_api_trace
* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout
* Update rocprofv3/tracing-hip-in-libraries::test_api_trace
* Remove submit retry
* Update rocprofv3/tracing-hip-in-libraries::test_api_trace
* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout
* Update lib/common/container/record_header_buffer.hpp
- minor tweaks
* Update lib/rocprofiler-sdk/buffer.hpp
- tweak ROCPROFILER_BUFFER_POLICY_LOSSLESS flush behavior
* Increase rocprofv3-test-trace-hip-in-libraries-validate timeout
* Update rocprofv3/tracing-hip-in-libraries::test_api_trace
* Revert rocprofv3-test-trace-hip-in-libraries-validate timeout
* Update run-ci.py
- RETRY_COUNT set to zero
[ROCm/rocprofiler-sdk commit:
|
||
|
|
0dc01661c1 |
Relax default CDash submission requirements in run-ci.py (#836)
* Update run-ci.py to not require successful CDash submission by default
* Minor tweak to run-ci.py
[ROCm/rocprofiler-sdk commit:
|
||
|
|
261e4da484 |
Add default values for kernel struct (#798)
* Add default values for kernel struct
* Update hsa-queue-dependency app
- default initializers
- check HSA_AMD_MEMORY_POOL_INFO_RUNTIME_ALLOC_ALLOWED for memory pools
- clang-tidy fixes (member -> static, etc.)
* Update run-ci.py
- add --progress --output-on-failure -V if no other options regarding verbosity are passed
- improve the ability to control the stages
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
098bd37968 |
Adding Keyword search pattern (#768)
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Adding the scan as a script
* clean up
* Update continuous_integration.yml
[ROCm/rocprofiler-sdk commit:
|
||
|
|
4f99edbad5 |
Page migration reporting (#651)
* Page migration reporting support
* Page migration: Update parser and reporting
Container does not lave latest KFD header, so CI might fail
* Add kfd_ioctl.h
* Formatting
* Update get_key
- get key was not used (and shouldn't be), so delete it
* clang-tidy fixes
* Tests for page migration
* Apply suggestions from code review
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update tests/bin/page-migration/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update page-migration test app
- add hipHostRegister to register mmap'ed allocation with HIP
- misc cleanup and reorg
- remove HSA_XNACK=1 from test env
* Update lib/rocprofiler-sdk/tests/page_migration.cpp
- fix compilation error
* Minor updates (reorg, rename)
* Page migration reporting support
* Page migration: Update parser and reporting
Container does not lave latest KFD header, so CI might fail
* Update page migration tests, fix trigger types
* Page Migration Tracing Support Refactoring (#753)
* Reorganization
* Update page migration init/fini
* Formatting
* Update page_migration.cpp
- change logging severity
* Skip test if KFD does not support page migration reporting
* Rework skipping test if KFD does not support page migration
* Fix event trigger enum values
* Fix clang-diagnostic-unused-const-variable
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
03fb9ace21 |
CTest Environment Update (#756)
* Update test/tools/json-tool.cpp
- push/pop ppid as external correlation id instead of pid
* Update environment variables for tests and samples
* Revert to old CDash dashboard in run-ci.py
* Revert to new CDash dashboard in run-ci.py
[ROCm/rocprofiler-sdk commit:
|
||
|
|
73ff4f2502 |
Update HSA async copy active signals handling (#732)
* Enable INFO logging on retried CI jobs * Update lib/rocprofiler-sdk/async_copy.cpp - rework active_signals - make hsa_signal_t member variable - remove sync from destructor - replace _is_set with atomic counter - timeout of 30 seconds hsa_signal_wait - switch from relaxed to scacquire/screlease memory ordering - improve logging and error handling - destroy hsa signal in active_signals in async_fini * Update lib/rocprofiler-sdk/async_copy.cpp - active_signals::create - change initial value of signal to 1 instead of value of completion signal - change condition trigger of signal callback * Update tests/counter-collection/validate.py * Update lib/rocprofiler-sdk/async_copy.cpp - improved logging - fix hsa_signal_wait_scacquire_fn check * Cleanup tests/lib/transpose/transpose.cpp - remove huge comment block * Appears to be working on MI200 Dependency Versions: clr: |
||
|
|
e2c30bd438 |
adding pandas and pytest to rquirements.txt (#748)
* adding pandas and pytest to rquirements.txt
* setting up requrements.txt
* Update requirements
- formatting packages
- remove packages not directly used by rocprofiler-sdk
* Update cmake formatting, linting, and options
- if BUILD_CI -> force BUILD_DEVELOPER and BUILD_WERROR
- support python installed clang-format and python installed clang-tidy
* Update build.sh
- split into install-deps.sh and install-apt-deps.sh
* Improve code coverage
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
90b8328902 |
Update to Clang-tidy-15 (#742)
* Update continuous_integration.yml
* Update build.sh
* Update continuous_integration.yml
* Update build.sh
* Update continuous_integration.yml
[ROCm/rocprofiler-sdk commit:
|
||
|
|
b5d4745e4e |
Adding useful scripts for formating and building (#737)
* Addin useful scripts for formating and building
* Update build.sh
* Update build.sh
* Update continuous_integration.yml
[ROCm/rocprofiler-sdk commit:
|
||
|
|
521e2794e6 |
Update run-ci.py (#641)
* Temp: Fixing node id
* source formatting (clang-format v11) (#709)
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
* Using logical node id
* Update agent.cpp
* Update agent.cpp
* Python formatting
* Update run-ci.py
* Update run-ci.py
* Update continuous_integration.yml
* Update continuous_integration.yml
running directly using the prepared runner container
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update run-ci.py
* Clean up
* Fixing install paths
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Update continuous_integration.yml
* Fixing GPU Agents Test Validation
* python formatting (black) (#712)
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
* Fixing the issue with rocclr detected kernels __amd_rocclr_.*
* python formatting (black) (#713)
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
* Fixing the issue with rocclr detected kernels __amd_rocclr_.*
* Fixing static number of async copies and using hsa_api instead for validation
* python formatting (black) (#714)
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
* Increasing the time limit for waiting on active signals
* Update continuous_integration.yml
* Update async_copy.cpp
* Update CMakeLists.txt
* changing node id to logical node id in rocprofv3
* Update tool.cpp
* testing async mem copy signal decrement
* Update logging.cpp
* Update validate.py
---------
Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler1.amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler2.amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
5716eae6e1 |
Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization
- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue
* Update HIP/HSA/marker update_table logging
* Update rocprofv3 tests
- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them
* Disable thread sanitizer deadlock detection
* Update CI workflow
- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers
* Update run-ci.py
- set gcovr html medium and high threshold
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- remove this capture from enable/disable serialization
* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*
- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map
* Logging for HIP/HSA/marker/profile_serializer
* Logging for HIP/HSA/marker/queue_controller
* Improve test_retired_correlation_ids asserts
* Fix tests/counter-collection/validate.py
- scale expected SQ_WAVES counter value based on warp size of GPU
* Tweak github comment for code coverage
* Remove gcovr html high/medium threshold args
* Fix tests/counter-collection/validate.py
- round before casting to int in test_counter_values
* operator bool for profile_serializer
- only wait on CV if profile_serializer is used
* Logging updates (profile_serializer + code_object)
* Update counter-collection validate.py
* QueueController does not wait on CV if finalizing/finalized
* Update CI workflow
- remove navi32 from core job
* Improve HIP/HSA/marker tracing get_functor/functor
- remove lambda wrapper around functor
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- do not acquire cvmutex lock during finalization
* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*
- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized
* Update CI workflow
- remove navi32 runners
* bwelton fixes for hangs
* CMake improvements + simplified demangle
- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
e995df5be5 |
Deadlock Fix for HSA and Serialization Disable/Enabling support (#582)
* Initial barrier
* Working on profiler serializer extraction
* Current progress
* Serializtion Support
* source formatting (clang-format v11) (#583)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* cmake formatting (cmake-format) (#584)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* Minor fix
* Current Progress
* Current progress
* More fixes
* Serialization Fixes
* Bug fix
* source formatting (clang-format v11) (#600)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* More fixes
* More minor fixes
* source formatting (clang-format v11) (#603)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* source formatting (clang-format v11) (#604)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* Lock order inversion false positive
* order fix
* More changes
* source formatting (clang-format v11) (#607)
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
* minor test fix
* Minor test changes
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|
||
|
|
15302ff11d |
C compatibility for public headers (#566)
* C compatibility for public headers
- add tests/tools/c-tool.c
- builds a tool (which does nothing) with C language
- ensures that tool can be compiled in C
- add tests/c-tool/CMakeLists.txt
- ensures that tool library build from C is a valid tool
- rocprofiler_counter_info_v0_t is_derived is int instead of bool
- C does not have bool unless <stdbool.h> is included
- add `include/rocprofiler-sdk/hsa/api_trace_version.h
- handles providing HSA_*_TABLE_(MAJOR|STEP)_VERSION values if compiled from C
- cmake define in version.h.in for ROCPROFILER_HSA_*_TABLE_(MAJOR|STEP)_VERSION
- HSA table versions compiled with
- use rocprofiler_(hsa|hip|marker)_api_no_args struct to handle incompatibility b/t empty structs in C vs. C++ (size of 0 vs. size of 1)
- extern "C" in include/rocprofiler-sdk/{hsa,hip,marker}/api_args.h
- fixed spelling error: derrived -> derived
- scope YY_NO_INPUT compile definition to lib/rocprofiler-sdk/counters/parser/*
* Revert CDash dashboard
[ROCm/rocprofiler-sdk commit:
|
||
|
|
a7e1a05b34 |
Update run-ci.py (#534)
[ROCm/rocprofiler-sdk commit:
|
||
|
|
f760a3ceaa |
Updates/fixes for CI, docs, tests, samples, and common library (#528)
- .github/workflows/continuous_integration.yml
- apt-get update before apt-get install
- remove libgtest-dev
- actions-comment-pull-request: v2.4.3 -> v2.5.0
- .github/workflows/formatting.yml
- create-pull-request: v5 -> v6
- cmake/rocprofiler_options.cmake
- remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options
- samples/counter_collection/callback_client.cpp
- corr_id field renamed to correlation_id
- samples/counter_collection/client.cpp
- corr_id field renamed to correlation_id
- include/rocprofiler-sdk/fwd.h
- In rocprofiler_record_counter_t: rename corr_id field to correlation_id
- doxygen fixes
- lib/common/utility.*
- remove get_accurate_clock_id_impl
- timestamp_ns() defaults to CLOCK_BOOTTIME
- lib/rocprofiler-sdk/counters/core.cpp
- fix spelling mistake: extrenal -> external
- corr_id field renamed to correlation_id
- lib/rocprofiler-sdk-tool/tool.cpp
- fix destruction of static tool::output_file before finalization
- scripts/update-docs.sh
- define PROJECT_NAME
- tests/async-copy-tracing/validate.py
- init_time and fini_time checks
- hip_api_traces, marker_api_tracing
- tests/common/serialization.hpp
- fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id
- tests/kernel-tracing/validate.py
- init_time and fini_time checks
- relax test_total_runtime range
- tests/rocprofv3/tracing/CMakeLists.txt
- remove -M from rocprofv3-test-systrace-execute
- exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing
- tests/rocprofv3/tracing/validate.py
- update test_kernel_trace to accept mangled or demangled
- tests/tools/json-tool.cpp
- remove use of GLOG
- include init_time and fini_time
- write_json(...) function
[ROCm/rocprofiler-sdk commit:
|
||
|
|
cb6e12ab3a |
Callback based handler for counter collection (#506)
* Callback based handler for counter collection
* source formatting (clang-format v11) (#507)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* cmake formatting (cmake-format) (#508)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Doc fix
* Minor doc fix
* More doc fixes
* More doc fixes
* More doc fixes
* Update CI
* Changes to the API per comments
* Mutex exception for HSA
* source formatting (clang-format v11) (#511)
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
* Doc fix
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
[ROCm/rocprofiler-sdk commit:
|