c255ec5b5ce7d297043f8831bf7bb5359a2a74e2
59 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
7166b1ab58 |
[rocprofv3] Add rocpd output support (part 1: prelude) (#401)
* [rocprofv3] Add rocpd output support (part 1: prelude) - git submodules for sqlite3, GOTCHA, and pybind11 - HIP stream data - rocprofiler_query_intercept_table_name(...) - serialization load - rocprofiler::sdk::get_perfetto_category(KindT) - rocprofiler::sdk::parse::strip - common library updates - md5sum - hasher - simple_timer - static_tl_object - get_process_start_time_ns(pid_t) - output library updates - node_info - file_generator (generator is now virtual base class) - stream info updates * Added submodules * Code review updates * Minor unused-but-set-X warning fixes * Update CI - install libsqlite3-dev package * Update CI - install libsqlite3-dev package * Fix static thread-local object memory leak - also fix signal handler chaining * Remove URL from comment * Remove page migration exception * Enable ROCPROFILER_BUILD_SQLITE3 by default - try find_package(SQLite3) first and then build when ROCPROFILER_BUILD_SQLITE3=ON * Fix gotcha installation - make install of target optional * Validate tracing + counter collection dispatch data - i.e. correlation ids, thread ids, timestamps * Make find_package(SQLite3) optional - ROCm CI does not have SQLite3 dev package installed and cannot build from source (missing tclsh) * Fixes to tracing + counter collection test * get_process_start_time_ns update - original implementation did not work * Fix pytest-packages test_perfetto_data for counter collection - erroneous failure when used with same PMC + multiple agents * cmake policy: option() honors normal variables - for GOTCHA submodule * Improve samples/api_buffered_tracing stability - reduce likelihood of sporadic exception throw * Update gotcha submodule --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
33e43e66d3 |
[SDK] Standardize rocprofiler-sdk counter definition YAML schema (#370)
* Convert YAML Format Convert YAML format and reader to properly read the YAML. Comparison between output's from the YAML show only changes in ordering of architectures (and ids). * Test fixes * Add script for converting the YAML schema to source/scripts * Update documentation * Change the extra counter code block to YAML * Add missing new line at EOF * remove name issues --------- Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
6ec9526475 |
[docs] Improve readability of ROCprofiler-SDK API library documentation (#359)
* Use custom .rst to make api doc more readable. * Update index.rst * Misc docs updates - doxygen source code fixes - updated doxygen files - fixed conf.py (does not generate code in source tree) * Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com> * Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com> * Update source/docs/api-reference/rocprofiler-sdk_api/modules.rst Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com> * Update source/docs/api-reference/rocprofiler-sdk_api/global_data_structures_topics_files.rst Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com> * Duplicate * test warnings * Update CMakeLists.txt * Update rocprofiler-sdk.dox.in * Update update-docs.sh * fix docs build failures by -q -T flags. * set warn_as_error to NO. * test -W to suppress warnings. * remove -q flag from make. * reduce dot graph depth to 100 * Update custom docs target - docs target is now no longer part of the dependency list for the all target - installation of docs requires explicitly building the docs target (i.e. OPTIONAL install of _build/html/ folder) * add quit and trace mode back. * increase DOT_GRAPH_MAX_NODES to 500 back. * Format. --------- Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com> Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com> Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> |
||
|
|
3580478426 |
Build system (libdw), correlation ID, and shebang fixes (#354)
* Fix compilation for output library - link to targets for ATT (amd-comgr, dw, elf) * Relax correlation ID retirement log failures - only fail for correlation ID retirement underflow when building in CI mode * Fix shebang for several files - license was inserted before shebang in several places * Update code coverage exclude folders for samples * Tweak to agent tests - test to make sure hsa agent is not the old value instead of testing that it is the new value * Fix libdw include/link --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
35069a9d06 |
[Misc] fix the agent_id field (#297)
* fix the agent_id field * Fix shebang --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
ca7cce9e81 |
doc improvements for 1.0.0 part 2 (#330)
* update installation steps * Github Issue #50 Adding README's for samples * Making name change to ROCprofiler-SDK for consistency * Fix HIP trace documentation * Fix HSA trace in docs * Fix kernel trace in docs * Fixing memory copy and memory allocation traces * runtime trace and sys trace doc update * Fix scratch memory doc * kernel naming and filtering options * Adding collection period in docs * Perfetto configs update * summary output file * kernel trace format fix * update CHANGELOG * Agent index doc update * rocm-smi output * group by queue option * Updated --group-by-queue description * perfetto visualization --------- Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com> |
||
|
|
4fbcfd142c |
Copyright Compliance (#333)
* Added copyright information to requested files * Formatting * Fix bad function name error |
||
|
|
2d072f9217 |
[CI] Miscellaneous Testing Updates (#305)
* Add rocprofiler-sdk-utilities.cmake - contains cmake function rocprofiler_sdk_get_gfx_architectures * Update perfetto_reader.py - fix hash collision * Update project names in tests folders - rocprofiler-tests -> rocprofiler-sdk-tests * Fix incorrect allocation-error handling * [CI] Disable openmp tests for navi2, navi3, and navi4 * Suppress leaks by omptarget and llvm --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
f5c9663c51 |
removing gfx940 and gfx941 targets (#286)
* removing gfx940 and gfx941 targets * updated changelog |
||
|
|
fd99654433 | Fix install for conversion-script (#211) | ||
|
|
c77596b703 |
SWDEV-499989: Conversion Script to change counter collection output format from v3 to v1 (#107)
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1 * Add logging and argparsing * Dropping duplicated counters in pmc multiple lines * Adding test for conversion * moving conversion script to test files * copy conversion script from scripts folder |
||
|
|
e743bf5a93 |
Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX - add UBSAN_OPTIONS to setup-sanitizer-env.sh * Improve ROCPROFILER_DEFAULT_FAIL_REGEX * Use -fno-sanitize-recover=undefined flag - this compiler flag causes all undefined behavior errors to exit * Revert ROCPROFILER_DEFAULT_FAIL_REGEX * fix for shift overflow --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com> |
||
|
|
97b7a6315d |
update copyright date to 2025 (#102)
* Update LICENSE * Update conf.py * Update copyright year * [fix] Update copyright year * Update copyright year "ROCm Developer Tools" * Add license headers to c++ files * Add license to *.py * Update licenses in rocdecode sources --------- Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com> Co-authored-by: Mythreya <mythreya.kuricheti@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
73e7f8cfb1 |
ROCTx Documentation (#29)
* Add roctx doc * Add roctx doxyfile input * Update links and toc * Build doxysphinx for both doxygen files * Update scripts * Generate roctx doxygen files * Change doxygen path to allow for 2 doxyfiles * Make doxygen dir for script * Call make _doxygen dir with p flag * Create _doxygen dir in workfllow * Create doc dirs for doxygen * Run update docs as sudo * Fix typo in mkdir command * Include graphviz for dot * Install dot for docs CI * Install dot as sudo due to permission denied * Install doxygen via sudo * Install doxysphinx * Add postcheckout step to RTD to config and gen doxygen docs * On RTD, update doxygen after creating env * update docs.yml * update docs.yml * fixing build-docs-from-source * Fixing build docs from source * update docs.yml * trying to fix readthedocs * trying to fix readthedocs * update docs.yml * improve mainpage documentation * update docs * clang-format fix --------- Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> |
||
|
|
2c3bdeaed9 |
Download perfetto trace_processor_shell (#105)
* Download perfetto trace_processor_shell * Upgrade to perfetto-trace-processor-shell v0.0.4 * Fix run-ci.py warning - warning message: CMake Warning (dev) at /.../build/CTestCustom.cmake:16: Syntax Warning in cmake code at column 77 Argument not separated from preceding token by whitespace. * Update tests/pytest-packages/pytest_utils/perfetto_reader.py --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
00c46fd5e5 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
d564f759a5 |
Updating CI
Update continuous_integration.yml Update continuous_integration.yml Adding EMU Runners Update continuous_integration.yml Update continuous_integration.yml Bump thollander/actions-comment-pull-request from 2.5.0 to 3.0.1 Bumps [thollander/actions-comment-pull-request](https://github.com/thollander/actions-comment-pull-request) from 2.5.0 to 3.0.1. - [Release notes](https://github.com/thollander/actions-comment-pull-request/releases) - [Commits](https://github.com/thollander/actions-comment-pull-request/compare/v2.5.0...v3.0.1) --- updated-dependencies: - dependency-name: thollander/actions-comment-pull-request dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Update continuous_integration.yml Update continuous_integration.yml Update run-ci.py Update upload-image-to-github.py Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml using github output Update continuous_integration.yml Revert temp change Update continuous_integration.yml Update continuous_integration.yml |
||
|
|
5eb8c2658c |
rocprofv3: refactor and reorganize rocprofiler-sdk-tool library (#1138)
* Add rocprofv3-multi-node.md to source/lib/rocprofiler-sdk-tool
* Initial source re-organization
- create "output" static library
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- add GPR count fields to kernel symbol serialization
* Add source/scripts/generate-rocpd.py
- reads one or more JSON output files from rocprofv3 and writes rocpd SQLite3 database
- Note: preliminary implementation
* More reorganization b/t lib/rocprofiler-sdk-tool and lib/output
* Updates to generate-rocpd.py
- add SQL views
- option: --absolute-timestamps -> --normalize-timestamps
- option: --generic-markers
- misc fixes with regards to getting the views working
- support marker names
* Update generate-rocpd.py
- Add --marker-mode option
* Update generate-rocpd.py
- Improve debugging of bad bulk SQLite statements
* Update rocprofv3-multi-node.md
- cleanup of proposed SQL schema
* lib/output/format_path.{hpp,cpp}
- rename format to format_path (in config.hpp and config.cpp)
- move format_path functionality to format_path.{hpp,cpp}
* Rework lib/output/tmp_file_buffer.{hpp,cpp}
* Update output_key.cpp
- support %cwd%, %launch_date%
* Rework lib/output/buffered_output.hpp
* Support csv_output_file constructed via domain_type
* Update lib/output/domain_type.{hpp,cpp}
- get_domain_trace_file_name
- get_domain_stats_file_name
* Update lib/rocprofiler-sdk-tool/tool.cpp
- tweak headers
* Update lib/output/generate*.cpp
- remove include of helpers.hpp
- CSV uses domain_type for filenames
* Update samples/counter_collection/per_dev_serialization.cpp
- make wait_on volatile
* Remove tool_table from lib/output and lib/rocprofiler-sdk-tool
- Also split various structs into their own files
- lib/output/agent_info
- lib/output/metadata
- lib/output/kernel_symbol_info
- lib/output/counter_info
- Implemented rocprofiler::tool::metadata
* Optimize rocprofiler_tool_counter_collection_record_t
- reduce the size of the struct from 24784 bytes to 8376 bytes
* Introduced output_config
- split subset of config (from tools library) into output_config to be able to configure the output generating functions separately from the tool library
- this is a significant step towards the output generating functions not relying on static global memory
* Stream chunks of data into output instead of loading all info memory
* Remove duplicate group_segment_size in rocprofiler_kernel_dispatch_info_t serialization
* Adding Q&A to rocprofv3-multi-node.md
* Remove all remaining include lib/rocprofiler-sdk-tool from lib/output
- migrated a fair amount of code from lib/rocprofiler-sdk-tool/helper.hpp to lib/output
* Update Q&A of rocprofv3-multi-node.md
* Fix minor compilation errors + minor cleanup
* Update hsa/async_copy.cpp
- when ROCPROFILER_CI_STRICT_TIMESTAMPS > 0, reduce the active_signal sync wait time
* Update profiling_time.hpp
- fix log messages for when start/end time is less/greater than enqueue/current CPU time
* Fix generate_stats for tool_counter_record_t
* Dictionary optimization for generate-rocpd.py
---------
Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
|
||
|
|
5e1643cf81 |
rocprofv3: stabilize rocprofv3 summary tests (#1161)
* Update tests/bin/transpose/transpose.cpp - add hipMemGetInfo call to display the available vs. total memory on the GPU * Update tests/rocprofv3/summary/validate.py - Updated test_summary_display_data after addition of hipMemGetInfo to transpose test exe * Tweak code coverage comment uploading - create unique orphan branch per PR - reduce quality of PNG files (85 -> 70) * Revert some of code coverage comment uploading - remove creation of unique orphan branch per PR * Tweak code coverage comment uploading - create unique orphan branch per PR |
||
|
|
37e0d7efce |
Fix misaligned stores in buffer (#1063)
* Fix misaligned read/write to buffer - causes undefined behavior * Update run-ci.py - fix spurious CDash submission failure warning * Improve run-ci.py support for UBSan * Relax rocprofv3 summary stats count expectation * Update CHANGELOG |
||
|
|
8b986afbdb |
Update run-ci.py with new cdash portal (#1048)
* Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py |
||
|
|
69caa62b60 |
rocprofv3 doc updates (#982)
* updating rocprofv3 * using rocprofv3 * review updates * naming standardization * Update source/docs/how-to/using-rocprofv3.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * review comments * adding API references * kernel filtering * Remove Sphinx warn as error To bypass false warning for linking between rst and md * remove unused (duplicate) refs in _toc.yml.in --------- Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com> Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Peter Jun Park <peter.park@amd.com> |
||
|
|
94b5d9be3f |
Adding changes for handling abort signals (#979)
* Adding changes for handling abort signals * Fix the test failure * Fixing CmakeLists error * Addressing review comments * fixing warnings * fixing execute test * Fixing abort app test * Address review comments * Apply suggestions from code review * Apply suggestions from code review * Fixes for testing issues * Adding kernel filtering test * Removing text input file * fix formatting issues * misc fix * Suppress signal-unsafe error in ThreadSanitizer - rename signal handler to rocprofv3_error_signal_handler to ensure specific filtering * Fix rocprofv3 aborted-app validation --------- Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
1f96593b4f |
Test using HIP Graphs (#835)
* Test using hip graphs * Remove assert for api_end < async_end * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Remove submit retry * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update lib/common/container/record_header_buffer.hpp - minor tweaks * Update lib/rocprofiler-sdk/buffer.hpp - tweak ROCPROFILER_BUFFER_POLICY_LOSSLESS flush behavior * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Revert rocprofv3-test-trace-hip-in-libraries-validate timeout * Update run-ci.py - RETRY_COUNT set to zero |
||
|
|
d15cf17635 |
Relax default CDash submission requirements in run-ci.py (#836)
* Update run-ci.py to not require successful CDash submission by default * Minor tweak to run-ci.py |
||
|
|
29bc84ec0c |
Add default values for kernel struct (#798)
* Add default values for kernel struct * Update hsa-queue-dependency app - default initializers - check HSA_AMD_MEMORY_POOL_INFO_RUNTIME_ALLOC_ALLOWED for memory pools - clang-tidy fixes (member -> static, etc.) * Update run-ci.py - add --progress --output-on-failure -V if no other options regarding verbosity are passed - improve the ability to control the stages --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
87d549c8a9 |
Adding Keyword search pattern (#768)
* Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Adding the scan as a script * clean up * Update continuous_integration.yml |
||
|
|
fd3d97287c |
Page migration reporting (#651)
* Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Add kfd_ioctl.h * Formatting * Update get_key - get key was not used (and shouldn't be), so delete it * clang-tidy fixes * Tests for page migration * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update tests/bin/page-migration/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update page-migration test app - add hipHostRegister to register mmap'ed allocation with HIP - misc cleanup and reorg - remove HSA_XNACK=1 from test env * Update lib/rocprofiler-sdk/tests/page_migration.cpp - fix compilation error * Minor updates (reorg, rename) * Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Update page migration tests, fix trigger types * Page Migration Tracing Support Refactoring (#753) * Reorganization * Update page migration init/fini * Formatting * Update page_migration.cpp - change logging severity * Skip test if KFD does not support page migration reporting * Rework skipping test if KFD does not support page migration * Fix event trigger enum values * Fix clang-diagnostic-unused-const-variable --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> |
||
|
|
3eaa678054 |
CTest Environment Update (#756)
* Update test/tools/json-tool.cpp - push/pop ppid as external correlation id instead of pid * Update environment variables for tests and samples * Revert to old CDash dashboard in run-ci.py * Revert to new CDash dashboard in run-ci.py |
||
|
|
8c5399a68a |
Update HSA async copy active signals handling (#732)
* Enable INFO logging on retried CI jobs
* Update lib/rocprofiler-sdk/async_copy.cpp
- rework active_signals
- make hsa_signal_t member variable
- remove sync from destructor
- replace _is_set with atomic counter
- timeout of 30 seconds hsa_signal_wait
- switch from relaxed to scacquire/screlease memory ordering
- improve logging and error handling
- destroy hsa signal in active_signals in async_fini
* Update lib/rocprofiler-sdk/async_copy.cpp
- active_signals::create
- change initial value of signal to 1 instead of value of completion signal
- change condition trigger of signal callback
* Update tests/counter-collection/validate.py
* Update lib/rocprofiler-sdk/async_copy.cpp
- improved logging
- fix hsa_signal_wait_scacquire_fn check
* Cleanup tests/lib/transpose/transpose.cpp
- remove huge comment block
* Appears to be working on MI200
Dependency Versions:
clr:
|
||
|
|
e2d8ccad4b |
adding pandas and pytest to rquirements.txt (#748)
* adding pandas and pytest to rquirements.txt * setting up requrements.txt * Update requirements - formatting packages - remove packages not directly used by rocprofiler-sdk * Update cmake formatting, linting, and options - if BUILD_CI -> force BUILD_DEVELOPER and BUILD_WERROR - support python installed clang-format and python installed clang-tidy * Update build.sh - split into install-deps.sh and install-apt-deps.sh * Improve code coverage --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
176d1552cf |
Update to Clang-tidy-15 (#742)
* Update continuous_integration.yml * Update build.sh * Update continuous_integration.yml * Update build.sh * Update continuous_integration.yml |
||
|
|
5bb087f072 |
Adding useful scripts for formating and building (#737)
* Addin useful scripts for formating and building * Update build.sh * Update build.sh * Update continuous_integration.yml |
||
|
|
2905fb5e95 |
Update run-ci.py (#641)
* Temp: Fixing node id * source formatting (clang-format v11) (#709) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Using logical node id * Update agent.cpp * Update agent.cpp * Python formatting * Update run-ci.py * Update run-ci.py * Update continuous_integration.yml * Update continuous_integration.yml running directly using the prepared runner container * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update run-ci.py * Clean up * Fixing install paths * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Fixing GPU Agents Test Validation * python formatting (black) (#712) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Fixing the issue with rocclr detected kernels __amd_rocclr_.* * python formatting (black) (#713) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Fixing the issue with rocclr detected kernels __amd_rocclr_.* * Fixing static number of async copies and using hsa_api instead for validation * python formatting (black) (#714) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Increasing the time limit for waiting on active signals * Update continuous_integration.yml * Update async_copy.cpp * Update CMakeLists.txt * changing node id to logical node id in rocprofv3 * Update tool.cpp * testing async mem copy signal decrement * Update logging.cpp * Update validate.py --------- Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler1.amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler2.amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
2f9b1767e9 |
Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization
- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue
* Update HIP/HSA/marker update_table logging
* Update rocprofv3 tests
- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them
* Disable thread sanitizer deadlock detection
* Update CI workflow
- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers
* Update run-ci.py
- set gcovr html medium and high threshold
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- remove this capture from enable/disable serialization
* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*
- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map
* Logging for HIP/HSA/marker/profile_serializer
* Logging for HIP/HSA/marker/queue_controller
* Improve test_retired_correlation_ids asserts
* Fix tests/counter-collection/validate.py
- scale expected SQ_WAVES counter value based on warp size of GPU
* Tweak github comment for code coverage
* Remove gcovr html high/medium threshold args
* Fix tests/counter-collection/validate.py
- round before casting to int in test_counter_values
* operator bool for profile_serializer
- only wait on CV if profile_serializer is used
* Logging updates (profile_serializer + code_object)
* Update counter-collection validate.py
* QueueController does not wait on CV if finalizing/finalized
* Update CI workflow
- remove navi32 from core job
* Improve HIP/HSA/marker tracing get_functor/functor
- remove lambda wrapper around functor
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- do not acquire cvmutex lock during finalization
* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*
- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized
* Update CI workflow
- remove navi32 runners
* bwelton fixes for hangs
* CMake improvements + simplified demangle
- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
|
||
|
|
1de44447f4 |
Deadlock Fix for HSA and Serialization Disable/Enabling support (#582)
* Initial barrier * Working on profiler serializer extraction * Current progress * Serializtion Support * source formatting (clang-format v11) (#583) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#584) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * Minor fix * Current Progress * Current progress * More fixes * Serialization Fixes * Bug fix * source formatting (clang-format v11) (#600) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * More fixes * More minor fixes * source formatting (clang-format v11) (#603) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * source formatting (clang-format v11) (#604) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * Lock order inversion false positive * order fix * More changes * source formatting (clang-format v11) (#607) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * minor test fix * Minor test changes --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> |
||
|
|
a1267e1fd2 |
C compatibility for public headers (#566)
* C compatibility for public headers
- add tests/tools/c-tool.c
- builds a tool (which does nothing) with C language
- ensures that tool can be compiled in C
- add tests/c-tool/CMakeLists.txt
- ensures that tool library build from C is a valid tool
- rocprofiler_counter_info_v0_t is_derived is int instead of bool
- C does not have bool unless <stdbool.h> is included
- add `include/rocprofiler-sdk/hsa/api_trace_version.h
- handles providing HSA_*_TABLE_(MAJOR|STEP)_VERSION values if compiled from C
- cmake define in version.h.in for ROCPROFILER_HSA_*_TABLE_(MAJOR|STEP)_VERSION
- HSA table versions compiled with
- use rocprofiler_(hsa|hip|marker)_api_no_args struct to handle incompatibility b/t empty structs in C vs. C++ (size of 0 vs. size of 1)
- extern "C" in include/rocprofiler-sdk/{hsa,hip,marker}/api_args.h
- fixed spelling error: derrived -> derived
- scope YY_NO_INPUT compile definition to lib/rocprofiler-sdk/counters/parser/*
* Revert CDash dashboard
|
||
|
|
4bb95f885b | Update run-ci.py (#534) | ||
|
|
0d939edbba |
Updates/fixes for CI, docs, tests, samples, and common library (#528)
- .github/workflows/continuous_integration.yml - apt-get update before apt-get install - remove libgtest-dev - actions-comment-pull-request: v2.4.3 -> v2.5.0 - .github/workflows/formatting.yml - create-pull-request: v5 -> v6 - cmake/rocprofiler_options.cmake - remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options - samples/counter_collection/callback_client.cpp - corr_id field renamed to correlation_id - samples/counter_collection/client.cpp - corr_id field renamed to correlation_id - include/rocprofiler-sdk/fwd.h - In rocprofiler_record_counter_t: rename corr_id field to correlation_id - doxygen fixes - lib/common/utility.* - remove get_accurate_clock_id_impl - timestamp_ns() defaults to CLOCK_BOOTTIME - lib/rocprofiler-sdk/counters/core.cpp - fix spelling mistake: extrenal -> external - corr_id field renamed to correlation_id - lib/rocprofiler-sdk-tool/tool.cpp - fix destruction of static tool::output_file before finalization - scripts/update-docs.sh - define PROJECT_NAME - tests/async-copy-tracing/validate.py - init_time and fini_time checks - hip_api_traces, marker_api_tracing - tests/common/serialization.hpp - fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id - tests/kernel-tracing/validate.py - init_time and fini_time checks - relax test_total_runtime range - tests/rocprofv3/tracing/CMakeLists.txt - remove -M from rocprofv3-test-systrace-execute - exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing - tests/rocprofv3/tracing/validate.py - update test_kernel_trace to accept mangled or demangled - tests/tools/json-tool.cpp - remove use of GLOG - include init_time and fini_time - write_json(...) function |
||
|
|
3638351b4c |
Callback based handler for counter collection (#506)
* Callback based handler for counter collection * source formatting (clang-format v11) (#507) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#508) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Doc fix * Minor doc fix * More doc fixes * More doc fixes * More doc fixes * Update CI * Changes to the API per comments * Mutex exception for HSA * source formatting (clang-format v11) (#511) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Doc fix --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <bwelton@users.noreply.github.com> |
||
|
|
c443663032 |
Swap container to compute-rocm-dkms-component-staging-profiler (#412)
* Swap container to compute-rocm-dkms-component-staging-profiler compute-rocm-dkms-component-staging-profiler contains the staging branches for aqlprofile and others that are needed by the CI to function. * python formatting (black) (#414) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> |
||
|
|
9a8b6f6b7b |
Counter API and Samples Updates (#410)
* Update include/rocprofiler-sdk/{counters,profile_config}.h
- use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update samples
- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t
* Update run-ci.py
- exclude rocprofiler-sdk-tool from samples (no sample uses that code)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- Update rocprofiler_iterate_agent_supported_counters to use agent ID
* Update lib/rocprofiler-sdk/counters/core.*
- profile_config has pointer to agent instead of copy
* Update lib/rocprofiler-sdk/agent.*
- provide get_agent(...) func via rocp agent id
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation
* Update lib/rocprofiler-sdk/counters.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update lib/rocprofiler-sdk/profile_config.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update source/docs
- requirements.txt + install reqs in cmake
* Bump version to 0.1.0
* Update samples/api_callback_tracing/CMakeLists.txt
- LD_LIBRARY_PATH for test
* Update test/rocprofv3/tracing/CMakeLists.txt
- reorder validation files so memory copy comes first
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for flushing buffers
- variables for buffer_size and buffer_watermark
- increase the watermark to a full buffer
- use dedicated threads for each buffer
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info
* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- drop log level to warning when leak sanitizer is enabled (produces small memory leak)
|
||
|
|
c641749fe6 |
HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
|
||
|
|
c5e45803e9 |
Code Coverage Reporting (#334)
* Update lib/rocprofiler-sdk/counters/{tests,parser/tests}/CMakeLists.txt
- use rocprofiler-static-library instead of rocprofiler-object-library
* Update scripts/run-ci.py
- support gcovr and pycobertura
* Update CI workflow for code coverage
- load/save cache for XML code coverage (via gcovr)
- generate and write code coverage comment
- archive code coverage HTML report
- fix name for sanitizer jobs
* Update CI workflow
- tweaks to env for PATH and LD_LIBRARY_PATH
* Add scripts/upload-image-to-github.py
- script for saving images to orphan branches to be used in markdown links
* Update CI workflow
- fix upload artifact conflict
- use upload-image-to-github.py
* Update CI workflow
- install extra packages for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- install more recent git
- tweak package installs for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- remove duplicate --cap-add=SYS_PTRACE
* Update CI and upload-image-to-github.py
- print versions
* Update upload-image-to-github.py
- check exit code of some subprocesses
* Update CI workflow
- fix GITHUB_PATH ordering
- fix LD_LIBRARY_PATH
* Update CI workflow
- fix code coverage cache keys (use SHAs)
- copy .codecov to .codecov.ref if a cached .codecov exists
* Update upload-image-to-github.py
- Update git pull/push commands
* Update upload-image-to-github.py
- git fetch before pulling
- git pull before committing
* Update upload-image-to-github.py
- git fetch after committing
- git pull after committing
* Update CI workflow
- list files before cat
* Update upload-image-to-github.py
- output messages
* Update CI workflow and upload-image-to-github.py
- fix output directory path for script to work with CI workflow
* Update CI workflow
- finishing touches/fixes on the code coverage comment generation
* Reproducible filenames
* Update CI workflow
- fix archive of code coverage data
* Fix relative path of reproducible file loc
* Update upload-image-to-github.py
- change update method
* rocprofiler-v2-internal -> rocprofiler-sdk-internal
|
||
|
|
9a0c84efa6 |
Use -sdk suffix and reset VERSION to 0.0.0 (#263)
* Fix find_package(rocprofiler) in build tree * Move include/rocprofiler to include/rocprofiler-sdk * Update include/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) * Move lib/rocprofiler to lib/rocprofiler-sdk * Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool * Update lib/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) - add_subdirectory(rocprofiler-sdk-tool) * Update lib/rocprofiler-sdk/CMakeLists.txt * Rename rocprofiler-tool to rocprofiler-sdk-tool * Replace include rocprofiler/ with include rocprofiler-sdk/ * Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/ * Set VERSION to 0.0.0 and finish install to rocprofiler-sdk * More fixes for rocprofiler -> rocprofiler-sdk - fix issue with rocprofiler-sdk-config.cmake.in - fix counters xml install path * Fix documentation generation * Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree * cmake formatting (cmake-format) (#264) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |
||
|
|
cf5e4b4b1b |
Integration Testing (#211)
* Add external/cereal submodule - used for integration testing * Update lib/common/container/small_vector.hpp - documentation notes * Update tests/apps - update transpose app (fix build) - add reproducible-runtime app * Update include/rocprofiler/fwd.h - rocprofiler_service_callback_phase_t -> rocprofiler_callback_phase_t * Update PTL submodule - fix for task group: submitting tasks from different thread * Update lib/rocprofiler/hsa/queue.cpp - CHECK_NOTNULL(_buffer) * Update lib/rocprofiler/hsa/hsa.cpp - use buffer::get_buffer instead of manually looking for buffer * Update lib/rocprofiler/internal_threading.cpp - use buffer::get_buffer instead of manually looking for buffer * Update lib/rocprofiler/buffer.cpp - offset the buffer id - properly handle rocprofiler_create_buffer reusing rocprofiler_buffer_id_t on a different context * Update tests - kernel tracing library for integration testing * Add cereal submodule * Update lib/rocprofiler/registration.* - OnUnload - Support ROCP_TOOL_LIBRARIES for python usage - improve finalize function - remove calling hsa_shut_down in finalize function * Update lib/rocprofiler/buffer.* - allocate_buffer sets the buffer id value - expose (internally) is_valid_buffer_id - update test * Update tests/kernel-tracing - installation - better organization of JSON groups - improved messaging * Update lib/rocprofiler/registration.cpp - add workaround for hsa-runtime supporting rocprofiler-register * Update tests/kernel-tracing/kernel-tracing.cpp - fix memory leaks * cereal support for minimal JSON - update cereal submodule to rocprofiler branch - change REPO_BRANCH in rocprofiler_checkout_git_submodule for cereal - update tests/kernel-tracing/kernel-tracing.cpp - use minimal json - slight tweak putting giving contexts name in storing name + context pointer pair in map * Update tests/kernel-tracing/kernel-tracing.cpp - support runtime selection of contexts via KERNEL_TRACING_CONTEXTS environment variable * Update tests - tests/CMakeLists.txt - find_package(Python3 REQUIRED) - tests/kernel-tracing - pytest validation * Update CI workflow - install pytest - add checks for test labels * Update scripts/run-ci.py - change --coverage options - replace 'unittests' with 'tests' - replace test label regex '-L unittests' with '-L tests' * Update requirements.txt - this is now an empty file since none of the packages are required for this repo |
||
|
|
086218c2eb |
Fixes licensing in files (#206)
* Update LICENSE - fix inconsistencies * Revert lib/rocprofiler/counters/parser/scanner.cpp * Update lib/rocprofiler/counters/tests/dimension.cpp - revert ending curly brace * Revert missing curly braces - missing curly braces when file did not end with a new line |
||
|
|
3082288a25 |
Code object, kernel dispatch, and memory copy tracing (#177)
* Update samples/api_buffered_tracing
- external correlation id
- support ROCPROFILER_BUFFER_TRACING_KERNEL_DISPATCH
* Update lib/rocprofiler/context.cpp
- update alternative get_active_contexts paradigm
* Update lib/rocprofiler/external_correlation.cpp
- inherit correlation id from main thread
* Update lib/rocprofiler/hsa/queue.*
- typedef changes
- rocprofiler_packet union
- modify Queue::queue_info_session_t
- use rocprofiler_packet
- add thread id
- add kernel id
- add correlation id
- out of line definitions
- AsyncSignalHandler function update
- handle kernel dispatch tracing
- Move CreateBarrierPacket and AddVendorSpecificPacket to lambdas
- handle contexts
* Update lib/rocprofiler/hsa/hsa.cpp
- remove unnecessary log function
- use new get_active_contexts paradigm
- use new correlation id updates
* Update AgentCache and kernel dispatch record
- include const rocprofiler_agent_t* in rocprofiler_buffer_tracing_kernel_dispatch_record_t
- AgentCache::get_rocp_agent returns const pointer
* Replace ROCPROFILER_SERVICE_ with ROCPROFILER_
* source formatting
* Code Object Tracing
- include/rocprofiler/callback_tracing.h
- remove rocprofiler_callback_tracing_code_object_unload_data_t
- remove rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t
- include/rocprofiler/fwd.h
- remove ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_UNLOAD
- remove ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER
- lib/common/utility.hpp
- assert_public_api_struct_properties()
- init_public_api_struct(...)
- lib/rocprofiler/registration.cpp
- invoke hsa::code_object_init
- lib/rocprofiler/hsa/CMakeLists.txt
- compile code_object code
- lib/rocprofiler/hsa/code_object.{hpp,cpp}
- tracing code object load/unload
- lib/rocprofiler/hsa/queue.cpp
- get_kernel_id
* Update lib/rocprofiler/hsa/hsa.cpp
- fix should_wrap_functor logic (which was not handling callback_tracer + buffered_tracer properly)
* Update lib/rocprofiler/hsa/queue.cpp
- fix rocprofiler_buffer_tracing_kernel_dispatch_record_t construction
* Update samples/api_buffered_tracing/client.cpp
- print kernel names
* Move samples/apps to tests/apps
* Update lib/rocprofiler/hsa/code_object.cpp
- ensure unload callbacks when application is exiting
- support user data in between load/unload callbacks
* Update lib/rocprofiler/hsa/queue.{hpp,cpp}
- store contexts and external correlation ids in queue_info_session
- reduce signal_limiter to 96 to fix hangs
- fix support for kernel tracing and async memory copies
* Add lib/common/scope_destructor.hpp
- similar to static_cleanup_wrapper but different
* Update include/rocprofiler/buffer_tracing.h
- update rocprofiler_buffer_tracing_memory_copy_record_t
- remove operation: user can figure that out from correlation id
- add kernel id
- add rocprofiler agent id
* Update include/rocprofiler/callback_tracing.h
- fix data type of load_delta field in code object
- remove rocp_agent from kernel_symbol_register_data_t (known via code_object_id)
* Add samples/code_object_tracing
- sample demonstrating code object tracing
* Update samples
- minor tweak to print_call_stack
* Update lib/rocprofiler/hsa/code_object.cpp
- flip ordering of unload callbacks for code object unloading and kernel symbol deregistering
* clang-tidy fixes
* Update lib/rocprofiler/hsa/code_object.cpp
- fix heap-use-after-free issue with code object
* Update include/rocprofiler/external_correlation.h
- update documentation to include info about default value of external correlation value
* Use common::container::small_vector for contexts
- small_vector<const context*> is an ideal data structure for array of active contexts
* Update context handling for code object unload
- code object unload is only called for contexts which received the load callback
* Update samples
- improve ROCPROFILER_CALL macro to include status string
- api_buffered_tracing handles ROCPROFILER_STATUS_ERROR_BUFFER_BUSY
* Code object shutdown
- ensure code object callbacks are invoked prior to finalizing
* Update lib/common (memory allocators)
- added lib/common/memory folder with allocators
* Add lib/rocprofiler/allocator.*
- rocprofiler::allocator::static_data_allocator
- special allocator for static data which finalizes before any data gets destroyed
- rocprofiler::allocator::unique_static_ptr_t
- unique_ptr that uses static data deleter (ensure finalize is called)
* Update lib/rocprofiler/buffer.cpp
- flush checks fini status
- use unique_static_ptr_t
* Update lib/rocprofiler/internal_threading.*
- change meaning of thread_pool_t and task_group_t
- improve finalization to prevent data races and heap-use-after-free
* Update lib/rocprofiler/registration.cpp
- use static_data_allocator for client_library vector
* Update lib/rocprofiler/context/context.*
- use allocator::unique_static_ptr_t
* Update lib/rocprofiler/allocator.cpp
- avoid deadlock in deleter<static_data>::operator()
* Update lib/rocprofiler/registration.cpp
- avoid deadlock in rocprofiler::registration::finalize()
* Update lib/rocprofiler/hsa/code_object.cpp
- suppress duplicate reporting of code-object/kernel-symbol load/unload
* Update leak sanitizer suppressions
- __new_exitfn (via stdlib/cxa_atexit.c leaks
|
||
|
|
be42677f7a |
Update scripts/leak-sanitizer-suppr.txt (#132)
- ignore leaks from hsa-amd-aqlprofile library |
||
|
|
010693b795 |
Agent, Counters, and AQL (#55)
* Migrate XML counter defs and reader from v1/v2 * Current Working Set * Modified parser * Evaluate AST Start * Update lib/common/xml - move definitions out of class declaration * Update lib/rocprofiler/counters/parser - update build of bison and flex build - reproducible generation - add ROCPROFILER_REGENERATE_COUNTERS_PARSER option - fix namespacing * Update lib/rocprofiler/counters/xml - change location of XML files and install them * Update lib/rocprofiler/counter/tests - normalize the test names - improve test failures (more clear about where failure is) * Update lib/rocprofiler/counters - fix namespace - update to new XML metrics directory * Update lib/rocprofiler/CMakeLists.txt - link to object library * Update lib/rocprofiler/hsa/types.hpp - reorganize includes * Add metric loading class/printers * Agent Implementation * Queue Implementation (#79) * Queue Implementation * API Implementation For Counters (part 1) (#80) * API Implementation For Counters * Bewelton/counter collection 3 (#84) * Added counter sample * More changes * More changes * Update samples/counter_collection - mostly formatting * Update include/rocprofiler/counters.h - formatting * Add lib.common/synchronized.hpp - Synchronized struct * Update lib/rocprofiler/counters/xml/basic_counters.xml - whitespace * Update scripts/patch-parser.cmake - tweaks for consistency * Update lib/rocprofiler/counters/parser/tests/parser_tests.cpp - formatting * Update lib/rocprofiler/counters/parser - improve consistency in rocprofiler-expr-parser-patch - update parser.{h,cpp} and scanner.cpp - formatting + regenerated * Update lib/rocprofiler/aql - formatting - clang-tidy fixes - guard against memory pool access errors * Update lib/rocprofiler/aql/tests - formatting - update use of get_val - normalize test names * Update lib/rocprofiler/counters/tests - formatting - patch basic_counters and derived_counters - normalize test names * Update lib/rocprofiler/aql/tests - set_tests_properties * Update test labels - fix minor issue with gtest labels * Update lib/rocprofiler/counters - formatting - clang-tidy fixes * Update lib/rocprofiler/hsa - fix includes - formatting - clang-tidy fixes - tweak to queue_controller_init interface * Update lib/rocprofiler - include fixes - namespace fixes - clang-tidy fixes - formatting * Update scripts/run-ci.py - exclude counters/parser from code coverage (generated files) * Update include/rocprofiler/counters.h - fix doxygen comment * Update lib/rocprofiler/aql/packet_construct.cpp - guard against HSA_AMD_MEMORY_POOL_ACCESS_DISALLOWED_BY_DEFAULT and HSA_AMD_MEMORY_POOL_ACCESS_NEVER_ALLOWED * Update lib/rocprofiler/counters/parser/raw_ast.hpp - clang-tidy fixes * Update lib/rocprofiler/counters/evaluate_ast.hpp - clang-tidy fixes * Update lib/rocprofiler/aql/tests - disable packet_generation_single and packet_generation_multi tests - the entire implementation rocprofiler::get_ext_table() is incorrect * Minor fixes before cleanup * More changes * More fixes * More fixes * source formatting (clang-format v11) (#99) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Revert PTL submodule * Update scripts/run-ci.py - exclude counters/parser from code coverage (generated files) * Migrating counters state to context * Linting * source formatting (clang-format v11) (#101) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * revert run-ci * Testing fixes * More test changes * Fix minor typo * Small queue change * Small queue change * source formatting (clang-format v11) (#102) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * source formatting (clang-format v11) (#105) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Documentation Change * More documentation fixes * source formatting (clang-format v11) (#106) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Threading fixes * Threading fixes * source formatting (clang-format v11) (#107) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Threading fixes * More test fixes * More agent fixes * More build fixes * source formatting (clang-format v11) (#109) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * changed test timeouts * Build fix * Build fix * Updates to agent * source formatting (clang-format v11) (#114) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#113) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * remove git worktree folder * Doc update * testing fix * Another test fix * More test changes * Rebase * source formatting (clang-format v11) (#116) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Documentation * source formatting (clang-format v11) (#119) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * PTL Changes * Minor agent fix for empty labels * source formatting (clang-format v11) (#120) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Minor agent fix for empty labels * Refactor read_map * source formatting (clang-format v11) (#121) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Refactor read_map * Cache fixes * source formatting (clang-format v11) (#122) Co-authored-by: bwelton <bwelton@users.noreply.github.com> --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <bwelton@users.noreply.github.com> |