cc0c401615ffb2ec73d669aba7073fcd1cf8899c
48 Коммитов
| Автор | SHA1 | Сообщение | Дата | |
|---|---|---|---|---|
|
|
e743bf5a93 |
Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX - add UBSAN_OPTIONS to setup-sanitizer-env.sh * Improve ROCPROFILER_DEFAULT_FAIL_REGEX * Use -fno-sanitize-recover=undefined flag - this compiler flag causes all undefined behavior errors to exit * Revert ROCPROFILER_DEFAULT_FAIL_REGEX * fix for shift overflow --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com> |
||
|
|
97b7a6315d |
update copyright date to 2025 (#102)
* Update LICENSE * Update conf.py * Update copyright year * [fix] Update copyright year * Update copyright year "ROCm Developer Tools" * Add license headers to c++ files * Add license to *.py * Update licenses in rocdecode sources --------- Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com> Co-authored-by: Mythreya <mythreya.kuricheti@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
73e7f8cfb1 |
ROCTx Documentation (#29)
* Add roctx doc * Add roctx doxyfile input * Update links and toc * Build doxysphinx for both doxygen files * Update scripts * Generate roctx doxygen files * Change doxygen path to allow for 2 doxyfiles * Make doxygen dir for script * Call make _doxygen dir with p flag * Create _doxygen dir in workfllow * Create doc dirs for doxygen * Run update docs as sudo * Fix typo in mkdir command * Include graphviz for dot * Install dot for docs CI * Install dot as sudo due to permission denied * Install doxygen via sudo * Install doxysphinx * Add postcheckout step to RTD to config and gen doxygen docs * On RTD, update doxygen after creating env * update docs.yml * update docs.yml * fixing build-docs-from-source * Fixing build docs from source * update docs.yml * trying to fix readthedocs * trying to fix readthedocs * update docs.yml * improve mainpage documentation * update docs * clang-format fix --------- Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> |
||
|
|
2c3bdeaed9 |
Download perfetto trace_processor_shell (#105)
* Download perfetto trace_processor_shell * Upgrade to perfetto-trace-processor-shell v0.0.4 * Fix run-ci.py warning - warning message: CMake Warning (dev) at /.../build/CTestCustom.cmake:16: Syntax Warning in cmake code at column 77 Argument not separated from preceding token by whitespace. * Update tests/pytest-packages/pytest_utils/perfetto_reader.py --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
00c46fd5e5 |
SDK: OMPT Support (#22)
* Ability to select alternative compiler per file
Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.
Misc updates
Update OpenMP target sample
- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files
Rework OpenMP implementation
Minor OpenMP implementation cleanup
Rename samples/openmp_target CMake targets
Add tests/bin/openmp
- OpenMP target test app in tests/bin/openmp/target
Format samples/openmp_target CMakeLists.txt
Misc lib/rocprofiler-sdk/openmp cleanup
- fix includes
- convert_arg
Update openmp.def.cpp
- tweak includes
- remove lots of temporary variables
Update samples
- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample
Fix code object operation names
- add "CODE_OBJECT_" prefix
Update include/rocprofiler-sdk/openmp/api_id.h
- remove spurious comment
Miscellaneous openmp updates
- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events
[SWDEV-484495] Fix int truncation in CSV output (#1098)
CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.
Co-authored-by: Benjamin Welton <ben@amd.com>
Update limit for max counter records in rocprof-tool (#1073)
A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.
adding proxy ompt_data_t * arguments
fixes for proxy pointers
- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers
formatting
SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)
Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics
* Fixing some more accumulate metrics
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
updating rocprofv3 help options (#1113)
* updating rocprofv3 help options
* updating CHANGELOG
Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI
* Formatted rocprofv3.py with black formatter
SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)
* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Adding backlog for codeobj changes
* Formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
---------
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions
* Removing gfx8
* Update changelog
* Fixing unit tests
* Small fixes
* Fix for write size
Fix PSDB change (#1120)
Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit
|
||
|
|
d564f759a5 |
Updating CI
Update continuous_integration.yml Update continuous_integration.yml Adding EMU Runners Update continuous_integration.yml Update continuous_integration.yml Bump thollander/actions-comment-pull-request from 2.5.0 to 3.0.1 Bumps [thollander/actions-comment-pull-request](https://github.com/thollander/actions-comment-pull-request) from 2.5.0 to 3.0.1. - [Release notes](https://github.com/thollander/actions-comment-pull-request/releases) - [Commits](https://github.com/thollander/actions-comment-pull-request/compare/v2.5.0...v3.0.1) --- updated-dependencies: - dependency-name: thollander/actions-comment-pull-request dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Update continuous_integration.yml Update continuous_integration.yml Update run-ci.py Update upload-image-to-github.py Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml Update continuous_integration.yml using github output Update continuous_integration.yml Revert temp change Update continuous_integration.yml Update continuous_integration.yml |
||
|
|
5eb8c2658c |
rocprofv3: refactor and reorganize rocprofiler-sdk-tool library (#1138)
* Add rocprofv3-multi-node.md to source/lib/rocprofiler-sdk-tool
* Initial source re-organization
- create "output" static library
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- add GPR count fields to kernel symbol serialization
* Add source/scripts/generate-rocpd.py
- reads one or more JSON output files from rocprofv3 and writes rocpd SQLite3 database
- Note: preliminary implementation
* More reorganization b/t lib/rocprofiler-sdk-tool and lib/output
* Updates to generate-rocpd.py
- add SQL views
- option: --absolute-timestamps -> --normalize-timestamps
- option: --generic-markers
- misc fixes with regards to getting the views working
- support marker names
* Update generate-rocpd.py
- Add --marker-mode option
* Update generate-rocpd.py
- Improve debugging of bad bulk SQLite statements
* Update rocprofv3-multi-node.md
- cleanup of proposed SQL schema
* lib/output/format_path.{hpp,cpp}
- rename format to format_path (in config.hpp and config.cpp)
- move format_path functionality to format_path.{hpp,cpp}
* Rework lib/output/tmp_file_buffer.{hpp,cpp}
* Update output_key.cpp
- support %cwd%, %launch_date%
* Rework lib/output/buffered_output.hpp
* Support csv_output_file constructed via domain_type
* Update lib/output/domain_type.{hpp,cpp}
- get_domain_trace_file_name
- get_domain_stats_file_name
* Update lib/rocprofiler-sdk-tool/tool.cpp
- tweak headers
* Update lib/output/generate*.cpp
- remove include of helpers.hpp
- CSV uses domain_type for filenames
* Update samples/counter_collection/per_dev_serialization.cpp
- make wait_on volatile
* Remove tool_table from lib/output and lib/rocprofiler-sdk-tool
- Also split various structs into their own files
- lib/output/agent_info
- lib/output/metadata
- lib/output/kernel_symbol_info
- lib/output/counter_info
- Implemented rocprofiler::tool::metadata
* Optimize rocprofiler_tool_counter_collection_record_t
- reduce the size of the struct from 24784 bytes to 8376 bytes
* Introduced output_config
- split subset of config (from tools library) into output_config to be able to configure the output generating functions separately from the tool library
- this is a significant step towards the output generating functions not relying on static global memory
* Stream chunks of data into output instead of loading all info memory
* Remove duplicate group_segment_size in rocprofiler_kernel_dispatch_info_t serialization
* Adding Q&A to rocprofv3-multi-node.md
* Remove all remaining include lib/rocprofiler-sdk-tool from lib/output
- migrated a fair amount of code from lib/rocprofiler-sdk-tool/helper.hpp to lib/output
* Update Q&A of rocprofv3-multi-node.md
* Fix minor compilation errors + minor cleanup
* Update hsa/async_copy.cpp
- when ROCPROFILER_CI_STRICT_TIMESTAMPS > 0, reduce the active_signal sync wait time
* Update profiling_time.hpp
- fix log messages for when start/end time is less/greater than enqueue/current CPU time
* Fix generate_stats for tool_counter_record_t
* Dictionary optimization for generate-rocpd.py
---------
Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
|
||
|
|
5e1643cf81 |
rocprofv3: stabilize rocprofv3 summary tests (#1161)
* Update tests/bin/transpose/transpose.cpp - add hipMemGetInfo call to display the available vs. total memory on the GPU * Update tests/rocprofv3/summary/validate.py - Updated test_summary_display_data after addition of hipMemGetInfo to transpose test exe * Tweak code coverage comment uploading - create unique orphan branch per PR - reduce quality of PNG files (85 -> 70) * Revert some of code coverage comment uploading - remove creation of unique orphan branch per PR * Tweak code coverage comment uploading - create unique orphan branch per PR |
||
|
|
37e0d7efce |
Fix misaligned stores in buffer (#1063)
* Fix misaligned read/write to buffer - causes undefined behavior * Update run-ci.py - fix spurious CDash submission failure warning * Improve run-ci.py support for UBSan * Relax rocprofv3 summary stats count expectation * Update CHANGELOG |
||
|
|
8b986afbdb |
Update run-ci.py with new cdash portal (#1048)
* Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py * Update run-ci.py |
||
|
|
69caa62b60 |
rocprofv3 doc updates (#982)
* updating rocprofv3 * using rocprofv3 * review updates * naming standardization * Update source/docs/how-to/using-rocprofv3.rst Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> * review comments * adding API references * kernel filtering * Remove Sphinx warn as error To bypass false warning for linking between rst and md * remove unused (duplicate) refs in _toc.yml.in --------- Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com> Co-authored-by: Leo Paoletti <164940351+lpaoletti@users.noreply.github.com> Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Peter Jun Park <peter.park@amd.com> |
||
|
|
94b5d9be3f |
Adding changes for handling abort signals (#979)
* Adding changes for handling abort signals * Fix the test failure * Fixing CmakeLists error * Addressing review comments * fixing warnings * fixing execute test * Fixing abort app test * Address review comments * Apply suggestions from code review * Apply suggestions from code review * Fixes for testing issues * Adding kernel filtering test * Removing text input file * fix formatting issues * misc fix * Suppress signal-unsafe error in ThreadSanitizer - rename signal handler to rocprofv3_error_signal_handler to ensure specific filtering * Fix rocprofv3 aborted-app validation --------- Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
1f96593b4f |
Test using HIP Graphs (#835)
* Test using hip graphs * Remove assert for api_end < async_end * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Remove submit retry * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update lib/common/container/record_header_buffer.hpp - minor tweaks * Update lib/rocprofiler-sdk/buffer.hpp - tweak ROCPROFILER_BUFFER_POLICY_LOSSLESS flush behavior * Increase rocprofv3-test-trace-hip-in-libraries-validate timeout * Update rocprofv3/tracing-hip-in-libraries::test_api_trace * Revert rocprofv3-test-trace-hip-in-libraries-validate timeout * Update run-ci.py - RETRY_COUNT set to zero |
||
|
|
d15cf17635 |
Relax default CDash submission requirements in run-ci.py (#836)
* Update run-ci.py to not require successful CDash submission by default * Minor tweak to run-ci.py |
||
|
|
29bc84ec0c |
Add default values for kernel struct (#798)
* Add default values for kernel struct * Update hsa-queue-dependency app - default initializers - check HSA_AMD_MEMORY_POOL_INFO_RUNTIME_ALLOC_ALLOWED for memory pools - clang-tidy fixes (member -> static, etc.) * Update run-ci.py - add --progress --output-on-failure -V if no other options regarding verbosity are passed - improve the ability to control the stages --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
87d549c8a9 |
Adding Keyword search pattern (#768)
* Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Adding the scan as a script * clean up * Update continuous_integration.yml |
||
|
|
fd3d97287c |
Page migration reporting (#651)
* Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Add kfd_ioctl.h * Formatting * Update get_key - get key was not used (and shouldn't be), so delete it * clang-tidy fixes * Tests for page migration * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update tests/bin/page-migration/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update page-migration test app - add hipHostRegister to register mmap'ed allocation with HIP - misc cleanup and reorg - remove HSA_XNACK=1 from test env * Update lib/rocprofiler-sdk/tests/page_migration.cpp - fix compilation error * Minor updates (reorg, rename) * Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Update page migration tests, fix trigger types * Page Migration Tracing Support Refactoring (#753) * Reorganization * Update page migration init/fini * Formatting * Update page_migration.cpp - change logging severity * Skip test if KFD does not support page migration reporting * Rework skipping test if KFD does not support page migration * Fix event trigger enum values * Fix clang-diagnostic-unused-const-variable --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> |
||
|
|
3eaa678054 |
CTest Environment Update (#756)
* Update test/tools/json-tool.cpp - push/pop ppid as external correlation id instead of pid * Update environment variables for tests and samples * Revert to old CDash dashboard in run-ci.py * Revert to new CDash dashboard in run-ci.py |
||
|
|
8c5399a68a |
Update HSA async copy active signals handling (#732)
* Enable INFO logging on retried CI jobs
* Update lib/rocprofiler-sdk/async_copy.cpp
- rework active_signals
- make hsa_signal_t member variable
- remove sync from destructor
- replace _is_set with atomic counter
- timeout of 30 seconds hsa_signal_wait
- switch from relaxed to scacquire/screlease memory ordering
- improve logging and error handling
- destroy hsa signal in active_signals in async_fini
* Update lib/rocprofiler-sdk/async_copy.cpp
- active_signals::create
- change initial value of signal to 1 instead of value of completion signal
- change condition trigger of signal callback
* Update tests/counter-collection/validate.py
* Update lib/rocprofiler-sdk/async_copy.cpp
- improved logging
- fix hsa_signal_wait_scacquire_fn check
* Cleanup tests/lib/transpose/transpose.cpp
- remove huge comment block
* Appears to be working on MI200
Dependency Versions:
clr:
|
||
|
|
e2d8ccad4b |
adding pandas and pytest to rquirements.txt (#748)
* adding pandas and pytest to rquirements.txt * setting up requrements.txt * Update requirements - formatting packages - remove packages not directly used by rocprofiler-sdk * Update cmake formatting, linting, and options - if BUILD_CI -> force BUILD_DEVELOPER and BUILD_WERROR - support python installed clang-format and python installed clang-tidy * Update build.sh - split into install-deps.sh and install-apt-deps.sh * Improve code coverage --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
176d1552cf |
Update to Clang-tidy-15 (#742)
* Update continuous_integration.yml * Update build.sh * Update continuous_integration.yml * Update build.sh * Update continuous_integration.yml |
||
|
|
5bb087f072 |
Adding useful scripts for formating and building (#737)
* Addin useful scripts for formating and building * Update build.sh * Update build.sh * Update continuous_integration.yml |
||
|
|
2905fb5e95 |
Update run-ci.py (#641)
* Temp: Fixing node id * source formatting (clang-format v11) (#709) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Using logical node id * Update agent.cpp * Update agent.cpp * Python formatting * Update run-ci.py * Update run-ci.py * Update continuous_integration.yml * Update continuous_integration.yml running directly using the prepared runner container * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update run-ci.py * Clean up * Fixing install paths * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Update continuous_integration.yml * Fixing GPU Agents Test Validation * python formatting (black) (#712) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Fixing the issue with rocclr detected kernels __amd_rocclr_.* * python formatting (black) (#713) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Fixing the issue with rocclr detected kernels __amd_rocclr_.* * Fixing static number of async copies and using hsa_api instead for validation * python formatting (black) (#714) Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> * Increasing the time limit for waiting on active signals * Update continuous_integration.yml * Update async_copy.cpp * Update CMakeLists.txt * changing node id to logical node id in rocprofv3 * Update tool.cpp * testing async mem copy signal decrement * Update logging.cpp * Update validate.py --------- Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler1.amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com> Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler2.amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
2f9b1767e9 |
Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization
- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue
* Update HIP/HSA/marker update_table logging
* Update rocprofv3 tests
- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them
* Disable thread sanitizer deadlock detection
* Update CI workflow
- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers
* Update run-ci.py
- set gcovr html medium and high threshold
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- remove this capture from enable/disable serialization
* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*
- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map
* Logging for HIP/HSA/marker/profile_serializer
* Logging for HIP/HSA/marker/queue_controller
* Improve test_retired_correlation_ids asserts
* Fix tests/counter-collection/validate.py
- scale expected SQ_WAVES counter value based on warp size of GPU
* Tweak github comment for code coverage
* Remove gcovr html high/medium threshold args
* Fix tests/counter-collection/validate.py
- round before casting to int in test_counter_values
* operator bool for profile_serializer
- only wait on CV if profile_serializer is used
* Logging updates (profile_serializer + code_object)
* Update counter-collection validate.py
* QueueController does not wait on CV if finalizing/finalized
* Update CI workflow
- remove navi32 from core job
* Improve HIP/HSA/marker tracing get_functor/functor
- remove lambda wrapper around functor
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- do not acquire cvmutex lock during finalization
* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*
- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized
* Update CI workflow
- remove navi32 runners
* bwelton fixes for hangs
* CMake improvements + simplified demangle
- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
|
||
|
|
1de44447f4 |
Deadlock Fix for HSA and Serialization Disable/Enabling support (#582)
* Initial barrier * Working on profiler serializer extraction * Current progress * Serializtion Support * source formatting (clang-format v11) (#583) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#584) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * Minor fix * Current Progress * Current progress * More fixes * Serialization Fixes * Bug fix * source formatting (clang-format v11) (#600) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * More fixes * More minor fixes * source formatting (clang-format v11) (#603) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * source formatting (clang-format v11) (#604) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * Lock order inversion false positive * order fix * More changes * source formatting (clang-format v11) (#607) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * minor test fix * Minor test changes --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> |
||
|
|
a1267e1fd2 |
C compatibility for public headers (#566)
* C compatibility for public headers
- add tests/tools/c-tool.c
- builds a tool (which does nothing) with C language
- ensures that tool can be compiled in C
- add tests/c-tool/CMakeLists.txt
- ensures that tool library build from C is a valid tool
- rocprofiler_counter_info_v0_t is_derived is int instead of bool
- C does not have bool unless <stdbool.h> is included
- add `include/rocprofiler-sdk/hsa/api_trace_version.h
- handles providing HSA_*_TABLE_(MAJOR|STEP)_VERSION values if compiled from C
- cmake define in version.h.in for ROCPROFILER_HSA_*_TABLE_(MAJOR|STEP)_VERSION
- HSA table versions compiled with
- use rocprofiler_(hsa|hip|marker)_api_no_args struct to handle incompatibility b/t empty structs in C vs. C++ (size of 0 vs. size of 1)
- extern "C" in include/rocprofiler-sdk/{hsa,hip,marker}/api_args.h
- fixed spelling error: derrived -> derived
- scope YY_NO_INPUT compile definition to lib/rocprofiler-sdk/counters/parser/*
* Revert CDash dashboard
|
||
|
|
4bb95f885b | Update run-ci.py (#534) | ||
|
|
0d939edbba |
Updates/fixes for CI, docs, tests, samples, and common library (#528)
- .github/workflows/continuous_integration.yml - apt-get update before apt-get install - remove libgtest-dev - actions-comment-pull-request: v2.4.3 -> v2.5.0 - .github/workflows/formatting.yml - create-pull-request: v5 -> v6 - cmake/rocprofiler_options.cmake - remove unused ROCPROFILER_DEBUG_TRACE and ROCPROFILER_LD_AQLPROFILE options - samples/counter_collection/callback_client.cpp - corr_id field renamed to correlation_id - samples/counter_collection/client.cpp - corr_id field renamed to correlation_id - include/rocprofiler-sdk/fwd.h - In rocprofiler_record_counter_t: rename corr_id field to correlation_id - doxygen fixes - lib/common/utility.* - remove get_accurate_clock_id_impl - timestamp_ns() defaults to CLOCK_BOOTTIME - lib/rocprofiler-sdk/counters/core.cpp - fix spelling mistake: extrenal -> external - corr_id field renamed to correlation_id - lib/rocprofiler-sdk-tool/tool.cpp - fix destruction of static tool::output_file before finalization - scripts/update-docs.sh - define PROJECT_NAME - tests/async-copy-tracing/validate.py - init_time and fini_time checks - hip_api_traces, marker_api_tracing - tests/common/serialization.hpp - fix save function for rocprofiler_record_counter_t following rename of corr_id to correlation_id - tests/kernel-tracing/validate.py - init_time and fini_time checks - relax test_total_runtime range - tests/rocprofv3/tracing/CMakeLists.txt - remove -M from rocprofv3-test-systrace-execute - exclude test_hsa_api_trace in rocprofv3-test-systrace-validate due to HIP API tracing - tests/rocprofv3/tracing/validate.py - update test_kernel_trace to accept mangled or demangled - tests/tools/json-tool.cpp - remove use of GLOG - include init_time and fini_time - write_json(...) function |
||
|
|
3638351b4c |
Callback based handler for counter collection (#506)
* Callback based handler for counter collection * source formatting (clang-format v11) (#507) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#508) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Doc fix * Minor doc fix * More doc fixes * More doc fixes * More doc fixes * Update CI * Changes to the API per comments * Mutex exception for HSA * source formatting (clang-format v11) (#511) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Doc fix --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <bwelton@users.noreply.github.com> |
||
|
|
c443663032 |
Swap container to compute-rocm-dkms-component-staging-profiler (#412)
* Swap container to compute-rocm-dkms-component-staging-profiler compute-rocm-dkms-component-staging-profiler contains the staging branches for aqlprofile and others that are needed by the CI to function. * python formatting (black) (#414) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> |
||
|
|
9a8b6f6b7b |
Counter API and Samples Updates (#410)
* Update include/rocprofiler-sdk/{counters,profile_config}.h
- use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update samples
- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t
* Update run-ci.py
- exclude rocprofiler-sdk-tool from samples (no sample uses that code)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- Update rocprofiler_iterate_agent_supported_counters to use agent ID
* Update lib/rocprofiler-sdk/counters/core.*
- profile_config has pointer to agent instead of copy
* Update lib/rocprofiler-sdk/agent.*
- provide get_agent(...) func via rocp agent id
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation
* Update lib/rocprofiler-sdk/counters.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update lib/rocprofiler-sdk/profile_config.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update source/docs
- requirements.txt + install reqs in cmake
* Bump version to 0.1.0
* Update samples/api_callback_tracing/CMakeLists.txt
- LD_LIBRARY_PATH for test
* Update test/rocprofv3/tracing/CMakeLists.txt
- reorder validation files so memory copy comes first
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for flushing buffers
- variables for buffer_size and buffer_watermark
- increase the watermark to a full buffer
- use dedicated threads for each buffer
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info
* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- drop log level to warning when leak sanitizer is enabled (produces small memory leak)
|
||
|
|
c641749fe6 |
HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
|
||
|
|
c5e45803e9 |
Code Coverage Reporting (#334)
* Update lib/rocprofiler-sdk/counters/{tests,parser/tests}/CMakeLists.txt
- use rocprofiler-static-library instead of rocprofiler-object-library
* Update scripts/run-ci.py
- support gcovr and pycobertura
* Update CI workflow for code coverage
- load/save cache for XML code coverage (via gcovr)
- generate and write code coverage comment
- archive code coverage HTML report
- fix name for sanitizer jobs
* Update CI workflow
- tweaks to env for PATH and LD_LIBRARY_PATH
* Add scripts/upload-image-to-github.py
- script for saving images to orphan branches to be used in markdown links
* Update CI workflow
- fix upload artifact conflict
- use upload-image-to-github.py
* Update CI workflow
- install extra packages for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- install more recent git
- tweak package installs for wkhtmltopdf/wkhtmltoimage
* Update CI workflow (code coverage)
- remove duplicate --cap-add=SYS_PTRACE
* Update CI and upload-image-to-github.py
- print versions
* Update upload-image-to-github.py
- check exit code of some subprocesses
* Update CI workflow
- fix GITHUB_PATH ordering
- fix LD_LIBRARY_PATH
* Update CI workflow
- fix code coverage cache keys (use SHAs)
- copy .codecov to .codecov.ref if a cached .codecov exists
* Update upload-image-to-github.py
- Update git pull/push commands
* Update upload-image-to-github.py
- git fetch before pulling
- git pull before committing
* Update upload-image-to-github.py
- git fetch after committing
- git pull after committing
* Update CI workflow
- list files before cat
* Update upload-image-to-github.py
- output messages
* Update CI workflow and upload-image-to-github.py
- fix output directory path for script to work with CI workflow
* Update CI workflow
- finishing touches/fixes on the code coverage comment generation
* Reproducible filenames
* Update CI workflow
- fix archive of code coverage data
* Fix relative path of reproducible file loc
* Update upload-image-to-github.py
- change update method
* rocprofiler-v2-internal -> rocprofiler-sdk-internal
|
||
|
|
9a0c84efa6 |
Use -sdk suffix and reset VERSION to 0.0.0 (#263)
* Fix find_package(rocprofiler) in build tree * Move include/rocprofiler to include/rocprofiler-sdk * Update include/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) * Move lib/rocprofiler to lib/rocprofiler-sdk * Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool * Update lib/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) - add_subdirectory(rocprofiler-sdk-tool) * Update lib/rocprofiler-sdk/CMakeLists.txt * Rename rocprofiler-tool to rocprofiler-sdk-tool * Replace include rocprofiler/ with include rocprofiler-sdk/ * Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/ * Set VERSION to 0.0.0 and finish install to rocprofiler-sdk * More fixes for rocprofiler -> rocprofiler-sdk - fix issue with rocprofiler-sdk-config.cmake.in - fix counters xml install path * Fix documentation generation * Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree * cmake formatting (cmake-format) (#264) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |
||
|
|
cf5e4b4b1b |
Integration Testing (#211)
* Add external/cereal submodule - used for integration testing * Update lib/common/container/small_vector.hpp - documentation notes * Update tests/apps - update transpose app (fix build) - add reproducible-runtime app * Update include/rocprofiler/fwd.h - rocprofiler_service_callback_phase_t -> rocprofiler_callback_phase_t * Update PTL submodule - fix for task group: submitting tasks from different thread * Update lib/rocprofiler/hsa/queue.cpp - CHECK_NOTNULL(_buffer) * Update lib/rocprofiler/hsa/hsa.cpp - use buffer::get_buffer instead of manually looking for buffer * Update lib/rocprofiler/internal_threading.cpp - use buffer::get_buffer instead of manually looking for buffer * Update lib/rocprofiler/buffer.cpp - offset the buffer id - properly handle rocprofiler_create_buffer reusing rocprofiler_buffer_id_t on a different context * Update tests - kernel tracing library for integration testing * Add cereal submodule * Update lib/rocprofiler/registration.* - OnUnload - Support ROCP_TOOL_LIBRARIES for python usage - improve finalize function - remove calling hsa_shut_down in finalize function * Update lib/rocprofiler/buffer.* - allocate_buffer sets the buffer id value - expose (internally) is_valid_buffer_id - update test * Update tests/kernel-tracing - installation - better organization of JSON groups - improved messaging * Update lib/rocprofiler/registration.cpp - add workaround for hsa-runtime supporting rocprofiler-register * Update tests/kernel-tracing/kernel-tracing.cpp - fix memory leaks * cereal support for minimal JSON - update cereal submodule to rocprofiler branch - change REPO_BRANCH in rocprofiler_checkout_git_submodule for cereal - update tests/kernel-tracing/kernel-tracing.cpp - use minimal json - slight tweak putting giving contexts name in storing name + context pointer pair in map * Update tests/kernel-tracing/kernel-tracing.cpp - support runtime selection of contexts via KERNEL_TRACING_CONTEXTS environment variable * Update tests - tests/CMakeLists.txt - find_package(Python3 REQUIRED) - tests/kernel-tracing - pytest validation * Update CI workflow - install pytest - add checks for test labels * Update scripts/run-ci.py - change --coverage options - replace 'unittests' with 'tests' - replace test label regex '-L unittests' with '-L tests' * Update requirements.txt - this is now an empty file since none of the packages are required for this repo |
||
|
|
086218c2eb |
Fixes licensing in files (#206)
* Update LICENSE - fix inconsistencies * Revert lib/rocprofiler/counters/parser/scanner.cpp * Update lib/rocprofiler/counters/tests/dimension.cpp - revert ending curly brace * Revert missing curly braces - missing curly braces when file did not end with a new line |
||
|
|
3082288a25 |
Code object, kernel dispatch, and memory copy tracing (#177)
* Update samples/api_buffered_tracing
- external correlation id
- support ROCPROFILER_BUFFER_TRACING_KERNEL_DISPATCH
* Update lib/rocprofiler/context.cpp
- update alternative get_active_contexts paradigm
* Update lib/rocprofiler/external_correlation.cpp
- inherit correlation id from main thread
* Update lib/rocprofiler/hsa/queue.*
- typedef changes
- rocprofiler_packet union
- modify Queue::queue_info_session_t
- use rocprofiler_packet
- add thread id
- add kernel id
- add correlation id
- out of line definitions
- AsyncSignalHandler function update
- handle kernel dispatch tracing
- Move CreateBarrierPacket and AddVendorSpecificPacket to lambdas
- handle contexts
* Update lib/rocprofiler/hsa/hsa.cpp
- remove unnecessary log function
- use new get_active_contexts paradigm
- use new correlation id updates
* Update AgentCache and kernel dispatch record
- include const rocprofiler_agent_t* in rocprofiler_buffer_tracing_kernel_dispatch_record_t
- AgentCache::get_rocp_agent returns const pointer
* Replace ROCPROFILER_SERVICE_ with ROCPROFILER_
* source formatting
* Code Object Tracing
- include/rocprofiler/callback_tracing.h
- remove rocprofiler_callback_tracing_code_object_unload_data_t
- remove rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t
- include/rocprofiler/fwd.h
- remove ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_UNLOAD
- remove ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_DEVICE_KERNEL_SYMBOL_UNREGISTER
- lib/common/utility.hpp
- assert_public_api_struct_properties()
- init_public_api_struct(...)
- lib/rocprofiler/registration.cpp
- invoke hsa::code_object_init
- lib/rocprofiler/hsa/CMakeLists.txt
- compile code_object code
- lib/rocprofiler/hsa/code_object.{hpp,cpp}
- tracing code object load/unload
- lib/rocprofiler/hsa/queue.cpp
- get_kernel_id
* Update lib/rocprofiler/hsa/hsa.cpp
- fix should_wrap_functor logic (which was not handling callback_tracer + buffered_tracer properly)
* Update lib/rocprofiler/hsa/queue.cpp
- fix rocprofiler_buffer_tracing_kernel_dispatch_record_t construction
* Update samples/api_buffered_tracing/client.cpp
- print kernel names
* Move samples/apps to tests/apps
* Update lib/rocprofiler/hsa/code_object.cpp
- ensure unload callbacks when application is exiting
- support user data in between load/unload callbacks
* Update lib/rocprofiler/hsa/queue.{hpp,cpp}
- store contexts and external correlation ids in queue_info_session
- reduce signal_limiter to 96 to fix hangs
- fix support for kernel tracing and async memory copies
* Add lib/common/scope_destructor.hpp
- similar to static_cleanup_wrapper but different
* Update include/rocprofiler/buffer_tracing.h
- update rocprofiler_buffer_tracing_memory_copy_record_t
- remove operation: user can figure that out from correlation id
- add kernel id
- add rocprofiler agent id
* Update include/rocprofiler/callback_tracing.h
- fix data type of load_delta field in code object
- remove rocp_agent from kernel_symbol_register_data_t (known via code_object_id)
* Add samples/code_object_tracing
- sample demonstrating code object tracing
* Update samples
- minor tweak to print_call_stack
* Update lib/rocprofiler/hsa/code_object.cpp
- flip ordering of unload callbacks for code object unloading and kernel symbol deregistering
* clang-tidy fixes
* Update lib/rocprofiler/hsa/code_object.cpp
- fix heap-use-after-free issue with code object
* Update include/rocprofiler/external_correlation.h
- update documentation to include info about default value of external correlation value
* Use common::container::small_vector for contexts
- small_vector<const context*> is an ideal data structure for array of active contexts
* Update context handling for code object unload
- code object unload is only called for contexts which received the load callback
* Update samples
- improve ROCPROFILER_CALL macro to include status string
- api_buffered_tracing handles ROCPROFILER_STATUS_ERROR_BUFFER_BUSY
* Code object shutdown
- ensure code object callbacks are invoked prior to finalizing
* Update lib/common (memory allocators)
- added lib/common/memory folder with allocators
* Add lib/rocprofiler/allocator.*
- rocprofiler::allocator::static_data_allocator
- special allocator for static data which finalizes before any data gets destroyed
- rocprofiler::allocator::unique_static_ptr_t
- unique_ptr that uses static data deleter (ensure finalize is called)
* Update lib/rocprofiler/buffer.cpp
- flush checks fini status
- use unique_static_ptr_t
* Update lib/rocprofiler/internal_threading.*
- change meaning of thread_pool_t and task_group_t
- improve finalization to prevent data races and heap-use-after-free
* Update lib/rocprofiler/registration.cpp
- use static_data_allocator for client_library vector
* Update lib/rocprofiler/context/context.*
- use allocator::unique_static_ptr_t
* Update lib/rocprofiler/allocator.cpp
- avoid deadlock in deleter<static_data>::operator()
* Update lib/rocprofiler/registration.cpp
- avoid deadlock in rocprofiler::registration::finalize()
* Update lib/rocprofiler/hsa/code_object.cpp
- suppress duplicate reporting of code-object/kernel-symbol load/unload
* Update leak sanitizer suppressions
- __new_exitfn (via stdlib/cxa_atexit.c leaks
|
||
|
|
be42677f7a |
Update scripts/leak-sanitizer-suppr.txt (#132)
- ignore leaks from hsa-amd-aqlprofile library |
||
|
|
010693b795 |
Agent, Counters, and AQL (#55)
* Migrate XML counter defs and reader from v1/v2 * Current Working Set * Modified parser * Evaluate AST Start * Update lib/common/xml - move definitions out of class declaration * Update lib/rocprofiler/counters/parser - update build of bison and flex build - reproducible generation - add ROCPROFILER_REGENERATE_COUNTERS_PARSER option - fix namespacing * Update lib/rocprofiler/counters/xml - change location of XML files and install them * Update lib/rocprofiler/counter/tests - normalize the test names - improve test failures (more clear about where failure is) * Update lib/rocprofiler/counters - fix namespace - update to new XML metrics directory * Update lib/rocprofiler/CMakeLists.txt - link to object library * Update lib/rocprofiler/hsa/types.hpp - reorganize includes * Add metric loading class/printers * Agent Implementation * Queue Implementation (#79) * Queue Implementation * API Implementation For Counters (part 1) (#80) * API Implementation For Counters * Bewelton/counter collection 3 (#84) * Added counter sample * More changes * More changes * Update samples/counter_collection - mostly formatting * Update include/rocprofiler/counters.h - formatting * Add lib.common/synchronized.hpp - Synchronized struct * Update lib/rocprofiler/counters/xml/basic_counters.xml - whitespace * Update scripts/patch-parser.cmake - tweaks for consistency * Update lib/rocprofiler/counters/parser/tests/parser_tests.cpp - formatting * Update lib/rocprofiler/counters/parser - improve consistency in rocprofiler-expr-parser-patch - update parser.{h,cpp} and scanner.cpp - formatting + regenerated * Update lib/rocprofiler/aql - formatting - clang-tidy fixes - guard against memory pool access errors * Update lib/rocprofiler/aql/tests - formatting - update use of get_val - normalize test names * Update lib/rocprofiler/counters/tests - formatting - patch basic_counters and derived_counters - normalize test names * Update lib/rocprofiler/aql/tests - set_tests_properties * Update test labels - fix minor issue with gtest labels * Update lib/rocprofiler/counters - formatting - clang-tidy fixes * Update lib/rocprofiler/hsa - fix includes - formatting - clang-tidy fixes - tweak to queue_controller_init interface * Update lib/rocprofiler - include fixes - namespace fixes - clang-tidy fixes - formatting * Update scripts/run-ci.py - exclude counters/parser from code coverage (generated files) * Update include/rocprofiler/counters.h - fix doxygen comment * Update lib/rocprofiler/aql/packet_construct.cpp - guard against HSA_AMD_MEMORY_POOL_ACCESS_DISALLOWED_BY_DEFAULT and HSA_AMD_MEMORY_POOL_ACCESS_NEVER_ALLOWED * Update lib/rocprofiler/counters/parser/raw_ast.hpp - clang-tidy fixes * Update lib/rocprofiler/counters/evaluate_ast.hpp - clang-tidy fixes * Update lib/rocprofiler/aql/tests - disable packet_generation_single and packet_generation_multi tests - the entire implementation rocprofiler::get_ext_table() is incorrect * Minor fixes before cleanup * More changes * More fixes * More fixes * source formatting (clang-format v11) (#99) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Revert PTL submodule * Update scripts/run-ci.py - exclude counters/parser from code coverage (generated files) * Migrating counters state to context * Linting * source formatting (clang-format v11) (#101) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * revert run-ci * Testing fixes * More test changes * Fix minor typo * Small queue change * Small queue change * source formatting (clang-format v11) (#102) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * source formatting (clang-format v11) (#105) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Documentation Change * More documentation fixes * source formatting (clang-format v11) (#106) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Threading fixes * Threading fixes * source formatting (clang-format v11) (#107) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Threading fixes * More test fixes * More agent fixes * More build fixes * source formatting (clang-format v11) (#109) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * changed test timeouts * Build fix * Build fix * Updates to agent * source formatting (clang-format v11) (#114) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#113) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * remove git worktree folder * Doc update * testing fix * Another test fix * More test changes * Rebase * source formatting (clang-format v11) (#116) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Documentation * source formatting (clang-format v11) (#119) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * PTL Changes * Minor agent fix for empty labels * source formatting (clang-format v11) (#120) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Minor agent fix for empty labels * Refactor read_map * source formatting (clang-format v11) (#121) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Refactor read_map * Cache fixes * source formatting (clang-format v11) (#122) Co-authored-by: bwelton <bwelton@users.noreply.github.com> --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <bwelton@users.noreply.github.com> |
||
|
|
2d533ad91e |
Fix set_tests_properties on some unittests (#90)
* Fix set_tests_properties on some unittests - misspelled variable in two places * Update samples/api_buffered_tracing/client.cpp - output to file by default * Update samples/api_callback_tracing/client.cpp - output to file by default * Update lib/rocprofiler/registration.cpp - improve guards around initialize and finalize * Update lib/rocprofiler/tests/registration.cpp - test rocprofiler_iterate_callback_tracing_kind_names - validate number of kind names and number of HSA operation names * Update CI workflow and run-ci.py - change --coverage flag to support all/unittests/samples - samples mode excludes lib/common - samples mode appends -L samples - unittests mode appends -L unittests * Update samples/api_buffered_tracing/client.cpp - header include location fix |
||
|
|
a646c1546c |
rocprofiler library unit tests (#81)
* Update CI and linting workflows
- delete linting workflow
- compile default CI job with clang-tidy
- split out code coverage matrix entry to separate job
- code coverage job runs code coverage 3x
- once for total code coverage
- once for unittests code coverage
- once for samples code coverage
* Update PTL submodule
- improves handling of when thread pool is destroyed in atexit handler
* Update lib/rocprofiler/buffer
- buffer::instance::get_internal_buffer()
- allocate_buffer invokes internal_threading::initialize() on first entry
- update flush routine
- if wait is false, does not wait for task group to finish syncing
- checks for callback pointer
* Update lib/rocprofiler/internal_threading
- modifications to handle destruction of statics before atexit handler is invoked
* Update lib/rocprofiler/registration.cpp
- reorder atexit call in initialize()
- protect finalize from executing more than once
* Add unittests for rocprofiler buffer
* Update CI workflow
- disable fail-fast for sanitizers
- move AddressSanitizer job to top of the list
* Update lib/rocprofiler/tests/buffer/CMakeLists.txt
- do not set memcheck LD_PRELOAD for rocprofiler-lib-buffer-tests
* Update lib/rocprofiler/registration.{hpp,cpp}
- only invoke client finalizers if initialized
- remove invoke_client_initializer
- move invoke_client functions to anonymous namespace (no declaration in header)
- set fini status in finalize
* Update scripts/thread-sanitizer-suppr.txt
- suppress false positive for double mutex lock in external/ptl/source/PTL/TaskGroup.hh
* Restructure lib/rocprofiler/tests
* Update lib/common
- add utility.cpp
- move read_command_line to utility.{hpp,cpp}
- was formerly in config.cpp
* Update lib/rocprofiler
- checks for init status return configuration locked if status is not greater than -1
- in other words, this prevents calling these functions directly (which was possible when check was for greater than 0
* Update lib/rocprofiler/context/context.{hpp,cpp}
- provide deactivate_client_contexts and deregister_client_contexts
- these functions are used when the tool fails to configure
* Update lib/rocprofiler/registration.{hpp,cpp}
- internal "public" get_client_offet()
- client ids are offset by a random value to avoid default values behaving correctly
* Update lib/rocprofiler/tests
- fix rocprofiler_lib.registration_lambda_no_result
* Update lib/rocprofiler/tests
- fix rocprofiler_lib.registration_lambda_with_result
* Update lib/rocprofiler/tests
- remove deep bind from rocprofiler_lib.registration_lambda_with_result
* Update lib/rocprofiler/tests
- use RTLD_NOW when dlopen'ing in rocprofiler_lib.registration_lambda_with_result
* Update rocprofiler registration tests
- split registration tests into separate exe that links to shared library
* Formatting
* Update CI workflow
- always checkout submodules via actions/checkout
* Update lib/rocprofiler/buffer.{hpp,cpp}
- fix issue with buffer flushing not working when only called once
* Update rocprofiler lib registration test
- test for buffered callback
* Update include/rocprofiler/rocprofiler.h
- include internal_threading.h header
* Update rocprofiler lib registration test
- add in internal threading for buffered test
|
||
|
|
78425069e8 |
Bump actions/configure-pages from 2 to 3 (#68)
* Bump actions/configure-pages from 2 to 3 Bumps [actions/configure-pages](https://github.com/actions/configure-pages) from 2 to 3. - [Release notes](https://github.com/actions/configure-pages/releases) - [Commits](https://github.com/actions/configure-pages/compare/v2...v3) --- updated-dependencies: - dependency-name: actions/configure-pages dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> * Update scripts/thread-sanitizer-suppr.txt - replace race_top with race since it appears that race_top isn't suppressing the thread sanitizer error from libamdhip64.so --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
5e4e7b41f1 |
Documentation, sanitizers, and PTL submodule (#71)
* Update scripts/thread-sanitizer-suppr.txt
- ignore data race occasionally triggered by libamdhip64.so
* Update external/CMakeLists.txt
- configure PTL to use locks in task queues
* Update PTL submodule
- tweal to task queues to prevent data race from std::list next pointer
* Add scripts/setup-sanitizer-env.sh
- bash script that exports the {ASAN,LSAN,TSAN}_OPTIONS used by run-ci.py
* Update include/rocprofiler (doxygen)
- fix doxygen grouping
* Update docs workflow
- change concurrency group to be specific to workflow + ref
- this prevents separate PRs triggering this workflow from cancelling each other
|
||
|
|
d3eaacd610 |
Contexts, tracing, include reorg, registration, thread-pool (#65)
* Update scripts/update-doxygen.sh
- ensure build-docs folder exists
* Update scripts/run-ci.py
- exclude files in details subdirectory from code coverage
* Update scripts/thread-sanitizer-suppr.txt
- exclude races in glog
* Update docs/rocprofiler.dox.in
- exclude defines in include/rocprofiler/defines.h from doxygen
- Tweak EXCLUDE_PATTERNS and EXAMPLE_PATTERNS
* Update docs workflow
- trigger workflow whenever there is a change to the public headers (which may be doxygen comments)
* Update include/rocprofiler (reorg and overhaul)
- rocprofiler_status_t additions
- CONTEXT_NOT_FOUND
- CONTEXT_ERROR
- INVALID_CONTEXT_ID
- INVALID_CONTEXT
- BUFFER_BUSY
- rocprofiler_context_is_active func
- rocprofiler_context_is_valid func
- rocprofiler_service_callback_tracing_kind_t update
- remove ROCPROFILER_SERVICE_CALLBACK_TRACING_HELPER_THREAD
- Remove rocprofiler_tracing_helper_thread_operation_t
- Remove rocprofiler_helper_thread_callback_tracer_data_t
- Added rocprofiler_internal_thread_library_t
- Added rocprofiler_at_internal_thread_create
- split rocprofiler.h into several smaller headers
- reworked rocprofiler_status_t values
- added doxygen comments for enums
- replaced rocprofiler_trace_record_operation_kind_t with rocprofiler_trace_operation_t
- use @ instead of / in doxygen comment in rocprofiler_plugin.h
- fix ref to ROCPROFILER_SERVICE_CALLBACK_TRACING_MARKER_API
- end group in fwd.h
- remove PROFILE_COUNTING group in dispatch_profile.h
- remove premature group close in callback_tracing.h
- hsa.h: remove rocprofiler_hsa_trace_data_t
- fwd.h: remove rocprofiler_tracer_callback_data_t
- rename rocprofiler_correlation_id_t.handle to rocprofiler_correlation_id_t.id (consistency)
- fwd.h: add rocprofiler_callback_tracing_record_t
- callback_tracing.h: update rocprofiler_hsa_api_callback_tracer_data_t
- callback_tracing.h: add size fields
- simplify rocprofiler_tracer_callback_t
- removed ROCPROFILER_NONNULL from rocprofiler_get_version
- added rocprofiler_get_timestamp
- ROCPROFILER_STATUS_ERROR_CONFIGURATION_LOCKED in rocprofiler_status_t
- add ROCPROFILER_STATUS_ERROR_THREAD_NOT_FOUND rocprofiler_status_t
- add rocprofiler_buffer_category_t
- rocprofiler_trace_operation_t -> rocprofiler_tracing_operation_t
- rocprofiler_user_data_t union
- tweak rocprofiler_callback_tracing_record_t
- make external_correlation_id non-pointer
- add rocprofiler_user_data_t data field
- tweak rocprofiler_record_header_t
- instead of single uint64_t kind field, have union for category + kind (two u32) with u64 hash
- API extensions for kind id <-> kind string
- API extensions for operation id <-> operation string
- rocprofiler_callback_trace_kind_name_cb_t
- rocprofiler_callback_trace_operation_name_cb_t
- rocprofiler_iterate_callback_trace_kind_names
- rocprofiler_iterate_callback_trace_kind_operation_names
- modify rocprofiler_hsa_api_callback_tracer_data_t data members (remove pointers)
- add rocprofiler_callback_trace_operation_args_cb_t function pointer typedef
- add rocprofiler_iterate_callback_trace_operation_args function
- fixed inconsistent use of *_trace_* vs. *_tracing_* (opting for tracing)
- removed rocprofiler_query_callback_trace_kind_name
- removed rocprofiler_query_callback_kind_operation_name
- Add include/rocprofiler/registration.h
- header dedicated to registering a tool/client with rocprofiler
- this header is not intended to be included by rocprofiler.h
- rocprofiler_client_id_t
- identifier for client tool
- rocprofiler_client_finalize_t
- function pointer prototype for tool-initiated finalization
- rocprofiler_tool_initialize_t
- function pointer prototype for tool initialization (i.e. configuration)
- rocprofiler_tool_finalize_t
- function pointer prototype for tool finalization
- rocprofiler_tool_configure_result_t
- struct returned by tool/client to rocprofiler
- rocprofiler_is_initialized
- function for querying whether tool-induced initialization is possible
- rocprofiler_is_finalized
- function for querying whether rocprofiler has been finalized
- rocprofiler_configure prototype
- this is the function tools implement
- prototype is always marked as having default visibility
- no implementation in rocprofiler
- added typedef for rocprofiler_configure function pointer
- added rocprofiler_force_configure to explicitly invoke rocprofiler_configure instead of relying on lazy init
- made callback typedef names more consistent (_cb_t suffix)
- typedef for rocprofiler_internal_thread_library_cb_t function pointer
- added rocprofiler_at_internal_thread_create function
- added rocprofiler_callback_thread_t struct
- added rocprofiler_create_callback_thread function
- added rocprofiler_assign_callback_thread function
- removed rocprofiler_buffer_tracing_record_header_t in favor of kind and correlation id in each record type
- added rocprofiler_buffer_tracing_kind_name_cb_t typedef
- added rocprofiler_buffer_tracing_operation_name_cb_t typedef
- added rocprofiler_iterate_buffer_tracing_kind_names function
- added rocprofiler_iterate_buffer_tracing_kind_operation_names function
- removed rocprofiler_query_buffer_trace_kind_name function
- removed rocprofiler_query_buffer_kind_operation_name function
* Update lib/common/container/stable_vector.hpp
- include limits header
- reserve_size struct
- overload stable_vector constructor to support reserving as part of construction
* Update lib/common/container/record_header_buffer.{hpp,cpp}
- add emplace member function accepting category and kind (two u32 variables) instead of one u64 kind
- use std::shared_mutex to prevent data-race when reading m_headers
- record_header_buffer is now multiple writer, single reader
- add read_lock member function (shared)
- add read_unlock member function (shared)
- lock member function gets exclusive lock
- unlock member function releases exclusive lock
* Rename "config" to "context" + restructure + implement
- Restructure config files + license
- move config files into lib/rocprofiler/config subfolder
- rename some files
- add license to some files which were missing it
- Rename config/helpers.hpp
- rename to allocator.hpp
- remove get_domain_max_ops
- Create config/domain.{hpp,cpp}
- structures for handling tracing domains and ops
- Update config/config.{hpp,cpp}
- buffer_instance struct
- callback_tracing_service struct
- buffer_tracing_service struct
- config struct
- allocate_{config,buffer} func
- {validate,start,stop}_config funcs
- get_registered_configs func
- get_active_configs func
- get_buffers func
- Update rocprofiler.cpp
- Implement rocprofiler_create_context
- Implement rocprofiler_start_context
- Implement rocprofiler_stop_context
- Implement rocprofiler_context_is_active
- Implement rocprofiler_context_is_valid
- Implement rocprofiler_flush_buffer
- Implement rocprofiler_destroy_buffer
- Implement rocprofiler_create_buffer
- Update lib/rocprofiler/hsa
- use rocprofiler_tracer_activity_domain_t instead of rocprofiler_tracer_activity_domain_t
- remove ROCPROFILER_TRACER_ACTIVITY_DOMAIN_HSA_API fromHSA_API_INFO_DEFINITION_* macros
- Update lib/rocprofiler/context/domain.*
- fixes for domain_info (i.e. use correct enums)
- update rocprofiler_status_t codes
- fix template instantiations
- Update lib/rocprofiler/context/context.*
- use rocprofiler_service_callback_tracing_kind_t instead of rocprofiler_tracer_activity_domain_t
- rename correlation_context to correlation_tracing_service
- fix domains in callback_tracing_service and buffer_tracing_service
- unique_ptr for callback_tracer and buffered_tracer in context
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_configure_callback_tracing_service
- Update lib/rocprofiler/hsa/ostream.hpp
- include rocprofiler.h instead of tracer.hpp
- Update lib/rocprofiler/hsa
- migration to use rocprofiler_hsa_api_callback_tracer_data_t instead of rocprofiler_hsa_trace_data_t
- restructure hsa_api_impl<Idx>
- remove phase_enter and phase_exit
- add set_data_args (partial replacement for phase_enter)
- functor handles the contexts
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_get_version
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- remove hsa_api_ prefix for functions already in hsa namespace
- Update lib/rocprofiler/context/context.{hpp,cpp}
- add client_idx to context struct (tool identifier)
- add push_client function to set client_idx before context is allocated
- add pop_client function to remove client identifier from future context creations
- implemented {registered,active}_contexts and buffers to use new container::reserve_size overload to stable_vector
- fix implementation of start_context
- fix implementation of stop_context
- Update lib/rocprofiler/rocprofiler.cpp
- prevent context creation, buffer creation, pc sampling config, etc. after initialization
- add nullptr checks to rocprofiler_context_is_valid
- fix rocprofiler_configure_callback_tracing_service
- was checking size of buffers, not registered context
- implement rocprofiler_iterate_callback_trace_kind_names
- implement rocprofiler_iterate_callback_trace_kind_operation_names
- Update lib/rocprofiler/CMakeLists.txt
- add registration.{hpp,cpp} to rocprofiler-library target sources
- Update lib/rocprofiler/hsa/utils.hpp
- fix using fmt::formt with const char* strings
- remove join functions (no longer used)
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- remove args_string function
- remove named_args_string function
- update iterate_args function
- change callback type
- accept user data
- rework the hsa_api_impl<Idx>::functor function
- save the rocprofiler_callback_tracing_record_t between callbacks
- update update_table function
- check buffered_tracer domains
- remove comments
- Update lib/rocprofiler/hsa/defines.hpp
- remove MEMBER_<N> macros
- add ADDR_MEMBER_<N> macros
- remove doxygen comments for GET_MEMBER_FIELDS
- add GET_ADDR_MEMBER_FIELDS
- update HSA_API_INFO_DEFINITION_{0,V}
- rename domain_idx to callback_domain_idx
- add buffered_domain_idx
- add as_arg_addr function
- Update lib/rocprofiler/rocprofiler.cpp
- implement rocprofiler_iterate_callback_trace_operation_args
- Remove lib/rocprofiler/tracing.{hpp,cpp} and lib/rocprofiler/CMakeLists.txt
- unused
- Update lib/rocprofiler/hsa/hsa.{hpp,cpp}
- support buffered tracing in hsa_api_impl<Idx>::functor
- rocprofiler_callback_trace_operation_args_cb_t -> rocprofiler_callback_tracing_operation_args_cb_t
- i.e. trace -> tracing
- Update lib/rocprofiler/context/context.{hpp,cpp}
- removed buffer_instance struct
- removed allocate_buffer function
- removed get_buffers function
- changed buffer_tracing_service::buffer_array_t
- Update lib/rocprofiler/hsa: hsa.cpp, ostream.hpp, details folder
- move ostream.hpp into details folder to prevent from contributing to code coverage
- update cmake build system for new directory
* Add lib/rocprofiler/registration.{hpp,cpp}
- implements rocprofiler_set_api_table (called by rocprofiler-register)
- miscellaneous functions for client configure/initialize/finalize
- functions for querying the init/fini status
- relocated OnLoad HSA workaround to this file
- at present, this is used to workaround ROCr not having rocprofiler-register integration yet
- implement rocprofiler_force_configure function
- implement rocprofiler_is_initialized function
- implement rocprofiler_is_finalized function
- ensure configure functions only invoked once
- ensure internal thread creation notification functions are invoked
- get_status is pair of atomics
- fix heap-use-after-free in init_logging
- update finalize
- invoke hsa_shut_down
- set all active contexts to null pointers
* Add lib/rocprofiler/buffer_tracing.cpp
- contains implementations of buffer_tracing (i.e. rocprofiler/buffer_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/buffer.{hpp,cpp}
- contains implementations of buffer (i.e. rocprofiler/buffer.h) and misc internal access functions
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp and lib/rocprofiler/context/context.{hpp,cpp}
* Add lib/rocprofiler/callback_tracing.cpp
- contains implementations of callback_tracing (i.e. rocprofiler/callback_tracing.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/context.cpp
- contains implementations of context public API functions (i.e. rocprofiler/context.h)
- previous implementation may have been moved out of lib/rocprofiler/rocprofiler.cpp
* Add lib/rocprofiler/internal_threading.{hpp,cpp}
- contains implementations of internal_threading (i.e. rocprofiler/internal_threading.h)
- also contains implementations of internal access functions
- update finalize function
- join all task groups and destroy all thread pools first, then reset unique_ptr
* Update lib/rocprofiler/rocprofiler.cpp
- rocprofiler_get_version returns status
- implement rocprofiler_get_timestamp
- remove misc implementations that were split into other files
* Update lib/rocprofiler/CMakeLists.txt
- compile new implementation files
- buffer.cpp
- buffer_tracing.cpp
- callback_tracing.cpp
- context.cpp
- internal_threading.cpp
* Update lib/tests/buffering/buffering-*.cpp
- update to reflect changes to rocprofiler_record_header_t
* Update CMakeLists.txt
- increase minimum cmake version to 3.21 which added HIP support as a language
* Add samples/apps/transpose
- simple HIP application for testing
* Add samples/api_callback_tracing
- HIP application and tool library
- This effectively demos how to setup HSA API tracing
- For each function called in tool, it stores the func/file/line and prints it during finalization
- client.hpp and client.cpp are the tool library
- Implement use of rocprofiler_iterate_callback_trace_operation_args
- add demo of using rocprofiler_get_version
- add_test
- remove PASS_REGULAR_EXPRESSION
- causing false passes during memcheck
- add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
- check if rocprofiler is initialized before stopping context
* Add samples/api_buffered_tracing
- Sample demonstrating tracing the HSA API via buffering
- demo rocprofiler_record_header_compute_hash
- throw exceptions for unexpected buffer data
- add_test
- remove PASS_REGULAR_EXPRESSION
- causing false passes during memcheck
- add ROCPROFILER_MEMCHECK_PRELOAD_ENV to environment
* Update samples/CMakeLists.txt
- add subdirectory for api_callback_tracing
- add subdirectory api_buffered_tracing
* Update samples/pc_sampling/common.h
- fix processing of headers
* Update lib/rocprofiler/hsa/details/ostream.hpp
- fix data race on HSA_depth_max_cnt and recursion
- HSA_depth_max_cnt and recursion is now thread-local static instead of global static
- replace std::string usage with std::string_view
* Actions update
- add dependabot.yml
- use actions/checkout@v4
- install latest libasan and libtsan in sanitizer containers
* Add PTL (Parallel Tasking Library) submodule
|
||
|
|
3769bb7dbf |
Minor documentation workflow updates (#53)
* Document rocprofiler version defines - write doxygen for preprocessor defines - make ROCPROFILER_SOVERSION number similar to ROCPROFILER_VERSION - remove ROCPROFILER_COMPILER_STRING * Update rocprofiler.dox.in - reformatted - include rocprofiler/version.h in doxygen - tweaked dot settings, e.g. made dot SVGs non-interactive * Update scripts/update-docs.sh - configure with cmake ROCPROFILER_INTERNAL_BUILD_DOCS=ON which just generates version.h and exits * Update CMakeLists.txt - support ROCPROFILER_INTERNAL_BUILD_DOCS=ON option for generating version.h and exiting |
||
|
|
729c34fb60 |
Docs skeleton (#51)
* Add doxygen-awesome-css submodule * Basic documentation files - conf.py: run by sphinx - about.md: info about rocprofiler - features.md: overview of features - installation.md: build/test/install instructions - index.md: sets up main page - generate-doxyfile.cmake: generates rocprofiler.dox with rocprofiler-specific info - environment.yml: conda environment - Makefile: sphinx makefile - README.md: build instructions - rocprofiler.dox.in: doxygen template - .gitignore: ignores generated files - .nojekyll: prevents GitHub Pages from using Jekyll for deployment of pages * Documentation scripts - scripts for doing common sequences of commands for building docs - update-docs.sh: builds the docs and installs the docs if /docs directory is present - update-doxygen.sh: quick script for generating doxygen * Workflow for docs - step for building docs - step for deploying docs * Update doxygen comments in include/rocprofiler - rocprofiler.h / rocprofiler_plugins.h - fixed non-existent global references in doxygen comments - fixed parameter names that were incorrect or not updated * Update docs workflow - only deploy docs when on main branch |
||
|
|
b12ef4a75e |
Buffering: initial implementation and tests (#20)
* Update source/lib/common
- CMakeLists.txt
- less verbose
- rocprofiler-common-library uses rocprofiler-headers target
- mpl.hpp
- metaprogramming header with type_list, size_of, index_of, and is_one_of
- record_header_buffer.{hpp,cpp}
- wrapper class around atomic_ring_buffer and vector of rocprofiler_record_header_t
- atomic_ring_buffer.{hpp,cpp}
- request function accepts wrap param when overwritting is not desirable
- can_clear member function
- clear member function for rewinding write pointer to start of buffer
- containers/CMakeLists.txt
- include record_header_buffer.{hpp,cpp} in build target
* Update source/lib/tests: Buffering tests
- Added buffering tests. See comments in code for description
* atomic_ring_buffer -> ring_buffer
- remove ring_buffer implementation
- rename atomic_ring_buffer to ring_buffer
* atomic_ring_buffer -> ring_buffer
- remove ring_buffer implementation
- rename atomic_ring_buffer to ring_buffer
* Update record_header_buffer
- lock, unlock, is_locked, clear, save, and load member functions
* Buffering tests
- add buffer test for save/load capability
* Update rocprofiler_memcheck.cmake
- fix erroneous spaces causing incorrect string evaluation
* Update ring_buffer
- fix exception message
* undef HIP_PROF_API
- make sure HIP_PROF_API is undefined before including hip_runtime.h
- avoid directly including hip/hip_runtime.h
* Update rocprofiler_config_interfaces
- remove stale preprocessor defines that are from old rocprofiler/roctracer
- HIP_PROF_HIP_API_STRING=1
- PROF_API_IMPL=1
* Update run-ci.py
- fix paths to suppression files
- improve printing logs to console in github actions
* Update buffering implementation
- remove support for using malloc instead of mmap in ring_buffer
- provide some info functions in record_header_buffer
- improve the testing of the save-load buffer test
* Update run-ci.py
- fix CTEST_CUSTOM_COVERAGE_EXCLUDE
* Update hip/api_args.h
- remove undef HIP_PROF_API
* Update buffering-save-load.cpp
- updated comments
* Update record_header_buffer
- default ctor
- allocate member function
- is_allocated member function
* Update buffering-save-load.cpp
- tweaked usage of record_header_buffer to delay allocation
|
||
|
|
d4df53cdc9 |
Adding Workflow for building and testing (#21)
* Adding Workflow for building and testing
* Adding run-ci script
* Fixing Project name
* Fixing Github Action
* Fixing Git Version
* Adding CMake installation
* Adding Gtest installation
* Fixing CDash Project name
* Correcting the AmdExtTable
* Fixing issues caused by submodules
* Enable Coverage
* Update tests/CMakeLists.txt
- add placeholder test printing cmake version
* Update CI workflow
- remove CMAKE_PREFIX_PATH and LD_RUNPATH_FLAG env vars
- rename Mi200-Ubuntu22-Doc-Packages job to mi200-ubuntu
- reorder jobs
- remove CMAKE_MODULE_PATH, CMAKE_SHARED_LINKER_FLAGS, CMAKE_INSTALL_RPATH, CMAKE_INSTALL_RPATH_USE_LINK_PATH, CPACK_PACKAGING_INSTALL_PREFIX, CPACK_{OBJCOPY,READELF,STRIP,OBJDUMP}_EXECUTABLE
- Remove build docs step
* Update cmake
- fix code coverage build
* Update submodules
- use rocprofiler_checkout_git_submodule for googletest
---------
Co-authored-by: Jonathan Madsen <jrmadsen@users.noreply.github.com>
|