* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Support for rocJPEG API Trace
* Added newline to rocjpeg_version.h
* json-tool code added, initial test/bin commit
* Formatting
* Resolved rocjpeg bin test compilation errors
* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed
* Formatting and compilation errors
* Minor fixes
* Copyright year update and minor fixes
* Doc update fix
* Added rocjpeg csv file in data
* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes
* Added rocdecode and rocjpeg to CI
* Removed rocdecode and rocjpeg from CI and added back build tests option
* Updated Cmake Files
* Added rocDecode and rocJPEG to CI
* Remove cmake line added in error
* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes
* Added find_package for test
* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path
* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests
* Resolve merge conflicts and formatting
* Added regex find and replace instead of include for CI
* VAAPI package causing errors on Vega20
* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved
* Removed workflows regex
* Formatting and minor test modification
* Modified test for vega20
* Update rocDecode and rocJPEG cmake and tests
* Changelog
* Fix merge conflict
* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing
* Removed if found statements, replaced with TARGET:EXISTS
* Skip json file for rocjpeg and rocdecode tests if not supported
* Add os import
---------
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* rocprofv3: suppress agent info when no data collected
* Update output config serialization
- full serialization of output configuration
* Update rocprofiler-sdk-att/tests
- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them
* Update rocprofv3 command-line
- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload
* Add new rocprofv3 tests for specify the ATT library path
* Tweak to rocprofv3-test-hsa-multiqueue-att tests
* Update rocprofv3 tool to enable output with att
* Fix standalone test installation
* Revert to fetchcontent_makeavailable to fetchcontent_populate
* Revert tests/common/CMakeLists.txt
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Merge conflict error
* CMake files changed in response to review comments. Attempting to implement callbacks.
* Turned off test building for rocdecode
* Minor fixes for review comments
* Review comments
* Updated formatting
* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.
* Remove iterate args support
* Remove iterate-args
* enforce abi versioning in macro if
* Fix doc error
* removed spaces to fix indentation error
---------
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Adding Trace Period feature to rocprofv3
* Adding feature documentation
* Update source/bin/rocprofv3.py
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fixing format
* Moving to Collection Period and changing the input params
* Format Fixes
* Fixing rebasing issues
* Removing atomic include from the tool
* Adding more options for units, optimizing the code
* Fixing rocprofv3.py
* Fixing time conv & adding time controlled app
* Fixing format
* Changing to shared memory testing methodology
* use of shmem use
* Fix include headers for transpose-time-controlled.cpp
* Format upload-image-to-github.py
* Removing shmem and using only env var to dump timestamps from the tool
* Tool Fixes + Test Config
* Adding Tests
* Fixing Review comments
* Update trace period implementation
* Update trace period tests
* check between start and stop timestamps
* Merge Fix
* Update validate.py
* Improve safety of rocprofiler_stop_context after finalization
* Pass context id to collection_period_cntrl by value
* Adding 20 us error margin
* Ensure log level for collection-period test is not more than warning
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Adding tool pc sampling support
Fixing merge issue
tool support on SDKupdates
link amd-comgr
Sanitizer failure fix
fix format
Addressing review comments
misc fix
Adding dispatch id to the CSV output
AddingCHANGELOG
[ROCProfV3][PC Sampling] Initial ROCProfV3 PC sampling tests for JSON and CSV formats (#17)
ROCProfV3 initial tests for JSON and CSV output.
Simple kernels that simplify the verification of samples to instruction decoding
has been introduced.
removing option to enable pc sampling explicitly
Adding documentation
no pc-sampling option in tests anymore
Addressing review comments
Updating docs
an option for choosing whether all units must be sampled
try ignoring PC sampling tests (#36)
* run pc-sampling tests on MI2xx runners
* use v_fmac_f32 instead of s_nop 0 in tests
* fixing docs
Adds rocprofiler_load_counter_definition. This function allows a counter definition file to be supplied to rocprofiler-sdk directly. Takes in a string containing the counter definition YAML, its size (in bytes), and a flag value to state whether this is an append operation or not.
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions
* Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected
* Debugging and implementing generateCSV function
* Memory allocation size and starting address outputted to csv and json file formats
* Formatting
* Initial setup for OTF2 and Perfetto generation
* Collecting agent id for memory_allocation and formatting
* Modified memory_allocation.cpp to set up code for AMD_EXT commands
* Support for memory_pool_allocate added
* Removed accidently added file
* Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works
* Formatting
* Fixed perfetto and otf2 output
* Fixed flag issue due to incorrect buffer use
* Updated documentation
* Small cleaning and comments
* Added test for HSA memory allocation tracing
* Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error
* Decreased lower limit of hip calls for test
* Modified summary tests to vary number of allocate requests
* Minor fixes to address comments. Still need to address OTF2 comments
* Fix docs and changed OTF2 to use enum for type specified in location_base construction
* Fixed schema error
* Added vmem command tracking. Need to add test
* Updated test to work with vmem command and updated generateCSV to output int instead of hex string.
* OTF2 enum update and mispelling fix
* CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API
* Update CMakeLists.txt for memory allocation test
* Updated summary test
* Minor fixes to address comments
* Moved domain_type.hpp enum to before LAST
* Fixed compile errors and formatting
* Fixed stats summary domain name error
* Added rocprofv3 test
* Page migration test fix
* Undo page migration test changes. Failures do not appear to have to do with memory allocation
* rocprofv3: support specifying PMC counters via command line
- E.g. `rocprofv3 --pmc SQ_WAVES -- <app>`
* Update CHANGELOG
* updated rocprofv3 help and documentation
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
* [Draft]: Add support for RCCL tracing
Address comments
* [Draft]: Add support for RCCL tracing
Address PR comments, changes from RCCL upstream
* Add RCCL library table registration
Working on adding support to rocprofiler-register
* Support compilation w/o <rccl/amd_detail/api_trace.h>
- dummy api_trace.h header
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header
* RCCL API tracing tool support
- add to rocprofv3
- add to json-tool
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* LD_PRELOAD librocprofiler-sdk-roctx.so when marker-trace enabled
- this enables apps to link against old ROCTx (libroctx64.so) but get marker tracing in rocprofv3
* Update CHANGELOG
* Validation test for app linked to old (roctracer) ROCTx library
* Tweak scope of tool_counter_info
- causing "signal-unsafe call inside of a signal" error for ThreadSanitizer on mi200
* Fix handling of missing transpose-roctracer-roctx
* Disable rocprofv3 aborted-app test (ThreadSanitizer)
- ThreadSanitizer + mi200/mi300 + aborted-app results in a signal-unsafe call inside a signal that cannot be specifically suppressed as usual via rocprofv3_error_signal_handler for some unknown reason
* Add UndefinedBehaviorSanitizer job
* Move include/rocprofiler-sdk/cxx/details/delimit.hpp to tokenize.hpp
* Update docs/how-to/using-rocprofv3.rst
- fix code block indents
- reorder rocprofv3 options, limit them to important options
- add docs for `--runtime-trace`
* Update rocprofv3.py
- parser argument groups
- new `--runtime-trace` option
- new `--summary` option
- new `--summary-per-domain` option
- new `--summary-groups` option
- new `--summary-output-file` option
- new `--summary-units` option
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- fix async copy operation names: add "MEMORY_COPY_" prefix
* lib/rocprofiler-sdk-tool: update statistics.{hpp,cpp}
- statistics<>::get_percent function
- stats_entry_t struct
- stats_formatter struct
- percentage struct
- std::to_string(::rocprofiler::tool::percentage)
* lib/rocprofiler-sdk-tool: update domain_type.{hpp,cpp}
- reorder domain_type enum values
* lib/rocprofiler-sdk-tool: update generateCSV.{hpp,cpp}
- separate writing CSV from accumulating statistics
- a lot of functionality was moved to statistics.{hpp,cpp}
* lib/rocprofiler-sdk-tool: update output_file.{hpp,cpp}
- output_stream_t struct
- get_output_stream(...) returns output_stream_t instance
* lib/rocprofiler-sdk-tool: update generateJSON.cpp
- update get_output_stream usage to output_stream_t
* lib/rocprofiler-sdk-tool: update generateOTF2.cpp
- header include order tweak
* lib/rocprofiler-sdk-tool: update buffered_output.hpp
- stats_data_t was renamed to stats_entry_t
* lib/rocprofiler-sdk-tool: update generatePerfetto.cpp
- header include tweak
* lib/rocprofiler-sdk-tool: update tmp_file_buffer.hpp
- emit warning message if write_ring_buffer fails after offloading instead of aborting
- prefer placement new instead of assignment in write_ring_buffer
* lib/rocprofiler-sdk-tool: add generateStats.{hpp,cpp}
- functions for accumulating statistics
* Update tests/rocprofv3/tracing-hip-in-libraries/CMakeLists.txt
- accommodate tweak to CSV output file name for HIP and HSA traces
* lib/rocprofiler-sdk-tool: update config.{hpp,cpp}
- new config variables
- stats_summary
- stats_summary_per_domain
- summary_output
- stats_summary_unit_value
- stats_summary_unit
- stats_summary_file
- stats_summary_groups
- support output keys for hostname: %hostname% / %h
* lib/rocprofiler-sdk-tool: update tool.cpp
- support summary output
* Documentation fixes
* Test for summary output
* Update tests/bin/transpose to use more ROCTx
- also support building with the roctracer ROCTx
* Remove roctxMark from OTF2 + fix kernel-rename tests
- following more ROCTx calls in transpose, kernel-rename validation had to be updated
* JSON metadata + JSON summary
- add serialization support for config
- add serialization support for statistics
- additions to json spec
- rocprofiler-sdk-tool/metadata/config
- rocprofiler-sdk-tool/metadata/command
- rocprofiler-sdk-tool/summary
- config output_keys support for NVIDIA %q{<ENV-VAR>} syntax
- config output_keys support keys within keys
* rocprofv3 --summary-groups warning if no domain matches
- emit warning if a regex in for summary groups did not match any domain names
* Compile fix for lib/rocprofiler-sdk-tool/tool.cpp
- get_config().scratch_memory_trace
- pass contributions to write_json
* Update rocprofv3.py to preload rocprofiler-sdk-roctx
- appended to LD_PRELOAD when args.marker_trace is enabled
* Fix ReST link errors about subtitle underline being too short
* Patch tokenization of config::stats_summary_groups
- guard against array values of empty strings
* Tweak rocprofv3 summary test
- input-summary.yaml (used by rocprofv3-test-summary-inp-yaml-execute) only provides one summary group regex
* Disable LD_PRELOAD of librocprofiler-sdk-roctx.so
- this causes problems in the sanitizers, will be addressed in another PR