* Update VERSION
Update version to 0.7.0
* Fixing test install build step issue
* Updates from editor
---------
Co-authored-by: Ammar ELWazir <Ammar.ELWazir@amd.com>
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1
* Add logging and argparsing
* Dropping duplicated counters in pmc multiple lines
* Adding test for conversion
* moving conversion script to test files
* copy conversion script from scripts folder
* Counter track for memory allocation is now a running sum showing total allocation
* Address review comments
* Update source/lib/output/generatePerfetto.cpp
Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>
* Updated to reflect review comments
* Fix compilation errors on CI
* remove braces on scalar
* Fix struct compilation issues
* Removed name_to_id for sanitizer
---------
Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>
* rocprofv3: suppress agent info when no data collected
* Update output config serialization
- full serialization of output configuration
* Update rocprofiler-sdk-att/tests
- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them
* Update rocprofv3 command-line
- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload
* Add new rocprofv3 tests for specify the ATT library path
* Tweak to rocprofv3-test-hsa-multiqueue-att tests
* Update rocprofv3 tool to enable output with att
* Fix standalone test installation
* Revert to fetchcontent_makeavailable to fetchcontent_populate
* Revert tests/common/CMakeLists.txt
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* [DO NOT MERGE] Misc UUID updates
- this is WIP
* Agent visibility
- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL
* Update CHANGELOG
* tweak to rocprofiler_agent_runtime_visiblity_t
* Code object kernel address
- new fields in code_object_kernel_symbol_register_data_t
- kernel_code_entry_byte_offset
- kernel_address
* Support ROCR_VISIBLE_DEVICES reordering devices for HIP
* Addressed code review changes
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* rocprofv3: do not abort if counter does not have dimensions
* Relax error handling further in rocprofv3 metadata
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Adding pc sampling how to guide
* doc update
* Fixing indentation
* updating index
* udpating doc
* updating doc
* Added field information
* Fixing Formatting
* fix formatting error
* Added json format for pc sampling
* feedback resolved
* formatting for text
* PC Sampling API doc
* Reformatted
* Note for shared systems
* update docs
* correcting relative path for cross-referencing
---------
Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
* Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG value to be used with HSA calls
Fix for CI
* More tweaks
* Increase reproducible-runtime kernel sleep granularity
* Fix data race in synchronous device counter collection sample
* Update device counting service
- add get_active_context function
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX
- add UBSAN_OPTIONS to setup-sanitizer-env.sh
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX
* Use -fno-sanitize-recover=undefined flag
- this compiler flag causes all undefined behavior errors to exit
* Revert ROCPROFILER_DEFAULT_FAIL_REGEX
* fix for shift overflow
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
* Add example for synchronous reading of device counters
We already have test cases for this use case but this a sample
such that our collaborators can have a place to quickly pull
code from for use on their end (and to serve as a working example).
* Formatting fix
* Formatting fix
* Minor change for testing
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
* Adding New HIP APIs
* Format Fix
* Format Fix
* Removing changes from ostream and moving it to format
* Addressing Code Review Comments
* Versioning the new hip calls formatting
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
* [SWDEV-509876] Remove buffer requirement from device counting service
No longer require a buffer to be given when setting up device counting
service. This is to reduce performance overhead in cases where immediate
return of counting samples is being used (synchronous mode).
* Missed file
* Update source/include/rocprofiler-sdk/device_counting_service.h
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Update source/lib/rocprofiler-sdk/counters/controller.cpp
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Update source/lib/rocprofiler-sdk/counters/device_counting.cpp
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Fixes for build
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Fix async copy validation test
- make the async copy tracing test work regardless of however many HSA memory copies the HIP memory copy decomposes into
* Fix rocprofv3 memory copy tests
* Fix compilation support for hipGraphBatchMemOpNodeGetParams
* Fix rocprofv3-test-summary-*-validate
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Fix HIP data type stringify
- when ROCPROFILER_CI is not defined, provide default for case statements
- Add support for hipGraphNodeTypeBatchMemOp when HIP version is >= 6.4.0
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>