* Add rocprofv3-multi-node.md to source/lib/rocprofiler-sdk-tool
* Initial source re-organization
- create "output" static library
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- add GPR count fields to kernel symbol serialization
* Add source/scripts/generate-rocpd.py
- reads one or more JSON output files from rocprofv3 and writes rocpd SQLite3 database
- Note: preliminary implementation
* More reorganization b/t lib/rocprofiler-sdk-tool and lib/output
* Updates to generate-rocpd.py
- add SQL views
- option: --absolute-timestamps -> --normalize-timestamps
- option: --generic-markers
- misc fixes with regards to getting the views working
- support marker names
* Update generate-rocpd.py
- Add --marker-mode option
* Update generate-rocpd.py
- Improve debugging of bad bulk SQLite statements
* Update rocprofv3-multi-node.md
- cleanup of proposed SQL schema
* lib/output/format_path.{hpp,cpp}
- rename format to format_path (in config.hpp and config.cpp)
- move format_path functionality to format_path.{hpp,cpp}
* Rework lib/output/tmp_file_buffer.{hpp,cpp}
* Update output_key.cpp
- support %cwd%, %launch_date%
* Rework lib/output/buffered_output.hpp
* Support csv_output_file constructed via domain_type
* Update lib/output/domain_type.{hpp,cpp}
- get_domain_trace_file_name
- get_domain_stats_file_name
* Update lib/rocprofiler-sdk-tool/tool.cpp
- tweak headers
* Update lib/output/generate*.cpp
- remove include of helpers.hpp
- CSV uses domain_type for filenames
* Update samples/counter_collection/per_dev_serialization.cpp
- make wait_on volatile
* Remove tool_table from lib/output and lib/rocprofiler-sdk-tool
- Also split various structs into their own files
- lib/output/agent_info
- lib/output/metadata
- lib/output/kernel_symbol_info
- lib/output/counter_info
- Implemented rocprofiler::tool::metadata
* Optimize rocprofiler_tool_counter_collection_record_t
- reduce the size of the struct from 24784 bytes to 8376 bytes
* Introduced output_config
- split subset of config (from tools library) into output_config to be able to configure the output generating functions separately from the tool library
- this is a significant step towards the output generating functions not relying on static global memory
* Stream chunks of data into output instead of loading all info memory
* Remove duplicate group_segment_size in rocprofiler_kernel_dispatch_info_t serialization
* Adding Q&A to rocprofv3-multi-node.md
* Remove all remaining include lib/rocprofiler-sdk-tool from lib/output
- migrated a fair amount of code from lib/rocprofiler-sdk-tool/helper.hpp to lib/output
* Update Q&A of rocprofv3-multi-node.md
* Fix minor compilation errors + minor cleanup
* Update hsa/async_copy.cpp
- when ROCPROFILER_CI_STRICT_TIMESTAMPS > 0, reduce the active_signal sync wait time
* Update profiling_time.hpp
- fix log messages for when start/end time is less/greater than enqueue/current CPU time
* Fix generate_stats for tool_counter_record_t
* Dictionary optimization for generate-rocpd.py
---------
Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com>
* Check to force tool to initialize the ctx id to zero.
* initialize rocprofiler_context_id_t with 0 in units tests
* changelog
---------
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
* SWDEV-465322: Adding support for r Perfcounter SIMD Mask in ATT
* Apply suggestions from code review
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Adding unit tests
* Adding counters check for gfx9 and SQ block only
* Addressing review comments
* changing the struct size
* fixing header includes
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Dispatch table copy/update uses ROCP_TRACE instead of ROCP_INFO
* Update rocprofiler-sdk CMake config
- rocprofiler::rocprofiler is alias to rocprofiler-sdk::rocprofiler-sdk instead of other way around
* Prefer rocprofiler-sdk::rocprofiler-sdk over rocprofiler::rocprofiler
* Fix WITH_UNWIND for glog
- requires a value of "none" instead of boolean now
* Update include/rocprofiler-sdk/registration.h
- explicit struct names to permit forward decl
* Update include/rocprofiler-sdk/cxx/serialization.hpp
- ROCPROFILER_SDK_CEREAL_NAMESPACE_BEGIN and ROCPROFILER_SDK_CEREAL_NAMESPACE_END to enable customized namespace
* Update test/tools/json-tool.cpp
- push/pop ppid as external correlation id instead of pid
* Update environment variables for tests and samples
* Revert to old CDash dashboard in run-ci.py
* Revert to new CDash dashboard in run-ci.py
- enums for operations should not contain callback/buffer tracing categorization
- e.g. ROCPROFILER_CALLBACK_TRACING_CODE_OBJECT_LOAD should be ROCPROIFLER_CODE_OBJECT_LOAD
* Added first ATT API
* Finalizing thread trace API
* Fixing more rebase conflicts
* Added codeobj disassembly sample
* Fixing merge issues with rebase [2]
* Adding ATT packets
* Implemented thread trace intercept
* Moved codeobj parser to same repo as rocprofiler
* Moved thread trace to new API
* Fixing merge conflicts
* Fixing more merge conflicts
* Adding thread trace packet reuse
* Merged aql_profile_v2 headers
* Linked ATT sample to aqlprofile
* Updated decoder to include non-loaded codeobjs
* Implemented ISA decoder into ATT sample
* Added marker_id to vaddr
* Updating aql_profile_v2 API to memcpy
* Updating thread trace API to include 64bit markers. Using the result of ISA matching.
* Added instruction type and cycles summary
* Updated sample with selection of kernel by kernel_object
* Added option to copy from memory kernels
* Moved tool_data in thread_trace to dynamic alloc
* Restoring hsa.cpp
* Fixed ATT sample crash. General improvements.
* Moved codeobj library to outside src/
* Updated license header
* Moved codeobj_capture to camelcase
* Solving some more merge conflicts
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/advanced_thread_trace/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update samples/code_object_isa_decode/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/thread_trace/CMakeLists.txt
* Removing unused parameter check
* Adding const to isEmpty
* Removing unused warning
* Adding libdw-dev to requirements
* Running clang-format
* Commenting out new aql calls
* Clang format
* Unused variable fix
* Adding codeobj-decoder coverage
* Commenting out threadtrace
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* P
* WOverloaded
* Addressing clang-tidy
* Virtual destructor on ttracer class
* Corr id
* Fixing code source format
* Update CMakeLists.txt
* Build fixes
* Update source/lib/rocprofiler-sdk-codeobj/code_object_track.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fix shadowing
* Update CMakeLists.txt
* Update samples/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>