* Handle hsa_queue_destroy after finalization
- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue
* Update HIP/HSA/marker update_table logging
* Update rocprofv3 tests
- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them
* Disable thread sanitizer deadlock detection
* Update CI workflow
- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers
* Update run-ci.py
- set gcovr html medium and high threshold
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- remove this capture from enable/disable serialization
* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*
- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map
* Logging for HIP/HSA/marker/profile_serializer
* Logging for HIP/HSA/marker/queue_controller
* Improve test_retired_correlation_ids asserts
* Fix tests/counter-collection/validate.py
- scale expected SQ_WAVES counter value based on warp size of GPU
* Tweak github comment for code coverage
* Remove gcovr html high/medium threshold args
* Fix tests/counter-collection/validate.py
- round before casting to int in test_counter_values
* operator bool for profile_serializer
- only wait on CV if profile_serializer is used
* Logging updates (profile_serializer + code_object)
* Update counter-collection validate.py
* QueueController does not wait on CV if finalizing/finalized
* Update CI workflow
- remove navi32 from core job
* Improve HIP/HSA/marker tracing get_functor/functor
- remove lambda wrapper around functor
* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp
- do not acquire cvmutex lock during finalization
* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*
- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized
* Update CI workflow
- remove navi32 runners
* bwelton fixes for hangs
* CMake improvements + simplified demangle
- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)
---------
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit: 2f9b1767e9]
* Moved tests/apps to tests/bin
* Renamed cmake project in tests/bin
* Update samples
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
- tweaks to stdout messages
* Update tests
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
* Add tests/lib
- libraries with HIP code
* Update PTL submodule
- remove atexit delete of thread_id_map
* Update cmake/rocprofiler_options.cmake
- Set ROCPROFILER_DEFAULT_FAIL_REGEX
* Update common lib: env + logging
- improved customization of logging settings
- default to disabling logging to files
- install failure handler for rocprofv3
- set_env support in environment.*
* Add lib/rocprofiler-sdk/shared_library.cpp
- shared library constructor
* Update lib/rocprofiler-sdk-tool/tool.cpp
- destructor thread safety
- convert callback_name_info and buffered_name_info to pointers
- install failure handler for logging
* Add tests/bin/hip-in-libraries
- hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels
- used for testing deadlocking within __hipRegisterFatBinary
* Update bin/rocprofv3
- reorganized the env variables
- use exec to launch command
- set ROCPROFILER_LIBRARY_CTOR=1
* Add tests/rocprofv3/tracing-hip-in-libraries
- uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels
* Update bin/rocprofv3
- fix counter collection (no exec)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- replace "Kernel-Name" with "Kernel_Name"
* Update lib/rocprofiler-sdk/registration.cpp
Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries
* Update tests/rocprofv3
- replace "Kernel-Name" with "Kernel_Name"
* Update tests
- vector-ops (bin) stream syncs + runs with 4 queues per device
- improve counter-collection/input1 validation
- rocprofv3/tracing-hip-in-libraries does not do sys-trace
- improved validation script for tracing-hip-in-libraries
- updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection
* Update samples/counter_collection
- updated dispatch_callback(s) and record_callback(s) following reworking of prototypes
* Update bin/rocprofv3
- reorganized help menu
- added options for sub-HSA tables
- added --hip-runtime-trace
- changed --hip-trace to include --hip-compiler-trace
* Update lib/rocprofiler-sdk-tool
- improved kernel filtering
- removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk)
- fixed issue with counter-collection w/o tracing
- added support for fine grained HSA API tracing
- removed directly linking to HSA-runtime
* Update lib/rocprofiler-sdk/agent.cpp
- rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option)
* GPR (vector and scalar) info in kernel symbol data
- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info
* Header include order fix
- Include repo headers first
- Third party library headers next
- standard library headers last
* Update dispatch profiling public API
- introduce rocprofiler_profile_counting_dispatch_data_t
- change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t
- provide rocprofiler_user_data_t pointer in dispatch callback
- provide rocprofiler_user_data_t value (from dispatch cb) in record callback
* Update tests/bin/CMakeLists.txt
- fix add_subdirectory(hip-in-libraries) order
* Update VERSION
- bump to 0.2.0 in prep for AFAR
[ROCm/rocprofiler-sdk commit: 7b6d3c70bd]