a061b7947f
- hosttrace library automatically collects and merges timestamps for HIP API calls and kernels with the host-side instrumentation - mostly eliminates the need for using external rocprof - added thread_instruction_count in perfetto output - increased hosttrace min_loop_address_range to 512 - disabled instrumenting functions with dynamic callsites by default - miscellaneous cmake updates * roctracer support - fully integrated perfetto + roctracer outputs - thread_instruction_count in perfetto - increased min_loop_address_range to 512 - disabled instrumenting functions with dynamic callsites by default - updated timemory submodule * hosttrace_launch_compiler - support for using an alternative compiler as needed via launch compiler - elfio added as submodule (not currently used) - miscellaneous cmake updates * README update + host/device categories + misc - timemory fix for TIMEMORY_ROCTRACER_ENABLED - transpose fix * papi_tuple_t -> papi_tot_ins - minor fix to Findroctracer.cmake