e0901eba28
* remove samples dependency on rocprofiler-sdk-amd-comgr. * add find package for amd_comgr. --------- Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Thread Trace and ROCprof Trace Decoder
Services
- Thread trace in device profiling mode
- ROCprof Trace Decoder decodes the received thread trace data
- Thread trace start/stop using roctx
Properties
agent.cpp:
- Configures thread trace in all GPU agents found with
rocprofiler_configure_device_thread_trace_service - Waits until
roctxProfilerResumeis called to start thread trace - Stops tracing at
roctxProfilerPause - Receives the trace data in
shader_data_callbackand callsrocprofiler_trace_decodeto decode the data rocprofiler_trace_decodecallsparse(a lambda)parsereceives the dedecoded data and increments hitcount/latencies by pc address- At application end,
tool_finicallsgen_output_streamto write the top hotspots intothread_trace.log
main.cpp:
- Defines a few different kernels and runs them
- The first loop iteration warms up the kernels
- The second iteration calls
roctxProfilerResumeto start thread trace - After the loop ends,
roctxProfilerPauseis called to stop tracing