Files
rocm-systems/projects/rocprofiler-sdk/samples/thread_trace
Giovanni Lenzi Baraldi 9849073836 SWDEV-540648: Adding realtime clock to v3 tool. Update decoder header. (#666)
* SWDEV-540648: Adding realtime clock to v3 tool. Update header for decoder.

* Adding tests

* Review comments

* Review comment
2025-09-10 12:39:27 +02:00
..

Thread Trace and ROCprof Trace Decoder

Services

  • Thread trace in device profiling mode
  • ROCprof Trace Decoder decodes the received thread trace data
  • Thread trace start/stop using roctx

Properties

agent.cpp:

  • Configures thread trace in all GPU agents found with rocprofiler_configure_device_thread_trace_service
  • Waits until roctxProfilerResume is called to start thread trace
  • Stops tracing at roctxProfilerPause
  • Receives the trace data in shader_data_callback and calls rocprofiler_trace_decode to decode the data
  • rocprofiler_trace_decode calls parse (a lambda)
  • parse receives the dedecoded data and increments hitcount/latencies by pc address
  • At application end, tool_fini calls gen_output_stream to write the top hotspots into thread_trace.log

main.cpp:

  • Defines a few different kernels and runs them
  • The first loop iteration warms up the kernels
  • The second iteration calls roctxProfilerResume to start thread trace
  • After the loop ends, roctxProfilerPause is called to stop tracing