Adding roctracer_hcc.h for backward compatability to enable multiple components using it as tensorflow
Change-Id: Idfcdda9207277866e629e7bb9bfc0da835481217
The range message stack is mirrored in case ranges are pushed or popped
while tracing is stopped (by the tracer tool?). When a stop event is
reported, the tracer tool emits RangePop events by unwinding the stack,
then when the start event is reported, it emits RangePush events again
by unwinding the stack. The issue is that the RangePush events should
be emitted in reverse order.
For example:
RangePush(M1); RangePush(M2); \
TracerStop; RangePop; RangePop; \
...; \
TracerStart; RangePush(M2); RangePush(M1); \ <- In the wrong order
RangePop; RangePop;
It could be fixed by reversing the stack in RangeStackIterate but is it
worth it? The roctx range markers are supposed to be unintrusive so that
they can be left in the application even when it isn't being traced.
Simplifying the roctx API and reducing its added latency by removing
the range message stack mirroring seems like the better choise.
TODO: A future change should make roctx events immune to tracer start
and tracer stop requests. Or simply remove roctracer_start/stop.
Change-Id: Ie4d76afb5ce8d263848dcf1b599af394db56ddab
Remove thread_data_init. The C++ standard guarantees that the thread
local variable is initialized before its first odr-use and destructed
when the thread exits. Use a global initializer to set the reference
from the message stack instance in the map.
Remove roctracer_error_string. This does not belong to this library.
ROCTX does not expose errors to the application. The only functions
returning errors are returning -1 (Push/Pop).
Remove memory leaks due to strdup on the ranges messages. The memory
for the messages is guaranteed to be valid for the duration of the
callback, and it is the application's responsibility to strdup the
strings if it needs to extend the message's lifetime.
Add a lock to the RegisterApiCallback implementation. Iterating the
message stack map must be synchronized as a new thread could be adding
a new value to the map.
Change-Id: Iaf5b07ebc9efe4061cb01327d4c7034888727816
System clock timestamps should only come from a single source:
util::timestamp_ns(). Externally, this function is exposed as
roctracer_get_timestamp() (used by the tracer tool).
Removed the now unused HSA Runtime Utilities which were never part
of the ROCtracer API.
Change-Id: I044b7f4da60fd8fdb771b0c877622a3143f0e815
A trace buffer is used to efficiently store synchronous event records
so that they can be processed later, possibly in a different thread,
when the buffer is flushed. This helps reduce the latency added by
tracing API calls.
The API does not need to use trace buffers as synchronous events are
directly reported to the client with callbacks, and asynchronous events
(activities) are saved in memory pools.
The implentation of HSA asynchronous memory copy activities was using
a trace buffer shared with the tracer tool to write the records to a
file (async_copy_trace.txt), instead of using a memory pool and
reporting the activity to the client.
Removed the asynchronous memory copies trace buffer, and updated
hsa_async_copy_handler to use the pool specified when the activity
was enabled.
Updated the tracer tool to read HSA_OP_ID_COPY records out of the
default memory pool and write them to async_copy_trace.txt.
Move trace_buffer.h to test/tool as tracer_tool.cpp is now the only
file using it.
Change-Id: Ida95aba2eaf3c3f2a979ed6c2b060374017b7424
The HCC runtime is no longer used, so move all the remaining
activities in the HipApi loader and remove the HccLoader.
Change-Id: I845c04ca275a474526840315bae0ad1a4ce02257
Move roctracer_cb_table.h to the src/core directory, as it should not
be exposed as a public header, and rename it callback_table.h
Change-Id: Ib448cbd32a275df0268d53bd8d1da0bdc9201470
Include the upgrade operation check in the prerm script
in package.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
Change-Id: I1504ce96a27d21d9c3d9bafc0dea8055398adc99
roctracer_status_t roctracer_get_timestamp(uint64_t* timestamp);
Get system timestamp for roctracer clients.
The API could be used to help roctracer clients understand the reference frame
of timestamps when receiving activity callbacks, as the nanoseconds reported in
the activity callbacks are not in the same reference frame as CPU walltime
clock.