Fix the following error:
roctx.cpp:91:25: error: reinterpret_cast from 'const void *' to 'decltype(report_activity.load())' (aka 'int (*)(activity_domain_t, unsigned int, void *)') casts away qualifiers
report_activity.store(reinterpret_cast<decltype(report_activity.load())>(function),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
by replacing the 'const void *function' argument with the correct type.
Change-Id: I912239daf6f4a3f00fc753306b84833e5c75f74b
Improve the roctx markers performance when the tracer is not engaged
(the application is not running with rocprof).
The performance of roctx push/pop, measured with:
-----------------------------------------------------------------------
auto start = std::chrono::steady_clock::now();
for (int i = 0; i < 10000000; ++i) {
roctxRangePush ("A");
roctxRangePop ();
}
auto end = std::chrono::steady_clock::now();
std::cout << "ns = " << std::chrono::nanoseconds(end - start).count()
/ 10000000 << std::endl;
-----------------------------------------------------------------------
w/o rocprof | with rocprof | commit
92ns | 770ns | 0d6e132: Cleanup CallbackTable::Get
28ns | 712ns | 6421bd5: Cleanup ROCTX's implementation
20ns | 664ns | 7f0e5e5: Remove the roctx range message...
6ns | 665ns | this commit
Change-Id: Id679dcbd0fb190a3179be98a9b2c1db151efee3d
The range message stack is mirrored in case ranges are pushed or popped
while tracing is stopped (by the tracer tool?). When a stop event is
reported, the tracer tool emits RangePop events by unwinding the stack,
then when the start event is reported, it emits RangePush events again
by unwinding the stack. The issue is that the RangePush events should
be emitted in reverse order.
For example:
RangePush(M1); RangePush(M2); \
TracerStop; RangePop; RangePop; \
...; \
TracerStart; RangePush(M2); RangePush(M1); \ <- In the wrong order
RangePop; RangePop;
It could be fixed by reversing the stack in RangeStackIterate but is it
worth it? The roctx range markers are supposed to be unintrusive so that
they can be left in the application even when it isn't being traced.
Simplifying the roctx API and reducing its added latency by removing
the range message stack mirroring seems like the better choise.
TODO: A future change should make roctx events immune to tracer start
and tracer stop requests. Or simply remove roctracer_start/stop.
Change-Id: Ie4d76afb5ce8d263848dcf1b599af394db56ddab
Remove thread_data_init. The C++ standard guarantees that the thread
local variable is initialized before its first odr-use and destructed
when the thread exits. Use a global initializer to set the reference
from the message stack instance in the map.
Remove roctracer_error_string. This does not belong to this library.
ROCTX does not expose errors to the application. The only functions
returning errors are returning -1 (Push/Pop).
Remove memory leaks due to strdup on the ranges messages. The memory
for the messages is guaranteed to be valid for the duration of the
callback, and it is the application's responsibility to strdup the
strings if it needs to extend the message's lifetime.
Add a lock to the RegisterApiCallback implementation. Iterating the
message stack map must be synchronized as a new thread could be adding
a new value to the map.
Change-Id: Iaf5b07ebc9efe4061cb01327d4c7034888727816
Make CallbackTable::Get return the callback_function/user_arg pair
as an actual return value instead of returning it through arguments
pointers.
Change-Id: Ia2dfcdad8c237a09620518ad67af94add47220da
Replace EXC_ABORT() checks with assertions.
Rewrite the exception class to use std::runtime_error (as it
already handles the std::string/char* message argument).
Change-Id: I48e31924f3aea1328e6562ab6bb06ec373fd5d5e
Move roctracer_cb_table.h to the src/core directory, as it should not
be exposed as a public header, and rename it callback_table.h
Change-Id: Ib448cbd32a275df0268d53bd8d1da0bdc9201470