Making sure not to count duplicates for load_unload_reload_trace and
fixed the ignore-count option in check_trace.py.
Change-Id: I9e674aa624ec3b473bb7c6cc95260e240204627f
Split the public and private HSA profiler/tracer interfaces. Only the
public interface should be exposed in include/roctracer.
Change-Id: I7e4424cd90023693350c31e6b02caca8c984ba84
The roctracer-tests package contains all the roctracer test binaries
and scripts needed to run the testsuite outside of the build directory.
Change-Id: Id11f862fb4bdb2425d68f455074172c38814ec92
Improve the roctx markers performance when the tracer is not engaged
(the application is not running with rocprof).
The performance of roctx push/pop, measured with:
-----------------------------------------------------------------------
auto start = std::chrono::steady_clock::now();
for (int i = 0; i < 10000000; ++i) {
roctxRangePush ("A");
roctxRangePop ();
}
auto end = std::chrono::steady_clock::now();
std::cout << "ns = " << std::chrono::nanoseconds(end - start).count()
/ 10000000 << std::endl;
-----------------------------------------------------------------------
w/o rocprof | with rocprof | commit
92ns | 770ns | 0d6e132: Cleanup CallbackTable::Get
28ns | 712ns | 6421bd5: Cleanup ROCTX's implementation
20ns | 664ns | 7f0e5e5: Remove the roctx range message...
6ns | 665ns | this commit
Change-Id: Id679dcbd0fb190a3179be98a9b2c1db151efee3d
Make CallbackTable::Get return the callback_function/user_arg pair
as an actual return value instead of returning it through arguments
pointers.
Change-Id: Ia2dfcdad8c237a09620518ad67af94add47220da
Move roctracer_cb_table.h to the src/core directory, as it should not
be exposed as a public header, and rename it callback_table.h
Change-Id: Ib448cbd32a275df0268d53bd8d1da0bdc9201470
CppHeaderParser has limited support for unnamed structs. It leaves the
name empty so this results in classes (a.k.a structs) having trailing '::'
characters, also giving no way to distingush two unnamed structs at the
same level of nesting. An example are the inner structs of
hipExternalSemaphoreSignalParams. The workaround consists in skipping
over these, so they are not generated in the output header file
which lists all ostream ops<<. Only the inner unnamed structs are skipped,
the rest is processed as it should.
Change-Id: I17439c46095469b7adb7aee0b0f0b3d234aabc11