[SWDEV-418917] reported that timing skew was being introduced by
roctracer. Most of the cause of this problem seems to stem from outrunning
the double buffering scheme that we use in memory_pool (part of the
reason for this outrun is due to File writing being slow). A semi-quick
fix that may be able to last until RocProf v2 is complete is to allow
adjustment of the buffer size. ROCTRACER_BUFFER_SIZE env variable was
introduced here which allows setting the buffer size of tracer tool.
By increasing the buffer size, an ~8% reduction in execution time when timing
on the program side. This should also reduce the frequency of large delays
when we outrun the buffer. Note: increasing this size dramatically can cause
slow startups (i.e. above 50MB).
Change-Id: I98c4316cfe93a043623ae2669cfe1a5abb55c990
Using a thread_local object is problematic as the thread local
destructors are called first before any global destructor, making
the object invalid while tearing down the process.
rocblas uses a global destructor to clean up the loaded HIP modules
and ends up calling hip_executable_destroy after the timestamp stack
is destructed. As a result the begin timestamp for that API function
is 0.
The solution is to store the phase_enter timestamp in the phase_data.
Change-Id: If143f4d123dfb111c72fb20365431d07e73fc570
The tracer tool needs to remember the begin timestamps for API
callbacks, and uses a thread_local std::stack for that purpose.
The issue with thread_local objects is that they are destructed
before anything else when the main thread exits. To work around
that issue, we use a "safe" stack in the roctracer API.
Use the same "safe" stack in the tracer tool.
Change-Id: I0d69d4eb44f0205f4102d0d5ef9803a1ec1800a5
The post-processing script cannot handle HIP ops without a correlation
ID. The correlation ID is needed to connect the record to a HIP stream
and originating thread.
This issue was exposed by a change to the tracer API to report
asynchronous activities even if their originating synchronous API
activity (callback) is not enabled. This was a flow in the API.
Also fix an issue with the API filtering. Undefined API names should
not cause an exception, they should be ignored.
Change-Id: Iab2221af6180ade2b9c2eb10c256c3a73d872e9f
Instead of dlopen'ing RTLD_NOLOAD a library (for example libamdhip64.so)
and rely on the dynamic linker search path, search through the already
loaded shared objects for a library with a matching name.
Change-Id: I3e74d432bd7ca68df8927ca435b290e86aaaf9e9
This function has been deprecated since ROCm-2.9, use ROCTX's
roctxMark(const char* message) as a replacement for roctracer_mark.
Change-Id: Ie4aeae1db238453fc4451746cc9a338032ba817f
- Multithreaded Applications and plugin destruction
- Fixing Async-copy trace in file plugin
- Adding the assert checkups for every trace buffer flush function
Change-Id: I96e096fd7ee2604931200a0b446edb5ce49959dd
- Added File plugin as the default plugin
- Moved the flush functions to the plugins
- Improved the flush to file implementation
Change-Id: I80dd448eb8147a8ea4aa63b39bd1d0a4baf7252b
Move the HSA intercept to the OnLoad function, so that it is available
as soon as the ROCR is loaded.
Layer the HSA API wrappers on top of the basic HSA activity intercept.
Change-Id: Ie636d59755543cda181e76ec29f0b55081136b63
This commit is for code cleanup and for optimizing kernel name search
in the API callback, making sure to get the kernel name accurately
for the hip functions that have any kernel names
Change-Id: Ie9ab917c895748bfb8eee9ddfcbcad81a0b9a9fa
When ROCP_TRUNCATE_NAMES is not set, getenv returns NULL and std::atoi
crashes. Check that getenv returns a non-NULL string before calling
std::atoi.
Change-Id: Ie479a481f8d23f034b425d14e3cfefb3d62c84e8
The ROCR now detects already loaded tool libraries and calls OnLoad/
OnUnload in the order specified with HSA_AMD_TOOL_ORDER.
It is no longer necessary to set the HSA_TOOLS_LIB environment variable
to load the roctracer API. The roctracer tool library should be
pre-loaded with LD_PRELOAD.
Change-Id: I6de1b1bd4f93caa08d3554aad2376d242c74fb7e
The same information can be generated from the hcc_ops_trace.txt file,
so in a later commit, will add a stage to the tblextr.py script to
generate the .csv files when ROCP_STATS_OPT=1.
Change-Id: I3d1575e096bedf98c66068d9a4ca141421e5bb9d
The roctracer_load, roctracer_unload, and roctrace_flush_buf functions
are not part of the ROCtracer API, and should not be exposed in the API
header file, but keep the functions in the library for backward
compatibility.
Add src/roctracer/backward_compat.cpp to implement retired functions.
Add test/app/backward_compat_test.cpp to test that the retired functions
are still accessible in the latest roctracer library.
Change-Id: I4c94310a7bfccfeae9384dac5db18fc79b4c5b17
Make error codes more informative and have negative values. This is an
ABI break but it does not appear known tools are relying on the exact
error codes.
Use logging for all errors so that roctracer_error_string will be able
to return last error message.
Make internal errors fatal and abort.
Do not use the tracer API exceptions in the tracer tool.
Change-Id: Ie8ed3d50e5ad26625ac9d1263f7e048edb5584c0