File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition
Change-Id: I85e14470cce0f3d7c14ecb40e0e9e8b29c977c9f
In the generated header file hsa_prof_str.h , the header file hsa_ostream_ops.h was included using angle brackets
This results in compilation with include path /opt/rocm-ver/include. Corrected the usage by using double quotes
Change-Id: Ie9f1fff78d16a6953a2c99056b2acef42e577204
When multiple ranks are used, each rank's first logical device always
has GPU ID 0, regardless of which physical device is selected with
CUDA_VISIBLE_DEVICES. Because of this, when merging trace files from
multiple ranks, GPU IDs from different processes may overlap.
The long term solution is to use the KFD's gpu_id which is stable
across APIs and processes. Unfortunately the gpu_id is not yet exposed
by the ROCr, so for now use the driver's node id.
Change-Id: I2f5af8d2a7e8a89efeb5e0a1b86bdfa547b25fc8
Fix the following error:
roctx.cpp:91:25: error: reinterpret_cast from 'const void *' to 'decltype(report_activity.load())' (aka 'int (*)(activity_domain_t, unsigned int, void *)') casts away qualifiers
report_activity.store(reinterpret_cast<decltype(report_activity.load())>(function),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
by replacing the 'const void *function' argument with the correct type.
Change-Id: I912239daf6f4a3f00fc753306b84833e5c75f74b
Strings ([const] char *, [const] char[]) passed as arguments to API
functions may not always contain printable characters. All string
arguments should be quoted and escaped in the trace logs.
Change-Id: Ie39058f2190048b1a0090df16d9ac6bc6507e28a
Using a thread_local object is problematic as the thread local
destructors are called first before any global destructor, making
the object invalid while tearing down the process.
rocblas uses a global destructor to clean up the loaded HIP modules
and ends up calling hip_executable_destroy after the timestamp stack
is destructed. As a result the begin timestamp for that API function
is 0.
The solution is to store the phase_enter timestamp in the phase_data.
Change-Id: If143f4d123dfb111c72fb20365431d07e73fc570
Using rocprof with ROCP_MCOPY_DATA=1 while tracing HSA produces the
following error:
tblextr.py: Memcpy args "(0x7feb16a00000, 123handle=28593376125, 0x7feb12a00010, 123handle=27558560125, 4194304, 0, 0, 123handle=140661639440000125) = 1" cannot be identified
Profiling data corrupted: ' ./out/rpl_data_220930_143009_1826700/input_results_220930_143009/results.txt'
There are two issues:
1) The hsa_agent_t handle argument is misprinted: "123handle=...125"
Instead of printing '{' and '}', it prints '123' and '125'. The wrong
operator<<(unsigned char) is used and an integer value is printed
instead of a char.
Use std::operator<< instead of hsa_support::detail::operator<< to
print '{' and '}'
2) The result value is unitialized and in some cases printed as a
negative integer value. The leading '-' is not matched by the
mem_manager regular expresion for HSA api calls.
Correctly capture the HSA function's return value.
Change-Id: If13a1e62eeb4e598447c4b90d53d1b2e3b408696
The timestamps coming from the HIP runtime for asynchronus memory
copies are corrupted (begin > end) because the HSA setting to record
timestamps is turned off by the tracer's HSA intercept.
The solution is to intercept hsa_amd_profiling_async_copy_enable and
remember the application/runtime's request so that it can be ORed with
IsEnabled(ACTIVITY_DOMAIN_HSA_OPS, HSA_OP_ID_COPY).
Change-Id: Ib687cbf36711563e86c2bb8bc934c7c51572bfde
The tracer tool needs to remember the begin timestamps for API
callbacks, and uses a thread_local std::stack for that purpose.
The issue with thread_local objects is that they are destructed
before anything else when the main thread exits. To work around
that issue, we use a "safe" stack in the roctracer API.
Use the same "safe" stack in the tracer tool.
Change-Id: I0d69d4eb44f0205f4102d0d5ef9803a1ec1800a5
rocprof errors out with the following message:
symbol lookup 'KernelNameRef' failed: libamdhip64.so.5: undefined \
symbol: KernelNameRef
The HipLoader is incorrectly looking for a KernelNameRef symbol
instead of hipKernelNameRef.
Fixed the typo: KernelNameRef -> hipKernelNameRef.
Change-Id: Ia4860e1669707b0c83d67e71b78d362b07a6aaa7
Starting with gcc-11 (verified with gcc-12 as well), an array
out-of-bounds subscript error is reported for accessing the registration
table element at the operation ID index. Validating the index in the
function calling Register/Unregister does not quiet the warning/error
in release builds, so, for gcc-11 and gcc-12, we disable that warning
just for the RegistrationTable class.
Change-Id: I6bc4a02aa072cfa8905ecde5e3960aebf32fc912
Use #include "header" instead of #include <header> so that the header
files are found when the application #includes <roctracer/roctracer.h>
with -I /opt/rocm/include.
Change-Id: I24feac9a5030d3600aee98084340e246c3990db5
The post-processing script cannot handle HIP ops without a correlation
ID. The correlation ID is needed to connect the record to a HIP stream
and originating thread.
This issue was exposed by a change to the tracer API to report
asynchronous activities even if their originating synchronous API
activity (callback) is not enabled. This was a flow in the API.
Also fix an issue with the API filtering. Undefined API names should
not cause an exception, they should be ignored.
Change-Id: Iab2221af6180ade2b9c2eb10c256c3a73d872e9f
Default to the HSA runtime's hsa_system_get_info if the saved HSA
functions table is not yet initialized.
Change-Id: I3659095a5ad662f7ca8b0d92bd035901c6d66bb0
Instead of dlopen'ing RTLD_NOLOAD a library (for example libamdhip64.so)
and rely on the dynamic linker search path, search through the already
loaded shared objects for a library with a matching name.
Change-Id: I3e74d432bd7ca68df8927ca435b290e86aaaf9e9
Remove the hipInitActivityCallback and use the new hipRegister/
RemoveActivityCallback which allows distinct memory pools to be used
for HIP_OPS activities.
Enable the multi_pool_activities test.
Change-Id: I6f6feaedecc9c36285bea975caf24dbf8f5f624b
The code is easier to read if calling HIPActivityCallbackTracker
enable/disable_check directly. Both enable/disable_check return the
new mask, and the check whether a callback is already installed is
clearer.
Change-Id: Ic90d34489b5b4d9929dc08b4d9e93cc974b136b1
The HIP runtime is now allocating the hip_api_data and record on its
stack so we don't need the thread local record_data_pair stack anymore.
Refactor the API callback function to handle both the case where
synchronous user callbacks are requested and the case where asynchronous
records are requested (enable_callback & enable_activity respectively).
If the callback argument (memory pool) is not null, then activity
records are requested.
Remove CorrelationIdRegister and CorrelationIdLookup. These were used
by the HIP runtime to associate a HIP record id to a ROCtracer
correlation id. Instead, the HIP runtime is now using the correlation
ID returned in the hip_api_data_t.
Added a test to check enabling/disabling concurrent callbacks and
activities.
Change-Id: I5850cfead9861eb3602a3e8fcb7b22580d5fc979
These functions have little value as it is very unlikely an application
would want to enable all the domains.
Change-Id: I4743e8ddf6743e60c95c7ba5240950d2ef734301
This test checks that asynchronous activities can be enabled in distinct
memory pools. It enables activity reporting for HIP kernel dispatches in
one memory pool, and memory copy reporting in another memory pool.
The output of this test to stdout should be a series of kernel dispatch
records (10) followed by a series of memory copy records (10). The
records should not be interleaved.
Change-Id: Idb5cca7e650b2312a1955909932364f914737856
The plugin's file scope global variables destructors could be called
before roctracer_plugin_finalize is called, making the global variables
undefined by the time roctracer_plugin_finalize is called.
To avoid this issue, remove all non-pod global variables from the file
plugin.
Change-Id: I4b620d67d460d9c99adfd81cbf46b0e64540c503
This function has been deprecated since ROCm-2.9, use ROCTX's
roctxMark(const char* message) as a replacement for roctracer_mark.
Change-Id: Ie4aeae1db238453fc4451746cc9a338032ba817f
- Multithreaded Applications and plugin destruction
- Fixing Async-copy trace in file plugin
- Adding the assert checkups for every trace buffer flush function
Change-Id: I96e096fd7ee2604931200a0b446edb5ce49959dd
Don't set the color variables if tput is not available, not working, or
if ncolors < 8.
Move the color variables outside of eval to avoid calling tput over and
over again.
Change-Id: Id51a742b77ad0f7c99c1c7c5d05bed0f423b75de
- Added File plugin as the default plugin
- Moved the flush functions to the plugins
- Improved the flush to file implementation
Change-Id: I80dd448eb8147a8ea4aa63b39bd1d0a4baf7252b
This test verifies that callback argument matches the callback function
as a race condition while setting and reading the pair could result in
mismatched arguments.
Change-Id: I2fe49d98d19bb780b6956ea6718762cfa0de93f8
Intercept the first call to hsa_iterate_agents in order to number them.
The index assigned to agents will be used by a future commit.
Change-Id: I8db365f8fe913b6cde16a4dccb9bf09600846521
Move the HSA intercept to the OnLoad function, so that it is available
as soon as the ROCR is loaded.
Layer the HSA API wrappers on top of the basic HSA activity intercept.
Change-Id: Ie636d59755543cda181e76ec29f0b55081136b63
This commit is for code cleanup and for optimizing kernel name search
in the API callback, making sure to get the kernel name accurately
for the hip functions that have any kernel names
Change-Id: Ie9ab917c895748bfb8eee9ddfcbcad81a0b9a9fa
Making sure not to count duplicates for load_unload_reload_trace and
fixed the ignore-count option in check_trace.py.
Change-Id: I9e674aa624ec3b473bb7c6cc95260e240204627f
When separate debug info is requested, the test package
generation fails because /usr/bin/objcopy does not understand
the HSA code object format. We need a workaround to get
past this issue.
Change-Id: I9a307fcf532ce8219a9301850aae972303d19990