Add @ammarwa and @bgopesh as CODEOWNERS.
This is for GitHub upstream.
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Change-Id: I788f5ad550f91e8d3ce45bbeb527283bf11c4fd7
[ROCm/roctracer commit: 5d066e5286]
[SWDEV-418917] reported that timing skew was being introduced by
roctracer. Most of the cause of this problem seems to stem from outrunning
the double buffering scheme that we use in memory_pool (part of the
reason for this outrun is due to File writing being slow). A semi-quick
fix that may be able to last until RocProf v2 is complete is to allow
adjustment of the buffer size. ROCTRACER_BUFFER_SIZE env variable was
introduced here which allows setting the buffer size of tracer tool.
By increasing the buffer size, an ~8% reduction in execution time when timing
on the program side. This should also reduce the frequency of large delays
when we outrun the buffer. Note: increasing this size dramatically can cause
slow startups (i.e. above 50MB).
Change-Id: I98c4316cfe93a043623ae2669cfe1a5abb55c990
[ROCm/roctracer commit: 38ba63030d]
ROCm Packaging uses ROCM_PATCH_VERSION as a standard option to add the patch version which determines the rocm release for different libraries versions
Change-Id: I1edce84d2963d495c55c83cc0697761d7f696c92
[ROCm/roctracer commit: 421febd4bf]
RPATH in libraries installed in /opt/rocm-ver/lib/roctracer should be: $ORIGIN:$ORIGIN/..
cmake shared linker flags will provide the rpath $ORIGIN
The patch will append the rpath $ORIGIN/.. to the component specific libraries
Change-Id: Ied2bcb57bf0dd38ee3d1a946a5afc1bb182ff619
[ROCm/roctracer commit: 6fbf7673aa]
Using wrapper header files will result in #warning message by default
Change-Id: Ib8a05d11f2391dfcdac8601da26e1096821cd555
[ROCm/roctracer commit: 245eafea4c]
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message
Change-Id: I6abc236e810ccc38d3636074e0e8f5a9657c2e9a
[ROCm/roctracer commit: ea061be2d1]
SWDEV-356024 - Development package name will have suffix dev or devel based on OS
Devel package contents - Header files, name link of public library files, html files and roctracer manual file
Runtime package contents - Versioned public library files, private library files and license file
Change-Id: I8ced3eab5d8824a66be39b9e777368506516b155
[ROCm/roctracer commit: 9acba8b4a1]
Older GNU C++ runtimes cannot demangle symbol names generated by recent
versions of LLVM. To work around this issue, use the LLVM demangler to
process kernel names.
Change-Id: I595f900d06360bb5acce542955cf1f5aed81f00e
[ROCm/roctracer commit: 91b449d0d5]
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition
Change-Id: I85e14470cce0f3d7c14ecb40e0e9e8b29c977c9f
[ROCm/roctracer commit: ca1726f80d]
In the generated header file hsa_prof_str.h , the header file hsa_ostream_ops.h was included using angle brackets
This results in compilation with include path /opt/rocm-ver/include. Corrected the usage by using double quotes
Change-Id: Ie9f1fff78d16a6953a2c99056b2acef42e577204
[ROCm/roctracer commit: b1585c983d]
When multiple ranks are used, each rank's first logical device always
has GPU ID 0, regardless of which physical device is selected with
CUDA_VISIBLE_DEVICES. Because of this, when merging trace files from
multiple ranks, GPU IDs from different processes may overlap.
The long term solution is to use the KFD's gpu_id which is stable
across APIs and processes. Unfortunately the gpu_id is not yet exposed
by the ROCr, so for now use the driver's node id.
Change-Id: I2f5af8d2a7e8a89efeb5e0a1b86bdfa547b25fc8
[ROCm/roctracer commit: 799f0323cd]
Strings ([const] char *, [const] char[]) passed as arguments to API
functions may not always contain printable characters. All string
arguments should be quoted and escaped in the trace logs.
Change-Id: Ie39058f2190048b1a0090df16d9ac6bc6507e28a
[ROCm/roctracer commit: b556f8681e]
Using a thread_local object is problematic as the thread local
destructors are called first before any global destructor, making
the object invalid while tearing down the process.
rocblas uses a global destructor to clean up the loaded HIP modules
and ends up calling hip_executable_destroy after the timestamp stack
is destructed. As a result the begin timestamp for that API function
is 0.
The solution is to store the phase_enter timestamp in the phase_data.
Change-Id: If143f4d123dfb111c72fb20365431d07e73fc570
[ROCm/roctracer commit: 8a575d8d6e]
Using rocprof with ROCP_MCOPY_DATA=1 while tracing HSA produces the
following error:
tblextr.py: Memcpy args "(0x7feb16a00000, 123handle=28593376125, 0x7feb12a00010, 123handle=27558560125, 4194304, 0, 0, 123handle=140661639440000125) = 1" cannot be identified
Profiling data corrupted: ' ./out/rpl_data_220930_143009_1826700/input_results_220930_143009/results.txt'
There are two issues:
1) The hsa_agent_t handle argument is misprinted: "123handle=...125"
Instead of printing '{' and '}', it prints '123' and '125'. The wrong
operator<<(unsigned char) is used and an integer value is printed
instead of a char.
Use std::operator<< instead of hsa_support::detail::operator<< to
print '{' and '}'
2) The result value is unitialized and in some cases printed as a
negative integer value. The leading '-' is not matched by the
mem_manager regular expresion for HSA api calls.
Correctly capture the HSA function's return value.
Change-Id: If13a1e62eeb4e598447c4b90d53d1b2e3b408696
[ROCm/roctracer commit: 6416434d3b]
The timestamps coming from the HIP runtime for asynchronus memory
copies are corrupted (begin > end) because the HSA setting to record
timestamps is turned off by the tracer's HSA intercept.
The solution is to intercept hsa_amd_profiling_async_copy_enable and
remember the application/runtime's request so that it can be ORed with
IsEnabled(ACTIVITY_DOMAIN_HSA_OPS, HSA_OP_ID_COPY).
Change-Id: Ib687cbf36711563e86c2bb8bc934c7c51572bfde
[ROCm/roctracer commit: 329c0467cb]
The tracer tool needs to remember the begin timestamps for API
callbacks, and uses a thread_local std::stack for that purpose.
The issue with thread_local objects is that they are destructed
before anything else when the main thread exits. To work around
that issue, we use a "safe" stack in the roctracer API.
Use the same "safe" stack in the tracer tool.
Change-Id: I0d69d4eb44f0205f4102d0d5ef9803a1ec1800a5
[ROCm/roctracer commit: b664937ebd]
rocprof errors out with the following message:
symbol lookup 'KernelNameRef' failed: libamdhip64.so.5: undefined \
symbol: KernelNameRef
The HipLoader is incorrectly looking for a KernelNameRef symbol
instead of hipKernelNameRef.
Fixed the typo: KernelNameRef -> hipKernelNameRef.
Change-Id: Ia4860e1669707b0c83d67e71b78d362b07a6aaa7
[ROCm/roctracer commit: a287f20961]
Starting with gcc-11 (verified with gcc-12 as well), an array
out-of-bounds subscript error is reported for accessing the registration
table element at the operation ID index. Validating the index in the
function calling Register/Unregister does not quiet the warning/error
in release builds, so, for gcc-11 and gcc-12, we disable that warning
just for the RegistrationTable class.
Change-Id: I6bc4a02aa072cfa8905ecde5e3960aebf32fc912
[ROCm/roctracer commit: 67ce5fae13]
Use #include "header" instead of #include <header> so that the header
files are found when the application #includes <roctracer/roctracer.h>
with -I /opt/rocm/include.
Change-Id: I24feac9a5030d3600aee98084340e246c3990db5
[ROCm/roctracer commit: 05ee3ff973]
The post-processing script cannot handle HIP ops without a correlation
ID. The correlation ID is needed to connect the record to a HIP stream
and originating thread.
This issue was exposed by a change to the tracer API to report
asynchronous activities even if their originating synchronous API
activity (callback) is not enabled. This was a flow in the API.
Also fix an issue with the API filtering. Undefined API names should
not cause an exception, they should be ignored.
Change-Id: Iab2221af6180ade2b9c2eb10c256c3a73d872e9f
[ROCm/roctracer commit: 4856d33959]
Default to the HSA runtime's hsa_system_get_info if the saved HSA
functions table is not yet initialized.
Change-Id: I3659095a5ad662f7ca8b0d92bd035901c6d66bb0
[ROCm/roctracer commit: 87ffbd27f4]
Instead of dlopen'ing RTLD_NOLOAD a library (for example libamdhip64.so)
and rely on the dynamic linker search path, search through the already
loaded shared objects for a library with a matching name.
Change-Id: I3e74d432bd7ca68df8927ca435b290e86aaaf9e9
[ROCm/roctracer commit: db69cc1c9f]