Fixed exception thrown when ROCP_HSA_INTERCEPT not set or set to 0;
Fixed ROCM hsa_init() failed with error 4096 when trying to read hardware performance counters;
Fixed LD_LIBRARY_PATH to include necessary library;
Change-Id: Idcb7ff807a79f4267374c34041d3bca33d85f532
[ROCm/rocprofiler commit: a8b5d6cf33]
Changed var_pattern in tblextr.py to include pattern like "name[0]"
Change-Id: Ibe1c512595cfbdcaca8fa5bddceb3f6a570caf43
[ROCm/rocprofiler commit: ff43ca1542]
Changed derived metrics to double from int64.
Fixed standalone test due to int64 to float change
Fixed intercept test due to int64 to float change.
Change-Id: I49631c187406ae9dd94a869b3bb13772012e8cdf
[ROCm/rocprofiler commit: f9017cbdc5]
Instead of detecting files (header/library), use cmake's find_package to
locate the required dependencies (hsa-runtime64 and hsakmt).
Adding hsa-runtime64::hsa-runtime64 and hsakmt::hsakmt to the
target_link_libraries also takes care of adding the interfaces include
directories to the search path.
Change-Id: I64eb77c97dac7982ac96d3158ad57df776cc0b53
[ROCm/rocprofiler commit: acb246f788]
L2 flush is triggered by explicit cache flush PM4 packet in aqlprofile
packets to GPU. This cache flush is used to sync up CPU and GPU to make
sure perfomance counters copied to profile output buffer is visible to
CPU. To get rid of this cache flush the followings are done:
1. This explicit cache flush packet is removed from aqlprofile code
(another commit to aqlprofile code).
2. This commit which changed profile output buffer to use kernarg
memory since it is uncached for GPU.
After these changes profile counter values when copied by GPU to output
buffer they are guaranteed to be visible to CPU.
Change-Id: Ie953949c85fbee2f4369f1de966bcfb33daec084
[ROCm/rocprofiler commit: 2b79931631]
Removed the old code for trying to locate libhsakmt.so.1 as it is replaced by libhsakmt.a static library
Change-Id: Icc5a0f6ead285e2406e6e83614e536184e3a2663
[ROCm/rocprofiler commit: 14b62557d0]
Add roctx_trace to the list of files that need to be merged when
aggregating results from multiple runs.
Change-Id: I5810be9e9220765ed8e8a84eca854131e97e61b1
[ROCm/rocprofiler commit: 4ab94c410a]
Added Support for launch kernel functions to fill_api_db
Added support for hipMemcpyToSymbol in add_memcpy
Added support for hsa_amd_memory_pool_allocate to be counted as source of allocations
Change-Id: I68806106324b19ca6f09d413df37c27582be2f51
[ROCm/rocprofiler commit: 804e063eda]
When building the json data flow, from_us_list has (timestamp, stream_id, thread_id).
stream_id used to be interpreted as from_tid and tid as to_tid. But that's not correct.
stream_id is always a destination and tid is the initiator (source).
Change-Id: I2f5bb86a387b4003b17271c90bdf9de4b59a79bf
[ROCm/rocprofiler commit: 244dadcb85]
Marker events inside hcc_ops_trace.txt are from barriers so they are not meant to be stored in ops_patch_data map.
Added support for hipMemset events which are a kind of memory copy.
Change-Id: I213fe959bcd35ff0371613ba5bffd95bc53e06b5
[ROCm/rocprofiler commit: caa5f32300]
recordid cannot be just a counter. The code removed was doing
just that i.e. incrementing a counter. Recordid has to come
from recvals data structure. That code was left there since
a while when Evgeny and Rachida were trying to prototype this feature.
I am not sure why it was not spotted before.
Change-Id: Ia867066dcfca083fcd4111f2aefc2fec88c26314
[ROCm/rocprofiler commit: 4ba91a972c]
1st issue was that one of the ostream ops failed to print the
content of the struct.
2nd issue: get_ptr_type was called with args being src/dest
pointers while it should be the agents pointers for src/dest.
3rd issue: memcopies map used (recordid, procid, is_async)
as a key but this is not enough as some copies share same key,
so I added begin/end timestamps as a way to distinguish between them.
Change-Id: I7c6e80e74e30ea572f21612aaf0cf7efec6e91e6
[ROCm/rocprofiler commit: 761bd6a86b]
All packages should have a valid email for contact.
SWDEV-257322
Change-Id: I49107ff44b9aaf13ec6a20319a420146f6443907
[ROCm/rocprofiler commit: c684d61de8]
All packages should have a valid email for contact.
SWDEV-257328
Change-Id: I03ceefc46cf8da4486e19b1001abd4cd8cbcb3c5
[ROCm/rocprofiler commit: 71ce3fa617]
Add numa lib as this will be required with a static thunk
Look for static thunk of shared thunk cannot be found
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Idcaa0c785a0502c9f5fe42e2dfb9e0c1780f9d66
[ROCm/rocprofiler commit: 97c9efce38]
On Ubuntu 20.04, in Release mode, gcc fails with this error:
In file included from /usr/include/string.h:495,
from /opt/rocm/include/hsa/hsa_api_trace.h:57,
from ../rocprofiler/src/util/hsa_rsrc_factory.h:29,
from ../rocprofiler/src/util/hsa_rsrc_factory.cpp:25:
In function ‘char* strncpy(char*, const char*, size_t)’,
inlined from ‘const util::AgentInfo* util::HsaRsrcFactory::AddAgentInfo(hsa_agent_t)’ at ../rocprofiler/src/util/hsa_rsrc_factory.cpp:323:12:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:34: error: ‘char* __builtin___strncpy_chk(char*, const char*, long unsigned int, long unsigned int)’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../rocprofiler/src/util/hsa_rsrc_factory.cpp: In member function ‘const util::AgentInfo* util::HsaRsrcFactory::AddAgentInfo(hsa_agent_t)’:
../rocprofiler/src/util/hsa_rsrc_factory.cpp:322:39: note: length computed here
322 | const int gfxip_label_len = strlen(agent_info->name) - 2;
| ~~~~~~^~~~~~~~~~~~~~~~~~
The error is caused by the following 2 lines:
const int gfxip_label_len = strlen(agent_info->name) - 2;
strncpy(agent_info->gfxip, agent_info->name, gfxip_label_len);
The size argument to strncpy should not depend on the input string.
Since the terminating character is not considered (the copy is at
most len - 2 bytes), using memcpy is preferable. Also, make sure
the destination does not overflow by clamping the size.
Change-Id: I0c5cf7e0daf4cd6fcf7092efb1d9fd4c02a6c639
[ROCm/rocprofiler commit: 304d3366af]