Gráfico de commits

577 Commits

Autor SHA1 Mensaje Fecha
Ammar ELWazir 3df8bc7c77 Removing Backward compatability
removing the backward compatability file and making sure to use the right paths

Change-Id: I518d52c82e0c5878bd334713e7b1758bba79762d


[ROCm/roctracer commit: 6b16d37d65]
2022-05-11 14:43:35 -04:00
Ammar ELWazir 392e15598d Changing Installation docs
using build.sh rather than cmake in the readme

Change-Id: If3b80641497c0c967ec3340cb9ef546bf44824c3


[ROCm/roctracer commit: ed0e1f5cb8]
2022-05-11 01:31:52 -04:00
Ammar ELWazir dfe33f2c15 Changing the set CMAKE_CXX_FLAGS set for fPIC to known cmake ideal way
Change-Id: I898de3d05feffee2d7d37cf62ac33afe2ecde85a


[ROCm/roctracer commit: 7060b76927]
2022-05-10 22:38:13 -05:00
Laurent Morichetti 12623a5f24 Fix the roctracer tests
14/15 tests pass, 1/15 intermittent failure (tool flushing test).

Change-Id: I36ed2900a1c51e584718993badeaefd48ad450a2


[ROCm/roctracer commit: a98476fe11]
2022-05-10 14:58:08 -07:00
Laurent Morichetti 196af97ad6 Disallow copying or moving trace buffers
Change-Id: I104b8240a76c6d96ae176b0b26bdc2e4e5e3c180


[ROCm/roctracer commit: 3f402eb6e9]
2022-05-10 12:08:06 -07:00
Laurent Morichetti 9294225192 Fix memory leaks in roctracer
Each thread has a thread-local record_pair_stack. The stack is
dynamically allocated on first use, but is not detroyed when the
thread exits.

Replaced record_pair_stack pointers with record_pair_stack instances,
the intances are constructed on first odr-use, and destructed when the
thread exits.

Also, converted the cb_journal and act_journal to instances.

Change-Id: I186ac29da477f194880a1ab599f4be5715a23063


[ROCm/roctracer commit: 67481bd295]
2022-05-10 12:08:06 -07:00
Laurent Morichetti 4fddfcc5c5 Optimize rotcx markers
Improve the roctx markers performance when the tracer is not engaged
(the application is not running with rocprof).

The performance of roctx push/pop, measured with:

-----------------------------------------------------------------------
  auto start = std::chrono::steady_clock::now();
  for (int i = 0; i < 10000000; ++i) {
    roctxRangePush ("A");
    roctxRangePop ();
  }
  auto end = std::chrono::steady_clock::now();
  std::cout << "ns = " << std::chrono::nanoseconds(end - start).count()
      / 10000000 << std::endl;
-----------------------------------------------------------------------

w/o rocprof | with rocprof | commit
       92ns |       770ns  | 0d6e132: Cleanup CallbackTable::Get
       28ns |       712ns  | 6421bd5: Cleanup ROCTX's implementation
       20ns |       664ns  | 7f0e5e5: Remove the roctx range message...
        6ns |       665ns  | this commit

Change-Id: Id679dcbd0fb190a3179be98a9b2c1db151efee3d


[ROCm/roctracer commit: a794247c55]
2022-05-10 12:08:06 -07:00
Laurent Morichetti 9b78c65ce1 Remove the roctx range message stack
The range message stack is mirrored in case ranges are pushed or popped
while tracing is stopped (by the tracer tool?). When a stop event is
reported, the tracer tool emits RangePop events by unwinding the stack,
then when the start event is reported, it emits RangePush events again
by unwinding the stack. The issue is that the RangePush events should
be emitted in reverse order.

For example:

RangePush(M1); RangePush(M2); \
  TracerStop; RangePop; RangePop; \
...; \
  TracerStart; RangePush(M2); RangePush(M1); \ <- In the wrong order
RangePop; RangePop;

It could be fixed by reversing the stack in RangeStackIterate but is it
worth it? The roctx range markers are supposed to be unintrusive so that
they can be left in the application even when it isn't being traced.

Simplifying the roctx API and reducing its added latency by removing
the range message stack mirroring seems like the better choise.

TODO: A future change should make roctx events immune to tracer start
and tracer stop requests. Or simply remove roctracer_start/stop.

Change-Id: Ie4d76afb5ce8d263848dcf1b599af394db56ddab


[ROCm/roctracer commit: 3d0198c395]
2022-05-10 12:08:06 -07:00
Laurent Morichetti 4a04400f85 Cleanup ROCTX's implementation
Remove thread_data_init. The C++ standard guarantees that the thread
local variable is initialized before its first odr-use and destructed
when the thread exits. Use a global initializer to set the reference
from the message stack instance in the map.

Remove roctracer_error_string. This does not belong to this library.
ROCTX does not expose errors to the application. The only functions
returning errors are returning -1 (Push/Pop).

Remove memory leaks due to strdup on the ranges messages. The memory
for the messages is guaranteed to be valid for the duration of the
callback, and it is the application's responsibility to strdup the
strings if it needs to extend the message's lifetime.

Add a lock to the RegisterApiCallback implementation. Iterating the
message stack map must be synchronized as a new thread could be adding
a new value to the map.

Change-Id: Iaf5b07ebc9efe4061cb01327d4c7034888727816


[ROCm/roctracer commit: 713db1fce5]
2022-05-10 12:08:06 -07:00
Laurent Morichetti bac7f1c162 Merge "Cleanup CallbackTable::Get" into amd-staging
[ROCm/roctracer commit: 6e4055503c]
2022-05-10 14:55:20 -04:00
Laurent Morichetti 08289f356a Merge "Remove unused open_output_file/close_output_file" into amd-staging
[ROCm/roctracer commit: e8909158b3]
2022-05-10 14:55:10 -04:00
Laurent Morichetti fb3cf218c9 Merge "Fix a hang in './test/hsa/ctrl ctrl_hsa_input_trace'" into amd-staging
[ROCm/roctracer commit: 9cecf30131]
2022-05-10 14:54:11 -04:00
Laurent Morichetti 8b98f245ad Merge "Remove now unused hsa_rsrc_factory" into amd-staging
[ROCm/roctracer commit: fe0adfd37b]
2022-05-10 14:54:01 -04:00
Laurent Morichetti 4e9c35c929 Merge "Consolidate all sources of timestamps" into amd-staging
[ROCm/roctracer commit: 7c4f7625b1]
2022-05-10 14:53:36 -04:00
Laurent Morichetti e309f4c5df Cleanup CallbackTable::Get
Make CallbackTable::Get return the callback_function/user_arg pair
as an actual return value instead of returning it through arguments
pointers.

Change-Id: Ia2dfcdad8c237a09620518ad67af94add47220da


[ROCm/roctracer commit: 4aeb76f7a8]
2022-05-10 08:13:18 -07:00
Laurent Morichetti df767311e3 Remove unused open_output_file/close_output_file
Change-Id: I0e5118b814617cb605949c99e5f0dc235f6edac0


[ROCm/roctracer commit: cb040b7def]
2022-05-10 08:13:18 -07:00
Laurent Morichetti 63ead69012 Fix a hang in './test/hsa/ctrl ctrl_hsa_input_trace'
At the end of the test, the tracer tool is unloaded and the active
memory pools are flushed. In the flush callback, to get the activity
operation string, the RocpLoader instance is neeeded, and if the
RocpLoader is not already loaded, it attempts to dlopen the rocprofiler
library.

Calling dlopen from a global destructor hangs because the dynamic
loader lock is already owned (e.g. by dlclose).

To temporarily work around the issue, instanciate the RocpLoader when
the activities needing it are enabled.

Change-Id: I712c66d88c43694fe53a95d6a61d7b22abb75262


[ROCm/roctracer commit: 11887f596a]
2022-05-10 08:13:18 -07:00
Laurent Morichetti ef47516a88 Remove now unused hsa_rsrc_factory
Change-Id: I66175eb9fae2e7e61400af77a0c89be9c39e770e


[ROCm/roctracer commit: 4ced94b9a2]
2022-05-10 08:13:18 -07:00
Laurent Morichetti 19fbb76f1b Consolidate all sources of timestamps
System clock timestamps should only come from a single source:
util::timestamp_ns(). Externally, this function is exposed as
roctracer_get_timestamp() (used by the tracer tool).

Removed the now unused HSA Runtime Utilities which were never part
of the ROCtracer API.

Change-Id: I044b7f4da60fd8fdb771b0c877622a3143f0e815


[ROCm/roctracer commit: f8462b8637]
2022-05-10 08:13:09 -07:00
Ammar ELWazir f15a0ec2f0 Solving issue with using clang as the compiler
Change-Id: I4fa7b24af7008a30b0300b57ccbf1bc82dbfd66e


[ROCm/roctracer commit: 502ea835b9]
2022-05-09 17:41:33 -05:00
Laurent Morichetti efe7000e7b Remove unused ROCTX_CLOCK_TIME
Change-Id: I9696bb2892fe6fe21089462d624643b7a782fb71


[ROCm/roctracer commit: f46d1717cc]
2022-05-04 19:30:37 -04:00
Laurent Morichetti 2c4f347c0a Remove the tracer tool's dependency on hsa_rsrc_factory
hsa_rsrc_factory was only used to enumerate the agents types and pools.
The pools don't seem to be used by bin/mem_manager.py, so I only
ported the agent enumeration using hsa_iterate_agents.

Change-Id: Idd586aa13db303cf92962a6392771b7bf38b758f


[ROCm/roctracer commit: 6d6017249a]
2022-05-04 19:28:53 -04:00
Ammar ELWazir 5f2a988464 SWDEV-335490: Unused variables
Compilers doesn't see assert as a usage of the variables, I added [[maybe_unused]] to the variables that are used only in assert to make sure that the compiler is skipping them in the check. Note: [[maybe_unused]] is introduced in C++17

Change-Id: I96bb53cb2ab55ee7120681c2d279271c0075095d


[ROCm/roctracer commit: 78869032ad]
2022-05-04 11:24:28 -04:00
Ammar ELWazir e0aaaf4636 Removing HIP_API_PROF_STRING from the tracer_tool
The else part was not used as it was only using the hipApiString to format the data to string

Change-Id: I376721c478cffba0890436ca8895dfe2a7641570


[ROCm/roctracer commit: 5e012541c5]
2022-05-04 09:46:56 -04:00
Laurent Morichetti f1bce685df Fix race conditions in TraceBuffer
1) The Entry's state was published after making the record avaiable,
   so a thread flushing the records could see an unitialized record.
2) data_ and write_pointer_ could become out of sync. write_pointer_
   could be indexing into another buffer than what data_ was pointing
   to.
3) GetEntry could get a nullptr free_buffer_ because multiple threads
   could acquire the work_mutex_ before the work_thread_ could wake up,
   or between allocate_worker's loop iterations.

Change-Id: I6f0a015557888eeeaa75a8bce7fde8de276d11dd


[ROCm/roctracer commit: 046df32729]
2022-05-03 21:56:46 -04:00
Laurent Morichetti 8502571ab7 Move trace_buffer.h to the tool directory
A trace buffer is used to efficiently store synchronous event records
so that they can be processed later, possibly in a different thread,
when the buffer is flushed. This helps reduce the latency added by
tracing API calls.

The API does not need to use trace buffers as synchronous events are
directly reported to the client with callbacks, and asynchronous events
(activities) are saved in memory pools.

The implentation of HSA asynchronous memory copy activities was using
a trace buffer shared with the tracer tool to write the records to a
file (async_copy_trace.txt), instead of using a memory pool and
reporting the activity to the client.

Removed the asynchronous memory copies trace buffer, and updated
hsa_async_copy_handler to use the pool specified when the activity
was enabled.

Updated the tracer tool to read HSA_OP_ID_COPY records out of the
default memory pool and write them to async_copy_trace.txt.

Move trace_buffer.h to test/tool as tracer_tool.cpp is now the only
file using it.

Change-Id: Ida95aba2eaf3c3f2a979ed6c2b060374017b7424


[ROCm/roctracer commit: 61f35b0204]
2022-05-03 21:56:28 -04:00
Tony Tye 3417afa07f Merge "Add doxygen to roctracer.h" into amd-staging
[ROCm/roctracer commit: 48f4c82685]
2022-05-03 20:00:10 -04:00
Tony Tye dd82162466 Add doxygen to roctracer.h
Change-Id: Ie542399e990e02482ed740d99c6afe4b95b1f6f4


[ROCm/roctracer commit: 1f630a9291]
2022-04-30 00:33:05 +00:00
Laurent Morichetti 7746758ed7 Add a trace_buffer directed test
This test stresses the concurrent writing of trace buffer records while
frequently allocating new storage to hold the records.

Due to race conditions, this test fails with the current trace buffer
implementation.

Change-Id: I0b77c64005e776319bf21f1ee1e6d7c99ddccfff


[ROCm/roctracer commit: 200e27f12d]
2022-04-29 08:52:13 -07:00
Laurent Morichetti 6eb1d34cda Fix assertions
Replace EXC_ABORT() checks with assertions.

Rewrite the exception class to use std::runtime_error (as it
already handles the std::string/char* message argument).

Change-Id: I48e31924f3aea1328e6562ab6bb06ec373fd5d5e


[ROCm/roctracer commit: 5963363484]
2022-04-27 11:24:26 -07:00
Laurent Morichetti 576554dcea Fix a SEGV when running --roctx-trace
There's a typo in RegisterApiCallback, roctx::cb_table.Get should be
roctx::cb_table.Set.

Change-Id: I47ec8ac666f783ff4e03f35d13e375e645899900


[ROCm/roctracer commit: 0d7d56eea5]
2022-04-27 12:14:32 -04:00
Ranjith Ramakrishnan 140d2d4bc2 Merge "Populate roctracer.h wrapper file with orginal file contents as dead code" into amd-staging
[ROCm/roctracer commit: 7f05496a87]
2022-04-27 02:20:26 -04:00
Laurent Morichetti 6d8edf929f Fix typos/spelling errors
Change-Id: Idec1cb8fab91c30f99563bc7dd4db1faeb2db954


[ROCm/roctracer commit: 18f60efe05]
2022-04-26 12:39:38 -07:00
Laurent Morichetti 159a56ffff Remove unused proxy utilities
The proxy queue implements packet interception to enable timestamps
collection. As it is, the roctracer is not intercepting packets, and
instead relies on the rocprofiler tool to collect the timestamps for
kernel dispatches.

This is an issue as the roctracer API does not implement HSA_OPS
activities for kernel dispatches. This will be addressed in a future
commit.

Change-Id: Ib6a778a513410bec4579f223a9d9e9fd9b6054df


[ROCm/roctracer commit: 6b06322578]
2022-04-26 15:26:26 -04:00
Laurent Morichetti 4a50f3b88f Fix the static library build
Building with -DLIBRARY_TYPE=STATIC fails with 3 undefined symbols.
Add weak symbols to satisfy the linker (mirror what is done for the
other Loader symbols).

Change-Id: I8a2878def21d5f500b0764ceacb4e5255e1111c5


[ROCm/roctracer commit: b352eedac6]
2022-04-26 15:26:10 -04:00
Ranjith Ramakrishnan 75fadc28dd Populate roctracer.h wrapper file with orginal file contents as dead code
Backward comaptibility for components that search for  contents in roctracer.h
Improvements: Removed redundant code for setting and unsetting variables
Added header template file in source code instead of generating it on build time

Change-Id: I96aeb7f2a6d53d45eb5aeb5300024cd22dad1324


[ROCm/roctracer commit: 8ca752ce2c]
2022-04-26 03:09:35 -07:00
Ammar ELWazir 9dd5a58e3e SWDEV-295522: Fixing Performance Issue
Removing DEBUG_TRACES and the unnecessary use of roctracer_op_string, made the MS app reporting 78 to 81 stable samples per second, depending on the type of the trace, while the main app without rocprof reports 100 to 106. More detailed numbers will be posted in the ticket.

Change-Id: Ifbc529278cea54dd23e6086aa9b9ea2df952d5dd


[ROCm/roctracer commit: e4569c41fe]
2022-04-22 18:51:49 -04:00
Laurent Morichetti 0fd8cd7895 Allow MemoryPool::Write while Flushing
Before this change, when a producer was blocked by a flush operation,
no other producer could write to the memory pool.  This change allows
other producer threads to continue to write by releasing the producer
lock before waiting on the consumer condition variable.

Change-Id: Idc1c07173d2edb18fbe1a61961f10c02e7ca8c20


[ROCm/roctracer commit: dc8717a6b5]
2022-04-22 11:22:23 -07:00
Laurent Morichetti 1aaed4c508 Remove HCC_EXC_RAISING and HIP_EXC_RAISING
HCC_EXC_RAISING and HIP_EXC_RAISING don't add much value, so to
simplify, only keep EXC_RAISING and EXC_ABORT.

Change-Id: Ifdc54981bb682fe68b418cdc95ecebe668e3dcf6


[ROCm/roctracer commit: 121a84b449]
2022-04-22 11:22:23 -07:00
Laurent Morichetti 14f1d48482 Move the HccLoader activities into the HipLoader
The HCC runtime is no longer used, so move all the remaining
activities in the HipApi loader and remove the HccLoader.

Change-Id: I845c04ca275a474526840315bae0ad1a4ce02257


[ROCm/roctracer commit: 85552ea3a0]
2022-04-22 11:22:07 -07:00
Laurent Morichetti 33d8437801 Use ACTIVITY_DOMAIN_HIP_OPS instead of ACTIVITY_DOMAIN_HCC_OPS
Change-Id: I43fbac3d02011f74bf7b597519148ed0bd68ff98


[ROCm/roctracer commit: abf1b90017]
2022-04-20 22:00:59 -07:00
Laurent Morichetti 83cf22a698 Remove roctracer_hcc.h
roctracer_hip.h now contains the definitions for the HCC_OPS domain.

Change-Id: I132c993110254050aaa68828f3ca80f368ad24bc


[ROCm/roctracer commit: d3b166cf01]
2022-04-20 22:00:59 -07:00
Laurent Morichetti 57304225d9 Remove hip_act_cb_tracker.h
It only defines one class (hip_act_cb_tracker_t) that is only used
by roctracer.cpp.

Change-Id: I375a25bd363770d70a7b3b713223484a498cc3d1


[ROCm/roctracer commit: c009df3327]
2022-04-20 19:48:24 -07:00
Laurent Morichetti f8cc0d58a8 Close the default pool on exit
Change-Id: I388ea4d4f06c1818312a72185ef55b615c730509


[ROCm/roctracer commit: a0fd1e7c4b]
2022-04-20 19:48:24 -07:00
Laurent Morichetti 3c18fb9f01 Simplify memory_pool.h
Use the standard concurrent support library (std::thread, std::mutex,
st::condition_variable) instead of pthread.

Fix a mismatched memory allocation/deallocation when a custom allocator
is provided. The MemoryPool destructor was always using the default
allocator (using malloc/realloc/free) even if the pool memory was
allocated with the custom allocator.

Fix various thread safety issues and inefficiencies (spin loops).

Change-Id: I97592caa947f63463041bf43e00af9ebb5ff5886


[ROCm/roctracer commit: 9d728f74a1]
2022-04-20 19:48:24 -07:00
Laurent Morichetti 45deedf43a Make roctracer_cb_table.h a private header
Move roctracer_cb_table.h to the src/core directory, as it should not
be exposed as a public header, and rename it callback_table.h

Change-Id: Ib448cbd32a275df0268d53bd8d1da0bdc9201470


[ROCm/roctracer commit: cd62d841fa]
2022-04-20 19:47:43 -07:00
Laurent Morichetti 2a2852048f Address review comments from previous commit
Change-Id: I6629dd911de0d7fd08d7a863c172ec73f35fa3d1


[ROCm/roctracer commit: dc22139977]
2022-04-20 22:46:15 -04:00
Laurent Morichetti 23528f51e0 Run clang-format on all source files
Change-Id: Ifb52ca306286b6b2d473821bed9db28e9f616d50


[ROCm/roctracer commit: 15ab5d9cda]
2022-04-20 22:45:54 -04:00
Laurent Morichetti b9bbce0017 Simplify journal.h
Simplify implementation of journal.h.

Change-Id: I9e2e93fd3cd3391fdf182249f5c4c5ef3debae03


[ROCm/roctracer commit: 89f6880371]
2022-04-20 19:43:16 -07:00
Laurent Morichetti 80c01a27c0 Fix copyright headers
Change-Id: I380d867fa5fb04e68b5b332e9abf33fbeb1e9418


[ROCm/roctracer commit: 06a3da7c63]
2022-04-19 09:30:45 -07:00