파일
rocm-systems/source/docs/about.md
T
Jonathan R. Madsen 6b374b8e68 Improve static singleton memory safety (#316)
* Update GitHub links

* Update samples/api_buffered_tracing/client.cpp

- check if initialized before forcing initialization

* Add lib/common/static_object.*

- template class for creating a static allocation in the binary which has all the properties of a heap allocated singleton but does not trigger leak sanitizers

* Update include/rocprofiler-sdk/internal_threading.h

- document return values

* Update lib/rocprofiler-sdk/internal_threading.cpp

- return codes from rocprofiler_create_callback_thread and rocprofiler_assign_callback_thread
- use common::static_object for thread-pool object

* Update lib/rocprofiler-sdk/agent.cpp

- use common::static_object to store array of strings and their hashes

* Update lib/rocprofiler-sdk/hsa/code_object.cpp

- use common::static_object to store array of strings and their hashes to ensure strings exist until termination

* Update lib/rocprofiler-sdk/registration.cpp

- use common::static_object to store status and client libraries
- update return values for rocprofiler_set_api_table

* Update lib/rocprofiler-sdk/hsa/hsa.cpp

- check registration::get_fini_status() in hsa_api_impl::functor<Idx>(args...)

* Update lib/rocprofiler-sdk/context/context.cpp

- using common::static_object for correlation id map
2023-12-19 13:47:21 -06:00

2.5 KiB

About

.. toctree::
   :glob:
   :maxdepth: 4

Important Changes

Roctracer and rocprofiler (v1) have been combined into a single rocprofiler SDK and re-designed from scratch. The new rocprofiler API has been designed with some new restrictions to avoid problems that plagued the former implementations. These restrictions enable more efficient implementations and much better thread-safety. The most important restriction is the window for tools to inform rocprofiler about which services the tool wishes to use (where "services" refers to the capabilities for API tracing, kernel tracing, etc.).

In the former implementations, when one of the ROCm runtimes were initially loaded, a tool only had to inform roctracer/rocprofiler that it wished to use its services at some point (e.g. calling roctracer_init()) and were not required to specify which services it would eventually or potentially use. Thus, these libraries had to effectively prepare for any service to be enable at any point in time -- which introduced unnecessary overhead when tools had no desire to use certain features and made thread-safe data management difficult. For example, roctracer was required to always install wrappers around every runtime API function and always added extra overhead of indirection through the roctracer library and checks for the current service configuration (in a thread-safe manner).

In the re-designed implementation, rocprofiler introduces the concept of a "context". Contexts are effectively bundles of service configurations. Rocprofiler gives each tool one opportunity to create as many contexts as necessary -- for example, a tool can group all of the services into one context, create individual contexts for each service, or somewhere in between. Due to this design choice change, rocprofiler now knows exactly which services might be requested by the tool clients at any point in time. This has several important implications:

  • rocprofiler does not have to unnecessarily prepare for services that are never used -- if no registered contexts requested tracing the HSA API, no wrappers need to be generated
  • rocprofiler can perform more extensive checks during service specification and inform tools about potential issues very early on
  • rocprofiler can allow multiple tools to use certain services simulatenously
  • rocprofiler was able to improve thread-safety without introducing parallel bottlenecks
  • rocprofiler can manage internal data and allocations more efficiently