Remove the api_callbacks_table_t that was holding the API activities and
user callbacks. Instead use a single roctracer callback (TracerCallback)
used to report both API activities and callbacks.
Remove the hipInitActivityCallback that was setting the ROCtracer
callback and memory pool for asynchronous activities as it did not
allow disctinct pools to be used for each activity. Instead, use
hipRegisterTracerCallback to set the single roctracer callback.
Change-Id: I4c10f04f29a6e4cce8caf15db3016c3f72c86b04
Since the hip_api_data and record are only needed at the HIP function's
scope, there is no need to allocate/free them in the ROCtracer activity
callback, they can reside on the HIP function's stack frame.
This solves an issue with the thread local stacks of records the tracer
maintains that are destroyed first (before any global destructor) on
process exit, making it impossible to use HIP functions in global
destructors when the profiler is enabled.
Change-Id: Ib1d70124d009a44dc1f08d41edff95e5f9f84369