fe5d074375
* Adding tools support * cmake formatting (cmake-format) (#227) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * Checking to do rebase * Adding rocprofv2 script * cmake formatting (cmake-format) (#229) Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> * Fixing build for the tool * Removing the requirement for rocm_version * Update rocprofiler_utilities.cmake * C++ filesystem fixes - added source/lib/common/filesystem.hpp - support older compilers which have <experimental/filesystem> and do not have <filesystem> - added samples/common/filesystem.hpp - samples now depend on "common" library which provides the correct filesystem header - renamed rocprofiler-stdcxxfs interface target to rocprofiler-cxx-filesystem - support old LLVM in addition to GNU - fix bin/rocprof/rocprof.cpp - was using VLA * Fix rocprofiler-drm include directories - OpenSUSE only has include/libdrm/drm.h (no include/drm/drm.h) * Tools fixes * Fix for the tools * Fix rocprofv2 script * Fixing Filesystem Issues * source formatting (clang-format v11) (#234) Co-authored-by: ammarwa <ammarwa@users.noreply.github.com> * Vlaindic/pc sampling api update (#235) * pcs: updating PC sampling API * source formatting (clang-format v11) (#232) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> --------- Co-authored-by: vlaindic <vladimir.indic@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * Vlaindic/pc sampling api update for ammar branch (#244) *Updating the documentation inside pc_sampling.h --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: use @p in front of params * pcs: documenting struct fields updated * Fixing PC Sampling Documentation issues * Fixing PC Sampling Documentation * Relocated tools directory to source/lib/rocprofiler-tool * Fixes/updates to rocprofiler-tool - updated CMake - Fixed miscellaneous issues in the code (VLAs, etc.) - Updated rocprofv2 to reflect some minor env variables changes in rocprofiler-tool - Fixed clang-tidy warnings * Update lib/rocprofiler-tool/CMakeLists.txt - link to atomic library * Add $ORIGIN/.. RUNPATH to rocprofiler-tool * Adding readme file for tools * Renaming the tools readme file * Update ReadMe.md * Update ReadMe.md * Documentation updates - overview and explanation of design and concepts * Fix lib/rocprofiler-tool/README.md - delete ReadMe.md * Hacks for build * Update Filesystem * cmake formatting (cmake-format) (#248) Co-authored-by: ammarwa <ammarwa@users.noreply.github.com> * source formatting (clang-format v11) (#249) Co-authored-by: ammarwa <ammarwa@users.noreply.github.com> * source formatting (clang-format v11) (#250) Co-authored-by: ammarwa <ammarwa@users.noreply.github.com> * Addressing review comments on the tool readme file * Revert "Hacks for build" This reverts commit d6688cb3d1226c46fc97e37ced889a5b0d180940. * Fixes for GCC 7.5 compiler in OpenSUSE 15.4 * Update lib/rocprofiler-tool/CMakeLists.txt - link to AQL profile library * Fix lib/rocprofiler-tool/README.md - fix markdown * Fix lib/rocprofiler-tool - fix usage of hsa_ven_amd_loader_query_host_address * Fix unused variable warnings - byproduct of variables only used in assert statements * Update docs - update about.md - more "Important Changes" section here - update tool_library_overview.md - extend "Tool Library Design" section - write "Tool Initialization" section - write "Tool Finalization" section * Add ghc::filesystem submodule * Implement usage of ghc::filesystem * Add ROCPROFILER_BUILD_GHC_FS option - option to use external/filesystem (ghc) * Update samples/counter-collection - compile flags - common library - fixes for warnings * Update tests/kernel-tracing/CMakeLists.txt - change install location of kernel-tracing-test-tool and install rpath * Update samples/common/CMakeLists.txt - compile features requiring C++17 * Update lib/rocprofiler-tool/tool.cpp - remove include <filesystem> - comment out unused variable - remove unused functions - move some functions into anonymous namespace --------- Co-authored-by: Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com> Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: ammarwa <ammarwa@users.noreply.github.com> Co-authored-by: vlaindic <vladimir.indic@amd.com> Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
233 lines
7.4 KiB
C++
233 lines
7.4 KiB
C++
// MIT License
|
|
//
|
|
// Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
|
|
//
|
|
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
// of this software and associated documentation files (the "Software"), to deal
|
|
// in the Software without restriction, including without limitation the rights
|
|
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
// copies of the Software, and to permit persons to whom the Software is
|
|
// furnished to do so, subject to the following conditions:
|
|
//
|
|
// The above copyright notice and this permission notice shall be included in
|
|
// all copies or substantial portions of the Software.
|
|
//
|
|
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
// THE SOFTWARE.
|
|
|
|
#include "lib/rocprofiler/hsa/queue_controller.hpp"
|
|
#include "lib/rocprofiler/agent.hpp"
|
|
#include "lib/rocprofiler/context/context.hpp"
|
|
#include "lib/rocprofiler/hsa/agent_cache.hpp"
|
|
|
|
#include <rocprofiler/fwd.h>
|
|
|
|
#include <glog/logging.h>
|
|
|
|
namespace rocprofiler
|
|
{
|
|
namespace hsa
|
|
{
|
|
namespace
|
|
{
|
|
// HSA Intercept Functions (create_queue/destroy_queue)
|
|
hsa_status_t
|
|
create_queue(hsa_agent_t agent,
|
|
uint32_t size,
|
|
hsa_queue_type32_t type,
|
|
void (*callback)(hsa_status_t status, hsa_queue_t* source, void* data),
|
|
void* data,
|
|
uint32_t private_segment_size,
|
|
uint32_t group_segment_size,
|
|
hsa_queue_t** queue)
|
|
{
|
|
for(const auto& [_, agent_info] : get_queue_controller().get_supported_agents())
|
|
{
|
|
if(agent_info.get_hsa_agent().handle == agent.handle)
|
|
{
|
|
auto new_queue = std::make_unique<Queue>(agent_info,
|
|
size,
|
|
type,
|
|
callback,
|
|
data,
|
|
private_segment_size,
|
|
group_segment_size,
|
|
get_queue_controller().get_core_table(),
|
|
get_queue_controller().get_ext_table(),
|
|
queue);
|
|
get_queue_controller().add_queue(*queue, std::move(new_queue));
|
|
return HSA_STATUS_SUCCESS;
|
|
}
|
|
}
|
|
LOG(FATAL) << "Could not find agent - " << agent.handle;
|
|
return HSA_STATUS_ERROR_FATAL;
|
|
}
|
|
|
|
hsa_status_t
|
|
destroy_queue(hsa_queue_t* hsa_queue)
|
|
{
|
|
get_queue_controller().destory_queue(hsa_queue);
|
|
return HSA_STATUS_SUCCESS;
|
|
}
|
|
|
|
constexpr rocprofiler_agent_t default_agent =
|
|
rocprofiler_agent_t{sizeof(rocprofiler_agent_t),
|
|
rocprofiler_agent_id_t{std::numeric_limits<uint64_t>::max()}};
|
|
} // namespace
|
|
|
|
void
|
|
QueueController::add_queue(hsa_queue_t* id, std::unique_ptr<Queue> queue)
|
|
{
|
|
CHECK(queue);
|
|
_callback_cache.wlock([&](auto& callbacks) {
|
|
_queues.wlock([&](auto& map) {
|
|
const auto agent_id = queue->get_agent().get_rocp_agent()->id.handle;
|
|
map[id] = std::move(queue);
|
|
for(const auto& [cbid, cb_tuple] : callbacks)
|
|
{
|
|
auto& [agent, qcb, ccb] = cb_tuple;
|
|
if(agent.id.handle == default_agent.id.handle || agent.id.handle == agent_id)
|
|
{
|
|
map[id]->register_callback(cbid, qcb, ccb);
|
|
}
|
|
}
|
|
});
|
|
});
|
|
}
|
|
|
|
void
|
|
QueueController::destory_queue(hsa_queue_t* id)
|
|
{
|
|
_queues.wlock([&](auto& map) { map.erase(id); });
|
|
}
|
|
|
|
ClientID
|
|
QueueController::add_callback(std::optional<rocprofiler_agent_t> agent,
|
|
Queue::queue_cb_t qcb,
|
|
Queue::completed_cb_t ccb)
|
|
{
|
|
static std::atomic<ClientID> client_id = 1;
|
|
ClientID return_id;
|
|
_callback_cache.wlock([&](auto& cb_cache) {
|
|
return_id = client_id;
|
|
if(agent)
|
|
{
|
|
cb_cache[client_id] = std::tuple(*agent, qcb, ccb);
|
|
}
|
|
else
|
|
{
|
|
cb_cache[client_id] = std::tuple(default_agent, qcb, ccb);
|
|
}
|
|
client_id++;
|
|
|
|
_queues.wlock([&](auto& map) {
|
|
for(auto& [_, queue] : map)
|
|
{
|
|
if(!agent || queue->get_agent().get_rocp_agent()->id.handle == agent->id.handle)
|
|
{
|
|
queue->register_callback(return_id, qcb, ccb);
|
|
}
|
|
}
|
|
});
|
|
});
|
|
return return_id;
|
|
}
|
|
|
|
void
|
|
QueueController::remove_callback(ClientID id)
|
|
{
|
|
_callback_cache.wlock([&](auto& cb_cache) {
|
|
cb_cache.erase(id);
|
|
_queues.wlock([&](auto& map) {
|
|
for(auto& [_, queue] : map)
|
|
{
|
|
queue->remove_callback(id);
|
|
}
|
|
});
|
|
});
|
|
}
|
|
|
|
void
|
|
QueueController::init(CoreApiTable& core_table, AmdExtTable& ext_table)
|
|
{
|
|
_core_table = core_table;
|
|
_ext_table = ext_table;
|
|
|
|
auto agents = agent::get_agents();
|
|
|
|
// Generate supported agents
|
|
for(const auto* itr : agents)
|
|
{
|
|
auto cached_agent = agent::get_agent_cache(itr);
|
|
if(cached_agent && cached_agent->get_rocp_agent()->type == ROCPROFILER_AGENT_TYPE_GPU)
|
|
{
|
|
get_supported_agents().emplace(cached_agent->index(), *cached_agent);
|
|
}
|
|
}
|
|
|
|
auto enable_intercepter = false;
|
|
for(const auto& itr : context::get_registered_contexts())
|
|
{
|
|
constexpr auto expected_context_size = 160UL;
|
|
static_assert(
|
|
sizeof(context::context) == expected_context_size,
|
|
"If you added a new field to context struct, make sure there is a check here if it "
|
|
"requires queue interception. Once you have done so, increment expected_context_size");
|
|
|
|
if(itr->counter_collection)
|
|
{
|
|
enable_intercepter = true;
|
|
break;
|
|
}
|
|
else if(itr->buffered_tracer)
|
|
{
|
|
if(itr->buffered_tracer->domains(ROCPROFILER_BUFFER_TRACING_KERNEL_DISPATCH) ||
|
|
itr->buffered_tracer->domains(ROCPROFILER_BUFFER_TRACING_MEMORY_COPY))
|
|
{
|
|
enable_intercepter = true;
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
if(enable_intercepter)
|
|
{
|
|
core_table.hsa_queue_create_fn = create_queue;
|
|
core_table.hsa_queue_destroy_fn = destroy_queue;
|
|
}
|
|
}
|
|
|
|
const Queue*
|
|
QueueController::get_queue(const hsa_queue_t& _hsa_queue) const
|
|
{
|
|
return _queues.rlock(
|
|
[](const queue_map_t& _data, const hsa_queue_t& _inp) -> const Queue* {
|
|
for(const auto& itr : _data)
|
|
{
|
|
if(itr.first->id == _inp.id) return itr.second.get();
|
|
}
|
|
return nullptr;
|
|
},
|
|
_hsa_queue);
|
|
}
|
|
|
|
QueueController&
|
|
get_queue_controller()
|
|
{
|
|
static QueueController controller;
|
|
return controller;
|
|
}
|
|
|
|
void
|
|
queue_controller_init(HsaApiTable* table)
|
|
{
|
|
get_queue_controller().init(*table->core_, *table->amd_ext_);
|
|
}
|
|
} // namespace hsa
|
|
} // namespace rocprofiler
|