Files
rocm-systems/source/lib/output/metadata.cpp
T
Jakaraddi, Manjunath 78d8f4b8ea SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary

* adding json output for host_functions

* added error handling

* adding json tool support

* Adding tests

* formatting changes

* Adding documentation

* refactoring as per amd-staging

* Adding intializers and changing macros

* Fix page-migration background thread on fork (#31)

* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)

* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- debugging for error that cannot be locally reproduced

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- improve error handling and logging

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- tweak to non-fatal logging messages

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- cleanup of logging messages

* Update host kernel symbol register data fields

* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 11:42:37 +00:00

564 rindas
17 KiB
C++

// MIT License
//
// Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#include "metadata.hpp"
#include "lib/common/string_entry.hpp"
#include "lib/output/agent_info.hpp"
#include "lib/output/host_symbol_info.hpp"
#include "lib/output/kernel_symbol_info.hpp"
#include <rocprofiler-sdk/fwd.h>
#include <memory>
#include <vector>
namespace rocprofiler
{
namespace tool
{
namespace
{
rocprofiler_status_t
dimensions_info_callback(rocprofiler_counter_id_t /*id*/,
const rocprofiler_record_dimension_info_t* dim_info,
long unsigned int num_dims,
void* user_data)
{
auto* dimensions_info = static_cast<counter_dimension_info_vec_t*>(user_data);
dimensions_info->reserve(num_dims);
for(size_t j = 0; j < num_dims; j++)
dimensions_info->emplace_back(dim_info[j]);
return ROCPROFILER_STATUS_SUCCESS;
}
rocprofiler_status_t
query_pc_sampling_configuration(const rocprofiler_pc_sampling_configuration_t* configs,
long unsigned int num_config,
void* user_data)
{
auto* avail_configs =
static_cast<std::vector<rocprofiler_pc_sampling_configuration_t>*>(user_data);
for(size_t i = 0; i < num_config; i++)
{
avail_configs->emplace_back(configs[i]);
}
return ROCPROFILER_STATUS_SUCCESS;
}
} // namespace
kernel_symbol_info::kernel_symbol_info()
: base_type{0, 0, 0, "", 0, 0, 0, 0, 0, 0, 0, 0}
{}
constexpr auto null_address_v = rocprofiler_address_t{.value = 0};
constexpr auto null_dim3_v = rocprofiler_dim3_t{.x = 0, .y = 0, .z = 0};
host_function_info::host_function_info()
: base_type{0,
0,
0,
0,
null_address_v,
null_address_v,
"",
0,
null_dim3_v,
null_dim3_v,
null_dim3_v,
null_dim3_v,
0}
{}
metadata::metadata(inprocess)
: buffer_names{sdk::get_buffer_tracing_names()}
, callback_names{sdk::get_callback_tracing_names()}
{
ROCPROFILER_CHECK(rocprofiler_query_available_agents(
ROCPROFILER_AGENT_INFO_VERSION_0,
[](rocprofiler_agent_version_t, const void** _agents, size_t _num_agents, void* _data) {
auto* _agents_v = static_cast<agent_info_vec_t*>(_data);
_agents_v->reserve(_num_agents);
for(size_t i = 0; i < _num_agents; ++i)
{
auto* agent = static_cast<const rocprofiler_agent_v0_t*>(_agents[i]);
_agents_v->emplace_back(*agent);
}
return ROCPROFILER_STATUS_SUCCESS;
},
sizeof(rocprofiler_agent_v0_t),
&agents));
{
auto _gpu_agents = std::vector<agent_info*>{};
_gpu_agents.reserve(agents.size());
for(auto& itr : agents)
{
if(itr.type == ROCPROFILER_AGENT_TYPE_GPU)
{
_gpu_agents.emplace_back(&itr);
auto pc_configs = std::vector<rocprofiler_pc_sampling_configuration_t>{};
rocprofiler_query_pc_sampling_agent_configurations(
itr.id, query_pc_sampling_configuration, &pc_configs);
agent_pc_sample_config_info.emplace(itr.id, pc_configs);
}
}
// make sure they are sorted by node id
std::sort(_gpu_agents.begin(), _gpu_agents.end(), [](const auto& lhs, const auto& rhs) {
return CHECK_NOTNULL(lhs)->node_id < CHECK_NOTNULL(rhs)->node_id;
});
int64_t _dev_id = 0;
for(auto& itr : _gpu_agents)
itr->gpu_index = _dev_id++;
}
for(auto itr : agents)
agents_map.emplace(itr.id, itr);
}
void metadata::init(inprocess)
{
if(inprocess_init) return;
inprocess_init = true;
for(auto itr : agents)
{
if(itr.type == ROCPROFILER_AGENT_TYPE_CPU) continue;
auto status = rocprofiler_iterate_agent_supported_counters(
itr.id,
[](rocprofiler_agent_id_t id,
rocprofiler_counter_id_t* counters,
size_t num_counters,
void* user_data) {
auto* data_v = static_cast<agent_counter_info_map_t*>(user_data);
data_v->emplace(id, counter_info_vec_t{});
for(size_t i = 0; i < num_counters; ++i)
{
auto _info = rocprofiler_counter_info_v0_t{};
auto _dim_ids = std::vector<rocprofiler_counter_dimension_id_t>{};
auto _dim_info = std::vector<rocprofiler_record_dimension_info_t>{};
ROCPROFILER_CHECK(rocprofiler_query_counter_info(
counters[i],
ROCPROFILER_COUNTER_INFO_VERSION_0,
&static_cast<rocprofiler_counter_info_v0_t&>(_info)));
ROCPROFILER_CHECK(rocprofiler_iterate_counter_dimensions(
counters[i], dimensions_info_callback, &_dim_info));
_dim_ids.reserve(_dim_info.size());
for(auto ditr : _dim_info)
_dim_ids.emplace_back(ditr.id);
data_v->at(id).emplace_back(
id, _info, std::move(_dim_ids), std::move(_dim_info));
}
return ROCPROFILER_STATUS_SUCCESS;
},
&agent_counter_info);
if(status != ROCPROFILER_STATUS_ERROR_AGENT_ARCH_NOT_SUPPORTED)
{
ROCPROFILER_CHECK(status);
}
else
{
ROCP_WARNING << "Init failed for agent " << itr.node_id << " (" << itr.name << ")";
}
}
}
const agent_info*
metadata::get_agent(rocprofiler_agent_id_t _val) const
{
for(const auto& itr : agents)
{
if(itr.id == _val) return &itr;
}
return nullptr;
}
const code_object_info*
metadata::get_code_object(uint64_t code_obj_id) const
{
return code_objects.rlock([code_obj_id](const auto& _data) -> const code_object_info* {
return &_data.at(code_obj_id);
});
}
const kernel_symbol_info*
metadata::get_kernel_symbol(uint64_t kernel_id) const
{
return kernel_symbols.rlock([kernel_id](const auto& _data) -> const kernel_symbol_info* {
return &_data.at(kernel_id);
});
}
const host_function_info*
metadata::get_host_function(uint64_t host_function_id) const
{
return host_functions.rlock([host_function_id](const auto& _data) -> const host_function_info* {
return &_data.at(host_function_id);
});
}
const tool_counter_info*
metadata::get_counter_info(uint64_t instance_id) const
{
auto _counter_id = rocprofiler_counter_id_t{.handle = 0};
ROCPROFILER_CHECK(rocprofiler_query_record_counter_id(instance_id, &_counter_id));
return get_counter_info(_counter_id);
}
const tool_counter_info*
metadata::get_counter_info(rocprofiler_counter_id_t id) const
{
for(const auto& itr : agent_counter_info)
{
for(const auto& aitr : itr.second)
{
if(aitr.id == id) return &aitr;
}
}
return nullptr;
}
const counter_dimension_info_vec_t*
metadata::get_counter_dimension_info(uint64_t instance_id) const
{
return &CHECK_NOTNULL(get_counter_info(instance_id))->dimensions;
}
code_object_data_vec_t
metadata::get_code_objects() const
{
auto _data = code_objects.rlock([](const auto& _data_v) {
auto _info = std::vector<code_object_info>{};
_info.reserve(_data_v.size());
for(const auto& itr : _data_v)
_info.emplace_back(itr.second);
return _info;
});
uint64_t _sz = 0;
for(const auto& itr : _data)
_sz = std::max(_sz, itr.code_object_id);
auto _code_obj_data = std::vector<code_object_info>{};
_code_obj_data.resize(_sz + 1, code_object_info{});
// index by the code object id
for(auto& itr : _data)
_code_obj_data.at(itr.code_object_id) = itr;
return _code_obj_data;
}
kernel_symbol_data_vec_t
metadata::get_kernel_symbols() const
{
auto _data = kernel_symbols.rlock([](const auto& _data_v) {
auto _info = std::vector<kernel_symbol_info>{};
_info.reserve(_data_v.size());
for(const auto& itr : _data_v)
_info.emplace_back(itr.second);
return _info;
});
uint64_t kernel_data_size = 0;
for(const auto& itr : _data)
kernel_data_size = std::max(kernel_data_size, itr.kernel_id);
auto _symbol_data = std::vector<kernel_symbol_info>{};
_symbol_data.resize(kernel_data_size + 1, kernel_symbol_info{});
// index by the kernel id
for(auto& itr : _data)
_symbol_data.at(itr.kernel_id) = std::move(itr);
return _symbol_data;
}
host_function_data_vec_t
metadata::get_host_symbols() const
{
return host_functions.rlock([](const auto& _data_v) {
auto _info = std::vector<host_function_info>{};
_info.resize(_data_v.size() + 1, host_function_info{});
for(const auto& itr : _data_v)
_info.at(itr.first) = itr.second;
return _info;
});
}
metadata::agent_info_ptr_vec_t
metadata::get_gpu_agents() const
{
auto _data = metadata::agent_info_ptr_vec_t{};
for(const auto& itr : agents)
{
if(itr.type == ROCPROFILER_AGENT_TYPE_GPU) _data.emplace_back(&itr);
}
return _data;
}
pc_sample_config_vec_t
metadata::get_pc_sample_config_info(rocprofiler_agent_id_t _val) const
{
auto _ret = pc_sample_config_vec_t{};
auto pc_sample_config = agent_pc_sample_config_info.at(_val);
for(const auto& itr : pc_sample_config)
_ret.emplace_back(itr);
return _ret;
}
counter_info_vec_t
metadata::get_counter_info() const
{
auto _ret = std::vector<tool_counter_info>{};
for(const auto& itr : agent_counter_info)
{
for(const auto& iitr : itr.second)
_ret.emplace_back(iitr);
}
return _ret;
}
counter_dimension_vec_t
metadata::get_counter_dimension_info() const
{
auto _ret = counter_dimension_vec_t{};
for(const auto& itr : agent_counter_info)
{
for(const auto& iitr : itr.second)
for(const auto& ditr : iitr.dimensions)
_ret.emplace_back(ditr);
}
auto _sorter = [](const rocprofiler_record_dimension_info_t& lhs,
const rocprofiler_record_dimension_info_t& rhs) {
return std::tie(lhs.id, lhs.instance_size) < std::tie(rhs.id, rhs.instance_size);
};
auto _equiv = [](const rocprofiler_record_dimension_info_t& lhs,
const rocprofiler_record_dimension_info_t& rhs) {
return std::tie(lhs.id, lhs.instance_size) == std::tie(rhs.id, rhs.instance_size);
};
std::sort(_ret.begin(), _ret.end(), _sorter);
_ret.erase(std::unique(_ret.begin(), _ret.end(), _equiv), _ret.end());
return _ret;
}
bool
metadata::add_marker_message(uint64_t corr_id, std::string&& msg)
{
return marker_messages.wlock(
[](auto& _data, uint64_t _cid_v, std::string&& _msg) -> bool {
return _data.emplace(_cid_v, std::move(_msg)).second;
},
corr_id,
std::move(msg));
}
bool
metadata::add_code_object(code_object_info obj)
{
return code_objects.wlock(
[](code_object_data_map_t& _data_v, code_object_info _obj_v) -> bool {
return _data_v.emplace(_obj_v.code_object_id, _obj_v).second;
},
obj);
}
bool
metadata::add_kernel_symbol(kernel_symbol_info&& sym)
{
return kernel_symbols.wlock(
[](kernel_symbol_data_map_t& _data_v, kernel_symbol_info&& _sym_v) -> bool {
return _data_v.emplace(_sym_v.kernel_id, std::move(_sym_v)).second;
},
std::move(sym));
}
bool
metadata::add_host_function(host_function_info&& func)
{
return host_functions.wlock(
[](host_function_info_map_t& _data_v, host_function_info&& _func_v) -> bool {
return _data_v.emplace(_func_v.host_function_id, std::move(_func_v)).second;
},
std::move(func));
}
bool
metadata::add_string_entry(size_t key, std::string_view str)
{
return string_entries.ulock(
[](const auto& _data, size_t _key, std::string_view) { return (_data.count(_key) > 0); },
[](auto& _data, size_t _key, std::string_view _str) {
_data.emplace(_key, new std::string{_str});
return true;
},
key,
str);
}
bool
metadata::add_external_correlation_id(uint64_t val)
{
return external_corr_ids.wlock(
[](auto& _data, uint64_t _val) { return _data.emplace(_val).second; }, val);
}
std::string_view
metadata::get_marker_message(uint64_t corr_id) const
{
return marker_messages.rlock(
[](const auto& _data, uint64_t _corr_id_v) -> std::string_view {
return _data.at(_corr_id_v);
},
corr_id);
}
std::string_view
metadata::get_kernel_name(uint64_t kernel_id, uint64_t rename_id) const
{
if(rename_id > 0)
{
if(const auto* _name = common::get_string_entry(rename_id)) return std::string_view{*_name};
}
const auto* _kernel_data = get_kernel_symbol(kernel_id);
return CHECK_NOTNULL(_kernel_data)->formatted_kernel_name;
}
std::string_view
metadata::get_kind_name(rocprofiler_callback_tracing_kind_t kind) const
{
return callback_names.at(kind);
}
std::string_view
metadata::get_kind_name(rocprofiler_buffer_tracing_kind_t kind) const
{
return buffer_names.at(kind);
}
std::string_view
metadata::get_operation_name(rocprofiler_callback_tracing_kind_t kind,
rocprofiler_tracing_operation_t op) const
{
return callback_names.at(kind, op);
}
std::string_view
metadata::get_operation_name(rocprofiler_buffer_tracing_kind_t kind,
rocprofiler_tracing_operation_t op) const
{
return buffer_names.at(kind, op);
}
uint64_t
metadata::get_node_id(rocprofiler_agent_id_t _val) const
{
return CHECK_NOTNULL(get_agent(_val))->logical_node_id;
}
const std::string*
metadata::get_string_entry(size_t key) const
{
const auto* ret = string_entries.rlock(
[](const auto& _data, size_t _key) -> const std::string* {
if(_data.count(_key) > 0) return _data.at(_key).get();
return nullptr;
},
key);
if(!ret) ret = common::get_string_entry(key);
return ret;
}
int64_t
metadata::get_instruction_index(rocprofiler_pc_t record)
{
inst_t ins;
ins.code_object_id = record.code_object_id;
ins.code_object_offset = record.code_object_offset;
auto itr = indexes.find(ins);
if(itr != indexes.end()) return itr->second;
auto idx = instruction_decoder.size();
auto pc_instruction = decode_instruction(record);
instruction_decoder.emplace_back(pc_instruction->inst);
instruction_comment.emplace_back(pc_instruction->comment);
indexes.emplace(ins, idx);
return idx;
}
void
metadata::add_decoder(rocprofiler_code_object_info_t* obj_data)
{
if(obj_data->storage_type == ROCPROFILER_CODE_OBJECT_STORAGE_TYPE_FILE)
{
decoder.wlock(
[](auto& _decoder, rocprofiler_code_object_info_t* obj_data_v) {
_decoder.addDecoder(obj_data_v->uri,
obj_data_v->code_object_id,
obj_data_v->load_delta,
obj_data_v->load_size);
},
obj_data);
}
else
{
decoder.wlock(
[](auto& _decoder, rocprofiler_code_object_info_t* obj_data_v) {
_decoder.addDecoder(
// NOLINTBEGIN(performance-no-int-to-ptr)
reinterpret_cast<const void*>(obj_data_v->memory_base),
// NOLINTEND(performance-no-int-to-ptr)
obj_data_v->memory_size,
obj_data_v->code_object_id,
obj_data_v->load_delta,
obj_data_v->load_size);
},
obj_data);
}
}
std::unique_ptr<instruction_t>
metadata::decode_instruction(rocprofiler_pc_t pc)
{
return decoder.wlock(
[](auto& _decoder, uint64_t id, uint64_t addr) { return _decoder.get(id, addr); },
pc.code_object_id,
pc.code_object_offset);
}
} // namespace tool
} // namespace rocprofiler