[rocprofv3] signal handler fix (#332)
* rocprofv3: LD_PRELOAD for signal and sigaction
- wrappers around `signal` and `sigaction` to prevent applications which install signal handlers to replace the rocprofv3 signal handlers
- minor tweaks to buffer sizes (use page_size instead of
KiB)
* [DO NOT COMMIT] extra logging
* Switch git submodule url for perfetto
- use GitHub URL as this is more accessible
* Update ring_buffer<Tp>
- account for alignment padding
* Update buffered_output
- track number of bytes stored
- add nullptr checks
* Update tmp_file_buffer
- track number of bytes
- read_tmp_file does not create tmp file if it does not already exist
* Update tmp_file
- add exists member function for checking whether temporary file already exists
- tweak remove() implementation
* Update config.hpp
- add option to enable/disable signal handlers
- add option for minimum_output_bytes
* Make signal, sigaction functions visible
* rocprofv3 tool updates
- chained signals
- override the signal handler(s) installed by the application
- improve cleanup of temporary files
- support minimum output bytes
* Add commandline support
* fixing test
* minor fix
* minor fix
* fix clang issue
* fix
* Adding docs
* review comments
* review changes
* review
* YUV pulldown additions to rocdecode
* More rocdecode changes
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
[ROCm/rocprofiler-sdk commit: 87badfbd15]
Cette révision appartient à :
révisé par
GitHub
Parent
3fc9374295
révision
2e7d0b3aec
@@ -21,7 +21,7 @@
|
||||
url = https://github.com/gulrak/filesystem.git
|
||||
[submodule "external/perfetto"]
|
||||
path = external/perfetto
|
||||
url = https://android.googlesource.com/platform/external/perfetto
|
||||
url = https://github.com/google/perfetto
|
||||
[submodule "external/elfio"]
|
||||
path = external/elfio
|
||||
url = https://github.com/serge1/ELFIO.git
|
||||
|
||||
@@ -172,7 +172,7 @@ rocprofiler_checkout_git_submodule(
|
||||
RELATIVE_PATH external/perfetto
|
||||
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||
TEST_FILE meson.build
|
||||
REPO_URL https://android.googlesource.com/platform/external/perfetto external/perfetto
|
||||
REPO_URL https://github.com/google/perfetto
|
||||
REPO_BRANCH "v44.0")
|
||||
|
||||
add_library(rocprofiler-sdk-perfetto-static-library STATIC)
|
||||
|
||||
@@ -620,6 +620,26 @@ For MPI applications (or other job launchers such as SLURM), place rocprofv3 ins
|
||||
help=argparse.SUPPRESS,
|
||||
)
|
||||
|
||||
add_parser_bool_argument(
|
||||
advanced_options,
|
||||
"--disable-signal-handlers",
|
||||
help="""Enables the signal handlers in the rocprofv3 tool.
|
||||
When --disable-signal-handlers is set to true,
|
||||
and application has its signal handler on SIGSEGV or similar installed,
|
||||
then its signal handler will be used, not the rocprofv3 signal handler.
|
||||
Note: glog still installs signal handlers which provide backtraces""",
|
||||
)
|
||||
|
||||
advanced_options.add_argument(
|
||||
"--minimum-output-data",
|
||||
help="""Output files are generated only if output data size > minimum output data".
|
||||
It can be used for controlling the generation of output files so that user don't recieve empty files.
|
||||
The input is in KB units.""",
|
||||
default=None,
|
||||
type=int,
|
||||
metavar="KB",
|
||||
)
|
||||
|
||||
if args is None:
|
||||
args = sys.argv[1:]
|
||||
|
||||
@@ -1359,6 +1379,12 @@ def run(app_args, args, **kwargs):
|
||||
update_env("ROCPROF_PC_SAMPLING_METHOD", args.pc_sampling_method)
|
||||
update_env("ROCPROF_PC_SAMPLING_INTERVAL", args.pc_sampling_interval)
|
||||
|
||||
if args.disable_signal_handlers is not None:
|
||||
update_env("ROCPROF_SIGNAL_HANDLERS", not args.disable_signal_handlers)
|
||||
|
||||
if args.minimum_output_data:
|
||||
update_env("ROCPROF_MINIMUM_OUTPUT_BYTES", args.minimum_output_data * 1024)
|
||||
|
||||
if args.advanced_thread_trace:
|
||||
|
||||
def int_auto(num_str):
|
||||
|
||||
@@ -168,6 +168,10 @@ The following table lists the commonly used ``rocprofv3`` command-line options c
|
||||
* - Other
|
||||
- ``--preload`` [PRELOAD ...]
|
||||
- Specifies libraries to prepend to ``LD_PRELOAD``. It is useful for sanitizer libraries.
|
||||
- ``--minimum-output-data``
|
||||
- Output files are generated only if output data size is greater than minimum output data size. It can be used for controlling the generation of output files so that user don't recieve empty files. The input is in KB units.
|
||||
- ``--disable-signal-handlers``
|
||||
- Disables the signal handlers in the rocprofv3 tool. It disables the prioritizing of rocprofv3 signal handler over application installed signal handler. When --disable-signal-handlers is set to true, and application has its signal handler on SIGSEGV or similar installed, then its signal handler will be used not the rocprofv3 signal handler. Note: glog still installs signal handlers which provide backtraces.
|
||||
|
||||
To see exhaustive list of ``rocprofv3`` options:
|
||||
|
||||
@@ -702,6 +706,8 @@ Here is the input schema (properties) of JSON or YAML input files:
|
||||
- **hsa_finalize_trace** *(boolean)*
|
||||
- **hsa_image_trace** *(boolean)*
|
||||
- **sys_trace** *(boolean)*
|
||||
- **minimum-output-data** *(integer)*
|
||||
- **disable-signal-handlers** *(boolean)*
|
||||
- **mangled_kernels** *(boolean)*
|
||||
- **truncate_kernels** *(boolean)*
|
||||
- **output_file** *(string)*
|
||||
@@ -802,6 +808,8 @@ Here is the input schema (properties) of JSON or YAML input files:
|
||||
- **list_avail** *(boolean)*
|
||||
- **log_level** *(string)*
|
||||
- **preload** *(array)*
|
||||
- **minimum-output-data** *(integer)*
|
||||
- **disable-signal-handlers** *(boolean)*
|
||||
- **pc_sampling_unit** *(string)*
|
||||
- **pc_sampling_method** *(string)*
|
||||
- **pc_sampling_interval** *(integer)*
|
||||
|
||||
@@ -166,6 +166,14 @@
|
||||
"type": "array",
|
||||
"description": "Libraries to prepend to LD_PRELOAD (usually for sanitizers)"
|
||||
},
|
||||
"minimum-output-data":{
|
||||
"type": "integer",
|
||||
"description": "Minimum output data, the output files are generated only if output data size is greater than minimum output data size. It can be used for controlling the generation of output files so that user don't recieve empty files. The input is in KB units"
|
||||
},
|
||||
"disable-signal-handlers":{
|
||||
"type": "boolean",
|
||||
"description": "Disables the signal handlers in the rocprofv3 tool. When --disable-signal-handlers is set to true, and application has its signal handler on SIGSEGV or similar installed, then its signal handler will be used not the rocprofv3 signal handler. Note: glog still installs signal handlers which provide backtraces"
|
||||
},
|
||||
"pc_sampling_unit": {
|
||||
"type": "string",
|
||||
"description": "pc sampling unit"
|
||||
|
||||
@@ -300,7 +300,7 @@ struct ring_buffer : private base::ring_buffer
|
||||
~ring_buffer() = default;
|
||||
|
||||
explicit ring_buffer(size_t _size)
|
||||
: base_type{_size * sizeof(Tp)}
|
||||
: base_type{_size * aligned_data_size()}
|
||||
{}
|
||||
|
||||
ring_buffer(const ring_buffer&);
|
||||
@@ -313,16 +313,19 @@ struct ring_buffer : private base::ring_buffer
|
||||
bool is_initialized() const { return base_type::is_initialized(); }
|
||||
|
||||
/// Get the total number of Tp instances supported
|
||||
size_t capacity() const { return (base_type::capacity()) / sizeof(Tp); }
|
||||
size_t capacity() const { return (base_type::capacity()) / aligned_data_size(); }
|
||||
|
||||
/// Creates new ring buffer.
|
||||
void init(size_t _size) { base_type::init(_size * sizeof(Tp)); }
|
||||
void init(size_t _size) { base_type::init(_size * aligned_data_size()); }
|
||||
|
||||
/// Destroy ring buffer.
|
||||
void destroy() { base_type::destroy(); }
|
||||
|
||||
/// Write data to buffer.
|
||||
size_t data_size() const { return sizeof(Tp); }
|
||||
/// Size of the data type
|
||||
static constexpr size_t data_size() { return sizeof(Tp); }
|
||||
|
||||
/// Size of the data type + padding
|
||||
static constexpr size_t aligned_data_size();
|
||||
|
||||
/// Write data to buffer. Return pointer to location of write
|
||||
Tp* write(Tp* in) { return base_type::write<Tp>(in).second; }
|
||||
@@ -337,16 +340,16 @@ struct ring_buffer : private base::ring_buffer
|
||||
Tp* retrieve() { return base_type::retrieve<Tp>(); }
|
||||
|
||||
/// Returns number of Tp instances currently held by the buffer.
|
||||
size_t count() const { return (base_type::count()) / sizeof(Tp); }
|
||||
size_t count() const { return (base_type::count()) / aligned_data_size(); }
|
||||
|
||||
/// Returns how many Tp instances are availiable in the buffer.
|
||||
size_t free() const { return (base_type::free()) / sizeof(Tp); }
|
||||
size_t free() const { return (base_type::free()) / aligned_data_size(); }
|
||||
|
||||
/// Returns if the buffer is empty.
|
||||
bool is_empty() const { return base_type::is_empty(); }
|
||||
|
||||
/// Returns if the buffer is full.
|
||||
bool is_full() const { return (base_type::free() < sizeof(Tp)); }
|
||||
bool is_full() const { return (base_type::free() < aligned_data_size()); }
|
||||
|
||||
bool clear() { return base_type::clear(); }
|
||||
|
||||
@@ -365,6 +368,7 @@ struct ring_buffer : private base::ring_buffer
|
||||
std::ostringstream ss{};
|
||||
size_t _w = std::log10(base_type::capacity()) + 1;
|
||||
ss << std::boolalpha << std::right << "data size: " << std::setw(_w) << data_size()
|
||||
<< "B, aligned data size: " << std::setw(_w) << aligned_data_size()
|
||||
<< " B, is_initialized: " << std::setw(5) << is_initialized()
|
||||
<< ", is_empty: " << std::setw(5) << is_empty() << ", is_full: " << std::setw(5)
|
||||
<< is_full() << ", capacity: " << std::setw(_w) << capacity()
|
||||
@@ -392,6 +396,22 @@ ring_buffer<Tp>::get_items_per_page()
|
||||
}
|
||||
//
|
||||
template <typename Tp>
|
||||
constexpr size_t
|
||||
ring_buffer<Tp>::aligned_data_size()
|
||||
{
|
||||
constexpr auto _data_size = sizeof(Tp);
|
||||
constexpr auto _data_align = alignof(Tp);
|
||||
constexpr auto _align_modulo = _data_size % _data_align;
|
||||
constexpr auto _result =
|
||||
(_align_modulo == 0) ? _data_size : (_data_size + (_data_align - _align_modulo));
|
||||
|
||||
static_assert(_result >= _data_size && _result < (_data_size + _data_align),
|
||||
"should neither be < sizeof(Tp) nor > sizeof(Tp) + alignof(Tp)");
|
||||
|
||||
return _result;
|
||||
}
|
||||
//
|
||||
template <typename Tp>
|
||||
ring_buffer<Tp>::ring_buffer(const ring_buffer<Tp>& rhs)
|
||||
: base_type{rhs}
|
||||
{
|
||||
|
||||
@@ -63,6 +63,7 @@ struct buffered_output
|
||||
void clear();
|
||||
void destroy();
|
||||
|
||||
uint64_t get_num_bytes() const;
|
||||
generator<Tp> get_generator() const { return generator<Tp>{get_tmp_file_buffer<Tp>(DomainT)}; }
|
||||
std::deque<Tp> load_all();
|
||||
|
||||
@@ -130,11 +131,28 @@ buffered_output<Tp, DomainT>::destroy()
|
||||
if(!enabled) return;
|
||||
|
||||
clear();
|
||||
auto*& filebuf = get_tmp_file_buffer<type>(buffer_type_v);
|
||||
file_buffer<type>* tmp = nullptr;
|
||||
std::swap(filebuf, tmp);
|
||||
tmp->buffer.destroy();
|
||||
delete tmp;
|
||||
auto*& filebuf = get_tmp_file_buffer<type>(buffer_type_v);
|
||||
if(filebuf)
|
||||
{
|
||||
file_buffer<type>* tmp = nullptr;
|
||||
std::swap(filebuf, tmp);
|
||||
tmp->buffer.destroy();
|
||||
if(tmp->file)
|
||||
{
|
||||
tmp->file.close();
|
||||
tmp->file.remove();
|
||||
}
|
||||
delete tmp;
|
||||
}
|
||||
}
|
||||
|
||||
template <typename Tp, domain_type DomainT>
|
||||
uint64_t
|
||||
buffered_output<Tp, DomainT>::get_num_bytes() const
|
||||
{
|
||||
if(auto*& filebuf = get_tmp_file_buffer<type>(buffer_type_v); filebuf) return filebuf->nbytes;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
using hip_buffered_output_t =
|
||||
|
||||
@@ -132,13 +132,28 @@ tmp_file::remove()
|
||||
if(fs::exists(filename))
|
||||
{
|
||||
ROCP_INFO << "removing temporary file: '" << filename << "'...";
|
||||
auto _ret = ::remove(filename.c_str());
|
||||
return (_ret == 0);
|
||||
auto _ec = std::error_code{};
|
||||
auto _ret = fs::remove(filename, _ec);
|
||||
|
||||
if(_ec)
|
||||
ROCP_WARNING << fmt::format(
|
||||
"Error removing temporary file '{}' :: {}", filename, _ec.message());
|
||||
else if(!_ret)
|
||||
ROCP_WARNING << fmt::format("Error removing temporary file '{}' :: Unknown error",
|
||||
filename);
|
||||
|
||||
return _ret;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
tmp_file::exists() const
|
||||
{
|
||||
return fs::exists(filename);
|
||||
}
|
||||
|
||||
tmp_file::operator bool() const
|
||||
{
|
||||
return (stream.is_open() && stream.good()) || (file != nullptr && fd > 0);
|
||||
|
||||
@@ -44,6 +44,7 @@ struct tmp_file
|
||||
bool flush();
|
||||
bool close();
|
||||
bool remove();
|
||||
bool exists() const;
|
||||
|
||||
explicit operator bool() const;
|
||||
|
||||
|
||||
@@ -71,6 +71,7 @@ struct file_buffer
|
||||
file_buffer& operator=(file_buffer&&) noexcept = default;
|
||||
|
||||
domain_type domain = {};
|
||||
uint64_t nbytes = 0;
|
||||
ring_buffer_t<Tp> buffer = {};
|
||||
tmp_file file;
|
||||
};
|
||||
@@ -110,7 +111,13 @@ offload_buffer(domain_type type)
|
||||
ROCP_CI_LOG_IF(WARNING, _fs.tellg() != _fs.tellp()) // this should always be true
|
||||
<< "tellg=" << _fs.tellg() << ", tellp=" << _fs.tellp();
|
||||
|
||||
auto _nbytes = (filebuf->buffer.count() * filebuf->buffer.data_size());
|
||||
|
||||
ROCP_TRACE << fmt::format(
|
||||
"offloading {} B from {} buffer to tmp file", _nbytes, get_domain_column_name(type));
|
||||
|
||||
filebuf->file.file_pos.emplace(_fs.tellp());
|
||||
filebuf->nbytes += _nbytes;
|
||||
filebuf->buffer.save(_fs);
|
||||
filebuf->buffer.clear();
|
||||
|
||||
@@ -206,10 +213,13 @@ read_tmp_file(domain_type type)
|
||||
return;
|
||||
}
|
||||
|
||||
auto _lk = std::lock_guard<std::mutex>{filebuf->file.file_mutex};
|
||||
auto& _fs = filebuf->file.stream;
|
||||
if(_fs.is_open()) _fs.close();
|
||||
filebuf->file.open(std::ios::binary | std::ios::in);
|
||||
auto _lk = std::lock_guard<std::mutex>{filebuf->file.file_mutex};
|
||||
if(filebuf->file.exists())
|
||||
{
|
||||
auto& _fs = filebuf->file.stream;
|
||||
if(_fs.is_open()) _fs.close();
|
||||
filebuf->file.open(std::ios::binary | std::ios::in);
|
||||
}
|
||||
}
|
||||
} // namespace tool
|
||||
} // namespace rocprofiler
|
||||
|
||||
@@ -117,8 +117,9 @@ struct config : output_config
|
||||
bool pc_sampling_host_trap = false;
|
||||
bool advanced_thread_trace = get_env("ROCPROF_ADVANCED_THREAD_TRACE", false);
|
||||
bool pc_sampling_stochastic = false;
|
||||
size_t pc_sampling_interval = get_env("ROCPROF_PC_SAMPLING_INTERVAL", 1);
|
||||
bool att_serialize_all = get_env("ROCPROF_ATT_PARAM_SERIALIZE_ALL", false);
|
||||
bool enable_signal_handlers = get_env("ROCPROF_SIGNAL_HANDLERS", true);
|
||||
size_t pc_sampling_interval = get_env("ROCPROF_PC_SAMPLING_INTERVAL", 1);
|
||||
rocprofiler_pc_sampling_method_t pc_sampling_method_value = ROCPROFILER_PC_SAMPLING_METHOD_NONE;
|
||||
rocprofiler_pc_sampling_unit_t pc_sampling_unit_value = ROCPROFILER_PC_SAMPLING_UNIT_NONE;
|
||||
|
||||
@@ -145,6 +146,7 @@ struct config : output_config
|
||||
std::queue<CollectionPeriod> collection_periods = {};
|
||||
uint64_t counter_groups_random_seed = get_env("ROCPROF_COUNTER_GROUPS_RANDOM_SEED", 0);
|
||||
uint64_t counter_groups_interval = get_env("ROCPROF_COUNTER_GROUPS_INTERVAL", 1);
|
||||
uint64_t minimum_output_bytes = get_env("ROCPROF_MINIMUM_OUTPUT_BYTES", 0);
|
||||
|
||||
template <typename ArchiveT>
|
||||
void save(ArchiveT&) const;
|
||||
@@ -203,6 +205,8 @@ config::save(ArchiveT& ar) const
|
||||
CFG_SERIALIZE_MEMBER(kernel_filter_range);
|
||||
CFG_SERIALIZE_MEMBER(demangle);
|
||||
CFG_SERIALIZE_MEMBER(truncate);
|
||||
CFG_SERIALIZE_MEMBER(minimum_output_bytes);
|
||||
CFG_SERIALIZE_MEMBER(enable_signal_handlers);
|
||||
|
||||
CFG_SERIALIZE_MEMBER(pc_sampling_method);
|
||||
CFG_SERIALIZE_MEMBER(pc_sampling_unit);
|
||||
|
||||
@@ -26,6 +26,7 @@
|
||||
#define ROCPROFV3_INTERNAL_API __attribute__((visibility("internal")));
|
||||
|
||||
#include <dlfcn.h>
|
||||
#include <signal.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
@@ -43,10 +44,6 @@ typedef int (*start_main_t)(int (*)(int, char**, char**),
|
||||
void (*)(void),
|
||||
void (*)(void),
|
||||
void*);
|
||||
|
||||
//
|
||||
// local function declarations
|
||||
//
|
||||
int
|
||||
rocprofv3_libc_start_main(int (*)(int, char**, char**),
|
||||
int,
|
||||
@@ -65,15 +62,28 @@ __libc_start_main(int (*)(int, char**, char**),
|
||||
void (*)(void),
|
||||
void*) ROCPROFV3_PUBLIC_API;
|
||||
|
||||
//
|
||||
// external function declarations
|
||||
//
|
||||
sighandler_t
|
||||
signal(int signum, sighandler_t handler) ROCPROFV3_PUBLIC_API;
|
||||
|
||||
int
|
||||
sigaction(int signum,
|
||||
const struct sigaction* restrict act,
|
||||
struct sigaction* restrict oldact) ROCPROFV3_PUBLIC_API;
|
||||
|
||||
extern void
|
||||
rocprofv3_set_main(main_func_t main_func) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
extern int
|
||||
rocprofv3_main(int argc, char** argv, char** envp) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
extern sighandler_t
|
||||
rocprofv3_signal(int signum, sighandler_t handler) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
extern int
|
||||
rocprofv3_sigaction(int signum,
|
||||
const struct sigaction* restrict act,
|
||||
struct sigaction* restrict oldact) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
int
|
||||
rocprofv3_libc_start_main(int (*_main)(int, char**, char**),
|
||||
int _argc,
|
||||
@@ -130,6 +140,18 @@ rocprofv3_libc_start_main(int (*_main)(int, char**, char**),
|
||||
return -1;
|
||||
}
|
||||
|
||||
sighandler_t
|
||||
signal(int signum, sighandler_t handler)
|
||||
{
|
||||
return rocprofv3_signal(signum, handler);
|
||||
}
|
||||
|
||||
int
|
||||
sigaction(int signum, const struct sigaction* restrict act, struct sigaction* restrict oldact)
|
||||
{
|
||||
return rocprofv3_sigaction(signum, act, oldact);
|
||||
}
|
||||
|
||||
int
|
||||
__libc_start_main(int (*_main)(int, char**, char**),
|
||||
int _argc,
|
||||
|
||||
@@ -20,6 +20,9 @@
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
|
||||
#define _GNU_SOURCE 1
|
||||
#define _DEFAULT_SOURCE 1
|
||||
|
||||
#include "config.hpp"
|
||||
#include "helper.hpp"
|
||||
#include "stream_stack.hpp"
|
||||
@@ -65,7 +68,6 @@
|
||||
|
||||
#include <fmt/core.h>
|
||||
|
||||
#include <sys/mman.h>
|
||||
#include <unistd.h>
|
||||
#include <algorithm>
|
||||
#include <cassert>
|
||||
@@ -85,6 +87,11 @@
|
||||
#include <unordered_set>
|
||||
#include <vector>
|
||||
|
||||
#include <dlfcn.h>
|
||||
#include <sys/mman.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/wait.h>
|
||||
|
||||
#if defined(CODECOV) && CODECOV > 0
|
||||
extern "C" {
|
||||
extern void
|
||||
@@ -95,8 +102,22 @@ __gcov_dump(void);
|
||||
namespace common = ::rocprofiler::common;
|
||||
namespace tool = ::rocprofiler::tool;
|
||||
|
||||
extern "C" {
|
||||
void
|
||||
rocprofv3_error_signal_handler(int signo, siginfo_t*, void*);
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
using sigaction_t = struct sigaction;
|
||||
using signal_func_t = sighandler_t (*)(int signum, sighandler_t handler);
|
||||
using sigaction_func_t = int (*)(int signum,
|
||||
const struct sigaction* __restrict__ act,
|
||||
struct sigaction* __restrict__ oldact);
|
||||
|
||||
constexpr auto rocprofv3_num_signals = NSIG;
|
||||
constexpr auto rocprofv3_handled_signals = std::array<int, 4>{SIGINT, SIGQUIT, SIGABRT, SIGTERM};
|
||||
|
||||
auto destructors = new std::vector<std::function<void()>>{};
|
||||
|
||||
template <typename Tp>
|
||||
@@ -133,6 +154,29 @@ add_destructor(Tp*& ptr)
|
||||
|
||||
#undef ADD_DESTRUCTOR
|
||||
|
||||
struct chained_siginfo
|
||||
{
|
||||
int signo = 0;
|
||||
sighandler_t handler = nullptr;
|
||||
std::optional<sigaction_t> action = {};
|
||||
};
|
||||
|
||||
auto&
|
||||
get_chained_signals()
|
||||
{
|
||||
using data_type = std::array<std::optional<chained_siginfo>, rocprofv3_num_signals>;
|
||||
static auto*& _v = common::static_object<data_type>::construct();
|
||||
return *CHECK_NOTNULL(_v);
|
||||
}
|
||||
|
||||
bool
|
||||
is_handled_signal(int signum)
|
||||
{
|
||||
for(auto itr : rocprofv3_handled_signals)
|
||||
if(itr == signum) return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
struct buffer_ids
|
||||
{
|
||||
rocprofiler_buffer_id_t hsa_api_trace = {};
|
||||
@@ -1357,9 +1401,13 @@ rocprofiler_client_id_t* client_identifier = nullptr;
|
||||
void
|
||||
initialize_logging()
|
||||
{
|
||||
auto logging_cfg = rocprofiler::common::logging_config{.install_failure_handler = true};
|
||||
common::init_logging("ROCPROF", logging_cfg);
|
||||
FLAGS_colorlogtostderr = true;
|
||||
static auto _once = std::atomic<uint64_t>{0};
|
||||
if(_once++ == 0)
|
||||
{
|
||||
auto logging_cfg = rocprofiler::common::logging_config{.install_failure_handler = true};
|
||||
common::init_logging("ROCPROF", logging_cfg);
|
||||
FLAGS_colorlogtostderr = true;
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
@@ -1379,6 +1427,34 @@ initialize_rocprofv3()
|
||||
<< "nullptr to client finalizer!"; // exception for listing metrics
|
||||
}
|
||||
|
||||
void
|
||||
initialize_signal_handler(sigaction_func_t sigaction_func)
|
||||
{
|
||||
if(sigaction_func == nullptr) sigaction_func = &sigaction;
|
||||
|
||||
struct sigaction sig_act = {};
|
||||
sigemptyset(&sig_act.sa_mask);
|
||||
sig_act.sa_flags = (SA_SIGINFO | SA_RESETHAND | SA_NOCLDSTOP);
|
||||
sig_act.sa_sigaction = &rocprofv3_error_signal_handler;
|
||||
for(auto signal_v : rocprofv3_handled_signals)
|
||||
{
|
||||
if(get_chained_signals().at(signal_v))
|
||||
{
|
||||
ROCP_INFO << "Skipping install of signal handler for signal " << signal_v
|
||||
<< " (already wrapped)";
|
||||
continue;
|
||||
}
|
||||
|
||||
ROCP_INFO << "Installing signal handler for signal " << signal_v;
|
||||
if(sigaction_func(signal_v, &sig_act, nullptr) != 0)
|
||||
{
|
||||
auto _errno_v = errno;
|
||||
ROCP_ERROR << "error setting signal handler for " << signal_v
|
||||
<< " :: " << strerror(_errno_v);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
finalize_rocprofv3(std::string_view context)
|
||||
{
|
||||
@@ -1467,8 +1543,8 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
|
||||
{
|
||||
client_finalizer = fini_func;
|
||||
|
||||
constexpr uint64_t buffer_size = 32 * common::units::KiB;
|
||||
constexpr uint64_t buffer_watermark = 31 * common::units::KiB;
|
||||
const uint64_t buffer_size = 16 * common::units::get_page_size();
|
||||
const uint64_t buffer_watermark = 15 * common::units::get_page_size();
|
||||
|
||||
tool_metadata->init(tool::metadata::inprocess{});
|
||||
|
||||
@@ -1881,13 +1957,23 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
|
||||
using stats_data_t = tool::stats_data_t;
|
||||
using stats_entry_t = tool::stats_entry_t;
|
||||
using domain_stats_vec_t = tool::domain_stats_vec_t;
|
||||
using cleanup_vec_t = std::vector<std::function<void()>>;
|
||||
|
||||
struct output_data
|
||||
{
|
||||
uint64_t num_output = 0;
|
||||
uint64_t num_bytes = 0;
|
||||
};
|
||||
|
||||
template <typename Tp, domain_type DomainT>
|
||||
void
|
||||
generate_output(tool::buffered_output<Tp, DomainT>& output_v,
|
||||
uint64_t& num_output_v,
|
||||
domain_stats_vec_t& contributions_v)
|
||||
output_data& output_data_v,
|
||||
domain_stats_vec_t& contributions_v,
|
||||
cleanup_vec_t& cleanups_v)
|
||||
{
|
||||
cleanups_v.emplace_back([&output_v]() { output_v.destroy(); });
|
||||
|
||||
if(!output_v) return;
|
||||
|
||||
// opens temporary file and sets read position to beginning
|
||||
@@ -1896,7 +1982,9 @@ generate_output(tool::buffered_output<Tp, DomainT>& output_v,
|
||||
if(output_v.get_generator().empty()) return;
|
||||
|
||||
// if it has reached this point, the generator is not empty
|
||||
num_output_v += 1;
|
||||
auto _num_bytes = output_v.get_num_bytes();
|
||||
output_data_v.num_output += 1;
|
||||
output_data_v.num_bytes += _num_bytes;
|
||||
|
||||
if(tool::get_config().stats || tool::get_config().summary_output)
|
||||
{
|
||||
@@ -1909,7 +1997,7 @@ generate_output(tool::buffered_output<Tp, DomainT>& output_v,
|
||||
contributions_v.emplace_back(output_v.buffer_type_v, output_v.stats);
|
||||
}
|
||||
|
||||
if(tool::get_config().csv_output)
|
||||
if(tool::get_config().csv_output && _num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
tool::generate_csv(
|
||||
tool::get_config(), *tool_metadata, output_v.get_generator(), output_v.stats);
|
||||
@@ -1929,16 +2017,15 @@ tool_fini(void* /*tool_data*/)
|
||||
rocprofiler_stop_context(get_client_ctx());
|
||||
flush();
|
||||
|
||||
auto kernel_dispatch_with_stream_output =
|
||||
rocprofiler::tool::kernel_dispatch_buffered_output_with_stream_t{
|
||||
tool::get_config().kernel_trace};
|
||||
auto kernel_dispatch_output = rocprofiler::tool::kernel_dispatch_buffered_output_with_stream_t{
|
||||
tool::get_config().kernel_trace};
|
||||
auto hsa_output = tool::hsa_buffered_output_t{tool::get_config().hsa_core_api_trace ||
|
||||
tool::get_config().hsa_amd_ext_api_trace ||
|
||||
tool::get_config().hsa_image_ext_api_trace ||
|
||||
tool::get_config().hsa_finalizer_ext_api_trace};
|
||||
auto hip_output = tool::hip_buffered_output_t{tool::get_config().hip_runtime_api_trace ||
|
||||
tool::get_config().hip_compiler_api_trace};
|
||||
auto memory_copy_output_with_stream_output =
|
||||
auto memory_copy_output_output =
|
||||
tool::memory_copy_buffered_output_with_stream_t{tool::get_config().memory_copy_trace};
|
||||
auto marker_output = tool::marker_buffered_output_t{tool::get_config().marker_api_trace};
|
||||
auto counters_output =
|
||||
@@ -1962,42 +2049,58 @@ tool_fini(void* /*tool_data*/)
|
||||
auto agents_output = CHECK_NOTNULL(tool_metadata)->agents;
|
||||
std::sort(agents_output.begin(), agents_output.end(), node_id_sort);
|
||||
|
||||
uint64_t num_output = 0;
|
||||
auto contributions = domain_stats_vec_t{};
|
||||
auto outdata = output_data{};
|
||||
auto contributions = domain_stats_vec_t{};
|
||||
auto cleanups = cleanup_vec_t{};
|
||||
|
||||
generate_output(kernel_dispatch_with_stream_output, num_output, contributions);
|
||||
generate_output(hsa_output, num_output, contributions);
|
||||
generate_output(hip_output, num_output, contributions);
|
||||
generate_output(memory_copy_output_with_stream_output, num_output, contributions);
|
||||
generate_output(memory_allocation_output, num_output, contributions);
|
||||
generate_output(marker_output, num_output, contributions);
|
||||
generate_output(rccl_output, num_output, contributions);
|
||||
generate_output(counters_output, num_output, contributions);
|
||||
generate_output(scratch_memory_output, num_output, contributions);
|
||||
generate_output(rocdecode_output, num_output, contributions);
|
||||
generate_output(pc_sampling_host_trap_output, num_output, contributions);
|
||||
generate_output(rocjpeg_output, num_output, contributions);
|
||||
generate_output(pc_sampling_stochastic_output, num_output, contributions);
|
||||
auto run_cleanup = [&cleanups]() {
|
||||
for(const auto& itr : cleanups)
|
||||
{
|
||||
if(itr) itr();
|
||||
}
|
||||
cleanups.clear();
|
||||
};
|
||||
|
||||
auto _dtor = common::scope_destructor{run_cleanup};
|
||||
|
||||
generate_output(kernel_dispatch_output, outdata, contributions, cleanups);
|
||||
generate_output(hsa_output, outdata, contributions, cleanups);
|
||||
generate_output(hip_output, outdata, contributions, cleanups);
|
||||
generate_output(memory_copy_output_output, outdata, contributions, cleanups);
|
||||
generate_output(memory_allocation_output, outdata, contributions, cleanups);
|
||||
generate_output(marker_output, outdata, contributions, cleanups);
|
||||
generate_output(rccl_output, outdata, contributions, cleanups);
|
||||
generate_output(counters_output, outdata, contributions, cleanups);
|
||||
generate_output(scratch_memory_output, outdata, contributions, cleanups);
|
||||
generate_output(rocdecode_output, outdata, contributions, cleanups);
|
||||
generate_output(pc_sampling_host_trap_output, outdata, contributions, cleanups);
|
||||
generate_output(rocjpeg_output, outdata, contributions, cleanups);
|
||||
generate_output(pc_sampling_stochastic_output, outdata, contributions, cleanups);
|
||||
|
||||
if(tool::get_config().advanced_thread_trace && !tool::get_config().att_capability.empty() &&
|
||||
!tool_metadata->att_filenames.empty())
|
||||
{
|
||||
num_output += 1;
|
||||
outdata.num_output += 1;
|
||||
}
|
||||
|
||||
ROCP_INFO << "Number of services generating output: " << num_output;
|
||||
ROCP_INFO << fmt::format("Number of services generating output: {} ({} kB)",
|
||||
outdata.num_output,
|
||||
(outdata.num_bytes / 1024));
|
||||
|
||||
if(tool::get_config().csv_output && num_output > 0)
|
||||
if(tool::get_config().csv_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
tool::generate_csv(tool::get_config(), *tool_metadata, agents_output);
|
||||
}
|
||||
|
||||
if(tool::get_config().stats && tool::get_config().csv_output && num_output > 0)
|
||||
if(tool::get_config().stats && tool::get_config().csv_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
tool::generate_csv(tool::get_config(), *tool_metadata, contributions);
|
||||
}
|
||||
|
||||
if(tool::get_config().json_output && num_output > 0)
|
||||
if(tool::get_config().json_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
auto json_ar = tool::open_json(tool::get_config());
|
||||
|
||||
@@ -2009,8 +2112,8 @@ tool_fini(void* /*tool_data*/)
|
||||
contributions,
|
||||
hip_output.get_generator(),
|
||||
hsa_output.get_generator(),
|
||||
kernel_dispatch_with_stream_output.get_generator(),
|
||||
memory_copy_output_with_stream_output.get_generator(),
|
||||
kernel_dispatch_output.get_generator(),
|
||||
memory_copy_output_output.get_generator(),
|
||||
counters_output.get_generator(),
|
||||
marker_output.get_generator(),
|
||||
scratch_memory_output.get_generator(),
|
||||
@@ -2025,15 +2128,16 @@ tool_fini(void* /*tool_data*/)
|
||||
tool::close_json(json_ar);
|
||||
}
|
||||
|
||||
if(tool::get_config().pftrace_output && num_output > 0)
|
||||
if(tool::get_config().pftrace_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
tool::write_perfetto(tool::get_config(),
|
||||
*tool_metadata,
|
||||
agents_output,
|
||||
hip_output.get_generator(),
|
||||
hsa_output.get_generator(),
|
||||
kernel_dispatch_with_stream_output.get_generator(),
|
||||
memory_copy_output_with_stream_output.get_generator(),
|
||||
kernel_dispatch_output.get_generator(),
|
||||
memory_copy_output_output.get_generator(),
|
||||
counters_output.get_generator(),
|
||||
marker_output.get_generator(),
|
||||
scratch_memory_output.get_generator(),
|
||||
@@ -2043,12 +2147,13 @@ tool_fini(void* /*tool_data*/)
|
||||
rocjpeg_output.get_generator());
|
||||
}
|
||||
|
||||
if(tool::get_config().otf2_output && num_output > 0)
|
||||
if(tool::get_config().otf2_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
auto hip_elem_data = hip_output.load_all();
|
||||
auto hsa_elem_data = hsa_output.load_all();
|
||||
auto kernel_dispatch_elem_data = kernel_dispatch_with_stream_output.load_all();
|
||||
auto memory_copy_elem_data = memory_copy_output_with_stream_output.load_all();
|
||||
auto kernel_dispatch_elem_data = kernel_dispatch_output.load_all();
|
||||
auto memory_copy_elem_data = memory_copy_output_output.load_all();
|
||||
auto marker_elem_data = marker_output.load_all();
|
||||
auto scratch_memory_elem_data = scratch_memory_output.load_all();
|
||||
auto rccl_elem_data = rccl_output.load_all();
|
||||
@@ -2072,7 +2177,8 @@ tool_fini(void* /*tool_data*/)
|
||||
&rocjpeg_elem_data);
|
||||
}
|
||||
|
||||
if(tool::get_config().summary_output && num_output > 0)
|
||||
if(tool::get_config().summary_output && outdata.num_output > 0 &&
|
||||
outdata.num_bytes >= tool::get_config().minimum_output_bytes)
|
||||
{
|
||||
tool::generate_stats(tool::get_config(), *tool_metadata, contributions);
|
||||
}
|
||||
@@ -2121,22 +2227,7 @@ tool_fini(void* /*tool_data*/)
|
||||
}
|
||||
}
|
||||
|
||||
auto destroy_output = [](auto& _buffered_output_v) { _buffered_output_v.destroy(); };
|
||||
|
||||
destroy_output(kernel_dispatch_with_stream_output);
|
||||
destroy_output(hsa_output);
|
||||
destroy_output(hip_output);
|
||||
destroy_output(memory_copy_output_with_stream_output);
|
||||
destroy_output(memory_allocation_output);
|
||||
destroy_output(marker_output);
|
||||
destroy_output(counters_output);
|
||||
destroy_output(scratch_memory_output);
|
||||
destroy_output(rccl_output);
|
||||
destroy_output(counters_records_output);
|
||||
destroy_output(pc_sampling_host_trap_output);
|
||||
destroy_output(rocdecode_output);
|
||||
destroy_output(rocjpeg_output);
|
||||
destroy_output(pc_sampling_stochastic_output);
|
||||
run_cleanup();
|
||||
|
||||
if(kernel_rename_and_stream_display_pair_dtors != nullptr)
|
||||
{
|
||||
@@ -2200,21 +2291,304 @@ get_main_function()
|
||||
return user_main;
|
||||
}
|
||||
|
||||
bool signal_handler_exit = tool::get_env("ROCPROF_INTERNAL_TEST_SIGNAL_HANDLER_VIA_EXIT", false);
|
||||
signal_func_t&
|
||||
get_signal_function()
|
||||
{
|
||||
static signal_func_t user_signal = nullptr;
|
||||
return user_signal;
|
||||
}
|
||||
|
||||
sigaction_func_t&
|
||||
get_sigaction_function()
|
||||
{
|
||||
static sigaction_func_t user_sigaction = (sigaction_func_t) dlsym(RTLD_NEXT, "sigaction");
|
||||
return user_sigaction;
|
||||
}
|
||||
|
||||
bool signal_handler_exit =
|
||||
rocprofiler::tool::get_env("ROCPROF_INTERNAL_TEST_SIGNAL_HANDLER_VIA_EXIT", false);
|
||||
} // namespace
|
||||
|
||||
#define ROCPROFV3_INTERNAL_API __attribute__((visibility("internal")));
|
||||
|
||||
std::optional<int>
|
||||
wait_pid(pid_t _pid, int _opts = 0)
|
||||
{
|
||||
auto this_pid = getpid();
|
||||
auto this_ppid = getppid();
|
||||
auto this_tid = common::get_tid();
|
||||
auto this_func = std::string_view{__FUNCTION__};
|
||||
|
||||
ROCP_INFO << fmt::format("[PPID={}][PID={}][TID={}][{}] rocprofv3 waiting for child {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid);
|
||||
|
||||
int _status = 0;
|
||||
pid_t _pid_v = -1;
|
||||
_opts |= WUNTRACED;
|
||||
do
|
||||
{
|
||||
if((_opts & WNOHANG) > 0)
|
||||
{
|
||||
std::this_thread::yield();
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds{100});
|
||||
}
|
||||
_pid_v = waitpid(_pid, &_status, _opts);
|
||||
} while(_pid_v == 0);
|
||||
|
||||
if(_pid_v < 0) return std::nullopt;
|
||||
return _status;
|
||||
}
|
||||
|
||||
extern "C" {
|
||||
void
|
||||
rocprofv3_set_main(main_func_t main_func) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
void
|
||||
rocprofv3_error_signal_handler(int signo)
|
||||
int
|
||||
diagnose_status(pid_t _pid, int _status)
|
||||
{
|
||||
ROCP_WARNING << __FUNCTION__ << " caught signal " << signo << "...";
|
||||
auto this_pid = getpid();
|
||||
auto this_ppid = getppid();
|
||||
auto this_tid = common::get_tid();
|
||||
auto this_func = std::string_view{__FUNCTION__};
|
||||
|
||||
bool _normal_exit = (WIFEXITED(_status) > 0);
|
||||
bool _unhandled_signal = (WIFSIGNALED(_status) > 0);
|
||||
bool _core_dump = (WCOREDUMP(_status) > 0);
|
||||
bool _stopped = (WIFSTOPPED(_status) > 0);
|
||||
int _exit_status = WEXITSTATUS(_status);
|
||||
int _stop_signal = (_stopped) ? WSTOPSIG(_status) : 0;
|
||||
int _ec = (_unhandled_signal) ? WTERMSIG(_status) : 0;
|
||||
|
||||
ROCP_TRACE << fmt::format("[PPID={}][PID={}][TID={}][{}] diagnosing status for process {} :: "
|
||||
"status: {}, normal exit: {}, unhandled signal: {}, core dump: {}, "
|
||||
"stopped: {}, exit status: {}, stop signal: {}, exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_status,
|
||||
std::to_string(static_cast<int>(_normal_exit)),
|
||||
std::to_string(static_cast<int>(_unhandled_signal)),
|
||||
std::to_string(static_cast<int>(_core_dump)),
|
||||
std::to_string(static_cast<int>(_stopped)),
|
||||
_exit_status,
|
||||
_stop_signal,
|
||||
_ec);
|
||||
|
||||
if(!_normal_exit)
|
||||
{
|
||||
if(_ec == 0) _ec = EXIT_FAILURE;
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] process {} terminated abnormally. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_ec);
|
||||
}
|
||||
|
||||
if(_stopped)
|
||||
{
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] process {} stopped with signal {}. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_stop_signal,
|
||||
_ec);
|
||||
}
|
||||
|
||||
if(_core_dump)
|
||||
{
|
||||
ROCP_INFO << fmt::format("[PPID={}][PID={}][TID={}][{}] process {} terminated and "
|
||||
"produced a core dump. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_ec);
|
||||
}
|
||||
|
||||
if(_unhandled_signal)
|
||||
{
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] process {} terminated because it received a signal "
|
||||
"({}) that was not handled. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_ec,
|
||||
_ec);
|
||||
}
|
||||
|
||||
if(!_normal_exit && _exit_status > 0)
|
||||
{
|
||||
if(_exit_status == 127)
|
||||
{
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] execv in process {} failed. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_ec);
|
||||
}
|
||||
else
|
||||
{
|
||||
ROCP_INFO << fmt::format("[PPID={}][PID={}][TID={}][{}] process {} terminated with "
|
||||
"a non-zero status. exit code: {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_pid,
|
||||
_ec);
|
||||
}
|
||||
}
|
||||
|
||||
return _ec;
|
||||
}
|
||||
|
||||
void
|
||||
rocprofv3_error_signal_handler(int signo, siginfo_t* info, void* ucontext)
|
||||
{
|
||||
auto this_pid = getpid();
|
||||
auto this_ppid = getppid();
|
||||
auto this_tid = common::get_tid();
|
||||
auto this_func = std::string_view{__FUNCTION__};
|
||||
|
||||
ROCP_WARNING << fmt::format("[PPID={}][PID={}][TID={}][{}] rocprofv3 caught signal {}...",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, [&]() {
|
||||
auto get_children = [&this_pid]() {
|
||||
auto fname = fmt::format("/proc/{}/task/{}/children", this_pid, this_pid);
|
||||
auto ifs = std::ifstream{fname};
|
||||
auto children = std::vector<pid_t>{};
|
||||
while(ifs)
|
||||
{
|
||||
pid_t val = 0;
|
||||
ifs >> val;
|
||||
if(ifs && !ifs.eof() && val > 0) children.emplace_back(val);
|
||||
}
|
||||
return children;
|
||||
};
|
||||
|
||||
auto _children = get_children();
|
||||
ROCP_WARNING << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 will wait for {} children to exit",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
_children.size());
|
||||
|
||||
// wait for children
|
||||
for(auto itr : _children)
|
||||
{
|
||||
auto status = wait_pid(itr, WUNTRACED | WNOHANG);
|
||||
if(status) diagnose_status(itr, status.value());
|
||||
}
|
||||
|
||||
ROCP_WARNING << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 finalizing after signal {}...",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
|
||||
finalize_rocprofv3(this_func);
|
||||
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 finalizing after signal {}... complete",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
|
||||
if(get_chained_signals().at(signo))
|
||||
{
|
||||
ROCP_INFO << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 found chained signal for {}",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
auto& _chained = *get_chained_signals().at(signo);
|
||||
if(_chained.action)
|
||||
{
|
||||
ROCP_TRACE << fmt::format("[PPID={}][PID={}][TID={}][{}] rocprofv3 found chained "
|
||||
"signal for {}... executing chained sigaction",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
if((_chained.action->sa_flags & SA_SIGINFO) == SA_SIGINFO &&
|
||||
_chained.action->sa_sigaction)
|
||||
{
|
||||
ROCP_TRACE << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 found chained "
|
||||
"signal for {}... executing chained sigaction (SIGINFO)",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
_chained.action->sa_sigaction(signo, info, ucontext);
|
||||
}
|
||||
else if((_chained.action->sa_flags & SA_SIGINFO) != SA_SIGINFO &&
|
||||
_chained.action->sa_handler)
|
||||
{
|
||||
ROCP_TRACE << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 found chained "
|
||||
"signal for {}... executing chained sigaction (HANDLER)",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
_chained.action->sa_handler(signo);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if(_chained.handler)
|
||||
{
|
||||
ROCP_TRACE << fmt::format(
|
||||
"[PPID={}][PID={}][TID={}][{}] rocprofv3 found chained "
|
||||
"signal for {}... executing chained handler",
|
||||
this_ppid,
|
||||
this_pid,
|
||||
this_tid,
|
||||
this_func,
|
||||
signo);
|
||||
_chained.handler(signo);
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
finalize_rocprofv3(__FUNCTION__);
|
||||
// below is for testing purposes. re-raising the signal causes CTest to ignore WILL_FAIL ON
|
||||
if(signal_handler_exit) ::quick_exit(signo);
|
||||
::raise(signo);
|
||||
@@ -2223,6 +2597,14 @@ rocprofv3_error_signal_handler(int signo)
|
||||
int
|
||||
rocprofv3_main(int argc, char** argv, char** envp) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
sighandler_t
|
||||
rocprofv3_signal(int signum, sighandler_t handler) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
int
|
||||
rocprofv3_sigaction(int signum,
|
||||
const struct sigaction* __restrict__ act,
|
||||
struct sigaction* __restrict__ oldact) ROCPROFV3_INTERNAL_API;
|
||||
|
||||
rocprofiler_tool_configure_result_t*
|
||||
rocprofiler_configure(uint32_t version,
|
||||
const char* runtime_version,
|
||||
@@ -2283,26 +2665,79 @@ rocprofv3_set_main(main_func_t main_func)
|
||||
get_main_function() = main_func;
|
||||
}
|
||||
|
||||
#define LOG_FUNCTION_ENTRY(MSG, ...) \
|
||||
{ \
|
||||
ROCP_INFO << fmt::format("[PPID={}][PID={}][TID={}][rocprofv3] {}" MSG, \
|
||||
getppid(), \
|
||||
getpid(), \
|
||||
gettid(), \
|
||||
__FUNCTION__, \
|
||||
__VA_ARGS__); \
|
||||
}
|
||||
|
||||
sighandler_t
|
||||
rocprofv3_signal(int signum, sighandler_t handler)
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once,
|
||||
[]() { get_signal_function() = (signal_func_t) dlsym(RTLD_NEXT, "signal"); });
|
||||
|
||||
if(!is_handled_signal(signum) || !tool::get_config().enable_signal_handlers)
|
||||
return CHECK_NOTNULL(get_signal_function())(signum, handler);
|
||||
|
||||
get_chained_signals().at(signum) = chained_siginfo{signum, handler, std::nullopt};
|
||||
|
||||
return get_signal_function()(
|
||||
signum, [](int signum_v) { rocprofv3_error_signal_handler(signum_v, nullptr, nullptr); });
|
||||
}
|
||||
|
||||
int
|
||||
rocprofv3_sigaction(int signum,
|
||||
const struct sigaction* __restrict__ act,
|
||||
struct sigaction* __restrict__ oldact)
|
||||
{
|
||||
static auto _once = std::once_flag{};
|
||||
std::call_once(_once, []() {
|
||||
get_sigaction_function() = (sigaction_func_t) dlsym(RTLD_NEXT, "sigaction");
|
||||
});
|
||||
|
||||
if(!is_handled_signal(signum) || !act || !tool::get_config().enable_signal_handlers)
|
||||
return CHECK_NOTNULL(get_sigaction_function())(signum, act, oldact);
|
||||
|
||||
get_chained_signals().at(signum) = chained_siginfo{signum, nullptr, *act};
|
||||
|
||||
struct sigaction _upd_act = *act;
|
||||
_upd_act.sa_flags |= (SA_SIGINFO | SA_RESETHAND | SA_NOCLDSTOP);
|
||||
_upd_act.sa_sigaction = &rocprofv3_error_signal_handler;
|
||||
|
||||
return get_sigaction_function()(signum, &_upd_act, oldact);
|
||||
}
|
||||
|
||||
int
|
||||
rocprofv3_main(int argc, char** argv, char** envp)
|
||||
{
|
||||
auto convert_to_vec = [](char** inp) {
|
||||
auto _data = std::vector<std::string_view>{};
|
||||
size_t n = 0;
|
||||
const char* p = nullptr;
|
||||
do
|
||||
{
|
||||
p = inp[n++];
|
||||
if(p != nullptr) _data.emplace_back(p);
|
||||
} while(p != nullptr);
|
||||
return _data;
|
||||
};
|
||||
|
||||
auto _argv = convert_to_vec(argv);
|
||||
// auto _envp = convect_to_vec(envp);
|
||||
|
||||
LOG_FUNCTION_ENTRY("({}, '{}', ...)", argc, fmt::join(_argv.begin(), _argv.end(), " "));
|
||||
|
||||
initialize_logging();
|
||||
|
||||
initialize_rocprofv3();
|
||||
|
||||
struct sigaction sig_act = {};
|
||||
sigemptyset(&sig_act.sa_mask);
|
||||
sig_act.sa_flags = SA_RESETHAND | SA_NODEFER;
|
||||
sig_act.sa_handler = &rocprofv3_error_signal_handler;
|
||||
for(auto signal_v : {SIGTERM, SIGSEGV, SIGINT, SIGILL, SIGABRT, SIGFPE})
|
||||
{
|
||||
if(sigaction(signal_v, &sig_act, nullptr) != 0)
|
||||
{
|
||||
auto _errno_v = errno;
|
||||
ROCP_ERROR << "error setting signal handler for " << signal_v
|
||||
<< " :: " << strerror(_errno_v);
|
||||
}
|
||||
}
|
||||
initialize_signal_handler(get_sigaction_function());
|
||||
|
||||
ROCP_INFO << "rocprofv3: main function wrapper will be invoked...";
|
||||
|
||||
|
||||
@@ -274,6 +274,14 @@ GetSurfaceStrideInternal(rocDecVideoSurfaceFormat surface_format,
|
||||
*pitch = align(width, 128) * 2;
|
||||
*vstride = align(height, 16);
|
||||
break;
|
||||
case rocDecVideoSurfaceFormat_YUV422:
|
||||
*pitch = align(width, 256);
|
||||
*vstride = align(height, 16);
|
||||
break;
|
||||
case rocDecVideoSurfaceFormat_YUV422_16Bit:
|
||||
*pitch = align(width, 128) * 2;
|
||||
*vstride = align(height, 16);
|
||||
break;
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -94,8 +94,11 @@ GetChromaPlaneCount(rocDecVideoSurfaceFormat surface_format)
|
||||
{
|
||||
case rocDecVideoSurfaceFormat_NV12:
|
||||
case rocDecVideoSurfaceFormat_P016: num_planes = 1; break;
|
||||
// All YUV formats have 2 planes (to my knowledge).
|
||||
case rocDecVideoSurfaceFormat_YUV444:
|
||||
case rocDecVideoSurfaceFormat_YUV444_16Bit: num_planes = 2; break;
|
||||
case rocDecVideoSurfaceFormat_YUV422:
|
||||
case rocDecVideoSurfaceFormat_YUV422_16Bit:
|
||||
case rocDecVideoSurfaceFormat_YUV420:
|
||||
case rocDecVideoSurfaceFormat_YUV420_16Bit: num_planes = 2; break;
|
||||
}
|
||||
@@ -109,10 +112,17 @@ GetChromaHeightFactor(rocDecVideoSurfaceFormat surface_format)
|
||||
float factor = 0.5;
|
||||
switch(surface_format)
|
||||
{
|
||||
// NV12 and P016 have 1/2 vertical resolution, 1/2 horizontal resolution
|
||||
// (different 420 sampling techniques across a single plane)
|
||||
case rocDecVideoSurfaceFormat_NV12:
|
||||
case rocDecVideoSurfaceFormat_P016:
|
||||
// 420 pulldown has 1/2 vertical resolution, 1/2 horizontal resolution
|
||||
case rocDecVideoSurfaceFormat_YUV420:
|
||||
case rocDecVideoSurfaceFormat_YUV420_16Bit: factor = 0.5; break;
|
||||
// 422 pulldown has full vertical resolution, 1/2 horizontal resolution
|
||||
case rocDecVideoSurfaceFormat_YUV422:
|
||||
case rocDecVideoSurfaceFormat_YUV422_16Bit:
|
||||
// 444 pulldown has full vertical and horizontal resolution
|
||||
case rocDecVideoSurfaceFormat_YUV444:
|
||||
case rocDecVideoSurfaceFormat_YUV444_16Bit: factor = 1.0; break;
|
||||
}
|
||||
|
||||
@@ -100,7 +100,7 @@ if(NOT TARGET rocprofiler-sdk::rocprofiler-sdk-perfetto)
|
||||
# perfetto
|
||||
fetchcontent_declare(
|
||||
perfetto
|
||||
GIT_REPOSITORY https://android.googlesource.com/platform/external/perfetto
|
||||
GIT_REPOSITORY https://github.com/google/perfetto
|
||||
GIT_TAG v44.0
|
||||
SOURCE_DIR ${PROJECT_BINARY_DIR}/external/perfetto BINARY_DIR
|
||||
${PROJECT_BINARY_DIR}/external/build/perfetto-build SUBBUILD_DIR
|
||||
|
||||
@@ -44,3 +44,4 @@ endif()
|
||||
add_subdirectory(hip-stream-display)
|
||||
add_subdirectory(agent-index)
|
||||
add_subdirectory(negate-aggregate-tracing-options)
|
||||
add_subdirectory(minimum-bytes)
|
||||
|
||||
@@ -0,0 +1,49 @@
|
||||
#
|
||||
# rocprofv3 tool test
|
||||
#
|
||||
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
|
||||
|
||||
project(
|
||||
rocprofiler-sdk-tests-rocprofv3-minimum-bytes
|
||||
LANGUAGES CXX
|
||||
VERSION 0.0.0)
|
||||
|
||||
find_package(rocprofiler-sdk REQUIRED)
|
||||
|
||||
rocprofiler_configure_pytest_files(CONFIG pytest.ini COPY validate.py conftest.py
|
||||
input.json)
|
||||
|
||||
# pmc2
|
||||
add_test(
|
||||
NAME rocprofv3-test-minimum-bytes-execute
|
||||
COMMAND
|
||||
$<TARGET_FILE:rocprofiler-sdk::rocprofv3> -i
|
||||
${CMAKE_CURRENT_BINARY_DIR}/input.json --output-format csv json pftrace -d
|
||||
${CMAKE_CURRENT_BINARY_DIR}/%argt%-kernel-trace -o out --
|
||||
$<TARGET_FILE:simple-transpose>)
|
||||
|
||||
string(REPLACE "LD_PRELOAD=" "ROCPROF_PRELOAD=" PRELOAD_ENV
|
||||
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV}")
|
||||
|
||||
set(cc-env-pmc2 "${PRELOAD_ENV}")
|
||||
|
||||
set_tests_properties(
|
||||
rocprofv3-test-minimum-bytes-execute
|
||||
PROPERTIES TIMEOUT 45 LABELS "integration-tests" ENVIRONMENT "${cc-env-pmc2}"
|
||||
FAIL_REGULAR_EXPRESSION "${ROCPROFILER_DEFAULT_FAIL_REGEX}")
|
||||
|
||||
add_test(
|
||||
NAME rocprofv3-test-minimum-bytes-validate
|
||||
COMMAND
|
||||
${Python3_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/validate.py --trace-input-csv
|
||||
${CMAKE_CURRENT_BINARY_DIR}/simple-transpose-kernel-trace/out*.csv
|
||||
--trace-input-json
|
||||
${CMAKE_CURRENT_BINARY_DIR}/simple-transpose-kernel-trace/out_results.json
|
||||
--trace-input-pftrace
|
||||
${CMAKE_CURRENT_BINARY_DIR}/simple-transpose-kernel-trace/out_results.pftrace)
|
||||
|
||||
set_tests_properties(
|
||||
rocprofv3-test-minimum-bytes-validate
|
||||
PROPERTIES TIMEOUT 45 LABELS "integration-tests" DEPENDS
|
||||
"rocprofv3-test-minimum-bytes-execute" FAIL_REGULAR_EXPRESSION
|
||||
"${ROCPROFILER_DEFAULT_FAIL_REGEX}")
|
||||
@@ -0,0 +1,69 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
# MIT License
|
||||
#
|
||||
# Copyright (c) 2024-2025 Advanced Micro Devices, Inc. All rights reserved.
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
# of this software and associated documentation files (the "Software"), to deal
|
||||
# in the Software without restriction, including without limitation the rights
|
||||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
# copies of the Software, and to permit persons to whom the Software is
|
||||
# furnished to do so, subject to the following conditions:
|
||||
#
|
||||
# The above copyright notice and this permission notice shall be included in
|
||||
# all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
# THE SOFTWARE.
|
||||
|
||||
import json
|
||||
import pytest
|
||||
import csv
|
||||
|
||||
from rocprofiler_sdk.pytest_utils.dotdict import dotdict
|
||||
from rocprofiler_sdk.pytest_utils import collapse_dict_list
|
||||
|
||||
|
||||
def pytest_addoption(parser):
|
||||
|
||||
parser.addoption(
|
||||
"--trace-input-csv",
|
||||
action="store",
|
||||
help="Path to kernel trace CSV file.",
|
||||
)
|
||||
|
||||
parser.addoption(
|
||||
"--trace-input-json",
|
||||
action="store",
|
||||
help="Path to kernel trace JSON file.",
|
||||
)
|
||||
|
||||
parser.addoption(
|
||||
"--trace-input-pftrace",
|
||||
action="store",
|
||||
help="Path to kernel trace perfetto file.",
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def trace_data_csv(request):
|
||||
filename = request.config.getoption("--trace-input-csv")
|
||||
return filename
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def trace_data_json(request):
|
||||
filename = request.config.getoption("--trace-input-json")
|
||||
return filename
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def trace_data_pftrace(request):
|
||||
filename = request.config.getoption("--trace-input-pftrace")
|
||||
return filename
|
||||
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"jobs": [
|
||||
{
|
||||
"runtime_trace": true,
|
||||
"minimum_output_data":1000000
|
||||
}
|
||||
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,5 @@
|
||||
|
||||
[pytest]
|
||||
addopts = --durations=20 -rA -s -vv
|
||||
testpaths = validate.py
|
||||
pythonpath = @ROCPROFILER_SDK_TESTS_BINARY_DIR@/pytest-packages
|
||||
@@ -0,0 +1,41 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
# MIT License
|
||||
#
|
||||
# Copyright (c) 2024-2025 Advanced Micro Devices, Inc. All rights reserved.
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
# of this software and associated documentation files (the "Software"), to deal
|
||||
# in the Software without restriction, including without limitation the rights
|
||||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
# copies of the Software, and to permit persons to whom the Software is
|
||||
# furnished to do so, subject to the following conditions:
|
||||
#
|
||||
# The above copyright notice and this permission notice shall be included in
|
||||
# all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
# THE SOFTWARE.
|
||||
|
||||
import sys
|
||||
import pytest
|
||||
import os
|
||||
import glob
|
||||
|
||||
|
||||
def test_file_exists(trace_data_csv, trace_data_json, trace_data_pftrace):
|
||||
csv_files = glob.glob(trace_data_csv)
|
||||
# glob.glob will return empty list if there are no matching files
|
||||
assert len(csv_files) == 0, f"CSV glob: {trace_data_csv}\nCSV files: {csv_files}"
|
||||
assert os.path.exists(trace_data_json) is False
|
||||
assert os.path.exists(trace_data_pftrace) is False
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit_code = pytest.main(["-x", __file__] + sys.argv[1:])
|
||||
sys.exit(exit_code)
|
||||
Référencer dans un nouveau ticket
Bloquer un utilisateur