c641749fe6
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
183 строки
6.6 KiB
Python
183 строки
6.6 KiB
Python
#!/usr/bin/env python3
|
|
|
|
import sys
|
|
import pytest
|
|
|
|
|
|
# helper function
|
|
def node_exists(name, data, min_len=1):
|
|
assert name in data
|
|
assert data[name] is not None
|
|
assert len(data[name]) >= min_len
|
|
|
|
|
|
def test_data_structure(input_data):
|
|
"""verify minimum amount of expected data is present"""
|
|
data = input_data
|
|
|
|
node_exists("rocprofiler-sdk-json-tool", data)
|
|
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
node_exists("agents", sdk_data)
|
|
node_exists("call_stack", sdk_data)
|
|
node_exists("callback_records", sdk_data)
|
|
node_exists("buffer_records", sdk_data)
|
|
|
|
node_exists("names", sdk_data["callback_records"])
|
|
node_exists("code_objects", sdk_data["callback_records"])
|
|
node_exists("kernel_symbols", sdk_data["callback_records"])
|
|
node_exists("hsa_api_traces", sdk_data["callback_records"])
|
|
node_exists("hip_api_traces", sdk_data["callback_records"], 0)
|
|
node_exists("marker_api_traces", sdk_data["callback_records"])
|
|
|
|
node_exists("names", sdk_data["buffer_records"])
|
|
node_exists("kernel_dispatches", sdk_data["buffer_records"])
|
|
node_exists("memory_copies", sdk_data["buffer_records"], 0)
|
|
node_exists("hsa_api_traces", sdk_data["buffer_records"])
|
|
node_exists("hip_api_traces", sdk_data["buffer_records"], 0)
|
|
node_exists("marker_api_traces", sdk_data["buffer_records"])
|
|
|
|
|
|
def test_timestamps(input_data):
|
|
data = input_data
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
cb_start = {}
|
|
cb_end = {}
|
|
for titr in ["hsa_api_traces", "marker_api_traces", "hip_api_traces"]:
|
|
for itr in sdk_data["callback_records"][titr]:
|
|
cid = itr["record"]["correlation_id"]["internal"]
|
|
phase = itr["record"]["phase"]
|
|
if phase == 1:
|
|
cb_start[cid] = itr["timestamp"]
|
|
elif phase == 2:
|
|
cb_end[cid] = itr["timestamp"]
|
|
assert cb_start[cid] <= itr["timestamp"]
|
|
else:
|
|
assert phase == 1 or phase == 2
|
|
|
|
for itr in sdk_data["buffer_records"][titr]:
|
|
assert itr["start_timestamp"] <= itr["end_timestamp"]
|
|
|
|
for itr in sdk_data["buffer_records"]["memory_copies"]:
|
|
assert itr["start_timestamp"] <= itr["end_timestamp"]
|
|
|
|
for itr in sdk_data["buffer_records"]["kernel_dispatches"]:
|
|
assert itr["start_timestamp"] < itr["end_timestamp"]
|
|
assert itr["correlation_id"]["internal"] > 0
|
|
assert itr["correlation_id"]["external"] > 0
|
|
|
|
api_start = cb_start[itr["correlation_id"]["internal"]]
|
|
api_end = cb_end[itr["correlation_id"]["internal"]]
|
|
assert api_start < itr["start_timestamp"]
|
|
assert api_end <= itr["end_timestamp"]
|
|
|
|
|
|
def test_internal_correlation_ids(input_data):
|
|
data = input_data
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
api_corr_ids = []
|
|
for titr in ["hsa_api_traces", "marker_api_traces", "hip_api_traces"]:
|
|
for itr in sdk_data["callback_records"][titr]:
|
|
api_corr_ids.append(itr["record"]["correlation_id"]["internal"])
|
|
|
|
for itr in sdk_data["buffer_records"][titr]:
|
|
api_corr_ids.append(itr["correlation_id"]["internal"])
|
|
|
|
api_corr_ids_sorted = sorted(api_corr_ids)
|
|
api_corr_ids_unique = list(set(api_corr_ids))
|
|
|
|
for itr in sdk_data["buffer_records"]["kernel_dispatches"]:
|
|
assert itr["correlation_id"]["internal"] in api_corr_ids_unique
|
|
|
|
for itr in sdk_data["buffer_records"]["memory_copies"]:
|
|
assert itr["correlation_id"]["internal"] in api_corr_ids_unique
|
|
|
|
len_corr_id_unq = len(api_corr_ids_unique)
|
|
assert len(api_corr_ids) != len_corr_id_unq
|
|
assert max(api_corr_ids_sorted) == len_corr_id_unq
|
|
|
|
|
|
def test_external_correlation_ids(input_data):
|
|
data = input_data
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
extern_corr_ids = []
|
|
for titr in ["hsa_api_traces", "marker_api_traces", "hip_api_traces"]:
|
|
for itr in sdk_data["callback_records"][titr]:
|
|
assert itr["record"]["correlation_id"]["external"] > 0
|
|
assert (
|
|
itr["record"]["thread_id"] == itr["record"]["correlation_id"]["external"]
|
|
)
|
|
extern_corr_ids.append(itr["record"]["correlation_id"]["external"])
|
|
|
|
extern_corr_ids = list(set(sorted(extern_corr_ids)))
|
|
for titr in ["hsa_api_traces", "marker_api_traces", "hip_api_traces"]:
|
|
for itr in sdk_data["buffer_records"][titr]:
|
|
assert itr["correlation_id"]["external"] > 0
|
|
assert itr["thread_id"] == itr["correlation_id"]["external"]
|
|
assert itr["thread_id"] in extern_corr_ids
|
|
assert itr["correlation_id"]["external"] in extern_corr_ids
|
|
|
|
for itr in sdk_data["buffer_records"]["kernel_dispatches"]:
|
|
assert itr["correlation_id"]["external"] > 0
|
|
assert itr["correlation_id"]["external"] in extern_corr_ids
|
|
|
|
for itr in sdk_data["buffer_records"]["memory_copies"]:
|
|
assert itr["correlation_id"]["external"] > 0
|
|
assert itr["correlation_id"]["external"] in extern_corr_ids
|
|
|
|
|
|
def test_kernel_ids(input_data):
|
|
data = input_data
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
symbol_info = {}
|
|
for itr in sdk_data["callback_records"]["kernel_symbols"]:
|
|
phase = itr["record"]["phase"]
|
|
payload = itr["payload"]
|
|
kern_id = payload["kernel_id"]
|
|
|
|
assert phase == 1 or phase == 2
|
|
assert kern_id > 0
|
|
if phase == 1:
|
|
assert len(payload["kernel_name"]) > 0
|
|
symbol_info[kern_id] = payload
|
|
elif phase == 2:
|
|
assert payload["kernel_id"] in symbol_info.keys()
|
|
assert payload["kernel_name"] == symbol_info[kern_id]["kernel_name"]
|
|
|
|
for itr in sdk_data["buffer_records"]["kernel_dispatches"]:
|
|
assert itr["kernel_id"] in symbol_info.keys()
|
|
|
|
|
|
def test_async_copy_direction(input_data):
|
|
data = input_data
|
|
sdk_data = data["rocprofiler-sdk-json-tool"]
|
|
|
|
# Direction values:
|
|
# 0 == ??? (unknown)
|
|
# 1 == H2H (host to host)
|
|
# 2 == H2D (host to device)
|
|
# 3 == D2H (device to host)
|
|
# 4 == D2D (device to device)
|
|
async_dir_cnt = dict([(idx, 0) for idx in range(0, 5)])
|
|
for itr in sdk_data["buffer_records"]["memory_copies"]:
|
|
op_id = itr["operation"]
|
|
async_dir_cnt[op_id] += 1
|
|
|
|
# in the reproducible-runtime test which generates the input file,
|
|
# we don't expect any async memory copy operations
|
|
assert async_dir_cnt[0] == 0
|
|
assert async_dir_cnt[1] == 0
|
|
assert async_dir_cnt[2] == 0
|
|
assert async_dir_cnt[3] == 0
|
|
assert async_dir_cnt[4] == 0
|
|
|
|
|
|
if __name__ == "__main__":
|
|
exit_code = pytest.main(["-x", __file__] + sys.argv[1:])
|
|
sys.exit(exit_code)
|