rocJPEG API Tracing (#73)

* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Dieser Commit ist enthalten in:
Trowbridge, Ian
2025-02-21 15:43:49 -06:00
committet von GitHub
Ursprung 95e0341266
Commit 31fe8858d1
92 geänderte Dateien mit 4614 neuen und 242 gelöschten Zeilen
@@ -65,7 +65,7 @@ jobs:
run: |
git config --global --add safe.directory '*'
apt-get update
apt-get install -y build-essential cmake g++-11 g++-12 python3-pip libdw-dev rccl-dev rccl-unittests
apt-get install -y build-essential cmake g++-11 g++-12 python3-pip libdw-dev rccl-dev rccl-unittests rocjpeg-dev rocjpeg-test rocdecode-dev rocdecode-test
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 10 --slave /usr/bin/g++ g++ /usr/bin/g++-11 --slave /usr/bin/gcov gcov /usr/bin/gcov-11
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 20 --slave /usr/bin/g++ g++ /usr/bin/g++-12 --slave /usr/bin/gcov gcov /usr/bin/gcov-12
python3 -m pip install -U --user -r requirements.txt
+18
Datei anzeigen
@@ -159,6 +159,12 @@ Full documentation for ROCprofiler-SDK is available at [rocm.docs.amd.com/projec
- SDK: `rocprofiler_agent_v0_t` support for agent visibility based on gpu isolation environment variables (`ROCR_VISIBLE_DEVICES`, etc.)
- Accumulation VGPR support for rocprofv3.
## ROCprofiler-SDK 0.7.0 for ROCm release 6.5
### Added
- Added support for rocJPEG API Tracing
### Changed
- SDK no longer creates a background thread when every tool returns a nullptr from `rocprofiler_configure`.
@@ -168,3 +174,15 @@ Full documentation for ROCprofiler-SDK is available at [rocm.docs.amd.com/projec
- Fixed missing callbacks around internal thread creation within counter collection service
### Removed
## ROCprofiler-SDK 0.7.0 for ROCm release 6.5
### Added
- Added support for rocJPEG API Tracing.
### Changed
### Resolved issues
### Removed
+44 -3
Datei anzeigen
@@ -21,20 +21,61 @@
#
################################################################################
include_guard(DIRECTORY)
# find rocDecode - library and headers
find_path(
rocDecode_ROOT_DIR
NAMES include/rocdecode
HINTS ${ROCM_PATH}
PATHS ${ROCM_PATH})
mark_as_advanced(rocDecode_ROOT_DIR)
find_path(
rocDecode_INCLUDE_DIR
NAMES rocdecode.h
PATHS ${ROCM_PATH}/include/rocdecode)
NAMES rocdecode/rocdecode.h
HINTS ${rocDecode_ROOT_DIR}
PATHS ${rocDecode_ROOT_DIR}
PATH_SUFFIXES include)
find_library(
rocDecode_LIBRARY
NAMES rocdecode
HINTS ${ROCM_PATH}/lib)
HINTS ${rocDecode_ROOT_DIR}
PATHS ${rocDecode_ROOT_DIR}
PATH_SUFFIXES lib)
function(_rocdecode_read_version_header _VERSION_VAR)
if(rocDecode_INCLUDE_DIR AND EXISTS
"${rocDecode_INCLUDE_DIR}/rocdecode/rocdecode_version.h")
file(READ "${rocDecode_INCLUDE_DIR}/rocdecode/rocdecode_version.h"
_rocdecode_version)
macro(_rocdecode_get_version_num _VAR _NAME)
string(REGEX MATCH "define([ \t]+)${_NAME}([ \t]+)([0-9]+)" _tmp
"${_rocdecode_version}")
set(${_VAR} 0)
if(_tmp MATCHES "([0-9]+)")
string(REGEX REPLACE "(.*${_NAME}[ ]+)([0-9]+)" "\\2" ${_VAR} "${_tmp}")
endif()
endmacro()
_rocdecode_get_version_num(_major "ROCDECODE_MAJOR_VERSION")
_rocdecode_get_version_num(_minor "ROCDECODE_MINOR_VERSION")
_rocdecode_get_version_num(_patch "ROCDECODE_PATCH_VERSION")
set(${_VERSION_VAR}
${_major}.${_minor}.${_patch}
PARENT_SCOPE)
endif()
endfunction()
_rocdecode_read_version_header(rocDecode_VERSION)
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
rocDecode
FOUND_VAR rocDecode_FOUND
VERSION_VAR rocDecode_VERSION
REQUIRED_VARS rocDecode_INCLUDE_DIR rocDecode_LIBRARY)
if(rocDecode_FOUND)
+87
Datei anzeigen
@@ -0,0 +1,87 @@
################################################################################
# Copyright (c) 2024 - 2025 Advanced Micro Devices, Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#
################################################################################
include_guard(DIRECTORY)
# find rocJPEG - library and headers
find_path(
rocJPEG_ROOT_DIR
NAMES include/rocjpeg
HINTS ${ROCM_PATH}
PATHS ${ROCM_PATH})
mark_as_advanced(rocJPEG_ROOT_DIR)
find_path(
rocJPEG_INCLUDE_DIR
NAMES rocjpeg/rocjpeg.h
HINTS ${rocJPEG_ROOT_DIR}
PATHS ${rocJPEG_ROOT_DIR}
PATH_SUFFIXES include)
find_library(
rocJPEG_LIBRARY
NAMES rocjpeg
HINTS ${rocJPEG_ROOT_DIR}
PATHS ${rocJPEG_ROOT_DIR}
PATH_SUFFIXES lib)
function(_rocjpeg_read_version_header _VERSION_VAR)
if(rocJPEG_INCLUDE_DIR AND EXISTS "${rocJPEG_INCLUDE_DIR}/rocjpeg/rocjpeg_version.h")
file(READ "${rocJPEG_INCLUDE_DIR}/rocjpeg/rocjpeg_version.h" _rocjpeg_version)
macro(_rocjpeg_get_version_num _VAR _NAME)
string(REGEX MATCH "define([ \t]+)${_NAME}([ \t]+)([0-9]+)" _tmp
"${_rocjpeg_version}")
set(${_VAR} 0)
if(_tmp MATCHES "([0-9]+)")
string(REGEX REPLACE "(.*${_NAME}[ ]+)([0-9]+)" "\\2" ${_VAR} "${_tmp}")
endif()
endmacro()
_rocjpeg_get_version_num(_major "ROCJPEG_MAJOR_VERSION")
_rocjpeg_get_version_num(_minor "ROCJPEG_MINOR_VERSION")
_rocjpeg_get_version_num(_patch "ROCJPEG_MICRO_VERSION")
set(${_VERSION_VAR}
${_major}.${_minor}.${_patch}
PARENT_SCOPE)
endif()
endfunction()
_rocjpeg_read_version_header(rocJPEG_VERSION)
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
rocJPEG
FOUND_VAR rocJPEG_FOUND
VERSION_VAR rocJPEG_VERSION
REQUIRED_VARS rocJPEG_INCLUDE_DIR rocJPEG_LIBRARY)
if(rocJPEG_FOUND)
if(NOT TARGET rocJPEG::rocJPEG)
add_library(rocJPEG::rocJPEG INTERFACE IMPORTED)
target_link_libraries(rocJPEG::rocJPEG INTERFACE ${rocJPEG_LIBRARY})
target_include_directories(rocJPEG::rocJPEG INTERFACE ${rocJPEG_INCLUDE_DIR})
endif()
endif()
mark_as_advanced(rocJPEG_INCLUDE_DIR rocJPEG_LIBRARY)
+20 -1
Datei anzeigen
@@ -336,7 +336,7 @@ find_package(rocDecode)
if(rocDecode_FOUND
AND rocDecode_INCLUDE_DIR
AND EXISTS "${ROCDECODE_INCLUDE_DIR}/rocdecode/amd_detail/rocdecode_api_trace.h")
AND EXISTS "${rocDecode_INCLUDE_DIR}/rocdecode/amd_detail/rocdecode_api_trace.h")
rocprofiler_config_nolink_target(
rocprofiler-sdk-rocdecode-nolink rocdecode::rocdecode INTERFACE
ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE=1)
@@ -345,3 +345,22 @@ else()
INTERFACE ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE=0)
endif()
# ----------------------------------------------------------------------------------------#
#
# rocJPEG
#
# ----------------------------------------------------------------------------------------#
find_package(rocJPEG)
if(rocJPEG_FOUND
AND rocJPEG_INCLUDE_DIR
AND EXISTS "${rocJPEG_INCLUDE_DIR}/rocjpeg/amd_detail/rocjpeg_api_trace.h")
rocprofiler_config_nolink_target(rocprofiler-sdk-rocjpeg-nolink rocjpeg::rocjpeg
INTERFACE ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG=1)
else()
target_compile_definitions(rocprofiler-sdk-rocjpeg-nolink
INTERFACE ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG=0)
endif()
+4
Datei anzeigen
@@ -94,3 +94,7 @@ rocprofiler_add_interface_library(rocprofiler-sdk-rccl-nolink
rocprofiler_add_interface_library(
rocprofiler-sdk-rocdecode-nolink
"ROCDECODE headers without linking to ROCDECODE library" IMPORTED)
rocprofiler_add_interface_library(
rocprofiler-sdk-rocjpeg-nolink "ROCJPEG headers without linking to ROCJPEG library"
IMPORTED)
-2
Datei anzeigen
@@ -59,8 +59,6 @@ if(ROCPROFILER_BUILD_TESTS)
rocprofiler_add_option(
ROCPROFILER_BUILD_GTEST
"Enable building gtest (Google testing) library internally" ON ADVANCED)
rocprofiler_add_option(ROCPROFILER_BUILD_ROCDECODE_TESTS
"Enable building rocDecode tests" OFF ADVANCED)
endif()
rocprofiler_add_option(ROCPROFILER_ENABLE_CLANG_TIDY "Enable clang-tidy checks" OFF
+11 -3
Datei anzeigen
@@ -251,13 +251,13 @@ For MPI applications (or other job launchers such as SLURM), place rocprofv3 ins
aggregate_tracing_options,
"-r",
"--runtime-trace",
help="Collect tracing data for HIP runtime API, Marker (ROCTx) API, RCCL API, ROCDecode API, Memory operations (copies, scratch, and allocation), and Kernel dispatches. Similar to --sys-trace but without tracing HIP compiler API and the underlying HSA API.",
help="Collect tracing data for HIP runtime API, Marker (ROCTx) API, RCCL API, rocDecode API, rocJPEG API, Memory operations (copies, scratch, and allocation), and Kernel dispatches. Similar to --sys-trace but without tracing HIP compiler API and the underlying HSA API.",
)
add_parser_bool_argument(
aggregate_tracing_options,
"-s",
"--sys-trace",
help="Collect tracing data for HIP API, HSA API, Marker (ROCTx) API, RCCL API, ROCDecode API, Memory operations (copies, scratch, and allocations), and Kernel dispatches.",
help="Collect tracing data for HIP API, HSA API, Marker (ROCTx) API, RCCL API, rocDecode API, rocJPEG API, Memory operations (copies, scratch, and allocations), and Kernel dispatches.",
)
basic_tracing_options = parser.add_argument_group("Basic tracing options")
@@ -311,7 +311,12 @@ For MPI applications (or other job launchers such as SLURM), place rocprofv3 ins
add_parser_bool_argument(
basic_tracing_options,
"--rocdecode-trace",
help="For collecting ROCDecode Traces",
help="For collecting rocDecode Traces",
)
add_parser_bool_argument(
basic_tracing_options,
"--rocjpeg-trace",
help="For collecting rocJPEG Traces",
)
extended_tracing_options = parser.add_argument_group("Granular tracing options")
@@ -980,6 +985,7 @@ def run(app_args, args, **kwargs):
"scratch_memory_trace",
"rccl_trace",
"rocdecode_trace",
"rocjpeg_trace",
):
setattr(args, itr, True)
@@ -993,6 +999,7 @@ def run(app_args, args, **kwargs):
"scratch_memory_trace",
"rccl_trace",
"rocdecode_trace",
"rocjpeg_trace",
):
setattr(args, itr, True)
@@ -1017,6 +1024,7 @@ def run(app_args, args, **kwargs):
["marker_trace", "MARKER_API_TRACE"],
["rccl_trace", "RCCL_API_TRACE"],
["rocdecode_trace", "ROCDECODE_API_TRACE"],
["rocjpeg_trace", "ROCJPEG_API_TRACE"],
["kernel_trace", "KERNEL_TRACE"],
["memory_copy_trace", "MEMORY_COPY_TRACE"],
["memory_allocation_trace", "MEMORY_ALLOCATION_TRACE"],
+5
Datei anzeigen
@@ -0,0 +1,5 @@
"Domain","Function","Process_Id","Thread_Id","Correlation_Id","Start_Timestamp","End_Timestamp"
"ROCJPEG_API","rocJpegCreate",41884,41884,105,1286306029650499,1286306248201233
"ROCJPEG_API","rocJpegStreamCreate",41884,41884,502,1286306248250747,1286306248268715
"ROCJPEG_API","rocJpegStreamParse",41884,41884,503,1286306248421385,1286306248680757
"ROCJPEG_API","rocJpegGetImageInfo",41884,41884,504,1286306248684203,1286306248686556
1 Domain Function Process_Id Thread_Id Correlation_Id Start_Timestamp End_Timestamp
2 ROCJPEG_API rocJpegCreate 41884 41884 105 1286306029650499 1286306248201233
3 ROCJPEG_API rocJpegStreamCreate 41884 41884 502 1286306248250747 1286306248268715
4 ROCJPEG_API rocJpegStreamParse 41884 41884 503 1286306248421385 1286306248680757
5 ROCJPEG_API rocJpegGetImageInfo 41884 41884 504 1286306248684203 1286306248686556
+22
Datei anzeigen
@@ -526,6 +526,28 @@ Here are the contents of ``rocdecode_api_trace.csv`` file:
:widths: 10,10,10,10,10,20,20
:header-rows: 1
rocJPEG trace
+++++++++++++++
`rocJPEG <https://github.com/ROCm/rocJPEG>`_ is a high-performance jpeg decode SDK for decoding jpeg images. This option traces the rocJPEG API.
.. code-block:: shell
rocprofv3 --rocjpeg-trace -- <application_path>
The above command generates a ``rocjpeg_api_trace`` file prefixed with the process ID.
.. code-block:: shell
$ cat 41688_rocjpeg_api_trace.csv
Here are the contents of ``rocjpeg_api_trace.csv`` file:
.. csv-table:: rocJPEG trace
:file: /data/rocjpeg_api_trace.csv
:widths: 10,10,10,10,10,20,20
:header-rows: 1
Post-processing tracing options
++++++++++++++++++++++++++++++++
+6 -2
Datei anzeigen
@@ -65,10 +65,14 @@
"type": "boolean",
"description": "For Collecting Memory Allocation Traces"
},
"rocdecode_trace": {
"rocdecode_trace": {
"type": "boolean",
"description": "For Collecting rocDecode Traces"
},
"rocjpeg_trace": {
"type": "boolean",
"description": "For Collecting rocJPEG Traces"
},
"scratch_memory_trace": {
"type": "boolean",
"description": "For Collecting Scratch Memory operations Traces"
@@ -111,7 +115,7 @@
"sys_trace" : {
"type": "boolean",
"description": "For Collecting HIP, HSA, Marker (ROCTx), Memory copy, Memory allocation, Scratch memory, rocDecode, and Kernel dispatch traces"
"description": "For Collecting HIP, HSA, Marker (ROCTx), Memory copy, Memory allocation, Scratch memory, rocDecode, rocJPEG, and Kernel dispatch traces"
},
"mangled_kernels": {
@@ -32,6 +32,7 @@ set(ROCPROFILER_HEADER_FILES
registration.h
rccl.h
rocdecode.h
rocjpeg.h
spm.h
${CMAKE_CURRENT_BINARY_DIR}/version.h)
@@ -46,6 +47,7 @@ add_subdirectory(marker)
add_subdirectory(ompt)
add_subdirectory(rccl)
add_subdirectory(rocdecode)
add_subdirectory(rocjpeg)
add_subdirectory(cxx)
add_subdirectory(kfd)
add_subdirectory(amd_detail)
@@ -183,9 +183,9 @@ typedef struct
} rocprofiler_buffer_tracing_rccl_api_record_t;
/**
* @brief ROCProfiler Buffer ROCDecode API Record.
* @brief ROCProfiler Buffer rocDecode API Record.
*/
typedef struct
typedef struct rocprofiler_buffer_tracing_rocdecode_api_record_t
{
uint64_t size; ///< size of this struct
rocprofiler_buffer_tracing_kind_t kind;
@@ -201,6 +201,25 @@ typedef struct
/// @brief Specification of the API function, e.g., ::rocprofiler_rocdecode_api_id_t
} rocprofiler_buffer_tracing_rocdecode_api_record_t;
/**
* @brief ROCProfiler Buffer rocJPEG API Record.
*/
typedef struct rocprofiler_buffer_tracing_rocjpeg_api_record_t
{
uint64_t size; ///< size of this struct
rocprofiler_buffer_tracing_kind_t kind;
rocprofiler_tracing_operation_t operation;
rocprofiler_correlation_id_t correlation_id; ///< correlation ids for record
rocprofiler_timestamp_t start_timestamp; ///< start time in nanoseconds
rocprofiler_timestamp_t end_timestamp; ///< end time in nanoseconds
rocprofiler_thread_id_t thread_id; ///< id for thread generating this record
/// @var kind
/// @brief ::ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API
/// @var operation
/// @brief Specification of the API function, e.g., ::rocprofiler_rocjpeg_api_id_t
} rocprofiler_buffer_tracing_rocjpeg_api_record_t;
/**
* @brief ROCProfiler Buffer Memory Copy Tracer Record.
*/
@@ -30,6 +30,7 @@
#include <rocprofiler-sdk/ompt.h>
#include <rocprofiler-sdk/rccl.h>
#include <rocprofiler-sdk/rocdecode.h>
#include <rocprofiler-sdk/rocjpeg.h>
#include <hsa/hsa.h>
#include <hsa/hsa_amd_tool.h>
@@ -110,15 +111,25 @@ typedef struct
} rocprofiler_callback_tracing_rccl_api_data_t;
/**
* @brief ROCProfiler ROCDecode API Callback Data.
* @brief ROCProfiler rocDecode API Callback Data.
*/
typedef struct
typedef struct rocprofiler_callback_tracing_rocdecode_api_data_t
{
uint64_t size; ///< size of this struct
rocprofiler_rocdecode_api_args_t args;
rocprofiler_rocdecode_api_retval_t retval;
} rocprofiler_callback_tracing_rocdecode_api_data_t;
/**
* @brief ROCProfiler rocJPEG API Callback Data.
*/
typedef struct rocprofiler_callback_tracing_rocjpeg_api_data_t
{
uint64_t size; ///< size of this struct
rocprofiler_rocjpeg_api_args_t args;
rocprofiler_rocjpeg_api_retval_t retval;
} rocprofiler_callback_tracing_rocjpeg_api_data_t;
/**
* @brief ROCProfiler Code Object Load Tracer Callback Record.
*/
@@ -82,7 +82,8 @@ ROCPROFILER_DEFINE_CATEGORY(category, openmp, "OpenMP")
ROCPROFILER_DEFINE_CATEGORY(category, kernel_dispatch, "GPU kernel dispatch")
ROCPROFILER_DEFINE_CATEGORY(category, memory_copy, "Async memory copy")
ROCPROFILER_DEFINE_CATEGORY(category, memory_allocation, "Memory Allocation")
ROCPROFILER_DEFINE_CATEGORY(category, rocdecode_api, "ROCDecode API function")
ROCPROFILER_DEFINE_CATEGORY(category, rocdecode_api, "rocDecode API function")
ROCPROFILER_DEFINE_CATEGORY(category, rocjpeg_api, "rocJPEG API function")
#define ROCPROFILER_PERFETTO_CATEGORIES \
ROCPROFILER_PERFETTO_CATEGORY(category::hsa_api), \
@@ -93,7 +94,8 @@ ROCPROFILER_DEFINE_CATEGORY(category, rocdecode_api, "ROCDecode API function")
ROCPROFILER_PERFETTO_CATEGORY(category::kernel_dispatch), \
ROCPROFILER_PERFETTO_CATEGORY(category::memory_copy), \
ROCPROFILER_PERFETTO_CATEGORY(category::memory_allocation), \
ROCPROFILER_PERFETTO_CATEGORY(category::rocdecode_api)
ROCPROFILER_PERFETTO_CATEGORY(category::rocdecode_api), \
ROCPROFILER_PERFETTO_CATEGORY(category::rocjpeg_api)
#include <perfetto.h>
@@ -402,6 +402,21 @@ save(ArchiveT& ar, rocprofiler_callback_tracing_rocdecode_api_data_t data)
ROCP_SDK_SAVE_DATA_FIELD(retval);
}
template <typename ArchiveT>
void
save(ArchiveT& ar, rocprofiler_rocjpeg_api_retval_t data)
{
ROCP_SDK_SAVE_DATA_FIELD(rocJpegStatus_retval);
}
template <typename ArchiveT>
void
save(ArchiveT& ar, rocprofiler_callback_tracing_rocjpeg_api_data_t data)
{
ROCP_SDK_SAVE_DATA_FIELD(size);
ROCP_SDK_SAVE_DATA_FIELD(retval);
}
template <typename ArchiveT>
void
save(ArchiveT& ar, rocprofiler_callback_tracing_ompt_data_t data)
@@ -502,6 +517,13 @@ save(ArchiveT& ar, rocprofiler_buffer_tracing_rocdecode_api_record_t data)
save_buffer_tracing_api_record(ar, data);
}
template <typename ArchiveT>
void
save(ArchiveT& ar, rocprofiler_buffer_tracing_rocjpeg_api_record_t data)
{
save_buffer_tracing_api_record(ar, data);
}
template <typename ArchiveT>
void
save(ArchiveT& ar, rocprofiler_buffer_tracing_ompt_target_t data)
@@ -70,6 +70,7 @@ typedef enum // NOLINT(performance-enum-size)
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_OMPT, ///<
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_MEMORY_ALLOCATION, ///<
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_ROCDECODE_API, ///<
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_ROCJPEG_API, ///<
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_LAST,
} rocprofiler_external_correlation_id_request_kind_t;
+8 -3
Datei anzeigen
@@ -177,6 +177,7 @@ typedef enum // NOLINT(performance-enum-size)
ROCPROFILER_CALLBACK_TRACING_RUNTIME_INITIALIZATION, ///< Callback notifying that a runtime
///< library has been initialized
ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API, ///< rocDecode API Tracing
ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API, ///< rocJPEG API Tracing
ROCPROFILER_CALLBACK_TRACING_LAST,
} rocprofiler_callback_tracing_kind_t;
@@ -209,6 +210,7 @@ typedef enum // NOLINT(performance-enum-size)
///< been initialized. @see
///< ::rocprofiler_runtime_initialization_operation_t
ROCPROFILER_BUFFER_TRACING_ROCDECODE_API, ///< rocDecode tracing
ROCPROFILER_BUFFER_TRACING_ROCJPEG_API, ///< rocJPEG tracing
ROCPROFILER_BUFFER_TRACING_LAST,
} rocprofiler_buffer_tracing_kind_t;
@@ -371,7 +373,8 @@ typedef enum
ROCPROFILER_MARKER_LIBRARY = (1 << 3),
ROCPROFILER_RCCL_LIBRARY = (1 << 4),
ROCPROFILER_ROCDECODE_LIBRARY = (1 << 5),
ROCPROFILER_LIBRARY_LAST = ROCPROFILER_ROCDECODE_LIBRARY,
ROCPROFILER_ROCJPEG_LIBRARY = (1 << 6),
ROCPROFILER_LIBRARY_LAST = ROCPROFILER_ROCJPEG_LIBRARY,
} rocprofiler_runtime_library_t;
/**
@@ -388,7 +391,8 @@ typedef enum
ROCPROFILER_MARKER_NAME_TABLE = (1 << 5),
ROCPROFILER_RCCL_TABLE = (1 << 6),
ROCPROFILER_ROCDECODE_TABLE = (1 << 7),
ROCPROFILER_TABLE_LAST = ROCPROFILER_ROCDECODE_TABLE,
ROCPROFILER_ROCJPEG_TABLE = (1 << 8),
ROCPROFILER_TABLE_LAST = ROCPROFILER_ROCJPEG_TABLE,
} rocprofiler_intercept_table_t;
/**
@@ -401,7 +405,8 @@ typedef enum // NOLINT(performance-enum-size)
ROCPROFILER_RUNTIME_INITIALIZATION_HIP, ///< Application loaded HIP runtime
ROCPROFILER_RUNTIME_INITIALIZATION_MARKER, ///< Application loaded Marker (ROCTx) runtime
ROCPROFILER_RUNTIME_INITIALIZATION_RCCL, ///< Application loaded RCCL runtime
ROCPROFILER_RUNTIME_INITIALIZATION_ROCDECODE, ///< Application loaded rocDecode runtime
ROCPROFILER_RUNTIME_INITIALIZATION_ROCDECODE, ///< Application loaded rocDecoder runtime
ROCPROFILER_RUNTIME_INITIALIZATION_ROCJPEG, ///< Application loaded rocJPEG runtime
ROCPROFILER_RUNTIME_INITIALIZATION_LAST,
} rocprofiler_runtime_initialization_operation_t;
@@ -25,17 +25,7 @@
#include <rocprofiler-sdk/defines.h>
#include <rocprofiler-sdk/version.h>
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocparser.h>) && __has_include(<rocdecode/rocdecode.h>) && __has_include(<rocdecode/roc_bitstream_reader.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/roc_bitstream_reader.h>
@@ -27,7 +27,7 @@
#include <rocprofiler-sdk/version.h>
/**
* @brief ROCProfiler enumeration of HSA Core API tracing operations
* @brief ROCProfiler enumeration of rocDecode API tracing operations
*/
typedef enum // NOLINT(performance-enum-size)
{
@@ -1,11 +1,11 @@
#
#
# Installation of public ROCDecode headers
# Installation of public rocDecode headers
#
#
set(ROCPROFILER_ROCDECODE_DETAILS_HEADER_FILES
rocdecode_api_trace.h rocdecode.h rocparser.h rocdecode_version.h
roc_bitstream_reader.h)
roc_bitstream_reader.h rocdecode_headers.h)
install(
FILES ${ROCPROFILER_ROCDECODE_DETAILS_HEADER_FILES}
@@ -22,17 +22,7 @@ THE SOFTWARE.
#pragma once
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocdecode.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/rocdecode.h>
@@ -31,17 +31,8 @@ THE SOFTWARE.
#endif
#include "hip/hip_runtime.h"
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocdecode_version.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/rocdecode_version.h>
@@ -21,17 +21,7 @@ THE SOFTWARE.
*/
#pragma once
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocparser.h>) && __has_include(<rocdecode/rocdecode.h>) && __has_include(<rocdecode/roc_bitstream_reader.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/roc_bitstream_reader.h>
@@ -0,0 +1,39 @@
/*
Copyright (c) 2024 - 2025 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
/*
ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE is set to 0 due to issues with the
rocDecode cmake setup and header files. Once they are resolved, the first
if-condition should set the ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE variable
to 1
*/
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocdecode.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
@@ -22,17 +22,7 @@ THE SOFTWARE.
#pragma once
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/rocdecode.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/rocdecode.h>
@@ -26,6 +26,6 @@
typedef enum
{
ROCPROFILER_ROCDECODE_TABLE_ID_NONE = -1,
ROCPROFILER_ROCDECODE_TABLE_ID = 0,
ROCPROFILER_ROCDECODE_TABLE_ID_CORE = 0,
ROCPROFILER_ROCDECODE_TABLE_ID_LAST,
} rocprofiler_rocdecode_table_id_t;
@@ -0,0 +1,27 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
#pragma once
#include <rocprofiler-sdk/rocjpeg/api_args.h>
#include <rocprofiler-sdk/rocjpeg/api_id.h>
#include <rocprofiler-sdk/rocjpeg/table_id.h>
@@ -0,0 +1,13 @@
#
#
# Installation of public rocJPEG headers
#
#
set(ROCPROFILER_ROCJPEG_HEADER_FILES api_args.h api_id.h table_id.h)
install(
FILES ${ROCPROFILER_ROCJPEG_HEADER_FILES}
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/rocprofiler-sdk/rocjpeg
COMPONENT development)
add_subdirectory(details)
@@ -0,0 +1,118 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#pragma once
#include <rocprofiler-sdk/defines.h>
#include <rocprofiler-sdk/version.h>
#include <rocprofiler-sdk/rocjpeg/details/rocjpeg_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG > 0
# include <rocjpeg/rocjpeg.h>
#else
# include <rocprofiler-sdk/rocjpeg/details/rocjpeg.h>
#endif
#include <stdint.h>
ROCPROFILER_EXTERN_C_INIT
// Empty struct has a size of 0 in C but size of 1 in C++.
// This struct is added to the union members which represent
// functions with no arguments to ensure ABI compatibility
typedef struct rocprofiler_rocjpeg_api_no_args
{
char empty;
} rocprofiler_rocjpeg_api_no_args;
typedef union rocprofiler_rocjpeg_api_retval_t
{
int32_t rocJpegStatus_retval;
const char* const_charp_retval;
} rocprofiler_rocjpeg_api_retval_t;
typedef union rocprofiler_rocjpeg_api_args_t
{
struct
{
RocJpegStreamHandle* jpeg_stream_handle;
} rocJpegStreamCreate;
struct
{
const unsigned char* data;
size_t length;
RocJpegStreamHandle jpeg_stream_handle;
} rocJpegStreamParse;
struct
{
RocJpegStreamHandle jpeg_stream_handle;
} rocJpegStreamDestroy;
struct
{
RocJpegBackend backend;
int device_id;
RocJpegHandle* handle;
} rocJpegCreate;
struct
{
RocJpegHandle handle;
} rocJpegDestroy;
struct
{
RocJpegHandle handle;
RocJpegStreamHandle jpeg_stream_handle;
uint8_t* num_components;
RocJpegChromaSubsampling* subsampling;
uint32_t* widths;
uint32_t* heights;
} rocJpegGetImageInfo;
struct
{
RocJpegHandle handle;
RocJpegStreamHandle jpeg_stream_handle;
const RocJpegDecodeParams* decode_params;
RocJpegImage* destination;
} rocJpegDecode;
struct
{
RocJpegHandle handle;
RocJpegStreamHandle* jpeg_stream_handles;
int batch_size;
const RocJpegDecodeParams* decode_params;
RocJpegImage* destinations;
} rocJpegDecodeBatched;
struct
{
RocJpegStatus rocjpeg_status;
} rocJpegGetErrorName;
} rocprofiler_rocjpeg_api_args_t;
ROCPROFILER_EXTERN_C_FINI
@@ -0,0 +1,45 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#pragma once
#include <rocprofiler-sdk/version.h>
/**
* @brief ROCProfiler enumeration of rocJPEG API tracing operations
*/
typedef enum // NOLINT(performance-enum-size)
{
ROCPROFILER_ROCJPEG_API_ID_NONE = -1,
ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamCreate = 0,
ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamParse,
ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamDestroy,
ROCPROFILER_ROCJPEG_API_ID_rocJpegCreate,
ROCPROFILER_ROCJPEG_API_ID_rocJpegDestroy,
ROCPROFILER_ROCJPEG_API_ID_rocJpegGetImageInfo,
ROCPROFILER_ROCJPEG_API_ID_rocJpegDecode,
ROCPROFILER_ROCJPEG_API_ID_rocJpegDecodeBatched,
ROCPROFILER_ROCJPEG_API_ID_rocJpegGetErrorName,
ROCPROFILER_ROCJPEG_API_ID_LAST,
} rocprofiler_rocjpef_api_id_t;
@@ -0,0 +1,12 @@
#
#
# Installation of public rocJPEG headers
#
#
set(ROCPROFILER_ROCJPEG_DETAILS_HEADER_FILES rocjpeg_api_trace.h rocjpeg.h
rocjpeg_version.h rocjpeg_headers.h)
install(
FILES ${ROCPROFILER_ROCJPEG_DETAILS_HEADER_FILES}
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/rocprofiler-sdk/rocjpeg/details
COMPONENT development)
@@ -0,0 +1,418 @@
/* Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#ifndef ROC_JPEG_H
#define ROC_JPEG_H
#define ROCJPEGAPI
#pragma once
#include <rocprofiler-sdk/rocjpeg/details/rocjpeg_headers.h>
#include "hip/hip_runtime.h"
#if ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG > 0
# include <rocjpeg/rocjpeg_version.h>
#else
# include <rocprofiler-sdk/rocjpeg/details/rocjpeg_version.h>
#endif
/**
* @file rocjpeg.h
* @brief The AMD rocJPEG Library.
* @defgroup group_amd_rocjepg rocJPEG: AMD ROCm JPEG Decode API
* @brief rocJPEG API is a toolkit to decode JPEG images using a hardware-accelerated JPEG decoder
* on AMDs GPUs.
*/
#if defined(__cplusplus)
extern "C" {
#endif // __cplusplus
/**
* @def
* @ingroup group_amd_rocjpeg
* Maximum number of channels rocJPEG supports
*/
#define ROCJPEG_MAX_COMPONENT 4
/**
* @enum RocJpegStatus
* @ingroup group_amd_rocjpeg
* @brief Enumeration representing the status codes for the rocJPEG library.
*/
typedef enum
{
ROCJPEG_STATUS_SUCCESS = 0, /**< The operation completed successfully. */
ROCJPEG_STATUS_NOT_INITIALIZED = -1, /**< The rocJPEG library is not initialized. */
ROCJPEG_STATUS_INVALID_PARAMETER = -2, /**< An invalid parameter was passed to a function. */
ROCJPEG_STATUS_BAD_JPEG = -3, /**< The input JPEG data is corrupted or invalid. */
ROCJPEG_STATUS_JPEG_NOT_SUPPORTED = -4, /**< The JPEG format is not supported. */
ROCJPEG_STATUS_OUTOF_MEMORY = -5, /**< Out of memory error. */
ROCJPEG_STATUS_EXECUTION_FAILED = -6, /**< The execution of a function failed. */
ROCJPEG_STATUS_ARCH_MISMATCH = -7, /**< The architecture is not supported. */
ROCJPEG_STATUS_INTERNAL_ERROR = -8, /**< Internal error occurred. */
ROCJPEG_STATUS_IMPLEMENTATION_NOT_SUPPORTED =
-9, /**< The requested implementation is not supported. */
ROCJPEG_STATUS_HW_JPEG_DECODER_NOT_SUPPORTED =
-10, /**< Hardware JPEG decoder is not supported. */
ROCJPEG_STATUS_RUNTIME_ERROR = -11, /**< Runtime error occurred. */
ROCJPEG_STATUS_NOT_IMPLEMENTED = -12, /**< The requested feature is not implemented. */
} RocJpegStatus;
/**
* @enum RocJpegChromaSubsampling
* @ingroup group_amd_rocjpeg
* @brief Enum representing the chroma subsampling options for JPEG encoding/decoding.
*
* The `RocJpegChromaSubsampling` enum defines the available chroma subsampling options for JPEG
* encoding/decoding. Chroma subsampling refers to the reduction of color information in an image to
* reduce file size.
*
* The possible values are:
* - `ROCJPEG_CSS_444`: Full chroma resolution (4:4:4).
* - `ROCJPEG_CSS_440`: Chroma resolution reduced by half vertically (4:4:0).
* - `ROCJPEG_CSS_422`: Chroma resolution reduced by half horizontally (4:2:2).
* - `ROCJPEG_CSS_420`: Chroma resolution reduced by half both horizontally and vertically (4:2:0).
* - `ROCJPEG_CSS_411`: Chroma resolution reduced by a quarter horizontally (4:1:1).
* - `ROCJPEG_CSS_400`: No chroma information (4:0:0).
* - `ROCJPEG_CSS_UNKNOWN`: Unknown chroma subsampling.
*/
typedef enum
{
ROCJPEG_CSS_444 = 0,
ROCJPEG_CSS_440 = 1,
ROCJPEG_CSS_422 = 2,
ROCJPEG_CSS_420 = 3,
ROCJPEG_CSS_411 = 4,
ROCJPEG_CSS_400 = 5,
ROCJPEG_CSS_UNKNOWN = -1
} RocJpegChromaSubsampling;
/**
* @struct RocJpegImage
* @ingroup group_amd_rocjpeg
* @brief Structure representing a JPEG image.
*
* This structure holds the information about a JPEG image, including the pointers to the image
* channels and the pitch (stride) of each channel.
*/
typedef struct
{
uint8_t* channel[ROCJPEG_MAX_COMPONENT]; /**< Pointers to the image channels. */
uint32_t pitch[ROCJPEG_MAX_COMPONENT]; /**< Pitch (stride) of each channel. */
} RocJpegImage;
/**
* @enum RocJpegOutputFormat
* @ingroup group_amd_rocjpeg
* @brief Enum representing the output format options for the RocJpegImage.
*
* The `RocJpegOutputFormat` enum specifies the different output formats that can be used when
* decoding a JPEG image using the VCN JPEG decoder.
*
* The available output formats are:
* - `ROCJPEG_OUTPUT_NATIVE`: Returns the native unchanged decoded YUV image from the VCN JPEG
* decoder. The channel arrangement depends on the chroma subsampling format.
* - `ROCJPEG_OUTPUT_YUV_PLANAR`: Extracts the Y, U, and V channels from the decoded YUV image and
* writes them into separate channels of the RocJpegImage.
* - `ROCJPEG_OUTPUT_Y`: Returns only the luma component (Y) and writes it to the first channel of
* the RocJpegImage.
* - `ROCJPEG_OUTPUT_RGB`: Converts the decoded image to interleaved RGB format using the VCN JPEG
* decoder or HIP kernels and writes it to the first channel of the RocJpegImage.
* - `ROCJPEG_OUTPUT_RGB_PLANAR`: Converts the decoded image to RGB PLANAR format using the VCN JPEG
* decoder or HIP kernels and writes the RGB channels to separate channels of the RocJpegImage.
* - `ROCJPEG_OUTPUT_FORMAT_MAX`: Maximum allowed value for the output format.
*/
typedef enum
{
/**< return native unchanged decoded YUV image from the VCN JPEG decoder.
For ROCJPEG_CSS_444 and ROCJPEG_CSS_440 write Y, U, and V to first, second, and third
channels of RocJpegImage For ROCJPEG_CSS_422 write YUYV (packed) to first channel of
RocJpegImage For ROCJPEG_CSS_420 write Y to first channel and UV (interleaved) to second
channel of RocJpegImage For ROCJPEG_CSS_400 write Y to first channel of RocJpegImage */
ROCJPEG_OUTPUT_NATIVE = 0,
/**< extract Y, U, and V channels from the decoded YUV image from the VCN JPEG decoder and write
into first, second, and third channel of RocJpegImage. For ROCJPEG_CSS_400 write Y to first
channel of RocJpegImage */
ROCJPEG_OUTPUT_YUV_PLANAR = 1,
/**< return luma component (Y) and write to first channel of RocJpegImage */
ROCJPEG_OUTPUT_Y = 2,
/**< convert to interleaved RGB using VCN JPEG decoder (on MI300+) or using HIP kernels and
write to first channel of RocJpegImage */
ROCJPEG_OUTPUT_RGB = 3,
/**< convert to RGB PLANAR using VCN JPEG decoder (on MI300+) or HIP kernels and write to first,
second, and third channel of RocJpegImage. */
ROCJPEG_OUTPUT_RGB_PLANAR = 4,
ROCJPEG_OUTPUT_FORMAT_MAX = 5 /**< maximum allowed value */
} RocJpegOutputFormat;
/**
* @struct RocJpegDecodeParams
* @ingroup group_amd_rocjpeg
* @brief Structure containing parameters for JPEG decoding.
*
* This structure defines the parameters for decoding a JPEG image using the rocJpeg library.
* It specifies the output format, crop rectangle, and target dimensions for the decoded image.
* Note that if both the crop rectangle and target dimensions are defined, cropping is done first,
* followed by resizing the resulting ROI to the target dimension.
*/
typedef struct
{
RocJpegOutputFormat
output_format; /**< Output data format. See RocJpegOutputFormat for description. */
struct
{
int16_t left; /**< Left coordinate of the crop rectangle. */
int16_t top; /**< Top coordinate of the crop rectangle. */
int16_t right; /**< Right coordinate of the crop rectangle. */
int16_t bottom; /**< Bottom coordinate of the crop rectangle. */
} crop_rectangle; /**< Defines the region of interest (ROI) to be copied into the RocJpegImage
output buffers. */
struct
{
uint32_t width; /**< Target width of the picture to be resized. */
uint32_t height; /**< Target height of the picture to be resized. */
} target_dimension; /**< (future use) Defines the target width and height of the picture to be
resized. Both should be even. If specified, allocate the RocJpegImage
buffers based on these dimensions. */
} RocJpegDecodeParams;
/**
* @enum RocJpegBackend
* @ingroup group_amd_rocjpeg
* @brief The backend options for the rocJpeg library.
*
* This enum defines the available backend options for the rocJpeg library.
* The backend can be either hardware or hybrid.
*/
typedef enum
{
ROCJPEG_BACKEND_HARDWARE = 0, /**< Hardware backend option. */
ROCJPEG_BACKEND_HYBRID = 1 /**< Hybrid backend option. */
} RocJpegBackend;
/**
* @brief A handle representing a RocJpegStream instance.
*
* The `RocJpegStreamHandle` is a pointer type used to represent a RocJpegStream instance.
* It is used as a handle to parse and store various parameters from a JPEG stream.
*/
typedef void* RocJpegStreamHandle;
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegStreamCreate(RocJpegStreamHandle *jpeg_stream_handle);
* @ingroup group_amd_rocjpeg
* @brief Creates a RocJpegStreamHandle for JPEG stream processing.
*
* This function creates a RocJpegStreamHandle, which is used for processing JPEG streams.
* The created handle is stored in the `jpeg_stream_handle` parameter.
*
* @param jpeg_stream_handle Pointer to a RocJpegStreamHandle variable that will hold the created
* handle.
* @return RocJpegStatus Returns the status of the operation. Possible values are:
* - ROCJPEG_STATUS_SUCCESS: The operation was successful.
* - ROCJPEG_STATUS_INVALID_ARGUMENT: The `jpeg_stream_handle` parameter is
* NULL.
* - ROCJPEG_STATUS_OUT_OF_MEMORY: Failed to allocate memory for the handle.
* - ROCJPEG_STATUS_UNKNOWN_ERROR: An unknown error occurred.
*/
RocJpegStatus ROCJPEGAPI
rocJpegStreamCreate(RocJpegStreamHandle* jpeg_stream_handle);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegStreamParse(const unsigned char *data, size_t length,
* RocJpegStreamHandle jpeg_stream_handle);
* @ingroup group_amd_rocjpeg
* @brief Parses a JPEG stream.
*
* This function parses a JPEG stream represented by the `data` parameter of length `length`.
* The parsed stream is associated with the `jpeg_stream_handle` provided.
*
* @param data The pointer to the JPEG stream data.
* @param length The length of the JPEG stream data.
* @param jpeg_stream_handle The handle to the JPEG stream.
* @return The status of the JPEG stream parsing operation.
*/
RocJpegStatus ROCJPEGAPI
rocJpegStreamParse(const unsigned char* data,
size_t length,
RocJpegStreamHandle jpeg_stream_handle);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegStreamDestroy(RocJpegStreamHandle jpeg_stream_handle);
* @ingroup group_amd_rocjpeg
* @brief Destroys a RocJpegStreamHandle object and releases associated resources.
*
* This function destroys the RocJpegStreamHandle object specified by `jpeg_stream_handle` and
* releases any resources associated with it. After calling this function, the `jpeg_stream_handle`
* becomes invalid and should not be used anymore.
*
* @param jpeg_stream_handle The handle to the RocJpegStreamHandle object to be destroyed.
* @return The status of the operation. Returns ROCJPEG_STATUS_SUCCESS if the operation is
* successful, or an error code if an error occurs.
*/
RocJpegStatus ROCJPEGAPI
rocJpegStreamDestroy(RocJpegStreamHandle jpeg_stream_handle);
/**
* @brief A handle representing a RocJpeg instance.
*
* The `RocJpegHandle` is a pointer type used to represent a RocJpeg instance.
* It is used as a handle to perform various operations on the rocJpeg library.
*/
typedef void* RocJpegHandle;
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegCreate(RocJpegBackend backend, int device_id, RocJpegHandle
* *handle);
* @ingroup group_amd_rocjpeg
* @brief Creates a RocJpegHandle for JPEG decoding.
*
* This function creates a RocJpegHandle for JPEG decoding using the specified backend and device
* ID.
*
* @param backend The backend to be used for JPEG decoding.
* @param device_id The ID of the device to be used for JPEG decoding.
* @param handle Pointer to a RocJpegHandle variable to store the created handle.
* @return The status of the operation. Returns ROCJPEG_STATUS_INVALID_PARAMETER if handle is
* nullptr, ROCJPEG_STATUS_NOT_INITIALIZED if the rocJPEG handle initialization fails, or the status
* returned by the InitializeDecoder function of the rocjpeg_decoder.
*/
RocJpegStatus ROCJPEGAPI
rocJpegCreate(RocJpegBackend backend, int device_id, RocJpegHandle* handle);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegDestroy(RocJpegHandle handle);
* @ingroup group_amd_rocjpeg
* @brief Destroys a RocJpegHandle object.
*
* This function destroys the RocJpegHandle object pointed to by the given handle.
* It releases any resources associated with the handle and frees the memory.
*
* @param handle The handle to the RocJpegHandle object to be destroyed.
* @return The status of the operation. Returns ROCJPEG_STATUS_SUCCESS if the handle was
* successfully destroyed, or ROCJPEG_STATUS_INVALID_PARAMETER if the handle is nullptr.
*/
RocJpegStatus ROCJPEGAPI
rocJpegDestroy(RocJpegHandle handle);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegGetImageInfo(RocJpegHandle handle, RocJpegStreamHandle
* jpeg_stream_handle, uint8_t *num_components, RocJpegChromaSubsampling *subsampling, uint32_t
* *widths, uint32_t *heights);
* @ingroup group_amd_rocjpeg
* @brief Retrieves information about the JPEG image.
*
* This function retrieves the number of components, chroma subsampling, and dimensions (width and
* height) of the JPEG image specified by the `jpeg_stream_handle`. The information is stored in the
* provided output parameters `num_components`, `subsampling`, `widths`, and `heights`.
*
* @param handle The handle to the RocJpegDecoder instance.
* @param jpeg_stream_handle The handle to the RocJpegStream instance representing the JPEG image.
* @param num_components A pointer to an unsigned 8-bit integer that will store the number of
* components in the JPEG image.
* @param subsampling A pointer to a RocJpegChromaSubsampling enum that will store the chroma
* subsampling information.
* @param widths A pointer to an unsigned 32-bit integer array that will store the width of each
* component in the JPEG image.
* @param heights A pointer to an unsigned 32-bit integer array that will store the height of each
* component in the JPEG image.
*
* @return The RocJpegStatus indicating the success or failure of the operation.
* - ROCJPEG_STATUS_SUCCESS: The operation was successful.
* - ROCJPEG_STATUS_INVALID_PARAMETER: One or more input parameters are invalid.
* - ROCJPEG_STATUS_RUNTIME_ERROR: An exception occurred during the operation.
*/
RocJpegStatus ROCJPEGAPI
rocJpegGetImageInfo(RocJpegHandle handle,
RocJpegStreamHandle jpeg_stream_handle,
uint8_t* num_components,
RocJpegChromaSubsampling* subsampling,
uint32_t* widths,
uint32_t* heights);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegDecode(RocJpegHandle handle, RocJpegStreamHandle
* jpeg_stream_handle, const RocJpegDecodeParams *decode_params, RocJpegImage *destination);
* @ingroup group_amd_rocjpeg
* @brief Decodes a JPEG image using the rocJPEG library.
*
* This function decodes a JPEG image using the rocJPEG library. It takes a rocJpegHandle, a
* rocJpegStreamHandle, a pointer to RocJpegDecodeParams, and a pointer to RocJpegImage as input
* parameters. The function returns a RocJpegStatus indicating the success or failure of the
* decoding operation.
*
* @param handle The rocJpegHandle representing the rocJPEG decoder instance.
* @param jpeg_stream_handle The rocJpegStreamHandle representing the input JPEG stream.
* @param decode_params A pointer to RocJpegDecodeParams containing the decoding parameters.
* @param destination A pointer to RocJpegImage where the decoded image will be stored.
* @return A RocJpegStatus indicating the success or failure of the decoding operation.
*/
RocJpegStatus ROCJPEGAPI
rocJpegDecode(RocJpegHandle handle,
RocJpegStreamHandle jpeg_stream_handle,
const RocJpegDecodeParams* decode_params,
RocJpegImage* destination);
/**
* @fn RocJpegStatus ROCJPEGAPI rocJpegDecodeBatched(RocJpegHandle handle, RocJpegStreamHandle
* *jpeg_stream_handles, int batch_size, const RocJpegDecodeParams *decode_params, RocJpegImage
* *destinations);
* @ingroup group_amd_rocjpeg
* @brief Decodes a batch of JPEG images using the rocJPEG library.
*
* Decodes a batch of JPEG images using the rocJPEG library.
*
* @param handle The rocJPEG handle.
* @param jpeg_stream_handles An array of rocJPEG stream handles representing the input JPEG
* streams.
* @param batch_size The number of JPEG streams in the batch.
* @param decode_params The decode parameters for the JPEG decoding process.
* @param destinations An array of rocJPEG images representing the output decoded images.
* @return The status of the JPEG decoding operation.
*/
RocJpegStatus ROCJPEGAPI
rocJpegDecodeBatched(RocJpegHandle handle,
RocJpegStreamHandle* jpeg_stream_handles,
int batch_size,
const RocJpegDecodeParams* decode_params,
RocJpegImage* destinations);
/**
* @fn extern const char* ROCDECAPI rocJpegGetErrorName(RocJpegStatus rocjpeg_status);
* @ingroup group_amd_rocjpeg
* @brief Retrieves the name of the error associated with the given RocJpegStatus.
*
* This function returns a string representation of the error associated with the given
* RocJpegStatus.
*
* @param rocjpeg_status The RocJpegStatus for which to retrieve the error name.
* @return A pointer to a constant character string representing the error name.
*/
extern const char* ROCJPEGAPI
rocJpegGetErrorName(RocJpegStatus rocjpeg_status);
#if defined(__cplusplus)
}
#endif
#endif // ROC_JPEG_H
@@ -0,0 +1,120 @@
/*
Copyright (c) 2024 - 2025 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#pragma once
#include <rocprofiler-sdk/rocjpeg/details/rocjpeg_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG > 0
# include <rocjpeg/rocjpeg.h>
#else
# include <rocprofiler-sdk/rocjpeg/details/rocjpeg.h>
#endif
// Define version macros for the rocJPEG API dispatch table, specifying the MAJOR and STEP versions.
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! IMPORTANT !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
// 1. When adding new functions to the rocJPEG API dispatch table, always append the new function
// pointer
// to the end of the table and increment the dispatch table's version number. Never rearrange the
// order of the member variables in the dispatch table, as doing so will break the Application
// Binary Interface (ABI).
// 2. In critical situations where the type of an existing member variable in a dispatch table has
// been changed
// or removed due to a data type modification, it is important to increment the major version
// number of the rocJPEG API dispatch table. If the function pointer type can no longer be
// declared, do not remove it. Instead, change the function pointer type to `void*` and ensure it
// is always initialized to `nullptr`.
//
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
//
// The major version number should ideally remain unchanged. Increment the
// ROCJPEG_RUNTIME_API_TABLE_MAJOR_VERSION only for fundamental changes to the rocJPEGDispatchTable
// struct, such as altering the type or name of an existing member variable. Please DO NOT REMOVE
// it.
#define ROCJPEG_RUNTIME_API_TABLE_MAJOR_VERSION 0
// Increment the ROCJPEG_RUNTIME_API_TABLE_STEP_VERSION when new runtime API functions are added.
// If the corresponding ROCJPEG_RUNTIME_API_TABLE_MAJOR_VERSION increases reset the
// ROCJPEG_RUNTIME_API_TABLE_STEP_VERSION to zero.
#define ROCJPEG_RUNTIME_API_TABLE_STEP_VERSION 0
// rocJPEG API interface
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegStreamCreate)(RocJpegStreamHandle* jpeg_stream_handle);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegStreamParse)(const unsigned char* data,
size_t length,
RocJpegStreamHandle jpeg_stream_handle);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegStreamDestroy)(RocJpegStreamHandle jpeg_stream_handle);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegCreate)(RocJpegBackend backend,
int device_id,
RocJpegHandle* handle);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegDestroy)(RocJpegHandle handle);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegGetImageInfo)(RocJpegHandle handle,
RocJpegStreamHandle jpeg_stream_handle,
uint8_t* num_components,
RocJpegChromaSubsampling* subsampling,
uint32_t* widths,
uint32_t* heights);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegDecode)(RocJpegHandle handle,
RocJpegStreamHandle jpeg_stream_handle,
const RocJpegDecodeParams* decode_params,
RocJpegImage* destination);
typedef RocJpegStatus(ROCJPEGAPI* PfnRocJpegDecodeBatched)(RocJpegHandle handle,
RocJpegStreamHandle* jpeg_stream_handles,
int batch_size,
const RocJpegDecodeParams* decode_params,
RocJpegImage* destinations);
typedef const char*(ROCJPEGAPI* PfnRocJpegGetErrorName)(RocJpegStatus rocjpeg_status);
// rocJPEG API dispatch table
struct RocJpegDispatchTable
{
// ROCJPEG_RUNTIME_API_TABLE_STEP_VERSION == 0
size_t size;
PfnRocJpegStreamCreate pfn_rocjpeg_stream_create;
PfnRocJpegStreamParse pfn_rocjpeg_stream_parse;
PfnRocJpegStreamDestroy pfn_rocjpeg_stream_destroy;
PfnRocJpegCreate pfn_rocjpeg_create;
PfnRocJpegDestroy pfn_rocjpeg_destroy;
PfnRocJpegGetImageInfo pfn_rocjpeg_get_image_info;
PfnRocJpegDecode pfn_rocjpeg_decode;
PfnRocJpegDecodeBatched pfn_rocjpeg_decode_batched;
PfnRocJpegGetErrorName pfn_rocjpeg_get_error_name;
// PLEASE DO NOT EDIT ABOVE!
// ROCJPEG_RUNTIME_API_TABLE_STEP_VERSION == 1
// *******************************************************************************************
// //
// READ BELOW
// *******************************************************************************************
// // Please keep this text at the end of the structure:
// 1. Do not reorder any existing members.
// 2. Increase the step version definition before adding new members.
// 3. Insert new members under the appropriate step version comment.
// 4. Generate a comment for the next step version.
// 5. Add a "PLEASE DO NOT EDIT ABOVE!" comment.
// *******************************************************************************************
// //
};
@@ -0,0 +1,37 @@
/*
Copyright (c) 2024 - 2025 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#pragma once
/*
ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG is set to 0 due to issues with the
rocJPEG cmake setup and header files. Once they are resolved, the first
if-condition should set the ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG variable
to 1
*/
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG)
# if defined __has_include
# if __has_include(<rocjpeg/rocjpeg.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG 0
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG 0
# endif
#endif
@@ -0,0 +1,59 @@
/*
Copyright (c) 2024 - 2025 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#ifndef ROCJPEG_VERSION_H
#define ROCJPEG_VERSION_H
/*!
* \file
* \brief rocJPEG version
* \defgroup group_rocjpeg_version rocJPEG Version
* \brief rocJPEG version
*/
#ifdef __cplusplus
extern "C" {
#endif
#define ROCJPEG_MAJOR_VERSION 0
#define ROCJPEG_MINOR_VERSION 6
#define ROCJPEG_MICRO_VERSION 0
/**
* ROCJPEG_CHECK_VERSION:
* @major: major version, like 1 in 1.2.3
* @minor: minor version, like 2 in 1.2.3
* @micro: micro version, like 3 in 1.2.3
*
* Evaluates to %TRUE if the version of rocJPEG is greater than
* @major, @minor and @micro
*/
#define ROCJPEG_CHECK_VERSION(major, minor, micro) \
(ROCJPEG_MAJOR_VERSION > (major) || \
(ROCJPEG_MAJOR_VERSION == (major) && ROCJPEG_MINOR_VERSION > (minor)) || \
(ROCJPEG_MAJOR_VERSION == (major) && ROCJPEG_MINOR_VERSION == (minor) && \
ROCJPEG_MICRO_VERSION >= (micro)))
#ifdef __cplusplus
}
#endif
#endif
@@ -0,0 +1,31 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#pragma once
// NOLINTNEXTLINE(performance-enum-size)
typedef enum
{
ROCPROFILER_ROCJPEG_TABLE_ID_NONE = -1,
ROCPROFILER_ROCJPEG_TABLE_ID_CORE = 0,
ROCPROFILER_ROCJPEG_TABLE_ID_LAST,
} rocprofiler_rocjpeg_table_id_t;
+2
Datei anzeigen
@@ -165,5 +165,7 @@ using pc_sampling_host_trap_buffered_output_t =
domain_type::PC_SAMPLING_HOST_TRAP>;
using rocdecode_buffered_output_t =
buffered_output<rocprofiler_buffer_tracing_rocdecode_api_record_t, domain_type::ROCDECODE>;
using rocjpeg_buffered_output_t =
buffered_output<rocprofiler_buffer_tracing_rocjpeg_api_record_t, domain_type::ROCJPEG>;
} // namespace tool
} // namespace rocprofiler
+1
Datei anzeigen
@@ -62,6 +62,7 @@ DEFINE_BUFFER_TYPE_NAME(PC_SAMPLING_HOST_TRAP,
"pc_sampling_host_trap",
"pc_sampling_host_trap_stats")
DEFINE_BUFFER_TYPE_NAME(ROCDECODE, "ROCDECODE_API", "rocdecode_api_trace", "rocdecode_api_stats")
DEFINE_BUFFER_TYPE_NAME(ROCJPEG, "ROCJPEG_API", "rocjpeg_api_trace", "rocjpeg_api_stats")
#undef DEFINE_BUFFER_TYPE_NAME
+1
Datei anzeigen
@@ -38,6 +38,7 @@ enum class domain_type
COUNTER_VALUES,
PC_SAMPLING_HOST_TRAP,
ROCDECODE,
ROCJPEG,
LAST,
};
+42
Datei anzeigen
@@ -766,6 +766,48 @@ generate_csv(const output_config&
}
}
void
generate_csv(const output_config& cfg,
const metadata& tool_metadata,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& data,
const stats_entry_t& stats)
{
if(data.empty()) return;
if(cfg.stats && stats)
write_stats(get_stats_output_file(cfg, domain_type::ROCJPEG), stats.entries);
auto ofs = tool::csv_output_file{cfg,
domain_type::ROCJPEG,
tool::csv::api_csv_encoder{},
{"Domain",
"Function",
"Process_Id",
"Thread_Id",
"Correlation_Id",
"Start_Timestamp",
"End_Timestamp"}};
for(auto ditr : data)
{
for(auto record : data.get(ditr))
{
auto row_ss = std::stringstream{};
auto api_name = tool_metadata.get_operation_name(record.kind, record.operation);
rocprofiler::tool::csv::api_csv_encoder::write_row(
row_ss,
tool_metadata.get_kind_name(record.kind),
api_name,
tool_metadata.process_id,
record.thread_id,
record.correlation_id.internal,
record.start_timestamp,
record.end_timestamp);
ofs << row_ss.str();
}
}
}
void
generate_csv(const output_config& cfg,
const metadata& tool_metadata,
+6
Datei anzeigen
@@ -93,6 +93,12 @@ generate_csv(const output_config&
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& data,
const stats_entry_t& stats);
void
generate_csv(const output_config& cfg,
const metadata& tool_metadata,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& data,
const stats_entry_t& stats);
void
generate_csv(const output_config& cfg,
const metadata& tool_metadata,
+3 -1
Datei anzeigen
@@ -196,7 +196,8 @@ write_json(json_output& json_ar,
generator<rocprofiler_buffer_tracing_rccl_api_record_t> rccl_api_gen,
generator<rocprofiler_buffer_tracing_memory_allocation_record_t> memory_allocation_gen,
generator<rocprofiler_tool_pc_sampling_host_trap_record_t> pc_sampling_gen,
generator<rocprofiler_buffer_tracing_rocdecode_api_record_t> rocdecode_api_gen)
generator<rocprofiler_buffer_tracing_rocdecode_api_record_t> rocdecode_api_gen,
generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t> rocjpeg_api_gen)
{
// summary
@@ -239,6 +240,7 @@ write_json(json_output& json_ar,
json_ar(cereal::make_nvp("scratch_memory", scratch_memory_gen));
json_ar(cereal::make_nvp("pc_sample_host_trap", pc_sampling_gen));
json_ar(cereal::make_nvp("rocdecode_api", rocdecode_api_gen));
json_ar(cereal::make_nvp("rocjpeg_api", rocjpeg_api_gen));
json_ar.finishNode();
}
}
+2 -1
Datei anzeigen
@@ -95,7 +95,8 @@ write_json(json_output& json
generator<rocprofiler_buffer_tracing_rccl_api_record_t> rccl_api_gen,
generator<rocprofiler_buffer_tracing_memory_allocation_record_t> memory_allocation_gen,
generator<rocprofiler_tool_pc_sampling_host_trap_record_t> pc_sampling_gen,
generator<rocprofiler_buffer_tracing_rocdecode_api_record_t> rocdecode_api_gen);
generator<rocprofiler_buffer_tracing_rocdecode_api_record_t> rocdecode_api_gen,
generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t> rocjpeg_api_gen);
} // namespace tool
} // namespace rocprofiler
+5 -1
Datei anzeigen
@@ -368,7 +368,8 @@ write_otf2(
std::deque<rocprofiler_buffer_tracing_scratch_memory_record_t>* /*scratch_memory_data*/,
std::deque<rocprofiler_buffer_tracing_rccl_api_record_t>* rccl_api_data,
std::deque<rocprofiler_buffer_tracing_memory_allocation_record_t>* memory_allocation_data,
std::deque<rocprofiler_buffer_tracing_rocdecode_api_record_t>* rocdecode_api_data)
std::deque<rocprofiler_buffer_tracing_rocdecode_api_record_t>* rocdecode_api_data,
std::deque<rocprofiler_buffer_tracing_rocjpeg_api_record_t>* rocjpeg_api_data)
{
namespace sdk = ::rocprofiler::sdk;
@@ -421,6 +422,8 @@ write_otf2(
tids.emplace(itr.thread_id);
for(auto itr : *rocdecode_api_data)
tids.emplace(itr.thread_id);
for(auto itr : *rocjpeg_api_data)
tids.emplace(itr.thread_id);
for(auto itr : *memory_copy_data)
{
@@ -618,6 +621,7 @@ write_otf2(
add_event_data(marker_api_data, sdk::category::marker_api{});
add_event_data(rccl_api_data, sdk::category::rccl_api{});
add_event_data(rocdecode_api_data, sdk::category::rocdecode_api{});
add_event_data(rocjpeg_api_data, sdk::category::rocjpeg_api{});
}
for(auto itr : *memory_copy_data)
+2 -1
Datei anzeigen
@@ -47,6 +47,7 @@ write_otf2(
std::deque<rocprofiler_buffer_tracing_scratch_memory_record_t>* scratch_memory_data,
std::deque<rocprofiler_buffer_tracing_rccl_api_record_t>* rccl_api_data,
std::deque<rocprofiler_buffer_tracing_memory_allocation_record_t>* memory_allocation_data,
std::deque<rocprofiler_buffer_tracing_rocdecode_api_record_t>* rocdecode_api_data);
std::deque<rocprofiler_buffer_tracing_rocdecode_api_record_t>* rocdecode_api_data,
std::deque<rocprofiler_buffer_tracing_rocjpeg_api_record_t>* rocjpeg_api_data);
} // namespace tool
} // namespace rocprofiler
+36 -1
Datei anzeigen
@@ -73,7 +73,8 @@ write_perfetto(
const generator<rocprofiler_buffer_tracing_scratch_memory_record_t>& /*scratch_memory_gen*/,
const generator<rocprofiler_buffer_tracing_rccl_api_record_t>& rccl_api_gen,
const generator<rocprofiler_buffer_tracing_memory_allocation_record_t>& memory_allocation_gen,
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& rocdecode_api_gen)
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& rocdecode_api_gen,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& rocjpeg_api_gen)
{
namespace sdk = ::rocprofiler::sdk;
@@ -172,6 +173,9 @@ write_perfetto(
for(auto ditr : rocdecode_api_gen)
for(auto itr : rocdecode_api_gen.get(ditr))
tids.emplace(itr.thread_id);
for(auto ditr : rocjpeg_api_gen)
for(auto itr : rocjpeg_api_gen.get(ditr))
tids.emplace(itr.thread_id);
for(auto ditr : memory_copy_gen)
for(auto itr : memory_copy_gen.get(ditr))
@@ -434,6 +438,37 @@ write_perfetto(
tracing_session->FlushBlocking();
}
for(auto ditr : rocjpeg_api_gen)
for(auto itr : rocjpeg_api_gen.get(ditr))
{
auto name = buffer_names.at(itr.kind, itr.operation);
auto& track = thread_tracks.at(itr.thread_id);
TRACE_EVENT_BEGIN(sdk::perfetto_category<sdk::category::rocjpeg_api>::name,
::perfetto::StaticString(name.data()),
track,
itr.start_timestamp,
::perfetto::Flow::ProcessScoped(itr.correlation_id.internal),
"begin_ns",
itr.start_timestamp,
"end_ns",
itr.end_timestamp,
"delta_ns",
(itr.end_timestamp - itr.start_timestamp),
"tid",
itr.thread_id,
"kind",
itr.kind,
"operation",
itr.operation,
"corr_id",
itr.correlation_id.internal);
TRACE_EVENT_END(sdk::perfetto_category<sdk::category::rocjpeg_api>::name,
track,
itr.end_timestamp);
tracing_session->FlushBlocking();
}
for(auto ditr : memory_copy_gen)
for(auto itr : memory_copy_gen.get(ditr))
{
+2 -1
Datei anzeigen
@@ -47,6 +47,7 @@ write_perfetto(
const generator<rocprofiler_buffer_tracing_scratch_memory_record_t>& scratch_memory_gen,
const generator<rocprofiler_buffer_tracing_rccl_api_record_t>& rccl_api_gen,
const generator<rocprofiler_buffer_tracing_memory_allocation_record_t>& memory_allocation_gen,
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& rocdecode_api_gen);
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& rocdecode_api_gen,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& rocjpeg_api_gen);
} // namespace tool
} // namespace rocprofiler
+18
Datei anzeigen
@@ -246,6 +246,24 @@ generate_stats(const output_config& /*cfg*/,
return get_stats(rocdecode_stats);
}
stats_entry_t
generate_stats(const output_config& /*cfg*/,
const metadata& tool_metadata,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& data)
{
auto rocjpeg_stats = stats_map_t{};
for(auto ditr : data)
{
for(auto record : data.get(ditr))
{
auto api_name = tool_metadata.get_operation_name(record.kind, record.operation);
rocjpeg_stats[api_name] += (record.end_timestamp - record.start_timestamp);
}
}
return get_stats(rocjpeg_stats);
}
namespace
{
void
+5
Datei anzeigen
@@ -80,6 +80,11 @@ generate_stats(const output_config&
const metadata& tool_metadata,
const generator<rocprofiler_buffer_tracing_rocdecode_api_record_t>& data);
stats_entry_t
generate_stats(const output_config& cfg,
const metadata& tool_metadata,
const generator<rocprofiler_buffer_tracing_rocjpeg_api_record_t>& data);
stats_entry_t
generate_stats(const output_config& cfg,
const metadata& tool_metadata,
@@ -111,6 +111,7 @@ struct config : output_config
bool hip_compiler_api_trace = get_env("ROCPROF_HIP_COMPILER_API_TRACE", false);
bool rccl_api_trace = get_env("ROCPROF_RCCL_API_TRACE", false);
bool rocdecode_api_trace = get_env("ROCPROF_ROCDECODE_API_TRACE", false);
bool rocjpeg_api_trace = get_env("ROCPROF_ROCJPEG_API_TRACE", false);
bool list_metrics = get_env("ROCPROF_LIST_METRICS", false);
bool list_metrics_output_file = get_env("ROCPROF_OUTPUT_LIST_METRICS_FILE", false);
bool pc_sampling_host_trap = false;
+44 -8
Datei anzeigen
@@ -143,10 +143,11 @@ struct buffer_ids
rocprofiler_buffer_id_t rccl_api_trace = {};
rocprofiler_buffer_id_t pc_sampling_host_trap = {};
rocprofiler_buffer_id_t rocdecode_api_trace = {};
rocprofiler_buffer_id_t rocjpeg_api_trace = {};
auto as_array() const
{
return std::array<rocprofiler_buffer_id_t, 10>{hsa_api_trace,
return std::array<rocprofiler_buffer_id_t, 11>{hsa_api_trace,
hip_api_trace,
kernel_trace,
memory_copy_trace,
@@ -155,7 +156,8 @@ struct buffer_ids
scratch_memory,
rccl_api_trace,
pc_sampling_host_trap,
rocdecode_api_trace};
rocdecode_api_trace,
rocjpeg_api_trace};
}
};
@@ -831,6 +833,13 @@ buffered_tracing_callback(rocprofiler_context_id_t /*context*/,
tool::write_ring_buffer(*record, domain_type::ROCDECODE);
}
else if(header->kind == ROCPROFILER_BUFFER_TRACING_ROCJPEG_API)
{
auto* record =
static_cast<rocprofiler_buffer_tracing_rocjpeg_api_record_t*>(header->payload);
tool::write_ring_buffer(*record, domain_type::ROCJPEG);
}
else
{
ROCP_FATAL << fmt::format(
@@ -1491,6 +1500,26 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
"buffer tracing service for ROCDecode api configure");
}
if(tool::get_config().rocjpeg_api_trace)
{
ROCPROFILER_CALL(rocprofiler_create_buffer(get_client_ctx(),
buffer_size,
buffer_watermark,
ROCPROFILER_BUFFER_POLICY_LOSSLESS,
buffered_tracing_callback,
tool_data,
&get_buffers().rocjpeg_api_trace),
"buffer creation");
ROCPROFILER_CALL(
rocprofiler_configure_buffer_tracing_service(get_client_ctx(),
ROCPROFILER_BUFFER_TRACING_ROCJPEG_API,
nullptr,
0,
get_buffers().rocjpeg_api_trace),
"buffer tracing service for ROCDecode api configure");
}
if(tool::get_config().kernel_rename)
{
auto rename_ctx = rocprofiler_context_id_t{0};
@@ -1671,10 +1700,11 @@ tool_fini(void* /*tool_data*/)
tool::memory_allocation_buffered_output_t{tool::get_config().memory_allocation_trace};
auto counters_records_output =
tool::counter_records_buffered_output_t{tool::get_config().counter_collection};
auto rocdecode_output =
tool::rocdecode_buffered_output_t{tool::get_config().rocdecode_api_trace};
auto pc_sampling_host_trap_output =
tool::pc_sampling_host_trap_buffered_output_t{tool::get_config().pc_sampling_host_trap};
auto rocdecode_output =
tool::rocdecode_buffered_output_t{tool::get_config().rocdecode_api_trace};
auto rocjpeg_output = tool::rocjpeg_buffered_output_t{tool::get_config().rocjpeg_api_trace};
auto node_id_sort = [](const auto& lhs, const auto& rhs) { return lhs.node_id < rhs.node_id; };
auto agents_output = CHECK_NOTNULL(tool_metadata)->agents;
@@ -1694,6 +1724,7 @@ tool_fini(void* /*tool_data*/)
generate_output(scratch_memory_output, num_output, contributions);
generate_output(rocdecode_output, num_output, contributions);
generate_output(pc_sampling_host_trap_output, num_output, contributions);
generate_output(rocjpeg_output, num_output, contributions);
if(tool::get_config().advanced_thread_trace && !tool::get_config().att_capability.empty() &&
!tool_metadata->att_filenames.empty())
@@ -1767,7 +1798,8 @@ tool_fini(void* /*tool_data*/)
rccl_output.get_generator(),
memory_allocation_output.get_generator(),
pc_sampling_host_trap_output.get_generator(),
rocdecode_output.get_generator());
rocdecode_output.get_generator(),
rocjpeg_output.get_generator());
json_ar.finish_process();
tool::close_json(json_ar);
@@ -1786,7 +1818,8 @@ tool_fini(void* /*tool_data*/)
scratch_memory_output.get_generator(),
rccl_output.get_generator(),
memory_allocation_output.get_generator(),
rocdecode_output.get_generator());
rocdecode_output.get_generator(),
rocjpeg_output.get_generator());
}
if(tool::get_config().otf2_output && num_output > 0)
@@ -1800,6 +1833,7 @@ tool_fini(void* /*tool_data*/)
auto rccl_elem_data = rccl_output.load_all();
auto memory_allocation_elem_data = memory_allocation_output.load_all();
auto rocdecode_elem_data = rocdecode_output.load_all();
auto rocjpeg_elem_data = rocjpeg_output.load_all();
tool::write_otf2(tool::get_config(),
*tool_metadata,
@@ -1813,7 +1847,8 @@ tool_fini(void* /*tool_data*/)
&scratch_memory_elem_data,
&rccl_elem_data,
&memory_allocation_elem_data,
&rocdecode_elem_data);
&rocdecode_elem_data,
&rocjpeg_elem_data);
}
if(tool::get_config().summary_output && num_output > 0)
@@ -1833,8 +1868,9 @@ tool_fini(void* /*tool_data*/)
destroy_output(scratch_memory_output);
destroy_output(rccl_output);
destroy_output(counters_records_output);
destroy_output(rocdecode_output);
destroy_output(pc_sampling_host_trap_output);
destroy_output(rocdecode_output);
destroy_output(rocjpeg_output);
if(destructors)
{
+2 -1
Datei anzeigen
@@ -38,7 +38,6 @@ add_library(rocprofiler-sdk::rocprofiler-sdk-object-library ALIAS
target_sources(rocprofiler-sdk-object-library PRIVATE ${ROCPROFILER_LIB_SOURCES}
${ROCPROFILER_LIB_HEADERS})
add_subdirectory(hsa)
add_subdirectory(hip)
add_subdirectory(code_object)
@@ -53,6 +52,7 @@ add_subdirectory(kernel_dispatch)
add_subdirectory(page_migration)
add_subdirectory(rccl)
add_subdirectory(rocdecode)
add_subdirectory(rocjpeg)
add_subdirectory(details)
add_subdirectory(ompt)
@@ -63,6 +63,7 @@ target_link_libraries(
rocprofiler-sdk::rocprofiler-sdk-hsa-runtime-nolink
rocprofiler-sdk::rocprofiler-sdk-rccl-nolink
rocprofiler-sdk::rocprofiler-sdk-rocdecode-nolink
rocprofiler-sdk::rocprofiler-sdk-rocjpeg-nolink
PRIVATE rocprofiler-sdk::rocprofiler-sdk-build-flags
rocprofiler-sdk::rocprofiler-sdk-memcheck
rocprofiler-sdk::rocprofiler-sdk-common-library
+1 -1
Datei anzeigen
@@ -483,7 +483,7 @@ update_agent_runtime_visibility(rocprofiler_agent_t& agent_info)
};
static_assert(
ROCPROFILER_LIBRARY_LAST == ROCPROFILER_ROCDECODE_LIBRARY,
ROCPROFILER_LIBRARY_LAST == ROCPROFILER_ROCJPEG_LIBRARY,
"Since a new library was added to rocprofiler_runtime_library_t, please make sure "
"rocprofiler_agent_runtime_visiblity_t has an entry for this library (if "
"necessary) and make the necessary updates to the logic below has been updated");
@@ -35,6 +35,7 @@
#include "lib/rocprofiler-sdk/rccl/rccl.hpp"
#include "lib/rocprofiler-sdk/registration.hpp"
#include "lib/rocprofiler-sdk/rocdecode/rocdecode.hpp"
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include "lib/rocprofiler-sdk/runtime_initialization.hpp"
#include <rocprofiler-sdk/fwd.h>
@@ -43,6 +44,7 @@
#include <rocprofiler-sdk/marker/table_id.h>
#include <rocprofiler-sdk/rccl/table_id.h>
#include <rocprofiler-sdk/rocdecode/table_id.h>
#include <rocprofiler-sdk/rocjpeg/table_id.h>
#include <rocprofiler-sdk/rocprofiler.h>
#include <atomic>
@@ -94,6 +96,7 @@ ROCPROFILER_BUFFER_TRACING_KIND_STRING(RCCL_API)
ROCPROFILER_BUFFER_TRACING_KIND_STRING(OMPT)
ROCPROFILER_BUFFER_TRACING_KIND_STRING(RUNTIME_INITIALIZATION)
ROCPROFILER_BUFFER_TRACING_KIND_STRING(ROCDECODE_API)
ROCPROFILER_BUFFER_TRACING_KIND_STRING(ROCJPEG_API)
template <size_t Idx, size_t... Tail>
std::pair<const char*, size_t>
@@ -293,7 +296,13 @@ rocprofiler_query_buffer_tracing_kind_operation_name(rocprofiler_buffer_tracing_
}
case ROCPROFILER_BUFFER_TRACING_ROCDECODE_API:
{
val = rocprofiler::rocdecode::name_by_id<ROCPROFILER_ROCDECODE_TABLE_ID>(operation);
val =
rocprofiler::rocdecode::name_by_id<ROCPROFILER_ROCDECODE_TABLE_ID_CORE>(operation);
break;
}
case ROCPROFILER_BUFFER_TRACING_ROCJPEG_API:
{
val = rocprofiler::rocjpeg::name_by_id<ROCPROFILER_ROCJPEG_TABLE_ID_CORE>(operation);
break;
}
};
@@ -429,7 +438,12 @@ rocprofiler_iterate_buffer_tracing_kind_operations(
}
case ROCPROFILER_BUFFER_TRACING_ROCDECODE_API:
{
ops = rocprofiler::rocdecode::get_ids<ROCPROFILER_ROCDECODE_TABLE_ID>();
ops = rocprofiler::rocdecode::get_ids<ROCPROFILER_ROCDECODE_TABLE_ID_CORE>();
break;
}
case ROCPROFILER_BUFFER_TRACING_ROCJPEG_API:
{
ops = rocprofiler::rocjpeg::get_ids<ROCPROFILER_ROCJPEG_TABLE_ID_CORE>();
break;
}
}
@@ -34,6 +34,7 @@
#include "lib/rocprofiler-sdk/rccl/rccl.hpp"
#include "lib/rocprofiler-sdk/registration.hpp"
#include "lib/rocprofiler-sdk/rocdecode/rocdecode.hpp"
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include "lib/rocprofiler-sdk/runtime_initialization.hpp"
#include <rocprofiler-sdk/callback_tracing.h>
@@ -43,6 +44,7 @@
#include <rocprofiler-sdk/marker/table_id.h>
#include <rocprofiler-sdk/rccl/table_id.h>
#include <rocprofiler-sdk/rocdecode/table_id.h>
#include <rocprofiler-sdk/rocjpeg/table_id.h>
#include <rocprofiler-sdk/rocprofiler.h>
#include <atomic>
@@ -91,6 +93,7 @@ ROCPROFILER_CALLBACK_TRACING_KIND_STRING(RCCL_API)
ROCPROFILER_CALLBACK_TRACING_KIND_STRING(OMPT)
ROCPROFILER_CALLBACK_TRACING_KIND_STRING(RUNTIME_INITIALIZATION)
ROCPROFILER_CALLBACK_TRACING_KIND_STRING(ROCDECODE_API)
ROCPROFILER_CALLBACK_TRACING_KIND_STRING(ROCJPEG_API)
template <size_t Idx, size_t... Tail>
std::pair<const char*, size_t>
@@ -276,7 +279,13 @@ rocprofiler_query_callback_tracing_kind_operation_name(rocprofiler_callback_trac
}
case ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API:
{
val = rocprofiler::rocdecode::name_by_id<ROCPROFILER_ROCDECODE_TABLE_ID>(operation);
val =
rocprofiler::rocdecode::name_by_id<ROCPROFILER_ROCDECODE_TABLE_ID_CORE>(operation);
break;
}
case ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API:
{
val = rocprofiler::rocjpeg::name_by_id<ROCPROFILER_ROCJPEG_TABLE_ID_CORE>(operation);
break;
}
};
@@ -410,7 +419,12 @@ rocprofiler_iterate_callback_tracing_kind_operations(
}
case ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API:
{
ops = rocprofiler::rocdecode::get_ids<ROCPROFILER_ROCDECODE_TABLE_ID>();
ops = rocprofiler::rocdecode::get_ids<ROCPROFILER_ROCDECODE_TABLE_ID_CORE>();
break;
}
case ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API:
{
ops = rocprofiler::rocjpeg::get_ids<ROCPROFILER_ROCJPEG_TABLE_ID_CORE>();
break;
}
};
@@ -554,8 +568,9 @@ rocprofiler_iterate_callback_tracing_kind_operation_args(
case ROCPROFILER_CALLBACK_TRACING_MEMORY_COPY:
case ROCPROFILER_CALLBACK_TRACING_MEMORY_ALLOCATION:
case ROCPROFILER_CALLBACK_TRACING_RCCL_API:
case ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API:
case ROCPROFILER_CALLBACK_TRACING_RUNTIME_INITIALIZATION:
case ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API:
case ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API:
{
return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED;
}
@@ -35,6 +35,7 @@
#include <hip/amd_detail/hip_api_trace.hpp>
#include "lib/rocprofiler-sdk/rccl/rccl.hpp"
#include "lib/rocprofiler-sdk/rocdecode/rocdecode.hpp"
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include <cstdint>
#include <mutex>
@@ -59,7 +60,8 @@ constexpr auto intercept_library_seq = library_sequence_t<ROCPROFILER_HSA_TABLE,
ROCPROFILER_MARKER_CONTROL_TABLE,
ROCPROFILER_MARKER_NAME_TABLE,
ROCPROFILER_RCCL_TABLE,
ROCPROFILER_ROCDECODE_TABLE>{};
ROCPROFILER_ROCDECODE_TABLE,
ROCPROFILER_ROCJPEG_TABLE>{};
// check that intercept_library_seq is up to date
static_assert((1 << (intercept_library_seq.size() - 1)) == ROCPROFILER_TABLE_LAST,
@@ -199,6 +201,11 @@ template void notify_intercept_table_registration(rocprofiler_intercept_table_t,
uint64_t,
uint64_t,
std::tuple<RocDecodeDispatchTable*>);
template void notify_intercept_table_registration(rocprofiler_intercept_table_t,
uint64_t,
uint64_t,
std::tuple<RocJpegDispatchTable*>);
} // namespace intercept_table
} // namespace rocprofiler
@@ -123,7 +123,8 @@ constexpr auto creation_notifier_library_seq = library_sequence_t<ROCPROFILER_LI
ROCPROFILER_HIP_LIBRARY,
ROCPROFILER_MARKER_LIBRARY,
ROCPROFILER_RCCL_LIBRARY,
ROCPROFILER_ROCDECODE_LIBRARY>{};
ROCPROFILER_ROCDECODE_LIBRARY,
ROCPROFILER_ROCJPEG_LIBRARY>{};
// check that creation_notifier_library_seq is up to date
static_assert((1 << (creation_notifier_library_seq.size() - 1)) == ROCPROFILER_LIBRARY_LAST,
+24 -2
Datei anzeigen
@@ -47,6 +47,7 @@
#include "lib/rocprofiler-sdk/pc_sampling/service.hpp"
#include "lib/rocprofiler-sdk/rccl/rccl.hpp"
#include "lib/rocprofiler-sdk/rocdecode/rocdecode.hpp"
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include "lib/rocprofiler-sdk/runtime_initialization.hpp"
#include <rocprofiler-sdk/context.h>
@@ -1004,10 +1005,31 @@ rocprofiler_set_api_table(const char* name,
rocprofiler::intercept_table::notify_intercept_table_registration(
ROCPROFILER_ROCDECODE_TABLE, lib_version, lib_instance, std::make_tuple(rocdecode_api));
}
else if(std::string_view{name} == "rocjpeg")
{
ROCP_ERROR_IF(num_tables > 1)
<< "rocprofiler expected rocJPEG library to pass 1 API table, not " << num_tables;
auto* rocjpeg_api = static_cast<RocJpegDispatchTable*>(tables[0]);
// any internal modifications to the rocjpegApiFuncTable need to be done before we make
// the copy or else those modifications will be lost when rocJPEG API tracing is enabled
// because the rocJPEG API tracing invokes the function pointers from the copy below
rocprofiler::rocjpeg::copy_table(rocjpeg_api, lib_instance);
// install rocprofiler API wrappers
rocprofiler::rocjpeg::update_table(rocjpeg_api);
// Tracing notifications the runtime has initialized
rocprofiler::runtime_init::initialize(
ROCPROFILER_RUNTIME_INITIALIZATION_ROCJPEG, lib_version, lib_instance);
// allow tools to install API wrappers
rocprofiler::intercept_table::notify_intercept_table_registration(
ROCPROFILER_ROCJPEG_TABLE, lib_version, lib_instance, std::make_tuple(rocjpeg_api));
}
else
{
ROCP_ERROR << "rocprofiler does not accept API tables from " << name;
return ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENT;
}
@@ -555,6 +555,6 @@ using rocdecode_op_args_cb_t = rocprofiler_callback_tracing_operation_args_cb_t;
template std::vector<uint32_t> get_ids<TABLE_IDX>(); \
template std::vector<const char*> get_names<TABLE_IDX>();
INSTANTIATE_ROCDECODE_TABLE_FUNC(rocdecode_api_func_table_t, ROCPROFILER_ROCDECODE_TABLE_ID)
INSTANTIATE_ROCDECODE_TABLE_FUNC(rocdecode_api_func_table_t, ROCPROFILER_ROCDECODE_TABLE_ID_CORE)
} // namespace rocdecode
} // namespace rocprofiler
@@ -42,7 +42,7 @@ struct rocdecode_domain_info<ROCPROFILER_ROCDECODE_TABLE_ID_LAST>
};
template <>
struct rocdecode_domain_info<ROCPROFILER_ROCDECODE_TABLE_ID>
struct rocdecode_domain_info<ROCPROFILER_ROCDECODE_TABLE_ID_CORE>
: rocdecode_domain_info<ROCPROFILER_ROCDECODE_TABLE_ID_LAST>
{
using enum_type = rocprofiler_marker_core_api_id_t;
@@ -61,26 +61,26 @@ struct rocdecode_domain_info<ROCPROFILER_ROCDECODE_TABLE_ID>
ROCPROFILER_LIB_ROCPROFILER_SDK_ROCDECODE_ROCDECODE_CPP_IMPL == 1
// clang-format off
ROCDECODE_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_ROCDECODE_TABLE_ID, rocdecode_api_func_table_t)
ROCDECODE_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, rocdecode_api_func_table_t)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateVideoParser, rocDecCreateVideoParser, pfn_rocdec_create_video_parser, parser_handle, params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecParseVideoData, rocDecParseVideoData, pfn_rocdec_parse_video_data, parser_handle, packet)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyVideoParser, rocDecDestroyVideoParser, pfn_rocdec_destroy_video_parser, parser_handle)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateDecoder, rocDecCreateDecoder, pfn_rocdec_create_decoder, decoder_handle, decoder_create_info)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyDecoder, rocDecDestroyDecoder, pfn_rocdec_destroy_decoder, decoder_handle)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetDecoderCaps, rocDecGetDecoderCaps, pfn_rocdec_get_gecoder_caps, decode_caps)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecDecodeFrame, rocDecDecodeFrame, pfn_rocdec_decode_frame, decoder_handle, pic_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetDecodeStatus, rocDecGetDecodeStatus, pfn_rocdec_get_decode_status, decoder_handle, pic_idx, decode_status)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecReconfigureDecoder, rocDecReconfigureDecoder, pfn_rocdec_reconfigure_decoder, decoder_handle, reconfig_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetVideoFrame, rocDecGetVideoFrame, pfn_rocdec_get_video_frame, decoder_handle, pic_idx, dev_mem_ptr, horizontal_pitch, vid_postproc_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetErrorName, rocDecGetErrorName, pfn_rocdec_get_error_name, rocdec_status)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateVideoParser, rocDecCreateVideoParser, pfn_rocdec_create_video_parser, parser_handle, params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecParseVideoData, rocDecParseVideoData, pfn_rocdec_parse_video_data, parser_handle, packet)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyVideoParser, rocDecDestroyVideoParser, pfn_rocdec_destroy_video_parser, parser_handle)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateDecoder, rocDecCreateDecoder, pfn_rocdec_create_decoder, decoder_handle, decoder_create_info)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyDecoder, rocDecDestroyDecoder, pfn_rocdec_destroy_decoder, decoder_handle)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetDecoderCaps, rocDecGetDecoderCaps, pfn_rocdec_get_gecoder_caps, decode_caps)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecDecodeFrame, rocDecDecodeFrame, pfn_rocdec_decode_frame, decoder_handle, pic_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetDecodeStatus, rocDecGetDecodeStatus, pfn_rocdec_get_decode_status, decoder_handle, pic_idx, decode_status)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecReconfigureDecoder, rocDecReconfigureDecoder, pfn_rocdec_reconfigure_decoder, decoder_handle, reconfig_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetVideoFrame, rocDecGetVideoFrame, pfn_rocdec_get_video_frame, decoder_handle, pic_idx, dev_mem_ptr, horizontal_pitch, vid_postproc_params)
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetErrorName, rocDecGetErrorName, pfn_rocdec_get_error_name, rocdec_status)
#if ROCDECODE_RUNTIME_API_TABLE_STEP_VERSION >= 1
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateBitstreamReader, rocDecCreateBitstreamReader, pfn_rocdec_create_bitstream_reader, bs_reader_handle, input_file_path);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamCodecType, rocDecGetBitstreamCodecType, pfn_rocdec_get_bitstream_codec_type, bs_reader_handle, codec_type);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamBitDepth, rocDecGetBitstreamBitDepth, pfn_rocdec_get_bitstream_bit_depth, bs_reader_handle, bit_depth);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamPicData, rocDecGetBitstreamPicData, pfn_rocdec_get_bitstream_pic_data, bs_reader_handle, pic_data, pic_size, pts);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyBitstreamReader, rocDecDestroyBitstreamReader, pfn_rocdec_destroy_bitstream_reader, bs_reader_handle);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecCreateBitstreamReader, rocDecCreateBitstreamReader, pfn_rocdec_create_bitstream_reader, bs_reader_handle, input_file_path);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamCodecType, rocDecGetBitstreamCodecType, pfn_rocdec_get_bitstream_codec_type, bs_reader_handle, codec_type);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamBitDepth, rocDecGetBitstreamBitDepth, pfn_rocdec_get_bitstream_bit_depth, bs_reader_handle, bit_depth);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecGetBitstreamPicData, rocDecGetBitstreamPicData, pfn_rocdec_get_bitstream_pic_data, bs_reader_handle, pic_data, pic_size, pts);
ROCDECODE_API_INFO_DEFINITION_V(ROCPROFILER_ROCDECODE_TABLE_ID_CORE, ROCPROFILER_ROCDECODE_API_ID_rocDecDestroyBitstreamReader, rocDecDestroyBitstreamReader, pfn_rocdec_destroy_bitstream_reader, bs_reader_handle);
#endif
#else
# error \
@@ -22,17 +22,7 @@
#pragma once
#if !defined(ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE)
# if defined __has_include
# if __has_include(<rocdecode/amd_detail/api_trace.h>)
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 1
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
# else
# define ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE 0
# endif
#endif
#include <rocprofiler-sdk/rocdecode/details/rocdecode_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCDECODE > 0
# include <rocdecode/amd_detail/rocdecode_api_trace.h>
@@ -0,0 +1,5 @@
set(ROCPROFILER_LIB_ROCJPEG_SOURCES abi.cpp rocjpeg.cpp)
set(ROCPROFILER_LIB_ROCJPEG_HEADERS defines.hpp rocjpeg.hpp)
target_sources(rocprofiler-sdk-object-library PRIVATE ${ROCPROFILER_LIB_ROCJPEG_SOURCES}
${ROCPROFILER_LIB_ROCJPEG_HEADERS})
@@ -0,0 +1,51 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include "lib/common/abi.hpp"
#include "lib/common/defines.hpp"
#include <rocprofiler-sdk/rocjpeg.h>
#include <rocprofiler-sdk/version.h>
namespace rocprofiler
{
namespace rocjpeg
{
static_assert(ROCJPEG_RUNTIME_API_TABLE_MAJOR_VERSION == 0,
"Major version updated for rocJPEG dispatch table");
ROCP_SDK_ENFORCE_ABI_VERSIONING(::RocJpegDispatchTable, 9)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_stream_create, 0)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_stream_parse, 1)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_stream_destroy, 2)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_create, 3)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_destroy, 4)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_get_image_info, 5)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_decode, 6)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_decode_batched, 7)
ROCP_SDK_ENFORCE_ABI(::RocJpegDispatchTable, pfn_rocjpeg_get_error_name, 8)
} // namespace rocjpeg
} // namespace rocprofiler
@@ -0,0 +1,214 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#pragma once
#include "lib/common/defines.hpp"
#define ROCJPEG_API_INFO_DEFINITION_0( \
ROCJPEG_TABLE, ROCJPEG_API_ID, ROCJPEG_FUNC, ROCJPEG_FUNC_PTR) \
namespace rocprofiler \
{ \
namespace rocjpeg \
{ \
template <> \
struct rocjpeg_api_info<ROCJPEG_TABLE, ROCJPEG_API_ID> : rocjpeg_domain_info<ROCJPEG_TABLE> \
{ \
static constexpr auto table_idx = ROCJPEG_TABLE; \
static constexpr auto operation_idx = ROCJPEG_API_ID; \
static constexpr auto name = #ROCJPEG_FUNC; \
\
using domain_type = rocjpeg_domain_info<table_idx>; \
using this_type = rocjpeg_api_info<table_idx, operation_idx>; \
using base_type = rocjpeg_api_impl<table_idx, operation_idx>; \
\
using domain_type::callback_domain_idx; \
using domain_type::buffered_domain_idx; \
using domain_type::args_type; \
using domain_type::retval_type; \
using domain_type::callback_data_type; \
\
static constexpr auto offset() \
{ \
return offsetof(rocjpeg_table_lookup<table_idx>::type, ROCJPEG_FUNC_PTR); \
} \
\
static_assert(offsetof(rocjpeg_table_lookup<table_idx>::type, ROCJPEG_FUNC_PTR) == \
(sizeof(size_t) + (operation_idx * sizeof(void*))), \
"ABI error for " #ROCJPEG_FUNC); \
\
static auto& get_table() { return rocjpeg_table_lookup<table_idx>{}(); } \
\
template <typename TableT> \
static auto& get_table(TableT& _v) \
{ \
return rocjpeg_table_lookup<table_idx>{}(_v); \
} \
\
template <typename TableT> \
static auto& get_table_func(TableT& _table) \
{ \
if constexpr(std::is_pointer<TableT>::value) \
{ \
assert(_table != nullptr && "nullptr to MARKER table for " #ROCJPEG_FUNC \
" function"); \
return _table->ROCJPEG_FUNC_PTR; \
} \
else \
{ \
return _table.ROCJPEG_FUNC_PTR; \
} \
} \
\
static auto& get_table_func() { return get_table_func(get_table()); } \
\
template <typename DataT> \
static auto& get_api_data_args(DataT& _data) \
{ \
return _data.ROCJPEG_FUNC; \
} \
\
template <typename RetT, typename... Args> \
static auto get_functor(RetT (*)(Args...)) \
{ \
return &base_type::functor<RetT, Args...>; \
} \
\
static std::vector<void*> as_arg_addr(callback_data_type) { return std::vector<void*>{}; } \
\
static std::vector<common::stringified_argument> as_arg_list(callback_data_type, int32_t) \
{ \
return {}; \
} \
}; \
} \
}
#define ROCJPEG_API_INFO_DEFINITION_V( \
ROCJPEG_TABLE, ROCJPEG_API_ID, ROCJPEG_FUNC, ROCJPEG_FUNC_PTR, ...) \
namespace rocprofiler \
{ \
namespace rocjpeg \
{ \
template <> \
struct rocjpeg_api_info<ROCJPEG_TABLE, ROCJPEG_API_ID> : rocjpeg_domain_info<ROCJPEG_TABLE> \
{ \
static constexpr auto table_idx = ROCJPEG_TABLE; \
static constexpr auto operation_idx = ROCJPEG_API_ID; \
static constexpr auto name = #ROCJPEG_FUNC; \
\
using domain_type = rocjpeg_domain_info<table_idx>; \
using this_type = rocjpeg_api_info<table_idx, operation_idx>; \
using base_type = rocjpeg_api_impl<table_idx, operation_idx>; \
\
static constexpr auto callback_domain_idx = domain_type::callback_domain_idx; \
static constexpr auto buffered_domain_idx = domain_type::buffered_domain_idx; \
\
using domain_type::args_type; \
using domain_type::retval_type; \
using domain_type::callback_data_type; \
\
static constexpr auto offset() \
{ \
return offsetof(rocjpeg_table_lookup<table_idx>::type, ROCJPEG_FUNC_PTR); \
} \
\
static_assert(offsetof(rocjpeg_table_lookup<table_idx>::type, ROCJPEG_FUNC_PTR) == \
(sizeof(size_t) + (operation_idx * sizeof(void*))), \
"ABI error for " #ROCJPEG_FUNC); \
\
static auto& get_table() { return rocjpeg_table_lookup<table_idx>{}(); } \
\
template <typename TableT> \
static auto& get_table(TableT& _v) \
{ \
return rocjpeg_table_lookup<table_idx>{}(_v); \
} \
\
template <typename TableT> \
static auto& get_table_func(TableT& _table) \
{ \
if constexpr(std::is_pointer<TableT>::value) \
{ \
assert(_table != nullptr && "nullptr to MARKER table for " #ROCJPEG_FUNC \
" function"); \
return _table->ROCJPEG_FUNC_PTR; \
} \
else \
{ \
return _table.ROCJPEG_FUNC_PTR; \
} \
} \
\
static auto& get_table_func() { return get_table_func(get_table()); } \
\
template <typename DataT> \
static auto& get_api_data_args(DataT& _data) \
{ \
return _data.ROCJPEG_FUNC; \
} \
\
template <typename RetT, typename... Args> \
static auto get_functor(RetT (*)(Args...)) \
{ \
return &base_type::functor<RetT, Args...>; \
} \
\
static std::vector<void*> as_arg_addr(callback_data_type trace_data) \
{ \
return std::vector<void*>{ \
GET_ADDR_MEMBER_FIELDS(get_api_data_args(trace_data.args), __VA_ARGS__)}; \
} \
}; \
} \
}
#define ROCJPEG_API_TABLE_LOOKUP_DEFINITION(TABLE_ID, TYPE) \
namespace rocprofiler \
{ \
namespace rocjpeg \
{ \
namespace \
{ \
template <> \
auto* get_table<TABLE_ID>() \
{ \
return get_table_impl<TYPE>(); \
} \
} \
\
template <> \
struct rocjpeg_table_lookup<TABLE_ID> \
{ \
using type = TYPE; \
auto& operator()(type& _v) const { return _v; } \
auto& operator()(type* _v) const { return *_v; } \
auto& operator()() const { return (*this)(get_table<TABLE_ID>()); } \
}; \
\
template <> \
struct rocjpeg_table_id_lookup<TYPE> \
{ \
static constexpr auto value = TABLE_ID; \
}; \
} \
}
@@ -0,0 +1,560 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include "lib/common/defines.hpp"
#include "lib/common/static_object.hpp"
#include "lib/common/utility.hpp"
#include "lib/rocprofiler-sdk/buffer.hpp"
#include "lib/rocprofiler-sdk/context/context.hpp"
#include "lib/rocprofiler-sdk/hip/utils.hpp"
#include "lib/rocprofiler-sdk/registration.hpp"
#include "lib/rocprofiler-sdk/tracing/tracing.hpp"
#include <rocprofiler-sdk/buffer.h>
#include <rocprofiler-sdk/callback_tracing.h>
#include <rocprofiler-sdk/fwd.h>
#include <rocprofiler-sdk/rocjpeg/table_id.h>
#include <hip/driver_types.h>
#include <hip/hip_runtime_api.h>
// must be included after runtime api
#include <hip/hip_deprecated.h>
#include <cstddef>
#include <cstdint>
#include <type_traits>
#include <utility>
namespace rocprofiler
{
namespace rocjpeg
{
namespace
{
struct null_type
{};
template <typename Tp>
auto
get_default_retval()
{
if constexpr(std::is_pointer<Tp>::value)
{
Tp v = nullptr;
return v;
}
else if constexpr(std::is_same<Tp, RocJpegStatus>::value)
return ROCJPEG_STATUS_RUNTIME_ERROR;
else if constexpr(std::is_same<Tp, const char*>::value)
return "UnknownString";
else
static_assert(std::is_empty<Tp>::value, "Error! unsupported return type");
}
template <typename DataT, typename Tp>
void
set_data_retval(DataT& _data, Tp _val)
{
if constexpr(std::is_same<Tp, RocJpegStatus>::value)
{
_data.rocJpegStatus_retval = _val;
}
else if constexpr(std::is_same<Tp, const char*>::value)
{
_data.const_charp_retval = _val;
}
else
{
static_assert(std::is_empty<Tp>::value, "Error! unsupported return type");
}
}
template <typename Tp>
Tp*
get_table_impl()
{
static auto*& _v = common::static_object<Tp>::construct(common::init_public_api_struct(Tp{}));
return _v;
}
template <size_t TableIdx>
auto*
get_table();
} // namespace
template <size_t TableIdx, size_t OpIdx>
template <typename DataArgsT, typename... Args>
auto
rocjpeg_api_impl<TableIdx, OpIdx>::set_data_args(DataArgsT& _data_args, Args... args)
{
if constexpr(sizeof...(Args) == 0)
_data_args.no_args.empty = '\0';
else
_data_args = DataArgsT{args...};
}
template <size_t TableIdx, size_t OpIdx>
template <typename FuncT, typename... Args>
auto
rocjpeg_api_impl<TableIdx, OpIdx>::exec(FuncT&& _func, Args&&... args)
{
using return_type = std::decay_t<std::invoke_result_t<FuncT, Args...>>;
if(_func)
{
if constexpr(std::is_void<return_type>::value)
{
_func(std::forward<Args>(args)...);
return null_type{};
}
else
{
return _func(std::forward<Args>(args)...);
}
}
using info_type = rocjpeg_api_info<TableIdx, OpIdx>;
ROCP_ERROR << "nullptr to next rocjpeg function for " << info_type::name << " ("
<< info_type::operation_idx << ")";
return get_default_retval<return_type>();
}
template <size_t TableIdx, size_t OpIdx>
template <typename RetT, typename... Args>
RetT
rocjpeg_api_impl<TableIdx, OpIdx>::functor(Args... args)
{
using info_type = rocjpeg_api_info<TableIdx, OpIdx>;
using callback_api_data_t = typename rocjpeg_domain_info<TableIdx>::callback_data_type;
using buffered_api_data_t = typename rocjpeg_domain_info<TableIdx>::buffer_data_type;
constexpr auto external_corr_id_domain_idx =
rocjpeg_domain_info<TableIdx>::external_correlation_id_domain_idx;
if(registration::get_fini_status() != 0)
{
[[maybe_unused]] auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
if constexpr(!std::is_void<RetT>::value)
return _ret;
else
return;
}
constexpr auto ref_count = 2;
auto thr_id = common::get_tid();
auto callback_contexts = tracing::callback_context_data_vec_t{};
auto buffered_contexts = tracing::buffered_context_data_vec_t{};
auto external_corr_ids = tracing::external_correlation_id_map_t{};
tracing::populate_contexts(info_type::callback_domain_idx,
info_type::buffered_domain_idx,
info_type::operation_idx,
callback_contexts,
buffered_contexts,
external_corr_ids);
if(callback_contexts.empty() && buffered_contexts.empty())
{
[[maybe_unused]] auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
if constexpr(!std::is_void<RetT>::value)
return _ret;
else
return;
}
auto buffer_record = common::init_public_api_struct(buffered_api_data_t{});
auto tracer_data = common::init_public_api_struct(callback_api_data_t{});
auto* corr_id = tracing::correlation_service::construct(ref_count);
auto internal_corr_id = corr_id->internal;
tracing::populate_external_correlation_ids(external_corr_ids,
thr_id,
external_corr_id_domain_idx,
info_type::operation_idx,
internal_corr_id);
// invoke the callbacks
if(!callback_contexts.empty())
{
set_data_args(info_type::get_api_data_args(tracer_data.args), std::forward<Args>(args)...);
tracing::execute_phase_enter_callbacks(callback_contexts,
thr_id,
internal_corr_id,
external_corr_ids,
info_type::callback_domain_idx,
info_type::operation_idx,
tracer_data);
}
// enter callback may update the external correlation id field
tracing::update_external_correlation_ids(
external_corr_ids, thr_id, external_corr_id_domain_idx);
// record the start timestamp as close to the function call as possible
if(!buffered_contexts.empty())
{
buffer_record.start_timestamp = common::timestamp_ns();
}
// decrement the reference count before invoking
corr_id->sub_ref_count();
auto _ret = exec(info_type::get_table_func(), std::forward<Args>(args)...);
// record the end timestamp as close to the function call as possible
if(!buffered_contexts.empty())
{
buffer_record.end_timestamp = common::timestamp_ns();
}
if(!callback_contexts.empty())
{
set_data_retval(tracer_data.retval, _ret);
tracing::execute_phase_exit_callbacks(callback_contexts,
external_corr_ids,
info_type::callback_domain_idx,
info_type::operation_idx,
tracer_data);
}
if(!buffered_contexts.empty())
{
tracing::execute_buffer_record_emplace(buffered_contexts,
thr_id,
internal_corr_id,
external_corr_ids,
info_type::buffered_domain_idx,
info_type::operation_idx,
buffer_record);
}
// decrement the reference count after usage in the callback/buffers
corr_id->sub_ref_count();
context::pop_latest_correlation_id(corr_id);
if constexpr(!std::is_void<RetT>::value) return _ret;
}
} // namespace rocjpeg
} // namespace rocprofiler
#define ROCPROFILER_LIB_ROCPROFILER_SDK_ROCJPEG_ROCJPEG_CPP_IMPL 1
// template specializations
#include "rocjpeg.def.cpp"
namespace rocprofiler
{
namespace rocjpeg
{
namespace
{
template <size_t TableIdx, size_t OpIdx, size_t... OpIdxTail>
const char*
name_by_id(const uint32_t id, std::index_sequence<OpIdx, OpIdxTail...>)
{
if(OpIdx == id) return rocjpeg_api_info<TableIdx, OpIdx>::name;
if constexpr(sizeof...(OpIdxTail) > 0)
return name_by_id<TableIdx>(id, std::index_sequence<OpIdxTail...>{});
else
return nullptr;
}
template <size_t TableIdx, size_t OpIdx, size_t... OpIdxTail>
uint32_t
id_by_name(const char* name, std::index_sequence<OpIdx, OpIdxTail...>)
{
if(std::string_view{rocjpeg_api_info<TableIdx, OpIdx>::name} == std::string_view{name})
return rocjpeg_api_info<TableIdx, OpIdx>::operation_idx;
if constexpr(sizeof...(OpIdxTail) > 0)
return id_by_name<TableIdx>(name, std::index_sequence<OpIdxTail...>{});
else
return rocjpeg_domain_info<TableIdx>::none;
}
template <size_t TableIdx, size_t OpIdx, size_t... OpIdxTail>
void
get_ids(std::vector<uint32_t>& _id_list, std::index_sequence<OpIdx, OpIdxTail...>)
{
auto _idx = rocjpeg_api_info<TableIdx, OpIdx>::operation_idx;
if(_idx < rocjpeg_domain_info<TableIdx>::last) _id_list.emplace_back(_idx);
if constexpr(sizeof...(OpIdxTail) > 0)
get_ids<TableIdx>(_id_list, std::index_sequence<OpIdxTail...>{});
}
template <size_t TableIdx, size_t OpIdx, size_t... OpIdxTail>
void
get_names(std::vector<const char*>& _name_list, std::index_sequence<OpIdx, OpIdxTail...>)
{
auto&& _name = rocjpeg_api_info<TableIdx, OpIdx>::name;
if(_name != nullptr && strnlen(_name, 1) > 0) _name_list.emplace_back(_name);
if constexpr(sizeof...(OpIdxTail) > 0)
get_names<TableIdx>(_name_list, std::index_sequence<OpIdxTail...>{});
}
template <size_t TableIdx, typename DataT, size_t OpIdx, size_t... OpIdxTail>
void
iterate_args(const uint32_t id,
const DataT& data,
rocprofiler_callback_tracing_operation_args_cb_t func,
int32_t max_deref,
void* user_data,
std::index_sequence<OpIdx, OpIdxTail...>)
{
if(OpIdx == id)
{
using info_type = rocjpeg_api_info<TableIdx, OpIdx>;
auto&& arg_list = info_type::as_arg_list(data, max_deref);
auto&& arg_addr = info_type::as_arg_addr(data);
for(size_t i = 0; i < std::min(arg_list.size(), arg_addr.size()); ++i)
{
auto ret = func(info_type::callback_domain_idx, // kind
id, // operation
i, // arg_number
arg_addr.at(i), // arg_value_addr
arg_list.at(i).indirection_level, // indirection
arg_list.at(i).type, // arg_type
arg_list.at(i).name, // arg_name
arg_list.at(i).value.c_str(), // arg_value_str
arg_list.at(i).dereference_count, // num deref in str
user_data);
if(ret != 0) break;
}
return;
}
if constexpr(sizeof...(OpIdxTail) > 0)
iterate_args<TableIdx>(
id, data, func, max_deref, user_data, std::index_sequence<OpIdxTail...>{});
}
bool
should_wrap_functor(rocprofiler_callback_tracing_kind_t _callback_domain,
rocprofiler_buffer_tracing_kind_t _buffered_domain,
int _operation)
{
// we loop over all the *registered* contexts and see if any of them, at any point in time,
// might require callback or buffered API tracing
for(const auto& itr : context::get_registered_contexts())
{
if(!itr) continue;
// if there is a callback tracer enabled for the given domain and op, we need to wrap
if(itr->callback_tracer && itr->callback_tracer->domains(_callback_domain) &&
itr->callback_tracer->domains(_callback_domain, _operation))
return true;
// if there is a buffered tracer enabled for the given domain and op, we need to wrap
if(itr->buffered_tracer && itr->buffered_tracer->domains(_buffered_domain) &&
itr->buffered_tracer->domains(_buffered_domain, _operation))
return true;
}
return false;
}
template <size_t TableIdx, typename Tp, size_t OpIdx>
void
copy_table(Tp* _orig, uint64_t _tbl_instance, std::integral_constant<size_t, OpIdx>)
{
using table_type = typename rocjpeg_table_lookup<TableIdx>::type;
if constexpr(std::is_same<table_type, Tp>::value)
{
auto _info = rocjpeg_api_info<TableIdx, OpIdx>{};
// make sure we don't access a field that doesn't exist in input table
if(_info.offset() >= _orig->size) return;
// 1. get the sub-table containing the function pointer in original table
// 2. get reference to function pointer in sub-table in original table
auto& _orig_table = _info.get_table(_orig);
auto& _orig_func = _info.get_table_func(_orig_table);
// 3. get the sub-table containing the function pointer in saved table
// 4. get reference to function pointer in sub-table in saved table
// 5. save the original function in the saved table
auto& _copy_table = _info.get_table(*get_table<TableIdx>());
auto& _copy_func = _info.get_table_func(_copy_table);
ROCP_FATAL_IF(_copy_func && _tbl_instance == 0)
<< _info.name << " has non-null function pointer " << _copy_func
<< " despite this being the first instance of the library being copies";
if(!_copy_func)
{
ROCP_TRACE << "copying table entry for " << _info.name;
_copy_func = _orig_func;
}
else
{
ROCP_TRACE << "skipping copying table entry for " << _info.name
<< " from table instance " << _tbl_instance;
}
}
}
template <size_t TableIdx, typename Tp, size_t OpIdx>
void
update_table(Tp* _orig, std::integral_constant<size_t, OpIdx>)
{
using table_type = typename rocjpeg_table_lookup<TableIdx>::type;
if constexpr(std::is_same<table_type, Tp>::value)
{
auto _info = rocjpeg_api_info<TableIdx, OpIdx>{};
// make sure we don't access a field that doesn't exist in input table
if(_info.offset() >= _orig->size) return;
// check to see if there are any contexts which enable this operation in the rocJPEG API
// domain
if(!should_wrap_functor(
_info.callback_domain_idx, _info.buffered_domain_idx, _info.operation_idx))
return;
ROCP_TRACE << "updating table entry for " << _info.name;
// 1. get the sub-table containing the function pointer in original table
// 2. get reference to function pointer in sub-table in original table
// 3. update function pointer with wrapper
auto& _table = _info.get_table(_orig);
auto& _func = _info.get_table_func(_table);
_func = _info.get_functor(_func);
}
}
template <size_t TableIdx, typename Tp, size_t OpIdx, size_t... OpIdxTail>
void
copy_table(Tp* _orig, uint64_t _tbl_instance, std::index_sequence<OpIdx, OpIdxTail...>)
{
copy_table<TableIdx>(_orig, _tbl_instance, std::integral_constant<size_t, OpIdx>{});
if constexpr(sizeof...(OpIdxTail) > 0)
copy_table<TableIdx>(_orig, _tbl_instance, std::index_sequence<OpIdxTail...>{});
}
template <size_t TableIdx, typename Tp, size_t OpIdx, size_t... OpIdxTail>
void
update_table(Tp* _orig, std::index_sequence<OpIdx, OpIdxTail...>)
{
update_table<TableIdx>(_orig, std::integral_constant<size_t, OpIdx>{});
if constexpr(sizeof...(OpIdxTail) > 0)
update_table<TableIdx>(_orig, std::index_sequence<OpIdxTail...>{});
}
} // namespace
// check out the assembly here... this compiles to a switch statement
template <size_t TableIdx>
const char*
name_by_id(uint32_t id)
{
return name_by_id<TableIdx>(id,
std::make_index_sequence<rocjpeg_domain_info<TableIdx>::last>{});
}
template <size_t TableIdx>
uint32_t
id_by_name(const char* name)
{
return id_by_name<TableIdx>(name,
std::make_index_sequence<rocjpeg_domain_info<TableIdx>::last>{});
}
template <size_t TableIdx>
std::vector<uint32_t>
get_ids()
{
constexpr auto last_api_id = rocjpeg_domain_info<TableIdx>::last;
auto _data = std::vector<uint32_t>{};
_data.reserve(last_api_id);
get_ids<TableIdx>(_data, std::make_index_sequence<last_api_id>{});
return _data;
}
template <size_t TableIdx>
std::vector<const char*>
get_names()
{
constexpr auto last_api_id = rocjpeg_domain_info<TableIdx>::last;
auto _data = std::vector<const char*>{};
_data.reserve(last_api_id);
get_names<TableIdx>(_data, std::make_index_sequence<last_api_id>{});
return _data;
}
template <size_t TableIdx>
void
iterate_args(uint32_t id,
const rocprofiler_callback_tracing_rocjpeg_api_data_t& data,
rocprofiler_callback_tracing_operation_args_cb_t callback,
int32_t max_deref,
void* user_data)
{
if(callback)
iterate_args<TableIdx>(id,
data,
callback,
max_deref,
user_data,
std::make_index_sequence<rocjpeg_domain_info<TableIdx>::last>{});
}
template <typename TableT>
void
copy_table(TableT* _orig, uint64_t _tbl_instance)
{
constexpr auto TableIdx = rocjpeg_table_id_lookup<TableT>::value;
if(_orig)
copy_table<TableIdx>(
_orig, _tbl_instance, std::make_index_sequence<rocjpeg_domain_info<TableIdx>::last>{});
}
template <typename TableT>
void
update_table(TableT* _orig)
{
constexpr auto TableIdx = rocjpeg_table_id_lookup<TableT>::value;
if(_orig)
update_table<TableIdx>(_orig,
std::make_index_sequence<rocjpeg_domain_info<TableIdx>::last>{});
}
using rocjpeg_api_data_t = rocprofiler_callback_tracing_rocjpeg_api_data_t;
using rocjpeg_op_args_cb_t = rocprofiler_callback_tracing_operation_args_cb_t;
#define INSTANTIATE_ROCJPEG_TABLE_FUNC(TABLE_TYPE, TABLE_IDX) \
template void copy_table<TABLE_TYPE>(TABLE_TYPE * _tbl, uint64_t _instv); \
template void update_table<TABLE_TYPE>(TABLE_TYPE * _tbl); \
template const char* name_by_id<TABLE_IDX>(uint32_t); \
template uint32_t id_by_name<TABLE_IDX>(const char*); \
template std::vector<uint32_t> get_ids<TABLE_IDX>(); \
template std::vector<const char*> get_names<TABLE_IDX>();
INSTANTIATE_ROCJPEG_TABLE_FUNC(rocjpeg_api_func_table_t, ROCPROFILER_ROCJPEG_TABLE_ID_CORE)
} // namespace rocjpeg
} // namespace rocprofiler
@@ -0,0 +1,81 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#include "lib/rocprofiler-sdk/rocjpeg/defines.hpp"
#include "lib/rocprofiler-sdk/rocjpeg/rocjpeg.hpp"
#include <rocprofiler-sdk/external_correlation.h>
#include <rocprofiler-sdk/fwd.h>
#include <rocprofiler-sdk/rocjpeg.h>
#include <rocprofiler-sdk/rocjpeg/table_id.h>
namespace rocprofiler
{
namespace rocjpeg
{
template <>
struct rocjpeg_domain_info<ROCPROFILER_ROCJPEG_TABLE_ID_LAST>
{
using args_type = rocprofiler_rocjpeg_api_args_t;
using retval_type = rocprofiler_rocjpeg_api_retval_t;
using callback_data_type = rocprofiler_callback_tracing_rocjpeg_api_data_t;
using buffer_data_type = rocprofiler_buffer_tracing_rocjpeg_api_record_t;
};
template <>
struct rocjpeg_domain_info<ROCPROFILER_ROCJPEG_TABLE_ID_CORE>
: rocjpeg_domain_info<ROCPROFILER_ROCJPEG_TABLE_ID_LAST>
{
using enum_type = rocprofiler_marker_core_api_id_t;
static constexpr auto callback_domain_idx = ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API;
static constexpr auto buffered_domain_idx = ROCPROFILER_BUFFER_TRACING_ROCJPEG_API;
static constexpr auto none = ROCPROFILER_ROCJPEG_API_ID_NONE;
static constexpr auto last = ROCPROFILER_ROCJPEG_API_ID_LAST;
static constexpr auto external_correlation_id_domain_idx =
ROCPROFILER_EXTERNAL_CORRELATION_REQUEST_ROCJPEG_API;
};
} // namespace rocjpeg
} // namespace rocprofiler
#if defined(ROCPROFILER_LIB_ROCPROFILER_SDK_ROCJPEG_ROCJPEG_CPP_IMPL) && \
ROCPROFILER_LIB_ROCPROFILER_SDK_ROCJPEG_ROCJPEG_CPP_IMPL == 1
// clang-format off
ROCJPEG_API_TABLE_LOOKUP_DEFINITION(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, rocjpeg_api_func_table_t)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamCreate, rocJpegStreamCreate, pfn_rocjpeg_stream_create, jpeg_stream_handle)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamParse, rocJpegStreamParse, pfn_rocjpeg_stream_parse, data, length, jpeg_stream_handle)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegStreamDestroy, rocJpegStreamDestroy, pfn_rocjpeg_stream_destroy, jpeg_stream_handle)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegCreate, rocJpegCreate, pfn_rocjpeg_create, backend, device_id, handle)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegDestroy, rocJpegDestroy, pfn_rocjpeg_destroy, handle)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegGetImageInfo, rocJpegGetImageInfo, pfn_rocjpeg_get_image_info, handle, jpeg_stream_handle, num_components, subsampling, widths, heights)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegDecode, rocJpegDecode, pfn_rocjpeg_decode, handle, jpeg_stream_handle, decode_params, destination)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegDecodeBatched, rocJpegDecodeBatched, pfn_rocjpeg_decode_batched, handle, jpeg_stream_handles, batch_size, decode_params, destinations)
ROCJPEG_API_INFO_DEFINITION_V(ROCPROFILER_ROCJPEG_TABLE_ID_CORE, ROCPROFILER_ROCJPEG_API_ID_rocJpegGetErrorName, rocJpegGetErrorName, pfn_rocjpeg_get_error_name, rocjpeg_status)
#else
# error \
"Do not compile this file directly. It is included by lib/rocprofiler-sdk/rocjpeg/rocjpeg.cpp"
#endif
@@ -0,0 +1,114 @@
// MIT License
//
// Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#pragma once
#include <rocprofiler-sdk/rocjpeg/details/rocjpeg_headers.h>
#if ROCPROFILER_SDK_USE_SYSTEM_ROCJPEG > 0
# include <rocjpeg/amd_detail/rocjpeg_api_trace.h>
# include <rocjpeg/rocjpeg.h>
#else
# include <rocprofiler-sdk/rocjpeg/details/rocjpeg.h>
# include <rocprofiler-sdk/rocjpeg/details/rocjpeg_api_trace.h>
#endif
#include <rocprofiler-sdk/rocprofiler.h>
#include <cstdint>
#include <vector>
namespace rocprofiler
{
namespace rocjpeg
{
using rocjpeg_api_func_table_t = ::RocJpegDispatchTable;
struct RocJPEGAPITable
{
rocjpeg_api_func_table_t* rocjpeg_api_table = nullptr;
};
using rocjpeg_api_table_t = RocJPEGAPITable;
rocjpeg_api_table_t&
get_table();
template <size_t OpIdx>
struct rocjpeg_table_lookup;
template <typename Tp>
struct rocjpeg_table_id_lookup;
template <size_t TableIdx>
struct rocjpeg_domain_info;
template <size_t TableIdx, size_t OpIdx>
struct rocjpeg_api_info;
template <size_t TableIdx, size_t OpIdx>
struct rocjpeg_api_impl : rocjpeg_domain_info<TableIdx>
{
template <typename DataArgsT, typename... Args>
static auto set_data_args(DataArgsT&, Args... args);
template <typename FuncT, typename... Args>
static auto exec(FuncT&&, Args&&... args);
template <typename RetT, typename... Args>
static RetT functor(Args... args);
};
template <size_t TableIdx>
const char*
name_by_id(uint32_t id);
template <size_t TableIdx>
uint32_t
id_by_name(const char* name);
template <size_t TableIdx>
std::vector<const char*>
get_names();
template <size_t TableIdx>
std::vector<uint32_t>
get_ids();
template <size_t TableIdx>
void
iterate_args(uint32_t id,
const rocprofiler_callback_tracing_rocjpeg_api_data_t& data,
rocprofiler_callback_tracing_operation_args_cb_t callback,
int32_t max_deref,
void* user_data);
template <typename TableT>
void
copy_table(TableT* _orig, uint64_t _tbl_instance);
template <typename TableT>
void
update_table(TableT* _orig);
} // namespace rocjpeg
} // namespace rocprofiler
@@ -57,7 +57,8 @@ SPECIALIZE_RUNTIME_INIT_INFO(HSA, "HSA runtime")
SPECIALIZE_RUNTIME_INIT_INFO(HIP, "HIP runtime")
SPECIALIZE_RUNTIME_INIT_INFO(MARKER, "Marker (ROCTx) runtime")
SPECIALIZE_RUNTIME_INIT_INFO(RCCL, "RCCL runtime")
SPECIALIZE_RUNTIME_INIT_INFO(ROCDECODE, "ROCDecode runtime")
SPECIALIZE_RUNTIME_INIT_INFO(ROCDECODE, "rocDecode runtime")
SPECIALIZE_RUNTIME_INIT_INFO(ROCJPEG, "rocJPEG runtime")
#undef SPECIALIZE_RUNTIME_INIT_INFO
+2 -3
Datei anzeigen
@@ -68,10 +68,9 @@ add_subdirectory(thread-trace)
add_subdirectory(pc_sampling)
add_subdirectory(hip-graph-tracing)
add_subdirectory(counter-collection)
add_subdirectory(rocdecode)
add_subdirectory(rocjpeg)
add_subdirectory(conversion-script)
if(ROCPROFILER_BUILD_ROCDECODE_TESTS)
add_subdirectory(rocdecode)
endif()
if(ROCPROFILER_BUILD_OPENMP_TESTS)
add_subdirectory(openmp-tools)
+8 -1
Datei anzeigen
@@ -9,6 +9,10 @@ set(CMAKE_BUILD_RPATH
"\$ORIGIN:\$ORIGIN/../lib:$<TARGET_FILE_DIR:rocprofiler-sdk-roctx::rocprofiler-sdk-roctx-shared-library>"
)
# Find rocDecode and rocJPEG packages for testing
find_package(rocDecode)
find_package(rocJPEG)
# applications used by integration tests which DO link to rocprofiler-sdk-roctx
add_subdirectory(reproducible-runtime)
add_subdirectory(transpose)
@@ -29,7 +33,10 @@ add_subdirectory(hsa-queue-dependency)
add_subdirectory(hip-graph)
add_subdirectory(hsa-memory-allocation)
add_subdirectory(pc-sampling)
if(ROCPROFILER_BUILD_ROCDECODE_TESTS)
if(rocDecode_FOUND AND rocDecode_VERSION VERSION_GREATER 0.8.0)
add_subdirectory(rocdecode)
endif()
if(rocJPEG_FOUND AND rocJPEG_VERSION VERSION_GREATER 0.6.0)
add_subdirectory(rocjpeg)
endif()
add_subdirectory(hsa-code-object)
+5 -4
Datei anzeigen
@@ -33,11 +33,12 @@ set(CMAKE_HIP_EXTENSIONS OFF)
set(CMAKE_HIP_STANDARD_REQUIRED ON)
set_source_files_properties(rocdecode.cpp roc_video_dec.cpp PROPERTIES LANGUAGE HIP)
add_executable(rocdecode)
target_sources(rocdecode PRIVATE rocdecode.cpp roc_video_dec.cpp)
add_executable(rocdecode-demo)
target_sources(rocdecode-demo PRIVATE rocdecode.cpp roc_video_dec.cpp)
find_package(Threads REQUIRED)
find_package(rocDecode REQUIRED)
target_link_libraries(
rocdecode PRIVATE rocprofiler-sdk::tests-build-flags Threads::Threads hsa-runtime64
rocprofiler-sdk::tests-common-library rocDecode::rocDecode)
rocdecode-demo
PRIVATE rocprofiler-sdk::tests-build-flags Threads::Threads hsa-runtime64
rocprofiler-sdk::tests-common-library rocDecode::rocDecode)
+49
Datei anzeigen
@@ -0,0 +1,49 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
if(NOT CMAKE_HIP_COMPILER)
find_program(
amdclangpp_EXECUTABLE
NAMES amdclang++
HINTS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATHS ${ROCM_PATH} ENV ROCM_PATH /opt/rocm
PATH_SUFFIXES bin llvm/bin NO_CACHE)
mark_as_advanced(amdclangpp_EXECUTABLE)
if(amdclangpp_EXECUTABLE)
set(CMAKE_HIP_COMPILER "${amdclangpp_EXECUTABLE}")
endif()
endif()
project(rocprofiler-tool-test-app-rocjpeg LANGUAGES CXX HIP)
foreach(_TYPE DEBUG MINSIZEREL RELEASE RELWITHDEBINFO)
if("${CMAKE_HIP_FLAGS_${_TYPE}}" STREQUAL "")
set(CMAKE_HIP_FLAGS_${_TYPE} "${CMAKE_CXX_FLAGS_${_TYPE}}")
endif()
endforeach()
find_path(
ROCJPEG_SHARE_DIR
NAMES images
PATHS ${ROCM_PATH}/share/rocjpeg/)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_HIP_STANDARD 17)
set(CMAKE_HIP_EXTENSIONS OFF)
set(CMAKE_HIP_STANDARD_REQUIRED ON)
set_source_files_properties(rocjpeg.cpp PROPERTIES LANGUAGE HIP)
add_executable(rocjpeg-demo)
target_sources(rocjpeg-demo PRIVATE rocjpeg.cpp)
find_package(Threads REQUIRED)
find_package(rocJPEG REQUIRED)
target_link_libraries(
rocjpeg-demo
PRIVATE Threads::Threads hsa-runtime64 rocprofiler-sdk::tests-common-library
rocprofiler-sdk::tests-build-flags rocJPEG::rocJPEG)
+279
Datei anzeigen
@@ -0,0 +1,279 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#include <rocjpeg/rocjpeg.h>
#include "rocjpeg_samples_utils.h"
int
main(int argc, char** argv)
{
int device_id = 0;
bool save_images = false;
uint8_t num_components;
uint32_t widths[ROCJPEG_MAX_COMPONENT] = {};
uint32_t heights[ROCJPEG_MAX_COMPONENT] = {};
uint32_t channel_sizes[ROCJPEG_MAX_COMPONENT] = {};
uint32_t prior_channel_sizes[ROCJPEG_MAX_COMPONENT] = {};
uint32_t num_channels = 0;
int total_images = 0;
double time_per_image_all = 0;
std::string chroma_sub_sampling = "";
std::string input_path, output_file_path;
std::vector<std::string> file_paths = {};
bool is_dir = false;
bool is_file = false;
RocJpegChromaSubsampling subsampling;
RocJpegBackend rocjpeg_backend = ROCJPEG_BACKEND_HARDWARE;
RocJpegHandle rocjpeg_handle = nullptr;
RocJpegStreamHandle rocjpeg_stream_handle = nullptr;
RocJpegImage output_image = {};
RocJpegDecodeParams decode_params = {};
RocJpegUtils rocjpeg_utils;
uint64_t num_bad_jpegs = 0;
uint64_t num_jpegs_with_411_subsampling = 0;
uint64_t num_jpegs_with_unknown_subsampling = 0;
uint64_t num_jpegs_with_unsupported_resolution = 0;
RocJpegUtils::ParseCommandLine(input_path,
output_file_path,
save_images,
device_id,
rocjpeg_backend,
decode_params,
nullptr,
nullptr,
argc,
argv);
bool is_roi_valid = false;
uint32_t roi_width;
uint32_t roi_height;
roi_width = decode_params.crop_rectangle.right - decode_params.crop_rectangle.left;
roi_height = decode_params.crop_rectangle.bottom - decode_params.crop_rectangle.top;
if(!RocJpegUtils::GetFilePaths(input_path, file_paths, is_dir, is_file))
{
std::cerr << "ERROR: Failed to get input file paths!" << std::endl;
return EXIT_FAILURE;
}
if(!RocJpegUtils::InitHipDevice(device_id))
{
std::cerr << "ERROR: Failed to initialize HIP!" << std::endl;
return EXIT_FAILURE;
}
// CHECK_ROCJPEG(rocJpegCreate(rocjpeg_backend, device_id, &rocjpeg_handle));
if(rocJpegCreate(rocjpeg_backend, device_id, &rocjpeg_handle) != ROCJPEG_STATUS_SUCCESS)
{
std::cerr << "rocJPEG tests not supported" << std::endl;
return 0;
}
CHECK_ROCJPEG(rocJpegStreamCreate(&rocjpeg_stream_handle));
std::vector<char> file_data;
for(auto file_path : file_paths)
{
std::string base_file_name = file_path.substr(file_path.find_last_of("/\\") + 1);
int image_count = 0;
// Read an image from disk.
std::ifstream input(file_path.c_str(), std::ios::in | std::ios::binary | std::ios::ate);
if(!(input.is_open()))
{
std::cerr << "ERROR: Cannot open image: " << file_path << std::endl;
return EXIT_FAILURE;
}
// Get the size
std::streamsize file_size = input.tellg();
input.seekg(0, std::ios::beg);
// resize if buffer is too small
if(file_data.size() < static_cast<size_t>(file_size))
{
file_data.resize(file_size);
}
if(!input.read(file_data.data(), file_size))
{
std::cerr << "ERROR: Cannot read from file: " << file_path << std::endl;
return EXIT_FAILURE;
}
RocJpegStatus rocjpeg_status = rocJpegStreamParse(
reinterpret_cast<uint8_t*>(file_data.data()), file_size, rocjpeg_stream_handle);
if(rocjpeg_status != ROCJPEG_STATUS_SUCCESS)
{
if(is_dir)
{
num_bad_jpegs++;
continue;
}
else
{
std::cerr << "ERROR: Failed to parse the input jpeg stream with "
<< rocJpegGetErrorName(rocjpeg_status) << std::endl;
return EXIT_FAILURE;
}
}
CHECK_ROCJPEG(rocJpegGetImageInfo(
rocjpeg_handle, rocjpeg_stream_handle, &num_components, &subsampling, widths, heights));
if(roi_width > 0 && roi_height > 0 && roi_width <= widths[0] && roi_height <= heights[0])
{
is_roi_valid = true;
}
rocjpeg_utils.GetChromaSubsamplingStr(subsampling, chroma_sub_sampling);
if(widths[0] < 64 || heights[0] < 64)
{
std::cerr << "The image resolution is not supported by VCN Hardware" << std::endl;
if(is_dir)
{
num_jpegs_with_unsupported_resolution++;
continue;
}
else
return EXIT_FAILURE;
}
if(subsampling == ROCJPEG_CSS_411 || subsampling == ROCJPEG_CSS_UNKNOWN)
{
std::cerr << "The chroma sub-sampling is not supported by VCN Hardware" << std::endl;
if(is_dir)
{
if(subsampling == ROCJPEG_CSS_411) num_jpegs_with_411_subsampling++;
if(subsampling == ROCJPEG_CSS_UNKNOWN) num_jpegs_with_unknown_subsampling++;
continue;
}
else
return EXIT_FAILURE;
}
if(rocjpeg_utils.GetChannelPitchAndSizes(decode_params,
subsampling,
widths,
heights,
num_channels,
output_image,
channel_sizes))
{
std::cerr << "ERROR: Failed to get the channel pitch and sizes" << std::endl;
return EXIT_FAILURE;
}
// allocate memory for each channel and reuse them if the sizes remain unchanged for a new
// image.
for(uint32_t i = 0; i < num_channels; i++)
{
if(prior_channel_sizes[i] != channel_sizes[i])
{
if(output_image.channel[i] != nullptr)
{
CHECK_HIP(hipFree((void*) output_image.channel[i]));
output_image.channel[i] = nullptr;
}
CHECK_HIP(hipMalloc(&output_image.channel[i], channel_sizes[i]));
}
}
if(is_roi_valid)
{}
auto start_time = std::chrono::high_resolution_clock::now();
CHECK_ROCJPEG(
rocJpegDecode(rocjpeg_handle, rocjpeg_stream_handle, &decode_params, &output_image));
auto end_time = std::chrono::high_resolution_clock::now();
double time_per_image_in_milli_sec =
std::chrono::duration<double, std::milli>(end_time - start_time).count();
image_count++;
if(save_images)
{
std::string image_save_path = output_file_path;
// if ROI is present, need to pass roi_width and roi_height
uint32_t width = is_roi_valid ? roi_width : widths[0];
uint32_t height = is_roi_valid ? roi_height : heights[0];
if(is_dir)
{
rocjpeg_utils.GetOutputFileExt(decode_params.output_format,
base_file_name,
width,
height,
subsampling,
image_save_path);
}
rocjpeg_utils.SaveImage(image_save_path,
&output_image,
width,
height,
subsampling,
decode_params.output_format);
}
if(is_dir)
{
total_images += image_count;
time_per_image_all += time_per_image_in_milli_sec;
}
for(int i = 0; i < ROCJPEG_MAX_COMPONENT; i++)
{
prior_channel_sizes[i] = channel_sizes[i];
}
}
for(uint32_t i = 0; i < num_channels; i++)
{
if(output_image.channel[i] != nullptr)
{
CHECK_HIP(hipFree((void*) output_image.channel[i]));
output_image.channel[i] = nullptr;
}
}
if(is_dir)
{
time_per_image_all = time_per_image_all / total_images;
if(num_bad_jpegs || num_jpegs_with_411_subsampling || num_jpegs_with_unknown_subsampling ||
num_jpegs_with_unsupported_resolution)
{
if(num_bad_jpegs)
{
std::cout << " ,total images that cannot be parsed: " << num_bad_jpegs;
}
if(num_jpegs_with_411_subsampling)
{
std::cout << " ,total images with YUV 4:1:1 chroam subsampling: "
<< num_jpegs_with_411_subsampling;
}
if(num_jpegs_with_unknown_subsampling)
{
std::cout << " ,total images with unknwon chroam subsampling: "
<< num_jpegs_with_unknown_subsampling;
}
if(num_jpegs_with_unsupported_resolution)
{
std::cout << " ,total images with unsupported_resolution: "
<< num_jpegs_with_unsupported_resolution;
}
std::cout << std::endl;
}
}
CHECK_ROCJPEG(rocJpegDestroy(rocjpeg_handle));
CHECK_ROCJPEG(rocJpegStreamDestroy(rocjpeg_stream_handle));
return EXIT_SUCCESS;
}
+875
Datei anzeigen
@@ -0,0 +1,875 @@
/*
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
#ifndef ROC_JPEG_SAMPLES_COMMON
#define ROC_JPEG_SAMPLES_COMMON
#pragma once
#include <algorithm>
#include <condition_variable>
#include <fstream>
#include <functional>
#include <iomanip>
#include <iostream>
#include <mutex>
#include <queue>
#include <string>
#include <thread>
#include <vector>
#if __cplusplus >= 201703L && __has_include(<filesystem>)
# include <filesystem>
namespace fs = std::filesystem;
#else
# include <experimental/filesystem>
namespace fs = std::experimental::filesystem;
#endif
#include <rocjpeg/rocjpeg.h>
#include <chrono>
#define CHECK_ROCJPEG(call) \
{ \
RocJpegStatus _rocjpeg_status = (call); \
if(_rocjpeg_status != ROCJPEG_STATUS_SUCCESS) \
{ \
std::cerr << #call << " returned " << rocJpegGetErrorName(_rocjpeg_status) << " at " \
<< __FILE__ << ":" << __LINE__ << std::endl; \
exit(1); \
} \
}
#define CHECK_HIP(call) \
{ \
hipError_t _hip_status = (call); \
if(_hip_status != hipSuccess) \
{ \
std::cout << "rocJPEG failure: '#" << _hip_status << "' at " << __FILE__ << ":" \
<< __LINE__ << std::endl; \
exit(1); \
} \
}
/**
* @class RocJpegUtils
* @brief Utility class for rocJPEG samples.
*
* This class provides utility functions for rocJPEG samples, such as parsing command line
* arguments, getting file paths, initializing HIP device, getting chroma subsampling string,
* getting channel pitch and sizes, getting output file extension, and saving images.
*/
class RocJpegUtils
{
public:
/**
* @brief Parses the command line arguments.
*
* This function parses the command line arguments and sets the corresponding variables.
*
* @param input_path The input path.
* @param output_file_path The output file path.
* @param save_images Flag indicating whether to save images.
* @param device_id The device ID.
* @param rocjpeg_backend The rocJPEG backend.
* @param decode_params The rocJPEG decode parameters.
* @param num_threads The number of threads.
* @param crop The crop rectangle.
* @param argc The number of command line arguments.
* @param argv The command line arguments.
*/
static void ParseCommandLine(std::string& input_path,
std::string& output_file_path,
bool& save_images,
int& device_id,
RocJpegBackend& rocjpeg_backend,
RocJpegDecodeParams& decode_params,
int* num_threads,
int* batch_size,
int argc,
char* argv[])
{
if(argc <= 1)
{
ShowHelpAndExit("", num_threads != nullptr, batch_size != nullptr);
}
for(int i = 1; i < argc; i++)
{
if(!strcmp(argv[i], "-h"))
{
ShowHelpAndExit("", num_threads != nullptr, batch_size != nullptr);
}
if(!strcmp(argv[i], "-i"))
{
if(++i == argc)
{
ShowHelpAndExit("-i", num_threads != nullptr, batch_size != nullptr);
}
input_path = argv[i];
continue;
}
if(!strcmp(argv[i], "-o"))
{
if(++i == argc)
{
ShowHelpAndExit("-o", num_threads != nullptr, batch_size != nullptr);
}
output_file_path = argv[i];
save_images = true;
continue;
}
if(!strcmp(argv[i], "-d"))
{
if(++i == argc)
{
ShowHelpAndExit("-d", num_threads != nullptr, batch_size != nullptr);
}
device_id = atoi(argv[i]);
continue;
}
if(!strcmp(argv[i], "-be"))
{
if(++i == argc)
{
ShowHelpAndExit("-be", num_threads != nullptr, batch_size != nullptr);
}
rocjpeg_backend = static_cast<RocJpegBackend>(atoi(argv[i]));
continue;
}
if(!strcmp(argv[i], "-fmt"))
{
if(++i == argc)
{
ShowHelpAndExit("-fmt", num_threads != nullptr, batch_size != nullptr);
}
std::string selected_output_format = argv[i];
if(selected_output_format == "native")
{
decode_params.output_format = ROCJPEG_OUTPUT_NATIVE;
}
else if(selected_output_format == "yuv_planar")
{
decode_params.output_format = ROCJPEG_OUTPUT_YUV_PLANAR;
}
else if(selected_output_format == "y")
{
decode_params.output_format = ROCJPEG_OUTPUT_Y;
}
else if(selected_output_format == "rgb")
{
decode_params.output_format = ROCJPEG_OUTPUT_RGB;
}
else if(selected_output_format == "rgb_planar")
{
decode_params.output_format = ROCJPEG_OUTPUT_RGB_PLANAR;
}
else
{
ShowHelpAndExit(argv[i], num_threads != nullptr);
}
continue;
}
if(!strcmp(argv[i], "-t"))
{
if(++i == argc)
{
ShowHelpAndExit("-t", num_threads != nullptr, batch_size != nullptr);
}
if(num_threads != nullptr)
{
*num_threads = atoi(argv[i]);
if(*num_threads <= 0 || *num_threads > 32)
{
ShowHelpAndExit(argv[i], num_threads != nullptr, batch_size != nullptr);
}
}
continue;
}
if(!strcmp(argv[i], "-b"))
{
if(++i == argc)
{
ShowHelpAndExit("-b", num_threads != nullptr, batch_size != nullptr);
}
if(batch_size != nullptr) *batch_size = atoi(argv[i]);
continue;
}
if(!strcmp(argv[i], "-crop"))
{
if(++i == argc || 4 != sscanf(argv[i],
"%hd,%hd,%hd,%hd",
&decode_params.crop_rectangle.left,
&decode_params.crop_rectangle.top,
&decode_params.crop_rectangle.right,
&decode_params.crop_rectangle.bottom))
{
ShowHelpAndExit("-crop");
}
if((&decode_params.crop_rectangle.right - &decode_params.crop_rectangle.left) % 2 ==
1 ||
(&decode_params.crop_rectangle.bottom - &decode_params.crop_rectangle.top) % 2 ==
1)
{
std::cout << "output crop rectangle must have width and height of even numbers"
<< std::endl;
exit(1);
}
continue;
}
ShowHelpAndExit(argv[i], num_threads != nullptr, batch_size != nullptr);
}
}
/**
* Checks if a file is a JPEG file.
*
* @param filePath The path to the file to be checked.
* @return True if the file is a JPEG file, false otherwise.
*/
static bool IsJPEG(const std::string& filePath)
{
std::ifstream file(filePath, std::ios::binary);
if(!file.is_open())
{
std::cerr << "Failed to open file: " << filePath << std::endl;
return false;
}
unsigned char buffer[2];
file.read(reinterpret_cast<char*>(buffer), 2);
file.close();
// The first two bytes of every JPEG stream are always 0xFFD8, which represents the Start of
// Image (SOI) marker.
return buffer[0] == 0xFF && buffer[1] == 0xD8;
}
/**
* @brief Gets the file paths.
*
* This function gets the file paths based on the input path and sets the corresponding
* variables.
*
* @param input_path The input path.
* @param file_paths The vector to store the file paths.
* @param is_dir Flag indicating whether the input path is a directory.
* @param is_file Flag indicating whether the input path is a file.
* @return True if successful, false otherwise.
*/
static bool GetFilePaths(std::string& input_path,
std::vector<std::string>& file_paths,
bool& is_dir,
bool& is_file)
{
if(!fs::exists(input_path))
{
std::cerr << "ERROR: the input path does not exist!" << std::endl;
return false;
}
is_dir = fs::is_directory(input_path);
is_file = fs::is_regular_file(input_path);
if(is_dir)
{
for(const auto& entry : fs::recursive_directory_iterator(input_path))
{
if(fs::is_regular_file(entry) && IsJPEG(entry.path().string()))
{
file_paths.push_back(entry.path().string());
}
}
}
else if(is_file && IsJPEG(input_path))
{
file_paths.push_back(input_path);
}
else
{
std::cerr << "ERROR: the input path does not contain JPEG files!" << std::endl;
return false;
}
return true;
}
/**
* @brief Initializes the HIP device.
*
* This function initializes the HIP device with the specified device ID.
*
* @param device_id The device ID.
* @return True if successful, false otherwise.
*/
static bool InitHipDevice(int device_id)
{
int num_devices;
hipDeviceProp_t hip_dev_prop;
CHECK_HIP(hipGetDeviceCount(&num_devices));
if(num_devices < 1)
{
std::cerr << "ERROR: didn't find any GPU!" << std::endl;
return false;
}
if(device_id >= num_devices)
{
std::cerr << "ERROR: the requested device_id is not found!" << std::endl;
return false;
}
CHECK_HIP(hipSetDevice(device_id));
CHECK_HIP(hipGetDeviceProperties(&hip_dev_prop, device_id));
return true;
}
/**
* @brief Gets the chroma subsampling string.
*
* This function gets the chroma subsampling string based on the specified subsampling value.
*
* @param subsampling The chroma subsampling value.
* @param chroma_sub_sampling The string to store the chroma subsampling.
*/
void GetChromaSubsamplingStr(RocJpegChromaSubsampling subsampling,
std::string& chroma_sub_sampling)
{
switch(subsampling)
{
case ROCJPEG_CSS_444: chroma_sub_sampling = "YUV 4:4:4"; break;
case ROCJPEG_CSS_440: chroma_sub_sampling = "YUV 4:4:0"; break;
case ROCJPEG_CSS_422: chroma_sub_sampling = "YUV 4:2:2"; break;
case ROCJPEG_CSS_420: chroma_sub_sampling = "YUV 4:2:0"; break;
case ROCJPEG_CSS_411: chroma_sub_sampling = "YUV 4:1:1"; break;
case ROCJPEG_CSS_400: chroma_sub_sampling = "YUV 4:0:0"; break;
case ROCJPEG_CSS_UNKNOWN: chroma_sub_sampling = "UNKNOWN"; break;
default: chroma_sub_sampling = ""; break;
}
}
/**
* @brief Gets the channel pitch and sizes.
*
* This function gets the channel pitch and sizes based on the specified output format, chroma
* subsampling, output image, and channel sizes.
*
* @param decode_params The decode parameters that specify the output format and crop rectangle.
* @param subsampling The chroma subsampling.
* @param widths The array to store the channel widths.
* @param heights The array to store the channel heights.
* @param num_channels The number of channels.
* @param output_image The output image.
* @param channel_sizes The array to store the channel sizes.
* @return The channel pitch.
*/
int GetChannelPitchAndSizes(RocJpegDecodeParams decode_params,
RocJpegChromaSubsampling subsampling,
uint32_t* widths,
uint32_t* heights,
uint32_t& num_channels,
RocJpegImage& output_image,
uint32_t* channel_sizes)
{
bool is_roi_valid = false;
uint32_t roi_width;
uint32_t roi_height;
roi_width = decode_params.crop_rectangle.right - decode_params.crop_rectangle.left;
roi_height = decode_params.crop_rectangle.bottom - decode_params.crop_rectangle.top;
if(roi_width > 0 && roi_height > 0 && roi_width <= widths[0] && roi_height <= heights[0])
{
is_roi_valid = true;
}
switch(decode_params.output_format)
{
case ROCJPEG_OUTPUT_NATIVE:
switch(subsampling)
{
case ROCJPEG_CSS_444:
num_channels = 3;
output_image.pitch[2] = output_image.pitch[1] = output_image.pitch[0] =
is_roi_valid ? roi_width : widths[0];
channel_sizes[2] = channel_sizes[1] = channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
case ROCJPEG_CSS_440:
num_channels = 3;
output_image.pitch[2] = output_image.pitch[1] = output_image.pitch[0] =
is_roi_valid ? roi_width : widths[0];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
channel_sizes[2] = channel_sizes[1] = align(
output_image.pitch[0] * ((is_roi_valid ? roi_height : heights[0]) >> 1),
mem_alignment);
break;
case ROCJPEG_CSS_422:
num_channels = 1;
output_image.pitch[0] = (is_roi_valid ? roi_width : widths[0]) * 2;
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
case ROCJPEG_CSS_420:
num_channels = 2;
output_image.pitch[1] = output_image.pitch[0] =
is_roi_valid ? roi_width : widths[0];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
channel_sizes[1] = align(
output_image.pitch[1] * ((is_roi_valid ? roi_height : heights[0]) >> 1),
mem_alignment);
break;
case ROCJPEG_CSS_400:
num_channels = 1;
output_image.pitch[0] = is_roi_valid ? roi_width : widths[0];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
default:
std::cout << "Unknown chroma subsampling!" << std::endl;
return EXIT_FAILURE;
}
break;
case ROCJPEG_OUTPUT_YUV_PLANAR:
if(subsampling == ROCJPEG_CSS_400)
{
num_channels = 1;
output_image.pitch[0] = is_roi_valid ? roi_width : widths[0];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
}
else
{
num_channels = 3;
output_image.pitch[0] = is_roi_valid ? roi_width : widths[0];
output_image.pitch[1] = is_roi_valid ? roi_width : widths[1];
output_image.pitch[2] = is_roi_valid ? roi_width : widths[2];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
channel_sizes[1] =
align(output_image.pitch[1] * (is_roi_valid ? roi_height : heights[1]),
mem_alignment);
channel_sizes[2] =
align(output_image.pitch[2] * (is_roi_valid ? roi_height : heights[2]),
mem_alignment);
}
break;
case ROCJPEG_OUTPUT_Y:
num_channels = 1;
output_image.pitch[0] = is_roi_valid ? roi_width : widths[0];
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
case ROCJPEG_OUTPUT_RGB:
num_channels = 1;
output_image.pitch[0] = (is_roi_valid ? roi_width : widths[0]) * 3;
channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
case ROCJPEG_OUTPUT_RGB_PLANAR:
num_channels = 3;
output_image.pitch[2] = output_image.pitch[1] = output_image.pitch[0] =
is_roi_valid ? roi_width : widths[0];
channel_sizes[2] = channel_sizes[1] = channel_sizes[0] =
align(output_image.pitch[0] * (is_roi_valid ? roi_height : heights[0]),
mem_alignment);
break;
default: std::cout << "Unknown output format!" << std::endl; return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
/**
* @brief Gets the output file extension.
*
* This function gets the output file extension based on the specified output format, base file
* name, image width, image height, and file name for saving.
*
* @param output_format The output format.
* @param base_file_name The base file name.
* @param image_width The image width.
* @param image_height The image height.
* @param file_name_for_saving The string to store the file name for saving.
*/
void GetOutputFileExt(RocJpegOutputFormat output_format,
std::string& base_file_name,
uint32_t image_width,
uint32_t image_height,
RocJpegChromaSubsampling subsampling,
std::string& file_name_for_saving)
{
std::string file_extension;
std::string::size_type const p(base_file_name.find_last_of('.'));
std::string file_name_no_ext = base_file_name.substr(0, p);
std::string format_description = "";
switch(output_format)
{
case ROCJPEG_OUTPUT_NATIVE:
file_extension = "yuv";
switch(subsampling)
{
case ROCJPEG_CSS_444: format_description = "444"; break;
case ROCJPEG_CSS_440: format_description = "440"; break;
case ROCJPEG_CSS_422: format_description = "422_yuyv"; break;
case ROCJPEG_CSS_420: format_description = "nv12"; break;
case ROCJPEG_CSS_400: format_description = "400"; break;
default: std::cout << "Unknown chroma subsampling!" << std::endl; return;
}
break;
case ROCJPEG_OUTPUT_YUV_PLANAR:
file_extension = "yuv";
format_description = "planar";
break;
case ROCJPEG_OUTPUT_Y:
file_extension = "yuv";
format_description = "400";
break;
case ROCJPEG_OUTPUT_RGB:
file_extension = "rgb";
format_description = "packed";
break;
case ROCJPEG_OUTPUT_RGB_PLANAR:
file_extension = "rgb";
format_description = "planar";
break;
default: file_extension = ""; break;
}
file_name_for_saving += "//" + file_name_no_ext + "_" + std::to_string(image_width) + "x" +
std::to_string(image_height) + "_" + format_description + "." +
file_extension;
}
/**
* @brief Saves the image.
*
* This function saves the image to the specified output file name based on the output image,
* image width, image height, chroma subsampling, and output format.
*
* @param output_file_name The output file name.
* @param output_image The output image.
* @param img_width The image width.
* @param img_height The image height.
* @param subsampling The chroma subsampling.
* @param output_format The output format.
*/
void SaveImage(std::string output_file_name,
RocJpegImage* output_image,
uint32_t img_width,
uint32_t img_height,
RocJpegChromaSubsampling subsampling,
RocJpegOutputFormat output_format)
{
uint8_t* hst_ptr = nullptr;
FILE* fp;
if(output_image == nullptr || output_image->channel[0] == nullptr ||
output_image->pitch[0] == 0)
{
return;
}
uint32_t widths[ROCJPEG_MAX_COMPONENT] = {};
uint32_t heights[ROCJPEG_MAX_COMPONENT] = {};
switch(output_format)
{
case ROCJPEG_OUTPUT_NATIVE:
switch(subsampling)
{
case ROCJPEG_CSS_444:
widths[2] = widths[1] = widths[0] = img_width;
heights[2] = heights[1] = heights[0] = img_height;
break;
case ROCJPEG_CSS_440:
widths[2] = widths[1] = widths[0] = img_width;
heights[0] = img_height;
heights[2] = heights[1] = img_height >> 1;
break;
case ROCJPEG_CSS_422:
widths[0] = img_width * 2;
heights[0] = img_height;
break;
case ROCJPEG_CSS_420:
widths[1] = widths[0] = img_width;
heights[0] = img_height;
heights[1] = img_height >> 1;
break;
case ROCJPEG_CSS_400:
widths[0] = img_width;
heights[0] = img_height;
break;
default: std::cout << "Unknown chroma subsampling!" << std::endl; return;
}
break;
case ROCJPEG_OUTPUT_YUV_PLANAR:
switch(subsampling)
{
case ROCJPEG_CSS_444:
widths[2] = widths[1] = widths[0] = img_width;
heights[2] = heights[1] = heights[0] = img_height;
break;
case ROCJPEG_CSS_440:
widths[2] = widths[1] = widths[0] = img_width;
heights[0] = img_height;
heights[2] = heights[1] = img_height >> 1;
break;
case ROCJPEG_CSS_422:
widths[0] = img_width;
widths[2] = widths[1] = widths[0] >> 1;
heights[2] = heights[1] = heights[0] = img_height;
break;
case ROCJPEG_CSS_420:
widths[0] = img_width;
widths[2] = widths[1] = widths[0] >> 1;
heights[0] = img_height;
heights[2] = heights[1] = img_height >> 1;
break;
case ROCJPEG_CSS_400:
widths[0] = img_width;
heights[0] = img_height;
break;
default: std::cout << "Unknown chroma subsampling!" << std::endl; return;
}
break;
case ROCJPEG_OUTPUT_Y:
widths[0] = img_width;
heights[0] = img_height;
break;
case ROCJPEG_OUTPUT_RGB:
widths[0] = img_width * 3;
heights[0] = img_height;
break;
case ROCJPEG_OUTPUT_RGB_PLANAR:
widths[2] = widths[1] = widths[0] = img_width;
heights[2] = heights[1] = heights[0] = img_height;
break;
default: std::cout << "Unknown output format!" << std::endl; return;
}
uint32_t channel0_size = output_image->pitch[0] * heights[0];
uint32_t channel1_size = output_image->pitch[1] * heights[1];
uint32_t channel2_size = output_image->pitch[2] * heights[2];
uint32_t output_image_size = channel0_size + channel1_size + channel2_size;
if(hst_ptr == nullptr)
{
hst_ptr = new uint8_t[output_image_size];
}
CHECK_HIP(hipMemcpyDtoH((void*) hst_ptr, output_image->channel[0], channel0_size));
uint8_t* tmp_hst_ptr = hst_ptr;
fp = fopen(output_file_name.c_str(), "wb");
if(fp)
{
// write channel0
if(widths[0] == output_image->pitch[0])
{
fwrite(hst_ptr, 1, channel0_size, fp);
}
else
{
for(uint32_t i = 0; i < heights[0]; i++)
{
fwrite(tmp_hst_ptr, 1, widths[0], fp);
tmp_hst_ptr += output_image->pitch[0];
}
}
// write channel1
if(channel1_size != 0 && output_image->channel[1] != nullptr)
{
uint8_t* channel1_hst_ptr = hst_ptr + channel0_size;
CHECK_HIP(hipMemcpyDtoH(
(void*) channel1_hst_ptr, output_image->channel[1], channel1_size));
if(widths[1] == output_image->pitch[1])
{
fwrite(channel1_hst_ptr, 1, channel1_size, fp);
}
else
{
for(uint32_t i = 0; i < heights[1]; i++)
{
fwrite(channel1_hst_ptr, 1, widths[1], fp);
channel1_hst_ptr += output_image->pitch[1];
}
}
}
// write channel2
if(channel2_size != 0 && output_image->channel[2] != nullptr)
{
uint8_t* channel2_hst_ptr = hst_ptr + channel0_size + channel1_size;
CHECK_HIP(hipMemcpyDtoH(
(void*) channel2_hst_ptr, output_image->channel[2], channel2_size));
if(widths[2] == output_image->pitch[2])
{
fwrite(channel2_hst_ptr, 1, channel2_size, fp);
}
else
{
for(uint32_t i = 0; i < heights[2]; i++)
{
fwrite(channel2_hst_ptr, 1, widths[2], fp);
channel2_hst_ptr += output_image->pitch[2];
}
}
}
fclose(fp);
}
if(hst_ptr != nullptr)
{
delete[] hst_ptr;
hst_ptr = nullptr;
tmp_hst_ptr = nullptr;
}
}
private:
static const int mem_alignment = 4 * 1024 * 1024;
/**
* @brief Shows the help message and exits.
*
* This function shows the help message and exits the program.
*
* @param option The option to display in the help message (optional).
* @param show_threads Flag indicating whether to show the number of threads in the help
* message.
*/
static void ShowHelpAndExit(const char* option = nullptr,
bool show_threads = false,
bool show_batch_size = false)
{
(void) option;
std::cout
<< "Options:\n"
"-i [input path] - input path to a single JPEG image or a directory containing "
"JPEG images - [required]\n"
"-be [backend] - select rocJPEG backend (0 for hardware-accelerated JPEG "
"decoding using VCN,\n"
" 1 for hybrid JPEG decoding using CPU "
"and GPU HIP kernels (currently not supported)) [optional - default: 0]\n"
"-fmt [output format] - select rocJPEG output format for decoding, one of the "
"[native, yuv_planar, y, rgb, rgb_planar] - [optional - default: native]\n"
"-o [output path] - path to an output file or a path to an existing directory - "
"write decoded images to a file or an existing directory based on selected output "
"format - [optional]\n"
"-crop [crop rectangle] - crop rectangle for output in a comma-separated format: "
"left,top,right,bottom - [optional]\n"
"-d [device id] - specify the GPU device id for the desired device (use 0 for "
"the first device, 1 for the second device, and so on) [optional - default: 0]\n";
if(show_threads)
{
std::cout << "-t [threads] - number of threads (<= 32) for parallel JPEG decoding "
"- [optional - default: 1]\n";
}
if(show_batch_size)
{
std::cout << "-b [batch_size] - decode images from input by batches of a specified "
"size - [optional - default: 1]\n";
}
exit(0);
}
/**
* @brief Aligns a value to a specified alignment.
*
* This function takes a value and aligns it to the specified alignment. It returns the aligned
* value.
*
* @param value The value to be aligned.
* @param alignment The alignment value.
* @return The aligned value.
*/
static inline int align(int value, int alignment)
{
return (value + alignment - 1) & ~(alignment - 1);
}
};
class ThreadPool
{
public:
ThreadPool(int nthreads)
: shutdown_(false)
{
// Create the specified number of threads
threads_.reserve(nthreads);
for(int i = 0; i < nthreads; ++i)
threads_.emplace_back(std::bind(&ThreadPool::ThreadEntry, this, i));
}
~ThreadPool() {}
void JoinThreads()
{
{
// Unblock any threads and tell them to stop
std::unique_lock<std::mutex> lock(mutex_);
shutdown_ = true;
cond_var_.notify_all();
}
// Wait for all threads to stop
for(auto& thread : threads_)
thread.join();
}
void ExecuteJob(std::function<void()> func)
{
// Place a job on the queue and unblock a thread
std::unique_lock<std::mutex> lock(mutex_);
decode_jobs_queue_.emplace(std::move(func));
cond_var_.notify_one();
}
protected:
void ThreadEntry(int i)
{
(void) i;
std::function<void()> execute_decode_job;
while(true)
{
{
std::unique_lock<std::mutex> lock(mutex_);
cond_var_.wait(lock, [&] { return shutdown_ || !decode_jobs_queue_.empty(); });
if(decode_jobs_queue_.empty())
{
// No jobs to do; shutting down
return;
}
execute_decode_job = std::move(decode_jobs_queue_.front());
decode_jobs_queue_.pop();
}
// Execute the decode job without holding any locks
execute_decode_job();
}
}
std::mutex mutex_;
std::condition_variable cond_var_;
bool shutdown_;
std::queue<std::function<void()>> decode_jobs_queue_;
std::vector<std::thread> threads_;
};
#endif // ROC_JPEG_SAMPLES_COMMON
@@ -34,6 +34,7 @@ def test_perfetto_data(
"memory_copy",
"memory_allocation",
"rocdecode_api",
"rocjpeg_api",
),
):
@@ -45,6 +46,7 @@ def test_perfetto_data(
"memory_copy": ("memory_copy", "memory_copy"),
"memory_allocation": ("memory_allocation", "memory_allocation"),
"rocdecode_api": ("rocdecode_api", "rocdecode_api"),
"rocjpeg_api": ("rocjpeg_api", "rocjpeg_api"),
}
# make sure they specified valid categories
@@ -83,6 +85,7 @@ def test_otf2_data(
"memory_copy": ("memory_copy", "memory_copy"),
"memory_allocation": ("memory_allocation", "memory_allocation"),
"rocdecode_api": ("rocdecode_api", "rocdecode_api"),
"rocjpeg_api": ("rocjpeg_api", "rocjpeg_api"),
}
# make sure they specified valid categories
+29 -10
Datei anzeigen
@@ -9,6 +9,7 @@ project(
VERSION 0.0.0)
find_package(rocprofiler-sdk REQUIRED)
find_package(rocDecode)
if(ROCPROFILER_MEMCHECK_PRELOAD_ENV)
set(PRELOAD_ENV
@@ -18,14 +19,18 @@ else()
endif()
set(ROCDECODE_VIDEO_FILE
"${ROCM_PATH}/share/rocdecode/video/AMD_driving_virtual_20-H265.265")
if(NOT EXISTS "${ROCDECODE_VIDEO_FILE}")
"${rocDecode_ROOT_DIR}/share/rocdecode/video/AMD_driving_virtual_20-H265.265")
if(TARGET rocdecode-demo AND NOT EXISTS "${ROCDECODE_VIDEO_FILE}")
message(
FATAL_ERROR
"Unable to find video file for rocdecode tests: ${ROCDECODE_VIDEO_FILE}")
endif()
add_test(NAME test-rocdecode-tracing-execute COMMAND $<TARGET_FILE:rocdecode> -i
${ROCDECODE_VIDEO_FILE})
add_test(
NAME test-rocdecode-tracing-execute
COMMAND
$<IF:$<TARGET_EXISTS:rocdecode-demo>,$<$<TARGET_EXISTS:rocdecode-demo>:$<TARGET_FILE:rocdecode-demo>>,${CMAKE_COMMAND}>
-i ${ROCDECODE_VIDEO_FILE})
set(rocdecode-tracing-env
"${PRELOAD_ENV}"
@@ -35,9 +40,16 @@ set(rocdecode-tracing-env
set_tests_properties(
test-rocdecode-tracing-execute
PROPERTIES TIMEOUT 45 LABELS "integration-tests" ENVIRONMENT
"${rocdecode-tracing-env}" FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}")
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
ENVIRONMENT
"${rocdecode-tracing-env}"
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
DISABLED
$<NOT:$<TARGET_EXISTS:rocdecode-demo>>)
# copy to binary directory
rocprofiler_configure_pytest_files(COPY validate.py conftest.py CONFIG pytest.ini)
@@ -48,6 +60,13 @@ add_test(NAME test-rocdecode-tracing-validate
set_tests_properties(
test-rocdecode-tracing-validate
PROPERTIES TIMEOUT 45 LABELS "integration-tests" DEPENDS
test-rocdecode-tracing-execute FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}")
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
DEPENDS
test-rocdecode-tracing-execute
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
DISABLED
$<NOT:$<TARGET_EXISTS:rocdecode-demo>>)
+3
Datei anzeigen
@@ -1,6 +1,7 @@
#!/usr/bin/env python3
import json
import os
import pytest
from rocprofiler_sdk.pytest_utils.dotdict import dotdict
@@ -18,5 +19,7 @@ def pytest_addoption(parser):
@pytest.fixture
def input_data(request):
filename = request.config.getoption("--input")
if not os.path.isfile(filename):
return pytest.skip("rocdecode tracing unavailable")
with open(filename, "r") as inp:
return dotdict(json.load(inp))
+8 -64
Datei anzeigen
@@ -32,16 +32,12 @@ def test_data_structure(input_data):
node_exists("buffer_records", sdk_data)
node_exists("names", sdk_data["callback_records"])
node_exists("hsa_api_traces", sdk_data["callback_records"])
node_exists("hip_api_traces", sdk_data["callback_records"])
node_exists("memory_allocations", sdk_data["callback_records"])
node_exists("rocdecode_api_traces", sdk_data["callback_records"])
# Uncomment when rocprofiler register mainline supports rocdecode
# node_exists("rocdecode_api_traces", sdk_data["callback_records"])
node_exists("names", sdk_data["buffer_records"])
node_exists("hsa_api_traces", sdk_data["buffer_records"])
node_exists("hip_api_traces", sdk_data["buffer_records"])
node_exists("memory_allocations", sdk_data["buffer_records"])
node_exists("rocdecode_api_traces", sdk_data["buffer_records"])
# Uncomment when rocprofiler register mainline supports rocdecode
# node_exists("rocdecode_api_traces", sdk_data["buffer_records"])
def test_size_entries(input_data):
@@ -77,7 +73,7 @@ def test_timestamps(input_data):
cb_start = {}
cb_end = {}
for titr in ["hsa_api_traces", "hip_api_traces", "rocdecode_api_traces"]:
for titr in ["rocdecode_api_traces"]:
for itr in sdk_data["callback_records"][titr]:
cid = itr["correlation_id"]["internal"]
phase = itr["phase"]
@@ -92,29 +88,6 @@ def test_timestamps(input_data):
for itr in sdk_data["buffer_records"][titr]:
assert itr["start_timestamp"] <= itr["end_timestamp"]
for titr in ["memory_allocations"]:
for itr in sdk_data["buffer_records"][titr]:
assert itr["start_timestamp"] < itr["end_timestamp"], f"[{titr}] {itr}"
assert itr["correlation_id"]["internal"] > 0, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert (
sdk_data["metadata"]["init_time"] < itr["start_timestamp"]
), f"[{titr}] {itr}"
assert (
sdk_data["metadata"]["init_time"] < itr["end_timestamp"]
), f"[{titr}] {itr}"
assert (
sdk_data["metadata"]["fini_time"] > itr["start_timestamp"]
), f"[{titr}] {itr}"
assert (
sdk_data["metadata"]["fini_time"] > itr["end_timestamp"]
), f"[{titr}] {itr}"
api_start = cb_start[itr["correlation_id"]["internal"]]
# api_end = cb_end[itr["correlation_id"]["internal"]]
assert api_start < itr["start_timestamp"], f"[{titr}] {itr}"
# assert api_end <= itr["end_timestamp"], f"[{titr}] {itr}"
def test_internal_correlation_ids(input_data):
"""Assure correlation ids are unique"""
@@ -140,37 +113,6 @@ def test_internal_correlation_ids(input_data):
assert max(api_corr_ids_sorted) == len_corr_id_unq
def test_external_correlation_ids(input_data):
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
extern_corr_ids = []
for titr in ["hsa_api_traces", "hip_api_traces", "rocdecode_api_traces"]:
for itr in sdk_data["callback_records"][titr]:
assert itr["correlation_id"]["external"] > 0
assert itr["thread_id"] == itr["correlation_id"]["external"]
extern_corr_ids.append(itr["correlation_id"]["external"])
extern_corr_ids = list(set(sorted(extern_corr_ids)))
for titr in ["hsa_api_traces", "hip_api_traces", "rocdecode_api_traces"]:
for itr in sdk_data["buffer_records"][titr]:
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert (
itr["thread_id"] == itr["correlation_id"]["external"]
), f"[{titr}] {itr}"
assert itr["thread_id"] in extern_corr_ids, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] in extern_corr_ids, f"[{titr}] {itr}"
for titr in ["memory_allocations"]:
for itr in sdk_data["buffer_records"][titr]:
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] in extern_corr_ids, f"[{titr}] {itr}"
for itr in sdk_data["callback_records"][titr]:
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] in extern_corr_ids, f"[{titr}] {itr}"
def get_operation(record, kind_name, op_name=None):
for idx, itr in enumerate(record["names"]):
if kind_name == itr["kind"]:
@@ -196,7 +138,9 @@ def test_rocdecode_traces(input_data):
rocdecode_cb_traces = sdk_data["callback_records"]["rocdecode_api_traces"]
rocdecode_api_cb_ops = get_operation(callback_records, "ROCDECODE_API")
# If rocDecode tracing is not supported, end early
if len(rocdecode_bf_traces) == 0:
return pytest.skip("rocdecode tracing unavailable")
assert (
rocdecode_api_bf_ops[1] == rocdecode_api_cb_ops[1]
and len(rocdecode_api_cb_ops[1]) == 16
+68
Datei anzeigen
@@ -0,0 +1,68 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
project(
rocprofiler-tests-rocjpeg-tracing
LANGUAGES CXX
VERSION 0.0.0)
find_package(rocprofiler-sdk REQUIRED)
find_package(rocJPEG)
string(REPLACE "LD_PRELOAD=" "ROCPROF_PRELOAD=" PRELOAD_ENV
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV}")
set(rocjpeg-tracing-env "${PRELOAD_ENV}")
set(rocJPEG_IMAGE_DIR "${rocJPEG_ROOT_DIR}/share/rocjpeg/images")
if(TARGET rocjpeg-demo AND NOT EXISTS "${rocJPEG_IMAGE_DIR}")
message(
FATAL_ERROR
"Unable to find image directory for rocjpeg tests: ${rocJPEG_IMAGE_DIR}")
endif()
add_test(
NAME test-rocjpeg-tracing-execute
COMMAND
$<IF:$<TARGET_EXISTS:rocjpeg-demo>,$<$<TARGET_EXISTS:rocjpeg-demo>:$<TARGET_FILE:rocjpeg-demo>>,${CMAKE_COMMAND}>
-i ${rocJPEG_IMAGE_DIR})
set(rocjpeg-tracing-env
"${PRELOAD_ENV}"
"ROCPROFILER_TOOL_OUTPUT_FILE=rocjpeg-tracing-test.json"
"LD_LIBRARY_PATH=$<TARGET_FILE_DIR:rocprofiler-sdk::rocprofiler-sdk-shared-library>:$ENV{LD_LIBRARY_PATH}"
)
set_tests_properties(
test-rocjpeg-tracing-execute
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
ENVIRONMENT
"${rocjpeg-tracing-env}"
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
DISABLED
$<NOT:$<TARGET_EXISTS:rocjpeg-demo>>)
# copy to binary directory
rocprofiler_configure_pytest_files(COPY validate.py conftest.py CONFIG pytest.ini)
add_test(NAME test-rocjpeg-tracing-validate
COMMAND ${Python3_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/validate.py --input
${CMAKE_CURRENT_BINARY_DIR}/rocjpeg-tracing-test.json)
set_tests_properties(
test-rocjpeg-tracing-validate
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
DEPENDS
test-rocjpeg-tracing-execute
FAIL_REGULAR_EXPRESSION
"${ROCPROFILER_DEFAULT_FAIL_REGEX}"
DISABLED
$<NOT:$<TARGET_EXISTS:rocjpeg-demo>>)
+25
Datei anzeigen
@@ -0,0 +1,25 @@
#!/usr/bin/env python3
import json
import os
import pytest
from rocprofiler_sdk.pytest_utils.dotdict import dotdict
def pytest_addoption(parser):
parser.addoption(
"--input",
action="store",
default="rocjpeg-tracing-test.json",
help="Input JSON",
)
@pytest.fixture
def input_data(request):
filename = request.config.getoption("--input")
if not os.path.isfile(filename):
return pytest.skip("rocjpeg tracing unavailable")
with open(filename, "r") as inp:
return dotdict(json.load(inp))
+5
Datei anzeigen
@@ -0,0 +1,5 @@
[pytest]
addopts = --durations=20 -rA -s -vv
testpaths = validate.py
pythonpath = @ROCPROFILER_SDK_TESTS_BINARY_DIR@/pytest-packages
+250
Datei anzeigen
@@ -0,0 +1,250 @@
#!/usr/bin/env python3
import sys
import pytest
# helper function
def node_exists(name, data, min_len=1):
assert name in data
assert data[name] is not None
if isinstance(data[name], (list, tuple, dict, set)):
assert len(data[name]) >= min_len, f"{name}:\n{data}"
def test_data_structure(input_data):
"""verify minimum amount of expected data is present"""
data = input_data
node_exists("rocprofiler-sdk-json-tool", data)
sdk_data = data["rocprofiler-sdk-json-tool"]
node_exists("metadata", sdk_data)
node_exists("pid", sdk_data["metadata"])
node_exists("main_tid", sdk_data["metadata"])
node_exists("init_time", sdk_data["metadata"])
node_exists("fini_time", sdk_data["metadata"])
node_exists("agents", sdk_data)
node_exists("call_stack", sdk_data)
node_exists("callback_records", sdk_data)
node_exists("buffer_records", sdk_data)
node_exists("names", sdk_data["callback_records"])
# Uncomment once mainline rocprofiler register supports rocJPEG
# node_exists("rocjpeg_api_traces", sdk_data["callback_records"])
node_exists("names", sdk_data["buffer_records"])
# Uncomment once mainline rocprofiler register supports rocJPEG
# node_exists("rocjpeg_api_traces", sdk_data["buffer_records"])
def test_size_entries(input_data):
# check that size fields are > 0 but account for function arguments
# which are named "size"
def check_size(data, bt):
if "size" in data.keys():
if isinstance(data["size"], str) and bt.endswith('["args"]'):
pass
else:
assert data["size"] > 0, f"origin: {bt}"
# recursively check the entire data structure
def iterate_data(data, bt):
if isinstance(data, (list, tuple)):
for i, itr in enumerate(data):
if isinstance(itr, dict):
check_size(itr, f"{bt}[{i}]")
iterate_data(itr, f"{bt}[{i}]")
elif isinstance(data, dict):
check_size(data, f"{bt}")
for key, itr in data.items():
iterate_data(itr, f'{bt}["{key}"]')
# start recursive check over entire JSON dict
iterate_data(input_data, "input_data")
def test_timestamps(input_data):
"""Verify starting timestamps are less than ending timestamps"""
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
cb_start = {}
cb_end = {}
for titr in ["rocjpeg_api_traces"]:
for itr in sdk_data["callback_records"][titr]:
cid = itr["correlation_id"]["internal"]
phase = itr["phase"]
if phase == 1:
cb_start[cid] = itr["timestamp"]
elif phase == 2:
cb_end[cid] = itr["timestamp"]
assert cb_start[cid] <= itr["timestamp"]
else:
assert phase == 1 or phase == 2
for itr in sdk_data["buffer_records"][titr]:
assert itr["start_timestamp"] <= itr["end_timestamp"]
def test_internal_correlation_ids(input_data):
"""Assure correlation ids are unique"""
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
api_corr_ids = []
for titr in ["hsa_api_traces", "hip_api_traces", "rocjpeg_api_traces"]:
for itr in sdk_data["callback_records"][titr]:
api_corr_ids.append(itr["correlation_id"]["internal"])
for itr in sdk_data["buffer_records"][titr]:
api_corr_ids.append(itr["correlation_id"]["internal"])
api_corr_ids_sorted = sorted(api_corr_ids)
api_corr_ids_unique = list(set(api_corr_ids))
for itr in sdk_data["buffer_records"]["memory_allocations"]:
assert itr["correlation_id"]["internal"] in api_corr_ids_unique
len_corr_id_unq = len(api_corr_ids_unique)
assert len(api_corr_ids) != len_corr_id_unq
assert max(api_corr_ids_sorted) == len_corr_id_unq
def test_external_correlation_ids(input_data):
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
extern_corr_ids = []
for titr in ["rocjpeg_api_traces"]:
for itr in sdk_data["callback_records"][titr]:
assert itr["correlation_id"]["external"] > 0
assert itr["thread_id"] == itr["correlation_id"]["external"]
extern_corr_ids.append(itr["correlation_id"]["external"])
extern_corr_ids = list(set(sorted(extern_corr_ids)))
for titr in ["rocjpeg_api_traces"]:
for itr in sdk_data["buffer_records"][titr]:
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert (
itr["thread_id"] == itr["correlation_id"]["external"]
), f"[{titr}] {itr}"
assert itr["thread_id"] in extern_corr_ids, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] in extern_corr_ids, f"[{titr}] {itr}"
for itr in sdk_data["callback_records"][titr]:
assert itr["correlation_id"]["external"] > 0, f"[{titr}] {itr}"
assert itr["correlation_id"]["external"] in extern_corr_ids, f"[{titr}] {itr}"
def get_operation(record, kind_name, op_name=None):
for idx, itr in enumerate(record["names"]):
if kind_name == itr["kind"]:
if op_name is None:
return idx, itr["operations"]
else:
for oidx, oname in enumerate(itr["operations"]):
if op_name == oname:
return oidx
return None
def test_rocjpeg_traces(input_data):
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
callback_records = sdk_data["callback_records"]
buffer_records = sdk_data["buffer_records"]
rocjpeg_bf_traces = sdk_data["buffer_records"]["rocjpeg_api_traces"]
rocjpeg_api_bf_ops = get_operation(buffer_records, "ROCJPEG_API")
assert len(rocjpeg_api_bf_ops[1]) == 9
rocjpeg_cb_traces = sdk_data["callback_records"]["rocjpeg_api_traces"]
rocjpeg_api_cb_ops = get_operation(callback_records, "ROCJPEG_API")
# If rocJPEG tracing is not supported, end early
if len(rocjpeg_bf_traces) <= 2:
return pytest.skip("rocdecode tracing unavailable")
assert (
rocjpeg_api_bf_ops[1] == rocjpeg_api_cb_ops[1] and len(rocjpeg_api_cb_ops[1]) == 9
)
# check that buffer and callback records agree
phase_enter_count = 0
phase_end_count = 0
api_calls = []
for api_call in rocjpeg_cb_traces:
if api_call["phase"] == 1:
phase_enter_count += 1
api_calls.append(rocjpeg_api_cb_ops[1][api_call["operation"]])
if api_call["phase"] == 2:
phase_end_count += 1
assert phase_enter_count == phase_end_count == len(rocjpeg_bf_traces)
for call in [
"rocJpegCreate",
"rocJpegStreamCreate",
"rocJpegStreamParse",
"rocJpegGetImageInfo",
"rocJpegDecode",
"rocJpegDestroy",
"rocJpegStreamDestroy",
]:
assert call in api_calls
def test_retired_correlation_ids(input_data):
data = input_data
sdk_data = data["rocprofiler-sdk-json-tool"]
def _sort_dict(inp):
return dict(sorted(inp.items()))
api_corr_ids = {}
for titr in ["hsa_api_traces", "hip_api_traces", "rocjpeg_api_traces"]:
for itr in sdk_data["buffer_records"][titr]:
corr_id = itr["correlation_id"]["internal"]
assert corr_id not in api_corr_ids.keys()
api_corr_ids[corr_id] = itr
alloc_corr_ids = {}
for titr in ["memory_allocations"]:
for itr in sdk_data["buffer_records"][titr]:
corr_id = itr["correlation_id"]["internal"]
assert corr_id not in alloc_corr_ids.keys()
alloc_corr_ids[corr_id] = itr
retired_corr_ids = {}
for itr in sdk_data["buffer_records"]["retired_correlation_ids"]:
corr_id = itr["internal_correlation_id"]
assert corr_id not in retired_corr_ids.keys()
retired_corr_ids[corr_id] = itr
api_corr_ids = _sort_dict(api_corr_ids)
alloc_corr_ids = _sort_dict(alloc_corr_ids)
retired_corr_ids = _sort_dict(retired_corr_ids)
for cid, itr in alloc_corr_ids.items():
assert cid in retired_corr_ids.keys()
retired_ts = retired_corr_ids[cid]["timestamp"]
end_ts = itr["end_timestamp"]
assert (retired_ts - end_ts) > 0, f"correlation-id: {cid}, data: {itr}"
for cid, itr in api_corr_ids.items():
assert cid in retired_corr_ids.keys()
retired_ts = retired_corr_ids[cid]["timestamp"]
end_ts = itr["end_timestamp"]
assert (retired_ts - end_ts) > 0, f"correlation-id: {cid}, data: {itr}"
assert len(api_corr_ids.keys()) == (len(retired_corr_ids.keys()))
if __name__ == "__main__":
exit_code = pytest.main(["-x", __file__] + sys.argv[1:])
sys.exit(exit_code)
+2 -3
Datei anzeigen
@@ -36,9 +36,8 @@ add_subdirectory(roctracer-roctx)
add_subdirectory(scratch-memory)
add_subdirectory(pc-sampling)
add_subdirectory(collection-period)
if(ROCPROFILER_BUILD_ROCDECODE_TESTS)
add_subdirectory(rocdecode-trace)
endif()
add_subdirectory(rocdecode-trace)
add_subdirectory(rocjpeg-trace)
if(TARGET att_decoder_testing)
add_subdirectory(advanced-thread-trace)
endif()
@@ -9,6 +9,7 @@ project(
VERSION 0.0.0)
find_package(rocprofiler-sdk REQUIRED)
find_package(rocDecode)
rocprofiler_configure_pytest_files(CONFIG pytest.ini COPY validate.py conftest.py)
@@ -18,23 +19,34 @@ string(REPLACE "LD_PRELOAD=" "ROCPROF_PRELOAD=" PRELOAD_ENV
set(rocdecode-tracing-env "${PRELOAD_ENV}")
set(ROCDECODE_VIDEO_FILE
"${ROCM_PATH}/share/rocdecode/video/AMD_driving_virtual_20-H265.265")
if(NOT EXISTS "${ROCDECODE_VIDEO_FILE}")
"${rocDecode_ROOT_DIR}/share/rocdecode/video/AMD_driving_virtual_20-H265.265")
if(TARGET rocdecode-demo AND NOT EXISTS "${ROCDECODE_VIDEO_FILE}")
message(
FATAL_ERROR
"Unable to find video file for rocdecode tests: ${ROCDECODE_VIDEO_FILE}")
endif()
add_test(
NAME rocprofv3-test-rocdecode-tracing-execute
COMMAND
$<TARGET_FILE:rocprofiler-sdk::rocprofv3> --rocdecode-trace -d
${CMAKE_CURRENT_BINARY_DIR}/%tag%-trace -o out --output-format json otf2 pftrace
csv --log-level env -- $<TARGET_FILE:rocdecode> -i ${ROCDECODE_VIDEO_FILE})
${CMAKE_CURRENT_BINARY_DIR}/%tag%-trace -o out --output-format json csv
--log-level env --
$<IF:$<TARGET_EXISTS:rocdecode-demo>,$<$<TARGET_EXISTS:rocdecode-demo>:$<TARGET_FILE:rocdecode-demo>>,rocdecode-demo>
-i ${ROCDECODE_VIDEO_FILE})
set_tests_properties(
rocprofv3-test-rocdecode-tracing-execute
PROPERTIES TIMEOUT 45 LABELS "integration-tests" ENVIRONMENT
"${rocdecode-tracing-env}" FAIL_REGULAR_EXPRESSION "threw an exception")
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
ENVIRONMENT
"${rocdecode-tracing-env}"
FAIL_REGULAR_EXPRESSION
"threw an exception"
DISABLED
$<NOT:$<TARGET_EXISTS:rocdecode-demo>>)
add_test(
NAME rocprofv3-test-rocdecode-tracing-validate
@@ -47,6 +59,13 @@ add_test(
set_tests_properties(
rocprofv3-test-rocdecode-tracing-validate
PROPERTIES TIMEOUT 45 LABELS "integration-tests" DEPENDS
rocprofv3-test-rocdecode-tracing-execute FAIL_REGULAR_EXPRESSION
"AssertionError")
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
DEPENDS
rocprofv3-test-rocdecode-tracing-execute
FAIL_REGULAR_EXPRESSION
"AssertionError"
DISABLED
$<NOT:$<TARGET_EXISTS:rocdecode-demo>>)
+16 -7
Datei anzeigen
@@ -41,6 +41,8 @@ def pytest_addoption(parser):
@pytest.fixture
def json_data(request):
filename = request.config.getoption("--json-input")
if not os.path.isfile(filename):
return pytest.skip("rocdecode tracing unavailable")
with open(filename, "r") as inp:
return dotdict(collapse_dict_list(json.load(inp)))
@@ -49,23 +51,30 @@ def json_data(request):
def csv_data(request):
filename = request.config.getoption("--csv-input")
data = []
with open(filename, "r") as inp:
reader = csv.DictReader(inp)
for row in reader:
data.append(row)
if not os.path.isfile(filename):
# The CSV file is not generated, because the dependency test
# responsible to generate this file was skipped or failed.
# Thus emit the message to skip this test as well.
return pytest.skip("rocdecode tracing unavailable")
else:
with open(filename, "r") as inp:
reader = csv.DictReader(inp)
for row in reader:
data.append(row)
return data
@pytest.fixture
def otf2_data(request):
filename = request.config.getoption("--otf2-input")
if not os.path.exists(filename):
raise FileExistsError(f"{filename} does not exist")
if not os.path.isfile(filename):
return pytest.skip("rocdecode tracing unavailable")
return OTF2Reader(filename).read()[0]
@pytest.fixture
def pftrace_data(request):
filename = request.config.getoption("--pftrace-input")
if not os.path.isfile(filename):
return pytest.skip("rocdecode tracing unavailable")
return PerfettoReader(filename).read()[0]
+22 -2
Datei anzeigen
@@ -32,6 +32,9 @@ def test_rocdeocde(json_data):
buffer_records = data["buffer_records"]
rocdecode_data = buffer_records["rocdecode_api"]
# If rocDecode tracing is not supported, end early
if len(rocdecode_data) == 0:
return pytest.skip("rocdecode tracing unavailable")
_, bf_op_names = get_operation(data, "ROCDECODE_API")
@@ -62,6 +65,9 @@ def test_rocdeocde(json_data):
def test_csv_data(csv_data):
# If rocDecode tracing is not supported, end early
if len(csv_data) == 0:
return pytest.skip("rocdecode tracing unavailable")
assert len(csv_data) > 0, "Expected non-empty csv data"
api_calls = []
@@ -116,20 +122,34 @@ def test_csv_data(csv_data):
def test_perfetto_data(pftrace_data, json_data):
import rocprofiler_sdk.tests.rocprofv3 as rocprofv3
# If rocDecode tracing is not supported, end early
if (
pftrace_data == None
or len(json_data["rocprofiler-sdk-tool"]["buffer_records"]["rocdecode_api"]) == 0
):
return pytest.skip("rocdecode tracing unavailable")
rocprofv3.test_perfetto_data(
pftrace_data,
json_data,
("hip", "hsa", "memory_allocation", "rocdecode_api"),
("rocdecode_api",),
)
def test_otf2_data(otf2_data, json_data):
import rocprofiler_sdk.tests.rocprofv3 as rocprofv3
# If rocDecode tracing is not supported, end early
if (
otf2_data == None
or len(json_data["rocprofiler-sdk-tool"]["buffer_records"]["rocdecode_api"]) == 0
):
return pytest.skip("rocdecode tracing unavailable")
rocprofv3.test_otf2_data(
otf2_data,
json_data,
("hip", "hsa", "memory_allocation", "rocdecode_api"),
("rocdecode_api",),
)
@@ -0,0 +1,72 @@
#
#
#
cmake_minimum_required(VERSION 3.21.0 FATAL_ERROR)
project(
rocprofiler-tests-rocprofv3-rocjpeg-tracing
LANGUAGES CXX
VERSION 0.0.0)
find_package(rocprofiler-sdk REQUIRED)
find_package(rocJPEG)
rocprofiler_configure_pytest_files(CONFIG pytest.ini COPY validate.py conftest.py)
string(REPLACE "LD_PRELOAD=" "ROCPROF_PRELOAD=" PRELOAD_ENV
"${ROCPROFILER_MEMCHECK_PRELOAD_ENV}")
set(rocjpeg-tracing-env "${PRELOAD_ENV}")
set(ROCJPEG_IMAGE_DIR "${ROCM_PATH}/share/rocjpeg/images")
if(TARGET rocjpeg-demo AND NOT EXISTS "${ROCJPEG_IMAGE_DIR}")
message(
FATAL_ERROR
"Unable to find image directory for rocjpeg tests: ${ROCJPEG_IMAGE_DIR}")
endif()
# CI Sanitizer run gives the following error: No target "rocjpeg-demo" Adding if-statement
# to avoid tests for now to stop error
add_test(
NAME rocprofv3-test-rocjpeg-tracing-execute
COMMAND
$<TARGET_FILE:rocprofiler-sdk::rocprofv3> --rocjpeg-trace -d
${CMAKE_CURRENT_BINARY_DIR}/%tag%-trace -o out --output-format json csv
--log-level env --
$<IF:$<TARGET_EXISTS:rocjpeg-demo>,$<$<TARGET_EXISTS:rocjpeg-demo>:$<TARGET_FILE:rocjpeg-demo>>,rocjpeg-demo>
-i ${ROCJPEG_IMAGE_DIR})
set_tests_properties(
rocprofv3-test-rocjpeg-tracing-execute
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
ENVIRONMENT
"${rocjpeg-tracing-env}"
FAIL_REGULAR_EXPRESSION
"threw an exception"
DISABLED
$<NOT:$<TARGET_EXISTS:rocjpeg-demo>>)
add_test(
NAME rocprofv3-test-rocjpeg-tracing-validate
COMMAND
${Python3_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/validate.py --json-input
${CMAKE_CURRENT_BINARY_DIR}/rocjpeg-trace/out_results.json --otf2-input
${CMAKE_CURRENT_BINARY_DIR}/rocjpeg-trace/out_results.otf2 --pftrace-input
${CMAKE_CURRENT_BINARY_DIR}/rocjpeg-trace/out_results.pftrace --csv-input
${CMAKE_CURRENT_BINARY_DIR}/rocjpeg-trace/out_rocjpeg_api_trace.csv)
set_tests_properties(
rocprofv3-test-rocjpeg-tracing-validate
PROPERTIES TIMEOUT
45
LABELS
"integration-tests"
DEPENDS
rocprofv3-test-rocjpeg-tracing-execute
FAIL_REGULAR_EXPRESSION
"AssertionError"
DISABLED
$<NOT:$<TARGET_EXISTS:rocjpeg-demo>>)
@@ -0,0 +1,80 @@
#!/usr/bin/env python3
import csv
import json
import os
import pytest
from rocprofiler_sdk.pytest_utils.dotdict import dotdict
from rocprofiler_sdk.pytest_utils import collapse_dict_list
from rocprofiler_sdk.pytest_utils.perfetto_reader import PerfettoReader
from rocprofiler_sdk.pytest_utils.otf2_reader import OTF2Reader
def pytest_addoption(parser):
parser.addoption(
"--json-input",
action="store",
default="rocjpeg-tracing/out_results.json",
help="Input JSON",
)
parser.addoption(
"--otf2-input",
action="store",
default="rocjpeg-tracing/out_results.otf2",
help="Input OTF2",
)
parser.addoption(
"--pftrace-input",
action="store",
default="rocjpeg-tracing/out_results.pftrace",
help="Input pftrace file",
)
parser.addoption(
"--csv-input",
action="store",
default="rocjpeg-tracing/out_rocjpeg_api_trace.csv",
help="Input CSV",
)
@pytest.fixture
def json_data(request):
filename = request.config.getoption("--json-input")
if not os.path.isfile(filename):
return pytest.skip("rocjpeg tracing unavailable")
with open(filename, "r") as inp:
return dotdict(collapse_dict_list(json.load(inp)))
@pytest.fixture
def csv_data(request):
filename = request.config.getoption("--csv-input")
data = []
if not os.path.isfile(filename):
# The CSV file is not generated, because the dependency test
# responsible to generate this file was skipped or failed.
# Thus emit the message to skip this test as well.
return pytest.skip("rocjpeg tracing unavailable")
else:
with open(filename, "r") as inp:
reader = csv.DictReader(inp)
for row in reader:
data.append(row)
return data
@pytest.fixture
def otf2_data(request):
filename = request.config.getoption("--otf2-input")
if not os.path.isfile(filename):
return pytest.skip("rocjpeg tracing unavailable")
return OTF2Reader(filename).read()[0]
@pytest.fixture
def pftrace_data(request):
filename = request.config.getoption("--pftrace-input")
if not os.path.isfile(filename):
return pytest.skip("rocjpeg tracing unavailable")
return PerfettoReader(filename).read()[0]
@@ -0,0 +1,5 @@
[pytest]
addopts = --durations=20 -rA -s -vv
testpaths = validate.py
pythonpath = @ROCPROFILER_SDK_TESTS_BINARY_DIR@/pytest-packages
+153
Datei anzeigen
@@ -0,0 +1,153 @@
#!/usr/bin/env python3
import sys
import pytest
import json
from collections import defaultdict
# helper function
def node_exists(name, data, min_len=1):
assert name in data
assert data[name] is not None
if isinstance(data[name], (list, tuple, dict, set)):
assert len(data[name]) >= min_len
def get_operation(record, kind_name, op_name=None):
for idx, itr in enumerate(record["strings"]["buffer_records"]):
if kind_name == itr["kind"]:
if op_name is None:
return idx, itr["operations"]
else:
for oidx, oname in enumerate(itr["operations"]):
if op_name == oname:
return oidx
return None
def test_rocjpeg(json_data):
data = json_data["rocprofiler-sdk-tool"]
buffer_records = data["buffer_records"]
rocjpeg_data = buffer_records["rocjpeg_api"]
# If rocJPEG tracing is not supported, end early
if len(rocjpeg_data) == 0:
return pytest.skip("rocjpeg tracing unavailable")
_, bf_op_names = get_operation(data, "ROCJPEG_API")
assert len(bf_op_names) == 9
rocjpeg_reported_agent_ids = set()
# check buffering data
for node in rocjpeg_data:
assert "size" in node
assert "kind" in node
assert "operation" in node
assert "correlation_id" in node
assert "end_timestamp" in node
assert "start_timestamp" in node
assert "thread_id" in node
assert node.size > 0
assert node.thread_id > 0
assert node.start_timestamp > 0
assert node.end_timestamp > 0
assert node.start_timestamp < node.end_timestamp
assert data.strings.buffer_records[node.kind].kind == "ROCJPEG_API"
assert (
data.strings.buffer_records[node.kind].operations[node.operation]
in bf_op_names
)
def test_csv_data(csv_data):
# If rocJPEG tracing is not supported, end early
if len(csv_data) <= 2:
return pytest.skip("rocjpeg tracing unavailable")
assert len(csv_data) > 0, "Expected non-empty csv data"
api_calls = []
for row in csv_data:
assert "Domain" in row, "'Domain' was not present in csv data for rocjpeg-trace"
assert (
"Function" in row
), "'Function' was not present in csv data for rocjpeg-trace"
assert (
"Process_Id" in row
), "'Process_Id' was not present in csv data for rocjpeg-trace"
assert (
"Thread_Id" in row
), "'Thread_Id' was not present in csv data for rocjpeg-trace"
assert (
"Correlation_Id" in row
), "'Correlation_Id' was not present in csv data for rocjpeg-trace"
assert (
"Start_Timestamp" in row
), "'Start_Timestamp' was not present in csv data for rocjpeg-trace"
assert (
"End_Timestamp" in row
), "'End_Timestamp' was not present in csv data for rocjpeg-trace"
api_calls.append(row["Function"])
assert row["Domain"] == "ROCJPEG_API"
assert int(row["Process_Id"]) > 0
assert int(row["Thread_Id"]) > 0
assert int(row["Start_Timestamp"]) > 0
assert int(row["End_Timestamp"]) > 0
assert int(row["Start_Timestamp"]) < int(row["End_Timestamp"])
for call in [
"rocJpegCreate",
"rocJpegStreamCreate",
"rocJpegStreamParse",
"rocJpegGetImageInfo",
"rocJpegDecode",
"rocJpegDestroy",
"rocJpegStreamDestroy",
]:
assert call in api_calls
def test_perfetto_data(pftrace_data, json_data):
import rocprofiler_sdk.tests.rocprofv3 as rocprofv3
# If rocJPEG tracing is not supported, end early
if (
pftrace_data == None
or len(json_data["rocprofiler-sdk-tool"]["buffer_records"]["rocjpeg_api"]) == 0
):
return pytest.skip("rocjpeg tracing unavailable")
rocprofv3.test_perfetto_data(
pftrace_data,
json_data,
("rocjpeg_api",),
)
def test_otf2_data(otf2_data, json_data):
import rocprofiler_sdk.tests.rocprofv3 as rocprofv3
# If rocJPEG tracing is not supported, end early
if (
otf2_data == None
or len(json_data["rocprofiler-sdk-tool"]["buffer_records"]["rocjpeg_api"]) == 0
):
return pytest.skip("rocjpeg tracing unavailable")
rocprofv3.test_otf2_data(
otf2_data,
json_data,
("rocjpeg_api",),
)
if __name__ == "__main__":
exit_code = pytest.main(["-x", __file__] + sys.argv[1:])
sys.exit(exit_code)
+121 -3
Datei anzeigen
@@ -414,6 +414,23 @@ struct rocdecode_api_callback_record_t
}
};
struct rocjpeg_api_callback_record_t
{
uint64_t timestamp = 0;
rocprofiler_callback_tracing_record_t record = {};
rocprofiler_callback_tracing_rocjpeg_api_data_t payload = {};
callback_arg_array_t args = {};
template <typename ArchiveT>
void save(ArchiveT& ar) const
{
ar(cereal::make_nvp("timestamp", timestamp));
cereal::save(ar, record);
ar(cereal::make_nvp("payload", payload));
serialize_args(ar, args);
}
};
struct ompt_callback_record_t
{
uint64_t timestamp = 0;
@@ -573,6 +590,7 @@ auto memory_copy_cb_records = std::deque<memory_copy_callback_record_t>{}
auto memory_allocation_cb_records = std::deque<memory_allocation_callback_record_t>{};
auto rccl_api_cb_records = std::deque<rccl_api_callback_record_t>{};
auto rocdecode_api_cb_records = std::deque<rocdecode_api_callback_record_t>{};
auto rocjpeg_api_cb_records = std::deque<rocjpeg_api_callback_record_t>{};
auto ompt_cb_records = std::deque<ompt_callback_record_t>{};
int
@@ -856,6 +874,19 @@ tool_tracing_callback(rocprofiler_callback_tracing_record_t record,
rocdecode_api_cb_records.emplace_back(
rocdecode_api_callback_record_t{ts, record, *data, std::move(args)});
}
else if(record.kind == ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API)
{
auto* data = static_cast<rocprofiler_callback_tracing_rocjpeg_api_data_t*>(record.payload);
auto args = callback_arg_array_t{};
if(record.phase == ROCPROFILER_CALLBACK_PHASE_EXIT)
rocprofiler_iterate_callback_tracing_kind_operation_args(
record, save_args, record.phase, &args);
static auto _mutex = std::mutex{};
auto _lk = std::unique_lock<std::mutex>{_mutex};
rocjpeg_api_cb_records.emplace_back(
rocjpeg_api_callback_record_t{ts, record, *data, std::move(args)});
}
else
{
throw std::runtime_error{"unsupported callback kind"};
@@ -877,6 +908,7 @@ auto corr_id_retire_records =
std::deque<rocprofiler_buffer_tracing_correlation_id_retirement_record_t>{};
auto rccl_api_bf_records = std::deque<rocprofiler_buffer_tracing_rccl_api_record_t>{};
auto rocdecode_api_bf_records = std::deque<rocprofiler_buffer_tracing_rocdecode_api_record_t>{};
auto rocjpeg_api_bf_records = std::deque<rocprofiler_buffer_tracing_rocjpeg_api_record_t>{};
auto ompt_bf_records = std::deque<rocprofiler_buffer_tracing_ompt_record_t>{};
void
@@ -1011,6 +1043,13 @@ tool_tracing_buffered(rocprofiler_context_id_t /*context*/,
rocdecode_api_bf_records.emplace_back(*record);
}
else if(header->kind == ROCPROFILER_BUFFER_TRACING_ROCJPEG_API)
{
auto* record =
static_cast<rocprofiler_buffer_tracing_rocjpeg_api_record_t*>(header->payload);
rocjpeg_api_bf_records.emplace_back(*record);
}
else
{
throw std::runtime_error{
@@ -1111,6 +1150,8 @@ rocprofiler_context_id_t runtime_init_callback_ctx = {};
rocprofiler_context_id_t runtime_init_buffered_ctx = {};
rocprofiler_context_id_t rocdecode_api_callback_ctx = {0};
rocprofiler_context_id_t rocdecode_api_buffered_ctx = {0};
rocprofiler_context_id_t rocjpeg_api_callback_ctx = {0};
rocprofiler_context_id_t rocjpeg_api_buffered_ctx = {0};
// buffers
rocprofiler_buffer_id_t runtime_init_buffered_buffer = {};
@@ -1126,6 +1167,7 @@ rocprofiler_buffer_id_t scratch_memory_buffer = {};
rocprofiler_buffer_id_t corr_id_retire_buffer = {};
rocprofiler_buffer_id_t rccl_api_buffered_buffer = {};
rocprofiler_buffer_id_t rocdecode_api_buffer = {};
rocprofiler_buffer_id_t rocjpeg_api_buffer = {};
rocprofiler_buffer_id_t ompt_buffered_buffer = {};
auto contexts = std::unordered_map<std::string_view, rocprofiler_context_id_t*>{
@@ -1153,10 +1195,12 @@ auto contexts = std::unordered_map<std::string_view, rocprofiler_context_id_t*>{
{"RCCL_API_BUFFERED", &rccl_api_buffered_ctx},
{"ROCDECODE_API_CALLBACK", &rocdecode_api_callback_ctx},
{"ROCDECODE_API_BUFFERED", &rocdecode_api_buffered_ctx},
{"ROCJPEG_API_CALLBACK", &rocjpeg_api_callback_ctx},
{"ROCJPEG_API_BUFFERED", &rocjpeg_api_buffered_ctx},
{"OMPT_BUFFERED", &ompt_buffered_ctx},
};
auto buffers = std::array<rocprofiler_buffer_id_t*, 14>{&runtime_init_buffered_buffer,
auto buffers = std::array<rocprofiler_buffer_id_t*, 15>{&runtime_init_buffered_buffer,
&hsa_api_buffered_buffer,
&hip_api_buffered_buffer,
&marker_api_buffered_buffer,
@@ -1169,7 +1213,8 @@ auto buffers = std::array<rocprofiler_buffer_id_t*, 14>{&runtime_init_buffered_b
&corr_id_retire_buffer,
&rccl_api_buffered_buffer,
&ompt_buffered_buffer,
&rocdecode_api_buffer};
&rocdecode_api_buffer,
&rocjpeg_api_buffer};
auto agents = std::vector<rocprofiler_agent_t>{};
auto agents_map = std::unordered_map<rocprofiler_agent_id_t, rocprofiler_agent_t>{};
@@ -1344,6 +1389,15 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
nullptr),
"rocdecode api callback tracing service configure");
ROCPROFILER_CALL(
rocprofiler_configure_callback_tracing_service(rocjpeg_api_callback_ctx,
ROCPROFILER_CALLBACK_TRACING_ROCJPEG_API,
nullptr,
0,
tool_tracing_callback,
nullptr),
"rocjpeg api callback tracing service configure");
ROCPROFILER_CALL(
rocprofiler_configure_callback_tracing_service(ompt_callback_ctx,
ROCPROFILER_CALLBACK_TRACING_OMPT,
@@ -1473,6 +1527,15 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
&rocdecode_api_buffer),
"buffer creation");
ROCPROFILER_CALL(rocprofiler_create_buffer(rocjpeg_api_buffered_ctx,
buffer_size,
watermark,
ROCPROFILER_BUFFER_POLICY_LOSSLESS,
tool_tracing_buffered,
tool_data,
&rocjpeg_api_buffer),
"buffer creation");
ROCPROFILER_CALL(rocprofiler_create_buffer(ompt_buffered_ctx,
buffer_size,
watermark,
@@ -1605,6 +1668,14 @@ tool_init(rocprofiler_client_finalize_t fini_func, void* tool_data)
rocdecode_api_buffer),
"buffer tracing service for rocdecode api configure");
ROCPROFILER_CALL(
rocprofiler_configure_buffer_tracing_service(rocjpeg_api_buffered_ctx,
ROCPROFILER_BUFFER_TRACING_ROCJPEG_API,
nullptr,
0,
rocjpeg_api_buffer),
"buffer tracing service for rocjpeg api configure");
ROCPROFILER_CALL(
rocprofiler_configure_buffer_tracing_service(
ompt_buffered_ctx, ROCPROFILER_BUFFER_TRACING_OMPT, nullptr, 0, ompt_buffered_buffer),
@@ -1775,7 +1846,9 @@ tool_fini(void* tool_data)
<< ", ompt_bf_records=" << ompt_bf_records.size()
<< ", counter_collection_value_records=" << counter_collection_bf_records.size()
<< ", rocdecode_api_callback_records=" << rocdecode_api_cb_records.size()
<< ", rocdecode_api_bf_records=" << rocdecode_api_bf_records.size() << "...\n"
<< ", rocdecode_api_bf_records=" << rocdecode_api_bf_records.size()
<< ", rocjpeg_api_callback_records=" << rocjpeg_api_cb_records.size()
<< ", rocjpeg_api_bf_records=" << rocjpeg_api_bf_records.size() << "...\n"
<< std::flush;
auto* _call_stack = static_cast<call_stack_t*>(tool_data);
@@ -1872,6 +1945,7 @@ write_json(call_stack_t* _call_stack)
json_ar(cereal::make_nvp("memory_copies", memory_copy_cb_records));
json_ar(cereal::make_nvp("memory_allocations", memory_allocation_cb_records));
json_ar(cereal::make_nvp("rocdecode_api_traces", rocdecode_api_cb_records));
json_ar(cereal::make_nvp("rocjpeg_api_traces", rocjpeg_api_cb_records));
} catch(std::exception& e)
{
std::cerr << "[" << getpid() << "][" << __FUNCTION__
@@ -1899,6 +1973,7 @@ write_json(call_stack_t* _call_stack)
json_ar(cereal::make_nvp("retired_correlation_ids", corr_id_retire_records));
json_ar(cereal::make_nvp("counter_collection", counter_collection_bf_records));
json_ar(cereal::make_nvp("rocdecode_api_traces", rocdecode_api_bf_records));
json_ar(cereal::make_nvp("rocjpeg_api_traces", rocjpeg_api_bf_records));
} catch(std::exception& e)
{
std::cerr << "[" << getpid() << "][" << __FUNCTION__
@@ -1972,6 +2047,8 @@ write_perfetto()
tids.emplace(itr.thread_id);
for(auto itr : rocdecode_api_bf_records)
tids.emplace(itr.thread_id);
for(auto itr : rocjpeg_api_bf_records)
tids.emplace(itr.thread_id);
for(auto itr : memory_copy_bf_records)
{
@@ -2266,6 +2343,47 @@ write_perfetto()
itr.end_timestamp);
}
for(auto itr : rocjpeg_api_bf_records)
{
auto name = buffer_names.at(itr.kind, itr.operation);
auto& track = thread_tracks.at(itr.thread_id);
auto _args = callback_arg_array_t{};
auto ritr = std::find_if(
rocjpeg_api_cb_records.begin(),
rocjpeg_api_cb_records.end(),
[&itr](const auto& citr) {
return (citr.record.correlation_id.internal == itr.correlation_id.internal &&
!citr.args.empty());
});
if(ritr != rocjpeg_api_cb_records.end()) _args = ritr->args;
TRACE_EVENT_BEGIN(sdk::perfetto_category<sdk::category::rocjpeg_api>::name,
::perfetto::StaticString(name.data()),
track,
itr.start_timestamp,
::perfetto::Flow::ProcessScoped(itr.correlation_id.internal),
"begin_ns",
itr.start_timestamp,
"tid",
itr.thread_id,
"kind",
itr.kind,
"operation",
itr.operation,
"corr_id",
itr.correlation_id.internal,
[&](::perfetto::EventContext ctx) {
for(const auto& aitr : _args)
sdk::add_perfetto_annotation(ctx, aitr.first, aitr.second);
});
TRACE_EVENT_END(sdk::perfetto_category<sdk::category::rocjpeg_api>::name,
track,
itr.end_timestamp,
"end_ns",
itr.end_timestamp);
}
for(auto itr : ompt_bf_records)
{
auto name = buffer_names.at(itr.kind, itr.operation);