From eece5aee73f3767be158e24d7a82adaefd1b9593 Mon Sep 17 00:00:00 2001 From: Aryan Salmanpour Date: Mon, 8 Jul 2024 13:29:01 -0400 Subject: [PATCH] Update documentation for using rocJPEG (#37) * Update documentation for using rocjpeg * clean up * Update the jpeg table * update the sample code in section 11 * Update Readme [ROCm/rocjpeg commit: d9df1e04b585718f48a6322ec2461f060abadfdb] --- projects/rocjpeg/README.md | 10 +- .../docker/rocJPEG-on-ubuntu20.dockerfile | 4 +- .../docker/rocJPEG-on-ubuntu22.dockerfile | 4 +- .../rocjpeg/docs/how-to/using-rocjpeg.rst | 159 +++++++++++++++++- projects/rocjpeg/docs/install/install.rst | 22 +-- projects/rocjpeg/samples/README.md | 4 + 6 files changed, 179 insertions(+), 24 deletions(-) diff --git a/projects/rocjpeg/README.md b/projects/rocjpeg/README.md index 67457c0780..5429c9fb0e 100644 --- a/projects/rocjpeg/README.md +++ b/projects/rocjpeg/README.md @@ -207,9 +207,9 @@ page. * Ubuntu - `20.04` / `22.04` * RHEL - `8` / `9` * ROCm: - * rocm-core - `6.2.0.60200-14004` - * amdgpu-core - `6.2.60200-1775523` + * rocm-core - `6.3.0.60300-14317` + * amdgpu-core - `6.3.60300-1798298` * libva-dev - `2.7.0-2` / `2.14.0-1` -* mesa-amdgpu-va-drivers - `24.1.0.60200-1775523` -* mesa-amdgpu-dri-drivers - `24.1.0.60200-1775523` -* rocJPEG Setup Script - `V1.2.0` +* mesa-amdgpu-va-drivers - `24.2.0.60300-1798298` +* mesa-amdgpu-dri-drivers - `24.2.0.60300-1798298` +* rocJPEG Setup Script - `V2.0.0` \ No newline at end of file diff --git a/projects/rocjpeg/docker/rocJPEG-on-ubuntu20.dockerfile b/projects/rocjpeg/docker/rocJPEG-on-ubuntu20.dockerfile index 5c1aed2c24..82d1b26be3 100644 --- a/projects/rocjpeg/docker/rocJPEG-on-ubuntu20.dockerfile +++ b/projects/rocjpeg/docker/rocJPEG-on-ubuntu20.dockerfile @@ -7,8 +7,8 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get -y install gcc g++ cmake pkg-config g # install ROCm RUN DEBIAN_FRONTEND=noninteractive apt-get -y install initramfs-tools libnuma-dev wget keyboard-configuration && \ - wget https://repo.radeon.com/amdgpu-install/6.1/ubuntu/focal/amdgpu-install_6.1.60100-1_all.deb && \ - sudo apt-get install ./amdgpu-install_6.1.60100-1_all.deb && \ + wget https://repo.radeon.com/amdgpu-install/6.3/ubuntu/focal/amdgpu-install_6.3.60100-1_all.deb && \ + sudo apt-get install ./amdgpu-install_6.3.60100-1_all.deb && \ sudo amdgpu-install -y --usecase=rocm WORKDIR /workspace diff --git a/projects/rocjpeg/docker/rocJPEG-on-ubuntu22.dockerfile b/projects/rocjpeg/docker/rocJPEG-on-ubuntu22.dockerfile index fd2f4124ae..3cfab872d0 100644 --- a/projects/rocjpeg/docker/rocJPEG-on-ubuntu22.dockerfile +++ b/projects/rocjpeg/docker/rocJPEG-on-ubuntu22.dockerfile @@ -7,8 +7,8 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get -y install gcc g++ cmake pkg-config g # install ROCm RUN DEBIAN_FRONTEND=noninteractive apt-get -y install initramfs-tools libnuma-dev wget keyboard-configuration && \ - wget https://repo.radeon.com/amdgpu-install/6.1/ubuntu/jammy/amdgpu-install_6.1.60100-1_all.deb && \ - sudo apt-get install ./amdgpu-install_6.1.60100-1_all.deb && \ + wget https://repo.radeon.com/amdgpu-install/6.3/ubuntu/jammy/amdgpu-install_6.3.60100-1_all.deb && \ + sudo apt-get install ./amdgpu-install_6.3.60100-1_all.deb && \ sudo amdgpu-install -y --usecase=rocm WORKDIR /workspace diff --git a/projects/rocjpeg/docs/how-to/using-rocjpeg.rst b/projects/rocjpeg/docs/how-to/using-rocjpeg.rst index 1ac02162b8..c60db2eb1b 100644 --- a/projects/rocjpeg/docs/how-to/using-rocjpeg.rst +++ b/projects/rocjpeg/docs/how-to/using-rocjpeg.rst @@ -177,17 +177,168 @@ the required size for the output buffers for a single decode JPEG. To optimally "ROCJPEG_OUTPUT_Y", "Any of the supported chroma subsampling", "destination.pitch[0] = widths[0]", "destination.channel[0] = destination.pitch[0] * heights[0]" "ROCJPEG_OUTPUT_RGB", "Any of the supported chroma subsampling", "destination.pitch[0] = widths[0] * 3", "destination.channel[0] = destination.pitch[0] * heights[0]" "ROCJPEG_OUTPUT_RGB_PLANAR", "Any of the supported chroma subsampling", "destination.pitch[c] = widths[c] for c = 0, 1, 2", "destination.channel[c] = destination.pitch[c] * heights[c] for c = 0, 1, 2" -7. Destroy the decoder + +7. Decode a batch of JPEG streams +==================================================== +The ``rocJpegDecodeBatched()`` function decodes a batch of JPEG images using the rocJPEG library. + +Below is the signature of the ``rocJpegDecodeBatched()`` function: + +.. code:: cpp + + RocJpegStatus rocJpegDecodeBatched( + RocJpegHandle handle, + RocJpegStreamHandle *jpeg_stream_handles, + int batch_size, + const RocJpegDecodeParams *decode_params, + RocJpegImage *destinations); + +The ``rocJpegDecodeBatched()`` function takes the following arguments: + +* ``handle``: The rocJPEG handle. +* ``jpeg_stream_handles``: An array of rocJPEG stream handles, each representing a JPEG image. +* ``batch_size``: The number of images in the batch. +* ``decode_params``: The decode parameters for the JPEG images. +* ``destinations``: An array of rocJPEG images to store the decoded images. + +To use the ``rocJpegDecodeBatched()`` function, you need to provide the appropriate rocJPEG handles, stream handles, decode parameters, and destination images. The function will decode the batch of JPEG images and store the decoded images in the ``destinations`` array. +Remember to allocate device memories for each channel of the destination images and pass them to the ``rocJpegDecodeBatched()`` API. The API will then copy the decoded images to the destination images based on the requested output format specified in the ``RocJpegDecodeParams``. + +8. Destroy the decoder ==================================================== You must call the ``rocJpegDestroy()`` to destroy the session and free up resources. -8. Destroy the JPEG stream handle +9. Destroy the JPEG stream handle ==================================================== You must call the ``rocJpegStreamDestroy()`` to release the stream parser object and resources. -9. Get Error name +10. Get Error name ==================================================== -You can call ``rocJpegGetErrorName`` to retrieve the name of the specified error code in text form returned from rocJPEG APIs. \ No newline at end of file +You can call ``rocJpegGetErrorName`` to retrieve the name of the specified error code in text form returned from rocJPEG APIs. + +11. Sample code snippet for decoding a JPEG stream using the rocJPEG APIs +==================================================== + +The code snippet provided demonstrates how to decode a JPEG stream using the rocJPEG library. +First, the code reads the JPEG image file and stores the data in a vector. Then, it initializes the rocJPEG handle using the ``rocJpegCreate()`` function. If the handle creation is successful, it proceeds to create a JPEG stream using the ``rocJpegStreamCreate()`` function. +Next, the code parses the JPEG stream by calling the ``rocJpegStreamParse()`` function with the JPEG data and its size. If the parsing is successful, it retrieves the image information using the ``rocJpegGetImageInfo()`` function. +After obtaining the image information, the code allocates HIP device memory for the decoded image using the ``RocJpegImage`` structure. It sets the channel and pitch values based on the image width and height. +Finally, the code decodes the JPEG stream by calling the ``rocJpegDecode()`` function with the rocJPEG handle, stream handle, and the decoded image structure. If the decoding is successful, the decoded image can be further processed or displayed. + +.. code:: cpp + + #include + #include + #include + #include "rocjpeg.h" + + int main() { + // Read the JPEG image file + std::ifstream input("mug_420.jpg", std::ios::in | std::ios::binary | std::ios::ate); + + // Get the JPEG image file size + std::streamsize file_size = input.tellg(); + input.seekg(0, std::ios::beg); + + std::vector file_data; + // resize if buffer is too small + if (file_data.size() < file_size) { + file_data.resize(file_size); + } + // Read the JPEG stream + if (!input.read(file_data.data(), file_size)) { + std::cerr << "ERROR: cannot read from file: " << std::endl; + return EXIT_FAILURE; + } + + // Initialize rocJPEG + RocJpegHandle handle; + RocJpegStatus status = rocJpegCreate(ROCJPEG_BACKEND_HARDWARE, 0, &handle); + if (status != ROCJPEG_STATUS_SUCCESS) { + std::cerr << "Failed to create rocJPEG handle with error code: " << rocJpegGetErrorName(status) << std::endl; + return EXIT_FAILURE; + } + + // Create a JPEG stream + RocJpegStreamHandle rocjpeg_stream_handle; + status = rocJpegStreamCreate(&rocjpeg_stream_handle); + if (status != ROCJPEG_STATUS_SUCCESS) { + std::cerr << "Failed to create JPEG stream with error code: " << rocJpegGetErrorName(status) << std::endl; + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + // Parse the JPEG stream + status = rocJpegStreamParse(reinterpret_cast(file_data.data()), file_size, rocjpeg_stream_handle); + if (status != ROCJPEG_STATUS_SUCCESS) { + std::cerr << "Failed to parse JPEG stream with error code: " << rocJpegGetErrorName(status) << std::endl; + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + // Get the image info + uint8_t num_components; + RocJpegChromaSubsampling subsampling; + uint32_t widths[ROCJPEG_MAX_COMPONENT] = {}; + uint32_t heights[ROCJPEG_MAX_COMPONENT] = {}; + + status = rocJpegGetImageInfo(rocjpeg_handle, rocjpeg_stream_handle, &num_components, &subsampling, widths, heights); + if (status != ROCJPEG_STATUS_SUCCESS) { + std::cerr << "Failed to get image info with error code: " << rocJpegGetErrorName(status) << std::endl; + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + // Allocate device memory for the decoded output image + RocJpegImage output_image = {}; + RocJpegDecodeParams decode_params = {}; + RocJpegDecodeParams.output_format = ROCJPEG_OUTPUT_NATIVE; + + // For this sample assuming the input image has a YUV420 chroma subsampling. + // For YUV420 subsampling, the native decoded output image would be NV12 (i.e., the rocJPegDecode API copies Y to first channel and UV (interleaved) to second channel of RocJpegImage) + output_image.pitch[1] = output_image.pitch[0] = widths[0]; + hipError_t hip_status; + hip_status = hipMalloc(&output_image.channel[0], output_image.pitch[0] * heights[0]); + if (hip_status != hipSuccess) { + std::cerr << "Failed to allocate device memory for the first channel" << std::endl; + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + hip_status = hipMalloc(&output_image.channel[1], output_image.pitch[1] * (heights[0] >> 1)); + if (hip_status != hipSuccess) { + std::cerr << "Failed to allocate device memory for the second channel" << std::endl; + hipFree((void *)output_image.channel[0]); + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + // Decode the JPEG stream + status = rocJpegDecode(rocjpeg_handle, rocjpeg_stream_handle, &decode_params, &output_image); + if (status != ROCJPEG_STATUS_SUCCESS) { + std::cerr << "Failed to decode JPEG stream with error code: " << rocJpegGetErrorName(status) << std::endl; + hipFree((void *)output_image.channel[0]); + hipFree((void *)output_image.channel[1]); + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + return EXIT_FAILURE; + } + + // Perform additional post-processing on the decoded image or optionally save it + // ... + + // Clean up resources + hipFree((void *)output_image.channel[0]); + hipFree((void *)output_image.channel[1]); + rocJpegStreamDestroy(rocjpeg_stream_handle); + rocJpegDestroy(handle); + + return EXIT_SUCCESS; + } diff --git a/projects/rocjpeg/docs/install/install.rst b/projects/rocjpeg/docs/install/install.rst index 1fd2683f76..3b45a27ee1 100644 --- a/projects/rocjpeg/docs/install/install.rst +++ b/projects/rocjpeg/docs/install/install.rst @@ -20,8 +20,8 @@ Tested configurations * ROCm - * rocm-core: 6.1.0.60100-65 - * amdgpu-core: 1:6.1.60100-1744891 + * rocm-core: 6.3.0.60300-14317 + * amdgpu-core: 6.3.60300-1798298 * rocJPEG Setup Script: V1.0 @@ -229,16 +229,16 @@ For more information on documentation builds, refer to the Hardware capabilities =================================================== -The following table shows the capabilities of the VCN and JPEG cores for each supported GPU +The following table shows the capabilities of the VCN and total number of JPEG cores for each supported GPU architecture. .. csv-table:: - :header: "GPU Architecture", "VCN Generation", "Number of VCNs", "Number of JPEG cores per VCN", "Max width, Max height" + :header: "GPU Architecture", "VCN Generation", "Total number of JPEG cores", "Max width, Max height" - "gfx908 - MI1xx", "VCN 2.5.0", "2", "1", "4096, 4096" - "gfx90a - MI2xx", "VCN 2.6.0", "2", "1", "4096, 4096" - "gfx940, gfx942 - MI300A", "VCN 3.0", "3", "8", "16384, 16384" - "gfx941, gfx942 - MI300X", "VCN 3.0", "4", "8", "16384, 16384" - "gfx1030, gfx1031, gfx1032 - Navi2x", "VCN 3.x", "2", "1", "16384, 16384" - "gfx1100, gfx1102 - Navi3x", "VCN 4.0", "2", "1", "16384, 16384" - "gfx1101 - Navi3x", "VCN 4.0", "1", "1", "16384, 16384" + "gfx908 - MI1xx", "VCN 2.5.0", "2", "4096, 4096" + "gfx90a - MI2xx", "VCN 2.6.0", "4", "4096, 4096" + "gfx940, gfx942 - MI300A", "VCN 3.0", "24", "16384, 16384" + "gfx941, gfx942 - MI300X", "VCN 3.0", "32", "16384, 16384" + "gfx1030, gfx1031, gfx1032 - Navi2x", "VCN 3.x", "1", "16384, 16384" + "gfx1100, gfx1102 - Navi3x", "VCN 4.0", "1", "16384, 16384" + "gfx1101 - Navi3x", "VCN 4.0", "1", "16384, 16384" diff --git a/projects/rocjpeg/samples/README.md b/projects/rocjpeg/samples/README.md index 62d953ff6c..8aac29ed1b 100644 --- a/projects/rocjpeg/samples/README.md +++ b/projects/rocjpeg/samples/README.md @@ -6,6 +6,10 @@ rocJPEG samples The jpeg decode sample illustrates decoding a JPEG images using rocJPEG library to get the individual decoded images in one of the supported output format (i.e., native, yuv, y, rgb, rgb_planar). This sample can be configured with a device ID and optionally able to dump the output to a file. +## [JPEG decode batched](jpegDecodeBatched) + +The jpeg decode bacthed sample illustrates decoding JPEG images by batches of specified size using rocJPEG library to get the individual decoded images in one of the supported output format (i.e., native, yuv, y, rgb, rgb_planar). This sample can be configured with a device ID and optionally able to dump the output to a file. + ## [JPEG decode multi-threads](jpegDecodeMultiThreads) The jpeg decode multi threads sample illustrates decoding JPEG images using rocJPEG library with multiple threads to get the individual decoded images in one of the supported output format (i.e., native, yuv, y, rgb, rgb_planar). This sample can be configured with a device ID and optionally able to dump the output to a file. \ No newline at end of file