Update documentation

Sync with the latest changes from upstream repo

Change-Id: I309880f5c7f77c58a8b186db320bbc0f0e634089


[ROCm/ROCR-Runtime commit: c48b858093]
이 커밋은 다음에 포함됨:
Chris Freehill
2024-07-23 19:05:40 -05:00
커밋한 사람 David Yat Sin
부모 3bdfe00bb7
커밋 5f20d2f242
12개의 변경된 파일368개의 추가작업 그리고 35개의 파일을 삭제
@@ -1,3 +1,9 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _rocr-api:
API
===
+16
파일 보기
@@ -0,0 +1,16 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _c-interface-adaptors:
C interface adaptors
=====================
The C interface layer is the :ref:`top layer in ROCR <runtime-design>` that provides C++ APIs as defined in the `HSA Runtime Specification 1.2 <https://hsafoundation.com/wp-content/uploads/2021/02/HSA-Runtime-1.2.pdf>`_. The C interface layer also consists of the interfaces and default definitions for the standard extensions. The interface functions simply forward to a function pointer table defined here. The table is initialized to point to default definitions, which simply returns an appropriate error code. If available, the extension library is loaded as part of runtime initialization and the table is updated to point to the extension library.
Files present in this layer:
- ``hsa.h`` (cpp)
- ``hsa_ext_interface.h`` (cpp)
+34
파일 보기
@@ -0,0 +1,34 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _environment-variables:
Environment variables
========================
The following table lists the most often used environment variables.
.. list-table:: ROCR environment variables
:header-rows: 1
* - Environment variable
- Possible values
- Description
* - HSA_ENABLE_SDMA
-
* 0: Disabled
* 1: Enabled (default)
- This controls the use of DMA engines in all copy directions (Host-to-Device, Device-to-Host, Device-to-Device) when using the
``hsa_memory_copy``, ``hsa_amd_memory_fill``, ``hsa_amd_memory_async_copy``, ``hsa_amd_memory_async_copy_on_engine`` APIs
* - HSA_ENABLE_PEER_SDMA
-
* 0: Disabled
* 1: Enabled (default)
- This controls the use of DMA engines for Device-to-Device copies when using the ``hsa_memory_copy``, ``hsa_amd_memory_async_copy``, ``hsa_amd_memory_async_copy_on_engine`` APIs
.. note::
The value of ``HSA_ENABLE_PEER_SDMA`` is ignored if ``HSA_ENABLE_SDMA`` is used to disable the use of DMA engines.
+2
파일 보기
@@ -28,6 +28,8 @@ release = version_number
external_toc_path = "./sphinx/_toc.yml"
external_projects_current_project = "rocr-runtime"
docs_core = ROCmDocs(left_nav_title)
docs_core.run_doxygen(doxygen_root="doxygen", doxygen_path="doxygen/xml")
docs_core.setup()
+93
파일 보기
@@ -0,0 +1,93 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _contributing-to-rocr:
Contributing to ROCR
========================
This document contains useful information required to contribute to ROCR.
.. _runtime-design:
Runtime design
-----------------
ROCR consists of the following primary layers:
1. :ref:`C interface adaptors <c-interface-adaptors>`
2. C++ interface classes and common functions
3. Device-specific implementations
The first layer provides interfaces to make ROCR APIs available to the user applications.
The second and third layers comprise of the internal ROCR implementation, which is available for contribution.
Additionally, the runtime is dependent on a small utility library that provides simple common functions, limited operating system, compiler abstraction, and atomic operation interfaces.
The following sections list the important files present in the second and third layer.
C++ interface classes and common functions
----------------------------------------------
The C++ interface layer provides abstract interface classes encapsulating commands to HSA signals, agents, and queues. This layer also contains the implementation of device-independent commands, such as ``hsa_init``, ``hsa_system_get_info``, and a default signal and queue implementation.
Files present in this layer:
- ``runtime.h`` (cpp)
- ``agent.h``
- ``queue.h``
- ``signal.h``
- ``memory_region.h`` (cpp)
- ``checked.h``
- ``memory_database.h`` (cpp)
- ``default_signal.h`` (cpp)
Device-specific implementations
----------------------------------
The device-specific layer contains implementations of the C++ interface classes that implement HSA functionality for ROCm supported devices.
Files present in this layer:
- ``amd_cpu_agent.h`` (cpp)
- ``amd_gpu_agent.h`` (cpp)
- ``amd_hw_aql_command_processor.h`` (cpp)
- ``amd_memory_region.h`` (cpp)
- ``amd_memory_registration.h`` (cpp)
- ``amd_topology.h`` (cpp)
- ``host_queue.h`` (cpp)
- ``interrupt_signal.h`` (cpp)
- ``hsa_ext_private_amd.h`` (cpp)
Source and include directories
--------------------------------
- ``core``: Source code for AMD’s implementation of the core HSA Runtime API’s
- ``cmake_modules``: CMake support modules and files
- ``inc``: Public and AMD-specific header files exposing the HSA Runtime`s interfaces
- ``libamdhsacode``: Code object definitions and interfaces
- ``loader``: Loads code objects
- ``utils``: Utilities required to build the core runtime
-23
파일 보기
@@ -1,23 +0,0 @@
# Environment Variables
## HSA_ENABLE_SDMA
Possible values:
* 0:Disabled
* 1:Enabled (Default Value)
This will enable or disable the use of DMA engines in all copy directions (Host-to-Device, Device-to-Host, Device-to-Device) when using the following APIs:
`hsa_memory_copy`, `hsa_amd_memory_fill`, `hsa_amd_memory_async_copy`, `hsa_amd_memory_async_copy_on_engine`
## HSA_ENABLE_PEER_SDMA
Possible values:
* 0:Disabled
* 1:Enabled (Default Value)
This will enable or disable the use of DMA engines for Device-to-Device copies when using the following APIs:
`hsa_memory_copy`, `hsa_amd_memory_async_copy`, `hsa_amd_memory_async_copy_on_engine`
The value of `HSA_ENABLE_PEER_SDMA` is ignored if `HSA_ENABLE_SDMA` is used to disable the use of DMA engines.
+37
파일 보기
@@ -0,0 +1,37 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _index:
=====================
ROCR documentation
=====================
The ROCm runtime (ROCR) is AMD's implementation of HSA runtime, which is a thin, user-mode API that exposes the necessary interfaces to access and interact with graphics hardware driven by the AMDGPU driver set and the ROCK kernel driver. To learn more, see :ref:`what-is-rocr-runtime`
You can access ROCR code on our `GitHub repository <https://github.com/ROCm/ROCR-Runtime>`_.
The documentation is structured as follows:
.. grid:: 2
:gutter: 3
.. grid-item-card:: Install
* :ref:`installation`
.. grid-item-card:: API reference
* :ref:`c-interface-adaptors`
* :ref:`environment-variables`
* :ref:`rocr-api`
.. grid-item-card:: Contribution
* :ref:`contributing-to-rocr`
To contribute to the documentation, refer to
`Contributing to ROCm <https://rocm.docs.amd.com/en/latest/contribute/contributing.html>`_.
You can find licensing information on the `Licensing <https://rocm.docs.amd.com/en/latest/about/license.html>`_ page.
+129
파일 보기
@@ -0,0 +1,129 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _installation:
====================
Installation
====================
This document provides information required to build and install ROCR using prebuilt binaries or from source.
Build and install using prebuilt binaries
-------------------------------------------
Here is how you can install ROCR using prebuilt binaries.
Prerequisites
*******************
- A system supporting ROCm. See the `supported operating systems <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-operating-systems>`_.
- Install ROCm. See `how to install ROCm <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/>`_.
- Install ``libdrm`` package.
.. code-block:: shell
sudo apt install libdrm-dev
The ROCR prebuilt binaries include:
**Core runtime package:**
- HSA include files to support application development on the HSA runtime for the ROCR runtime
- A 64-bit version of AMD’s HSA core runtime for the ROCR runtime
**Runtime extension package:**
- A 64-bit version of AMD’s runtime tools library
- A 64-bit version of AMD’s runtime image library
The contents of these packages are installed in ``/opt/rocm/hsa`` and ``/opt/rocm`` by default. The core runtime package depends on the ``hsakmt-roct-dev`` package.
Build and install from source
--------------------------------
Here is how you can build ROCR from source.
Prerequisites
***************
- CMake 3.7 or later. Export CMake bin into your PATH.
- Support packages ``libelf-dev`` and ``g++``.
.. code-block:: shell
sudo apt install libelf-dev g++
- A compatible version of the ``libhsakmt`` library and the ``hsakmt.h`` header file. Obtain the latest version of these files from the `ROCT-Thunk-Interface repository <https://github.com/ROCm/ROCT-Thunk-Interface>`_.
- Install ``xxd``.
.. code-block:: shell
sudo apt install xxd
Building the runtime
----------------------
The ``libhsakmt`` development packages include a CMake package config file. The runtime locates ``libhsakmt`` via ``find_package`` if ``libhsakmt`` is installed in a standard location. For installations that don't use standard ROCm paths, set CMake variables ``CMAKE_PREFIX_PATH`` or ``hsakmt_DIR`` to override ``find_package`` search paths.
The runtime includes an optional image support module (previously ``hsa-ext-rocr-dev``). By default this module is included in the runtime builds. To exclude the image module from the runtime, set the CMake variable ``IMAGE_SUPPORT`` to OFF.
To build the optional image module, install AMDGCN-compatible clang and device library. You can find the latest version of these additional build dependencies in the `ROCm package repository <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/package-manager-integration.html#packages-in-rocm-programming-models>`_.
The latest source for these projects are available in the `llvm project <https://github.com/ROCm/llvm-project>`_ and `ROCm device libs <https://github.com/ROCm/ROCm-Device-Libs>`_ repositories.
The runtime optionally supports use of the CMake user package registry. By default the registry is not modified. Set CMake variable ``EXPORT_TO_USER_PACKAGE_REGISTRY`` to ON to enable updating the package registry.
To build, install, and produce packages on a system with standard ROCm packages installed, clone your copy of ROCR and run the following from ``src/``:
.. code-block:: shell
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=/opt/rocm ..
make
make install
make package
Example with a custom installation path, build dependency path, and options:
.. code-block:: shell
cmake -DIMAGE_SUPPORT=OFF \
-DEXPORT_TO_USER_PACKAGE_REGISTRY=ON \
-DCMAKE_VERBOSE_MAKEFILE=1 \
-DCMAKE_PREFIX_PATH=<alternate path(s) to build dependencies> \
-DCMAKE_INSTALL_PATH=<custom install path for this build> \
..
Alternatively, use ``ccmake`` and ``cmake-gui``:
.. code-block:: shell
mkdir build
cd build
ccmake ..
press c to configure
populate variables as desired
press c again
press g to generate and exit
make
Building against the runtime
---------------------------------
The runtime provides a CMake package config file, installed by default to ``/opt/rocm/lib/cmake/hsa-runtime64``. The runtime exports CMake target ``hsa-runtime64`` in namespace ``hsa-runtime64``. A CMake project (``Foo``) using the runtime may locate, include, and link the runtime using the following template:
.. code-block:: shell
# Add /opt/rocm to CMAKE_PREFIX_PATH.
find_package(hsa-runtime64 1.0 REQUIRED)
...
add_library(Foo ...)
...
target_link_libraries(Foo PRIVATE hsa-runtime64::hsa-runtime64)
+4 -4
파일 보기
@@ -1,4 +1,4 @@
License
=======
.. include:: ../LICENSE.txt
License
=======
.. include:: ../LICENSE.txt
+13 -4
파일 보기
@@ -2,11 +2,20 @@
# These comments will also be removed.
root: index
subtrees:
- numbered: False
- caption: Install
entries:
- file: structure
- file: api
- file: environment_variables
- file: install/installation
- caption: API reference
entries:
- file: api-reference/c-interface-adaptors
- file: api-reference/environment_variables
- file: api-reference/api
- caption: Contribution
entries:
- file: contribution/contributing-to-rocr
- caption: About
entries:
- file: license
+2 -4
파일 보기
@@ -84,9 +84,7 @@ pygments==2.15.0
# pydata-sphinx-theme
# sphinx
pyjwt[crypto]==2.6.0
# via
# pygithub
# pyjwt
# via pygithub
pynacl==1.5.0
# via pygithub
pytz==2023.3
@@ -100,7 +98,7 @@ requests==2.28.2
# via
# pygithub
# sphinx
rocm-docs-core==0.31.0
rocm-docs-core==0.38.1
# via -r requirements.in
smmap==5.0.0
# via gitdb
+32
파일 보기
@@ -0,0 +1,32 @@
.. meta::
:description: HSA runtime implementation
:keywords: ROCR, ROCm, library, tool, runtime
.. _what-is-rocr-runtime:
What is ROCR?
========================
The ROCm runtime (ROCR) is AMD's implementation of HSA runtime, which is a thin, user-mode API that exposes the necessary interfaces to access and interact with graphics hardware driven by the AMDGPU driver set and the ROCK kernel driver. Together they enable you to directly harness the power of discrete AMD graphics devices by allowing host applications to launch compute kernels directly to the graphics hardware.
The ROCR APIs are capable of the following:
- Error handling
- Runtime initialization and shutdown
- System and agent information
- Signals and synchronization
- Architected dispatch
- Memory management
- Fitting into a typical software architecture stack
ROCR provides direct access to the graphics hardware, allowing you more control over execution. An example of low-level hardware access is the support for one or more user-mode queues, which provides a low-latency kernel dispatch interface, allowing you to develop customized dispatch algorithms specific to your application.
The HSA Architected Queuing Language (AQL) is an open standard defined by the HSA Foundation, which specifies the packet syntax used to control supported AMD or ATI Radeon © graphics devices. The AQL language supports several packet types, including packets that can command the hardware to automatically resolve inter-packet dependencies (barrier AND and barrier OR packet), kernel dispatch packets, and agent dispatch packets.
In addition to user-mode queues and AQL, the HSA runtime exposes various virtual address ranges that can be accessed by one or more of the system’s graphics devices and also possibly by the host. The exposed virtual address ranges support either a fine-grained or a coarse-grained access. Updates to memory in a fine-grained region are immediately visible to all devices that can access it, but only one device can have access to a coarse-grained allocation at a time. You can change the ownership of a coarse-grained region using the HSA runtime memory APIs, but this transfer of ownership must be explicitly done by the host application.
For a complete description of the HSA Runtime APIs, AQL, and the HSA memory policy, refer to the `HSA Runtime Programmer’s Reference Manual <https://hsafoundation.com/wp-content/uploads/2021/02/HSA-Runtime-1.2.pdf>`_.