Update documentation
Sync with the latest changes from upstream repo
Change-Id: I309880f5c7f77c58a8b186db320bbc0f0e634089
[ROCm/ROCR-Runtime commit: c48b858093]
이 커밋은 다음에 포함됨:
+6
@@ -1,3 +1,9 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _rocr-api:
|
||||
|
||||
API
|
||||
===
|
||||
|
||||
@@ -0,0 +1,16 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _c-interface-adaptors:
|
||||
|
||||
C interface adaptors
|
||||
=====================
|
||||
|
||||
The C interface layer is the :ref:`top layer in ROCR <runtime-design>` that provides C++ APIs as defined in the `HSA Runtime Specification 1.2 <https://hsafoundation.com/wp-content/uploads/2021/02/HSA-Runtime-1.2.pdf>`_. The C interface layer also consists of the interfaces and default definitions for the standard extensions. The interface functions simply forward to a function pointer table defined here. The table is initialized to point to default definitions, which simply returns an appropriate error code. If available, the extension library is loaded as part of runtime initialization and the table is updated to point to the extension library.
|
||||
|
||||
Files present in this layer:
|
||||
|
||||
- ``hsa.h`` (cpp)
|
||||
|
||||
- ``hsa_ext_interface.h`` (cpp)
|
||||
@@ -0,0 +1,34 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _environment-variables:
|
||||
|
||||
Environment variables
|
||||
========================
|
||||
|
||||
The following table lists the most often used environment variables.
|
||||
|
||||
.. list-table:: ROCR environment variables
|
||||
:header-rows: 1
|
||||
|
||||
* - Environment variable
|
||||
- Possible values
|
||||
- Description
|
||||
|
||||
* - HSA_ENABLE_SDMA
|
||||
-
|
||||
* 0: Disabled
|
||||
* 1: Enabled (default)
|
||||
- This controls the use of DMA engines in all copy directions (Host-to-Device, Device-to-Host, Device-to-Device) when using the
|
||||
``hsa_memory_copy``, ``hsa_amd_memory_fill``, ``hsa_amd_memory_async_copy``, ``hsa_amd_memory_async_copy_on_engine`` APIs
|
||||
|
||||
* - HSA_ENABLE_PEER_SDMA
|
||||
-
|
||||
* 0: Disabled
|
||||
* 1: Enabled (default)
|
||||
- This controls the use of DMA engines for Device-to-Device copies when using the ``hsa_memory_copy``, ``hsa_amd_memory_async_copy``, ``hsa_amd_memory_async_copy_on_engine`` APIs
|
||||
|
||||
.. note::
|
||||
|
||||
The value of ``HSA_ENABLE_PEER_SDMA`` is ignored if ``HSA_ENABLE_SDMA`` is used to disable the use of DMA engines.
|
||||
@@ -28,6 +28,8 @@ release = version_number
|
||||
|
||||
external_toc_path = "./sphinx/_toc.yml"
|
||||
|
||||
external_projects_current_project = "rocr-runtime"
|
||||
|
||||
docs_core = ROCmDocs(left_nav_title)
|
||||
docs_core.run_doxygen(doxygen_root="doxygen", doxygen_path="doxygen/xml")
|
||||
docs_core.setup()
|
||||
|
||||
@@ -0,0 +1,93 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _contributing-to-rocr:
|
||||
|
||||
Contributing to ROCR
|
||||
========================
|
||||
|
||||
This document contains useful information required to contribute to ROCR.
|
||||
|
||||
.. _runtime-design:
|
||||
|
||||
Runtime design
|
||||
-----------------
|
||||
|
||||
ROCR consists of the following primary layers:
|
||||
|
||||
1. :ref:`C interface adaptors <c-interface-adaptors>`
|
||||
|
||||
2. C++ interface classes and common functions
|
||||
|
||||
3. Device-specific implementations
|
||||
|
||||
The first layer provides interfaces to make ROCR APIs available to the user applications.
|
||||
The second and third layers comprise of the internal ROCR implementation, which is available for contribution.
|
||||
|
||||
Additionally, the runtime is dependent on a small utility library that provides simple common functions, limited operating system, compiler abstraction, and atomic operation interfaces.
|
||||
|
||||
The following sections list the important files present in the second and third layer.
|
||||
|
||||
C++ interface classes and common functions
|
||||
----------------------------------------------
|
||||
|
||||
The C++ interface layer provides abstract interface classes encapsulating commands to HSA signals, agents, and queues. This layer also contains the implementation of device-independent commands, such as ``hsa_init``, ``hsa_system_get_info``, and a default signal and queue implementation.
|
||||
|
||||
Files present in this layer:
|
||||
|
||||
- ``runtime.h`` (cpp)
|
||||
|
||||
- ``agent.h``
|
||||
|
||||
- ``queue.h``
|
||||
|
||||
- ``signal.h``
|
||||
|
||||
- ``memory_region.h`` (cpp)
|
||||
|
||||
- ``checked.h``
|
||||
|
||||
- ``memory_database.h`` (cpp)
|
||||
|
||||
- ``default_signal.h`` (cpp)
|
||||
|
||||
Device-specific implementations
|
||||
----------------------------------
|
||||
|
||||
The device-specific layer contains implementations of the C++ interface classes that implement HSA functionality for ROCm supported devices.
|
||||
|
||||
Files present in this layer:
|
||||
|
||||
- ``amd_cpu_agent.h`` (cpp)
|
||||
|
||||
- ``amd_gpu_agent.h`` (cpp)
|
||||
|
||||
- ``amd_hw_aql_command_processor.h`` (cpp)
|
||||
|
||||
- ``amd_memory_region.h`` (cpp)
|
||||
|
||||
- ``amd_memory_registration.h`` (cpp)
|
||||
|
||||
- ``amd_topology.h`` (cpp)
|
||||
|
||||
- ``host_queue.h`` (cpp)
|
||||
|
||||
- ``interrupt_signal.h`` (cpp)
|
||||
|
||||
- ``hsa_ext_private_amd.h`` (cpp)
|
||||
|
||||
Source and include directories
|
||||
--------------------------------
|
||||
|
||||
- ``core``: Source code for AMD’s implementation of the core HSA Runtime API’s
|
||||
|
||||
- ``cmake_modules``: CMake support modules and files
|
||||
|
||||
- ``inc``: Public and AMD-specific header files exposing the HSA Runtime`s interfaces
|
||||
|
||||
- ``libamdhsacode``: Code object definitions and interfaces
|
||||
|
||||
- ``loader``: Loads code objects
|
||||
|
||||
- ``utils``: Utilities required to build the core runtime
|
||||
@@ -1,23 +0,0 @@
|
||||
# Environment Variables
|
||||
|
||||
## HSA_ENABLE_SDMA
|
||||
|
||||
Possible values:
|
||||
|
||||
* 0:Disabled
|
||||
* 1:Enabled (Default Value)
|
||||
|
||||
This will enable or disable the use of DMA engines in all copy directions (Host-to-Device, Device-to-Host, Device-to-Device) when using the following APIs:
|
||||
`hsa_memory_copy`, `hsa_amd_memory_fill`, `hsa_amd_memory_async_copy`, `hsa_amd_memory_async_copy_on_engine`
|
||||
|
||||
## HSA_ENABLE_PEER_SDMA
|
||||
|
||||
Possible values:
|
||||
|
||||
* 0:Disabled
|
||||
* 1:Enabled (Default Value)
|
||||
|
||||
This will enable or disable the use of DMA engines for Device-to-Device copies when using the following APIs:
|
||||
`hsa_memory_copy`, `hsa_amd_memory_async_copy`, `hsa_amd_memory_async_copy_on_engine`
|
||||
|
||||
The value of `HSA_ENABLE_PEER_SDMA` is ignored if `HSA_ENABLE_SDMA` is used to disable the use of DMA engines.
|
||||
@@ -0,0 +1,37 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _index:
|
||||
|
||||
=====================
|
||||
ROCR documentation
|
||||
=====================
|
||||
|
||||
The ROCm runtime (ROCR) is AMD's implementation of HSA runtime, which is a thin, user-mode API that exposes the necessary interfaces to access and interact with graphics hardware driven by the AMDGPU driver set and the ROCK kernel driver. To learn more, see :ref:`what-is-rocr-runtime`
|
||||
|
||||
You can access ROCR code on our `GitHub repository <https://github.com/ROCm/ROCR-Runtime>`_.
|
||||
|
||||
The documentation is structured as follows:
|
||||
|
||||
.. grid:: 2
|
||||
:gutter: 3
|
||||
|
||||
.. grid-item-card:: Install
|
||||
|
||||
* :ref:`installation`
|
||||
|
||||
.. grid-item-card:: API reference
|
||||
|
||||
* :ref:`c-interface-adaptors`
|
||||
* :ref:`environment-variables`
|
||||
* :ref:`rocr-api`
|
||||
|
||||
.. grid-item-card:: Contribution
|
||||
|
||||
* :ref:`contributing-to-rocr`
|
||||
|
||||
To contribute to the documentation, refer to
|
||||
`Contributing to ROCm <https://rocm.docs.amd.com/en/latest/contribute/contributing.html>`_.
|
||||
|
||||
You can find licensing information on the `Licensing <https://rocm.docs.amd.com/en/latest/about/license.html>`_ page.
|
||||
@@ -0,0 +1,129 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _installation:
|
||||
|
||||
====================
|
||||
Installation
|
||||
====================
|
||||
|
||||
This document provides information required to build and install ROCR using prebuilt binaries or from source.
|
||||
|
||||
Build and install using prebuilt binaries
|
||||
-------------------------------------------
|
||||
|
||||
Here is how you can install ROCR using prebuilt binaries.
|
||||
|
||||
Prerequisites
|
||||
*******************
|
||||
|
||||
- A system supporting ROCm. See the `supported operating systems <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-operating-systems>`_.
|
||||
|
||||
- Install ROCm. See `how to install ROCm <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/>`_.
|
||||
|
||||
- Install ``libdrm`` package.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
sudo apt install libdrm-dev
|
||||
|
||||
The ROCR prebuilt binaries include:
|
||||
|
||||
**Core runtime package:**
|
||||
|
||||
- HSA include files to support application development on the HSA runtime for the ROCR runtime
|
||||
|
||||
- A 64-bit version of AMD’s HSA core runtime for the ROCR runtime
|
||||
|
||||
**Runtime extension package:**
|
||||
|
||||
- A 64-bit version of AMD’s runtime tools library
|
||||
|
||||
- A 64-bit version of AMD’s runtime image library
|
||||
|
||||
The contents of these packages are installed in ``/opt/rocm/hsa`` and ``/opt/rocm`` by default. The core runtime package depends on the ``hsakmt-roct-dev`` package.
|
||||
|
||||
Build and install from source
|
||||
--------------------------------
|
||||
|
||||
Here is how you can build ROCR from source.
|
||||
|
||||
Prerequisites
|
||||
***************
|
||||
|
||||
- CMake 3.7 or later. Export CMake bin into your PATH.
|
||||
|
||||
- Support packages ``libelf-dev`` and ``g++``.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
sudo apt install libelf-dev g++
|
||||
|
||||
- A compatible version of the ``libhsakmt`` library and the ``hsakmt.h`` header file. Obtain the latest version of these files from the `ROCT-Thunk-Interface repository <https://github.com/ROCm/ROCT-Thunk-Interface>`_.
|
||||
|
||||
- Install ``xxd``.
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
sudo apt install xxd
|
||||
|
||||
Building the runtime
|
||||
----------------------
|
||||
|
||||
The ``libhsakmt`` development packages include a CMake package config file. The runtime locates ``libhsakmt`` via ``find_package`` if ``libhsakmt`` is installed in a standard location. For installations that don't use standard ROCm paths, set CMake variables ``CMAKE_PREFIX_PATH`` or ``hsakmt_DIR`` to override ``find_package`` search paths.
|
||||
The runtime includes an optional image support module (previously ``hsa-ext-rocr-dev``). By default this module is included in the runtime builds. To exclude the image module from the runtime, set the CMake variable ``IMAGE_SUPPORT`` to OFF.
|
||||
To build the optional image module, install AMDGCN-compatible clang and device library. You can find the latest version of these additional build dependencies in the `ROCm package repository <https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/package-manager-integration.html#packages-in-rocm-programming-models>`_.
|
||||
The latest source for these projects are available in the `llvm project <https://github.com/ROCm/llvm-project>`_ and `ROCm device libs <https://github.com/ROCm/ROCm-Device-Libs>`_ repositories.
|
||||
|
||||
The runtime optionally supports use of the CMake user package registry. By default the registry is not modified. Set CMake variable ``EXPORT_TO_USER_PACKAGE_REGISTRY`` to ON to enable updating the package registry.
|
||||
|
||||
To build, install, and produce packages on a system with standard ROCm packages installed, clone your copy of ROCR and run the following from ``src/``:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
mkdir build
|
||||
cd build
|
||||
cmake -DCMAKE_INSTALL_PREFIX=/opt/rocm ..
|
||||
make
|
||||
make install
|
||||
make package
|
||||
|
||||
Example with a custom installation path, build dependency path, and options:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
cmake -DIMAGE_SUPPORT=OFF \
|
||||
-DEXPORT_TO_USER_PACKAGE_REGISTRY=ON \
|
||||
-DCMAKE_VERBOSE_MAKEFILE=1 \
|
||||
-DCMAKE_PREFIX_PATH=<alternate path(s) to build dependencies> \
|
||||
-DCMAKE_INSTALL_PATH=<custom install path for this build> \
|
||||
..
|
||||
|
||||
Alternatively, use ``ccmake`` and ``cmake-gui``:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
mkdir build
|
||||
cd build
|
||||
ccmake ..
|
||||
press c to configure
|
||||
populate variables as desired
|
||||
press c again
|
||||
press g to generate and exit
|
||||
make
|
||||
|
||||
Building against the runtime
|
||||
---------------------------------
|
||||
|
||||
The runtime provides a CMake package config file, installed by default to ``/opt/rocm/lib/cmake/hsa-runtime64``. The runtime exports CMake target ``hsa-runtime64`` in namespace ``hsa-runtime64``. A CMake project (``Foo``) using the runtime may locate, include, and link the runtime using the following template:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
# Add /opt/rocm to CMAKE_PREFIX_PATH.
|
||||
|
||||
find_package(hsa-runtime64 1.0 REQUIRED)
|
||||
...
|
||||
add_library(Foo ...)
|
||||
...
|
||||
target_link_libraries(Foo PRIVATE hsa-runtime64::hsa-runtime64)
|
||||
@@ -1,4 +1,4 @@
|
||||
License
|
||||
=======
|
||||
|
||||
.. include:: ../LICENSE.txt
|
||||
License
|
||||
=======
|
||||
|
||||
.. include:: ../LICENSE.txt
|
||||
|
||||
@@ -2,11 +2,20 @@
|
||||
# These comments will also be removed.
|
||||
root: index
|
||||
subtrees:
|
||||
- numbered: False
|
||||
- caption: Install
|
||||
entries:
|
||||
- file: structure
|
||||
- file: api
|
||||
- file: environment_variables
|
||||
- file: install/installation
|
||||
|
||||
- caption: API reference
|
||||
entries:
|
||||
- file: api-reference/c-interface-adaptors
|
||||
- file: api-reference/environment_variables
|
||||
- file: api-reference/api
|
||||
|
||||
- caption: Contribution
|
||||
entries:
|
||||
- file: contribution/contributing-to-rocr
|
||||
|
||||
- caption: About
|
||||
entries:
|
||||
- file: license
|
||||
|
||||
@@ -84,9 +84,7 @@ pygments==2.15.0
|
||||
# pydata-sphinx-theme
|
||||
# sphinx
|
||||
pyjwt[crypto]==2.6.0
|
||||
# via
|
||||
# pygithub
|
||||
# pyjwt
|
||||
# via pygithub
|
||||
pynacl==1.5.0
|
||||
# via pygithub
|
||||
pytz==2023.3
|
||||
@@ -100,7 +98,7 @@ requests==2.28.2
|
||||
# via
|
||||
# pygithub
|
||||
# sphinx
|
||||
rocm-docs-core==0.31.0
|
||||
rocm-docs-core==0.38.1
|
||||
# via -r requirements.in
|
||||
smmap==5.0.0
|
||||
# via gitdb
|
||||
|
||||
@@ -0,0 +1,32 @@
|
||||
.. meta::
|
||||
:description: HSA runtime implementation
|
||||
:keywords: ROCR, ROCm, library, tool, runtime
|
||||
|
||||
.. _what-is-rocr-runtime:
|
||||
|
||||
What is ROCR?
|
||||
========================
|
||||
|
||||
The ROCm runtime (ROCR) is AMD's implementation of HSA runtime, which is a thin, user-mode API that exposes the necessary interfaces to access and interact with graphics hardware driven by the AMDGPU driver set and the ROCK kernel driver. Together they enable you to directly harness the power of discrete AMD graphics devices by allowing host applications to launch compute kernels directly to the graphics hardware.
|
||||
|
||||
The ROCR APIs are capable of the following:
|
||||
|
||||
- Error handling
|
||||
|
||||
- Runtime initialization and shutdown
|
||||
|
||||
- System and agent information
|
||||
|
||||
- Signals and synchronization
|
||||
|
||||
- Architected dispatch
|
||||
|
||||
- Memory management
|
||||
|
||||
- Fitting into a typical software architecture stack
|
||||
|
||||
ROCR provides direct access to the graphics hardware, allowing you more control over execution. An example of low-level hardware access is the support for one or more user-mode queues, which provides a low-latency kernel dispatch interface, allowing you to develop customized dispatch algorithms specific to your application.
|
||||
The HSA Architected Queuing Language (AQL) is an open standard defined by the HSA Foundation, which specifies the packet syntax used to control supported AMD or ATI Radeon © graphics devices. The AQL language supports several packet types, including packets that can command the hardware to automatically resolve inter-packet dependencies (barrier AND and barrier OR packet), kernel dispatch packets, and agent dispatch packets.
|
||||
In addition to user-mode queues and AQL, the HSA runtime exposes various virtual address ranges that can be accessed by one or more of the system’s graphics devices and also possibly by the host. The exposed virtual address ranges support either a fine-grained or a coarse-grained access. Updates to memory in a fine-grained region are immediately visible to all devices that can access it, but only one device can have access to a coarse-grained allocation at a time. You can change the ownership of a coarse-grained region using the HSA runtime memory APIs, but this transfer of ownership must be explicitly done by the host application.
|
||||
|
||||
For a complete description of the HSA Runtime APIs, AQL, and the HSA memory policy, refer to the `HSA Runtime Programmer’s Reference Manual <https://hsafoundation.com/wp-content/uploads/2021/02/HSA-Runtime-1.2.pdf>`_.
|
||||
새 이슈에서 참조
사용자 차단