2
0
Ficheiros
rocm-systems/docs/understand/compilers.rst
T
Kiss, Istvan d0cf32a63a Update docs 2025 04 14 (#54)
* Update docs 2025 03 31

- Docs: remove virtual_rocr.rst
- Fix documentation  warnings
- Reformat HIP RTC
- Docs: Refactor HIP porting guide
- Docs: Expand HIP porting guide and CUDA driver porting guide
- Minor fix
- Docs: Update environment variables file
- Bump rocm-docs-core[api_reference] from 1.15.0 to 1.17.0 in /docs/sphinx
- Docs: Update FP8 page to show both FP8 and FP16 types
- Bump sphinxcontrib-doxylink from 1.12.4 to 1.13.0 in /docs/sphinx
- Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.17.0 to 1.17.1.
- Remove external link
- Update programming model
- Bump rocm-docs-core[api_reference] from 1.17.1 to 1.18.1 in /docs/sphinx
- Docs: Add page for Complex Math API
- Docs: Add page about HIP error codes
- Update docs: the compilation cache is enabled by default
- Fix fns32 function mask type in doc

* Bump rocm-docs-core[api_reference] from 1.18.1 to 1.18.2 in /docs/sphinx

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.18.1 to 1.18.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.18.1...v1.18.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.18.2
  dependency-type: direct:production
  update-type: version-update:semver-patch

* Fix readme link

* Docs: Fix verbose paths generated by doxygen

* Handle git ssh in docs conf.py
2025-06-02 20:40:41 +05:30

177 linhas
7.2 KiB
ReStructuredText

.. meta::
:description: Compilation workflow of the HIP compilers.
:keywords: AMD, ROCm, HIP, CUDA, HIP runtime API
.. _hip_compilers:
********************************************************************************
HIP compilers
********************************************************************************
ROCm provides the compiler driver ``hipcc``, that can be used on AMD ROCm and
NVIDIA CUDA platforms.
On ROCm, ``hipcc`` takes care of the following:
- Setting the default library and include paths for HIP
- Setting some environment variables
- Invoking the appropriate compiler - ``amdclang++``
On NVIDIA CUDA platform, ``hipcc`` takes care of invoking compiler ``nvcc``.
``amdclang++`` is based on the ``clang++`` compiler. For more
details, see the :doc:`llvm project<llvm-project:index>`.
HIPCC
================================================================================
Common Compiler Options
--------------------------------------------------------------------------------
The following table shows the most common compiler options supported by
``hipcc``.
.. list-table::
:header-rows: 1
*
- Option
- Description
*
- ``--fgpu-rdc``
- Generate relocatable device code, which allows kernels or device functions
to call device functions in different translation units.
*
- ``-ggdb``
- Equivalent to `-g` plus tuning for GDB. This is recommended when using
ROCm's GDB to debug GPU code.
*
- ``--gpu-max-threads-per-block=<num>``
- Generate code to support up to the specified number of threads per block.
*
- ``-offload-arch=<target>``
- Generate code for the given GPU target.
For a full list of supported compilation targets see the `processor names in AMDGPU's llvm documentation <https://llvm.org/docs/AMDGPUUsage.html#processors>`_.
This option can appear multiple times to generate a fat binary for multiple
targets.
The actual support of the platform's runtime may differ.
*
- ``-save-temps``
- Save the compiler generated intermediate files.
*
- ``-v``
- Show the compilation steps.
Linking
--------------------------------------------------------------------------------
``hipcc`` adds the necessary libraries for HIP as well as for the accelerator
compiler (``nvcc`` or ``amdclang++``). We recommend linking with ``hipcc`` since
it automatically links the binary to the necessary HIP runtime libraries.
Linking Code With Other Compilers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``nvcc`` by default uses ``g++`` to generate the host code.
``amdclang++`` generates both device and host code. The code uses the same API
as ``gcc``, which allows code generated by different ``gcc``-compatible
compilers to be linked together. For example, code compiled using ``amdclang++``
can link with code compiled using compilers such as ``gcc``, ``icc`` and
``clang``. Take care to ensure all compilers use the same standard C++ header
and library formats.
libc++ and libstdc++
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``hipcc`` links to ``libstdc++`` by default. This provides better compatibility
between ``g++`` and HIP.
In order to link to ``libc++``, pass ``--stdlib=libc++`` to ``hipcc``.
Generally, libc++ provides a broader set of C++ features while ``libstdc++`` is
the standard for more compilers, notably including ``g++``.
When cross-linking C++ code, any C++ functions that use types from the C++
standard library, such as ``std::string``, ``std::vector`` and other containers,
must use the same standard-library implementation. This includes cross-linking
between ``amdclang++`` and other compilers.
HIP compilation workflow
================================================================================
HIP provides a flexible compilation workflow that supports both offline
compilation and runtime or just-in-time (JIT) compilation. Each approach has
advantages depending on the use case, target architecture, and performance
needs.
The offline compilation is ideal for production environments, where the
performance is critical and the target GPU architecture is known in advance.
The runtime compilation is useful in development environments or when
distributing software that must run on a wide range of hardware without the
knowledge of the GPU in advance. It provides flexibility at the cost of some
performance overhead.
Offline compilation
--------------------------------------------------------------------------------
The HIP code compilation is performed in two stages: host and device code
compilation stage.
- Device-code compilation stage: The compiled device code is embedded into the
host object file. Depending on the platform, the device code can be compiled
into assembly or binary. ``nvcc`` and ``amdclang++`` target different
architectures and use different code object formats. ``nvcc`` uses the binary
``cubin`` or the assembly PTX files, while the ``amdclang++`` path is the
binary ``hsaco`` format. On CUDA platforms, the driver compiles the PTX files
to executable code during runtime.
- Host-code compilation stage: On the host side, ``hipcc`` or ``amdclang++`` can
compile the host code in one step without other C++ compilers. On the other
hand, ``nvcc`` only replaces the ``<<<...>>>`` kernel launch syntax with the
appropriate CUDA runtime function call and the modified host code is passed to
the default host compiler.
For an example on how to compile HIP from the command line, see :ref:`SAXPY
tutorial<compiling_on_the_command_line>` .
Runtime compilation
--------------------------------------------------------------------------------
HIP allows you to compile kernels at runtime using the ``hiprtc*`` API. Kernels
are stored as a text string, which is passed to HIPRTC alongside options to
guide the compilation.
For more details, see
:doc:`HIP runtime compiler <../how-to/hip_rtc>`.
Static libraries
================================================================================
``hipcc`` supports generating two types of static libraries.
- The first type of static library only exports and launches host functions
within the same library and not the device functions. This library type offers
the ability to link with a non-hipcc compiler such as ``gcc``. Additionally,
this library type contains host objects with device code embedded as fat
binaries. This library type is generated using the flag ``--emit-static-lib``:
.. code-block:: shell
hipcc hipOptLibrary.cpp --emit-static-lib -fPIC -o libHipOptLibrary.a
gcc test.cpp -L. -lhipOptLibrary -L/path/to/hip/lib -lamdhip64 -o test.out
- The second type of static library exports device functions to be linked by
other code objects by using ``hipcc`` as the linker. This library type
contains relocatable device objects and is generated using ``ar``:
.. code-block:: shell
hipcc hipDevice.cpp -c -fgpu-rdc -o hipDevice.o
ar rcsD libHipDevice.a hipDevice.o
hipcc libHipDevice.a test.cpp -fgpu-rdc -o test.out
A full example for this can be found in the ROCm-examples, see the examples for
`static host libraries <https://github.com/ROCm/rocm-examples/tree/develop/HIP-Basic/static_host_library>`_
or `static device libraries <https://github.com/ROCm/rocm-examples/tree/develop/HIP-Basic/static_device_library>`_.