Fichiers
rocm-systems/docs/install/docker-install.rst
T
Jeffrey Novotny bf7c130631 Refactor RCCL install guide into several pages (#1427)
* Refactor RCCL install guide into several pages

* Changes from code review and new docker guide

* Add missing entries to ToC

* Minor fixes

* Fix help strings

* Edits after review and remove extra white space
2024-11-27 15:34:26 -05:00

45 lignes
1.9 KiB
ReStructuredText

.. meta::
:description: Instruction on how to install the RCCL library for collective communication primitives using Docker
:keywords: RCCL, ROCm, library, API, install, Docker
.. _install-docker:
*****************************************
Running RCCL using Docker
*****************************************
To use Docker to run RCCL, Docker must already be installed on the system.
To build the Docker image and run the container, follow these steps.
#. Build the Docker image
By default, the Dockerfile uses ``docker.io/rocm/dev-ubuntu-22.04:latest`` as the base Docker image.
It then installs RCCL and rccl-tests (in both cases, it uses the version from the RCCL ``develop`` branch).
Use this command to build the Docker image:
.. code-block:: shell
docker build -t rccl-tests -f Dockerfile.ubuntu --pull .
The base Docker image, rccl repository, and rccl-tests repository can be modified
by using ``--build-args`` in the ``docker build`` command above. For example, to use a different base Docker image,
use this command:
.. code-block:: shell
docker build -t rccl-tests -f Dockerfile.ubuntu --build-arg="ROCM_IMAGE_NAME=rocm/dev-ubuntu-20.04" --build-arg="ROCM_IMAGE_TAG=6.2" --pull .
#. Launch an interactive Docker container on a system with AMD GPUs:
.. code-block:: shell
docker run -it --rm --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --network=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined rccl-tests /bin/bash
To run, for example, the ``all_reduce_perf`` test from rccl-tests on 8 AMD GPUs from inside the Docker container, use this command:
.. code-block:: shell
mpirun --allow-run-as-root -np 8 --mca pml ucx --mca btl ^openib -x NCCL_DEBUG=VERSION /workspace/rccl-tests/build/all_reduce_perf -b 1 -e 16G -f 2 -g 1
For more information on the rccl-tests options, see the `Usage guidelines <https://github.com/ROCm/rccl-tests#usage>`_ in the GitHub repository.