From d23d18f423d661556a5091f0dc9f27dec2adf67f Mon Sep 17 00:00:00 2001 From: David DeBonis Date: Mon, 29 Sep 2025 10:11:21 -0600 Subject: [PATCH] Adding usage tip for ignore cpu affinity (#1948) * Adding usage tip for ignore cpu affinity * Update docs/how-to/rccl-usage-tips.rst Co-authored-by: Jeffrey Novotny * Update docs/how-to/rccl-usage-tips.rst Co-authored-by: Jeffrey Novotny --------- Co-authored-by: Jeffrey Novotny --- docs/how-to/rccl-usage-tips.rst | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/how-to/rccl-usage-tips.rst b/docs/how-to/rccl-usage-tips.rst index 2c08f63b38..352ccdef83 100644 --- a/docs/how-to/rccl-usage-tips.rst +++ b/docs/how-to/rccl-usage-tips.rst @@ -82,6 +82,20 @@ set the HSA environment variable as follows: This feature requires GPUs that support peer-to-peer access along with proper large BAR addressing support. +Ignoring CPU affinity with multi-node +===================================== + +Depending on the job launcher and the requirements of your workload, performance as the communication workload scales +can be improved by setting ``NCCL_IGNORE_CPU_AFFINITY``. This allows the RCCL communication library to +ignore the job's supplied CPU affinity and use the GPU affinity only. + +.. code-block:: shell + + NCCL_IGNORE_CPU_AFFINITY=1 + +For general usage, this environment variable is not set so it doesn't interfere with the user or launcher +supplied preferences. + Improving performance on the MI300X =================================== @@ -262,4 +276,4 @@ To disable context tracking for Radeon GPUs, set the following environment varia .. code-block:: shell - export RCCL_DISABLE_CONTEXT_TRACKING=1 \ No newline at end of file + export RCCL_DISABLE_CONTEXT_TRACKING=1