Update plugin to look for librccl-net.so. (#1768)

[ROCm/rccl commit: 71c788d4d7]
이 커밋은 다음에 포함됨:
Arm Patinyasakdikul
2025-06-26 16:59:38 -05:00
커밋한 사람 GitHub
부모 2c02ee0a99
커밋 32e80aedc0
2개의 변경된 파일7개의 추가작업 그리고 7개의 파일을 삭제
+6 -6
파일 보기
@@ -16,20 +16,20 @@ particular version of the GPU stack (such as NVIDIA CUDA), from the network code
particular version of the networking stack. Using this method, you can easily integrate any CUDA version
with any network stack version.
NCCL network plugins are packaged as a shared library called ``libnccl-net.so``. The shared library
NCCL network plugins are packaged as a shared library called ``librccl-net.so``. The shared library
contains one or more implementations of the NCCL Net API in the form of versioned structs,
which are filled with pointers to all required functions.
Plugin architecture
===================
When NCCL is initialized, it searches for a ``libnccl-net.so`` library and dynamically loads it,
When NCCL is initialized, it searches for a ``librccl-net.so`` library and dynamically loads it,
then searches for symbols inside the library.
The ``NCCL_NET_PLUGIN`` environment variable allows multiple plugins to coexist. If it's set, NCCL
looks for a library named ``libnccl-net-${NCCL_NET_PLUGIN}.so``. It is therefore
recommended that you name the library according to that pattern, with a symlink pointing from ``libnccl-net.so``
to ``libnccl-net-${NCCL_NET_PLUGIN}.so``. This lets users select the correct plugin
looks for a library named ``librccl-net-${NCCL_NET_PLUGIN}.so``. It is therefore
recommended that you name the library according to that pattern, with a symlink pointing from ``librccl-net.so``
to ``librccl-net-${NCCL_NET_PLUGIN}.so``. This lets users select the correct plugin
if there are multiple plugins in the path.
Struct versioning
@@ -169,7 +169,7 @@ Initialization
Setting ``NCCL_NET=<plugin name>`` ensures a specific network implementation is used, with
a matching ``name``. This is not to be confused with ``NCCL_NET_PLUGIN`` which defines a suffix for the
``libnccl-net.so`` library name to load.
``librccl-net.so`` library name to load.
* ``init`` - As soon as NCCL finds the plugin and the correct ``ncclNet`` symbol, it calls the ``init`` function. This allows the plugin to discover network devices and ensure they are usable.
If the ``init`` function does not return ``ncclSuccess``, then NCCL does not use the plugin and falls back to internal ones.
+1 -1
파일 보기
@@ -22,7 +22,7 @@ enum ncclPluginType {
#define NUM_LIBS 3
static void *libHandles[NUM_LIBS];
static const char *pluginNames[NUM_LIBS] = { "NET", "TUNER", "PROFILER" };
static const char *pluginPrefix[NUM_LIBS] = { "libnccl-net", "libnccl-tuner", "libnccl-profiler" };
static const char *pluginPrefix[NUM_LIBS] = { "librccl-net", "librccl-tuner", "librccl-profiler" };
static const char *pluginFallback[NUM_LIBS] = { "Using internal net plugin.", "Using internal tuner plugin.", "" };
static unsigned long subsys[NUM_LIBS] = { NCCL_INIT|NCCL_NET, NCCL_INIT|NCCL_TUNING, NCCL_INIT };