Files
nawrinsu 6d22ce9b1a Fix protocol and channel override when tuner is used (#1985)
* Fix protocol and channel override when tuner is used

* Added comment

* Fix README for basic tuner implementation

[ROCm/rccl commit: 166268d715]
2025-11-03 13:56:34 -08:00

198 righe
5.9 KiB
Markdown

# Basic NCCL Tuner Plugin
This directory contains a minimal placeholder implementation of an NCCL tuner plugin. It serves as a starting point for developing custom tuner plugins by providing the essential function stubs and interface structure required by NCCL.
## Purpose
This basic plugin is designed to:
- Provide a minimal working example of the NCCL tuner plugin interface
- Serve as a template for developing custom tuner plugins
- Demonstrate the required function signatures and structure
- Implement placeholder functionality that can be extended
## Implementation Details
The plugin implements the following functions:
### `pluginInit`
```c
ncclResult_t pluginInit(size_t nRanks, size_t nNodes, ncclDebugLogger_t logFunction, void **context)
```
- **Purpose**: Initialize the plugin with communicator information
- **Current Implementation**: Simple placeholder that returns success
- **Parameters**:
- `nRanks`: Total number of ranks in the communicator
- `nNodes`: Total number of nodes in the communicator
- `logFunction`: NCCL debug logging function
- `context`: Plugin context pointer (output)
### `pluginGetCollInfo`
```c
ncclResult_t pluginGetCollInfo(void* context, ncclFunc_t collType, size_t nBytes,
int numPipeOps, float** collCostTable, int numAlgo, int numProto,
int regBuff, int* nChannels)
```
- **Purpose**: Modify cost tables for collective operations
- **Current Implementation**:
- Sets RING+SIMPLE algorithm to cost 0.0 (highest preference)
- Sets channel count to 0
- **Parameters**:
- `context`: Plugin context from init
- `collType`: Type of collective operation
- `nBytes`: Message size in bytes
- `numPipeOps`: Number of pipeline operations
- `collCostTable`: Cost table to modify
- `numAlgo`: Number of algorithms
- `numProto`: Number of protocols
- `regBuff`: Whether buffer can be registered
- `nChannels`: Number of channels to use (output)
### `pluginDestroy`
```c
ncclResult_t pluginDestroy(void* context)
```
- **Purpose**: Clean up plugin resources
- **Current Implementation**: Simple placeholder that returns success
## Cost Table Structure
The plugin demonstrates how to modify NCCL's cost tables:
```c
float (*table)[NCCL_NUM_PROTOCOLS] = (float (*)[NCCL_NUM_PROTOCOLS])collCostTable;
```
The cost table is a 2D array where:
- First dimension: Algorithm index (e.g., `NCCL_ALGO_RING`)
- Second dimension: Protocol index (e.g., `NCCL_PROTO_SIMPLE`)
- Values: Cost for that algorithm/protocol combination
### Cost Values
- **0.0**: Highest preference (lowest cost)
- **Positive values**: Relative costs (lower is better)
- **`NCCL_ALGO_PROTO_IGNORE`**: Disable this combination
## Building
```bash
make
```
This creates `libnccl-tuner-basic.so` which can be loaded by NCCL.
## Usage
### Loading the Plugin
```bash
export LD_LIBRARY_PATH=/path/to/basic:$LD_LIBRARY_PATH
mpirun -np 4 your_nccl_application
```
```bash
export NCCL_TUNER_PLUGIN=basic
export NCCL_TUNER_PLUGIN=libnccl-tuner-basic.so
export NCCL_TUNER_PLUGIN=/path/to/your/plugin/libnccl-tuner-basic.so
```
### Verifying Plugin Loading
Enable NCCL debug output to see if the plugin is loaded:
```bash
export NCCL_DEBUG=INFO
```
You should see messages indicating the tuner plugin is being used.
## Extending the Plugin
This basic plugin provides a foundation that you can extend:
### 1. Add Configuration Logic
Modify `pluginGetCollInfo` to implement your tuning strategy:
```c
__hidden ncclResult_t pluginGetCollInfo(void* context, ncclFunc_t collType, size_t nBytes,
int numPipeOps, float** collCostTable, int numAlgo, int numProto,
int regBuff, int* nChannels) {
// Your custom tuning logic here
if (nBytes < 1024) {
// Small message optimization
table[NCCL_ALGO_TREE][NCCL_PROTO_SIMPLE] = 0.0;
} else {
// Large message optimization
table[NCCL_ALGO_RING][NCCL_PROTO_LL128] = 0.0;
}
// Dynamic channel selection
*nChannels = (nBytes > 1024*1024) ? 4 : 1;
return ncclSuccess;
}
```
### 2. Add Context Management
Use the context pointer to store plugin state:
```c
struct pluginContext {
int initialized;
size_t nRanks;
size_t nNodes;
// Add your plugin-specific data here
};
```
### 3. Add File-Based Configuration
Read configuration from files, environment variables, or other sources.
### 4. Add Topology Awareness
Use the `nRanks` and `nNodes` parameters to implement topology-specific tuning.
## File Structure
```
basic/
├── README.md # This file
├── plugin.c # Plugin implementation
├── Makefile # Build configuration
└── nccl/ # NCCL header files
└── tuner.h # Tuner plugin interface definitions
```
## Next Steps
1. **Understand the Interface**: Study the function signatures and parameters
2. **Implement Your Logic**: Add your tuning strategy to `pluginGetCollInfo`
3. **Test Thoroughly**: Verify your plugin works with different message sizes and topologies
4. **Add Error Handling**: Implement proper error checking and resource management
5. **Document Your Changes**: Update this README with your specific implementation details
## Comparison with Example Plugin
- **Basic Plugin**: Minimal implementation, good for learning and simple use cases
- **Example Plugin**: Full-featured CSV-based configuration system, good for production use
Choose the basic plugin if you want to:
- Learn the tuner plugin interface
- Implement simple, hardcoded tuning strategies
- Build a custom plugin from scratch
Choose the example plugin if you want:
- File-based configuration
- Complex tuning strategies
- Production-ready features
## Resources
- [Parent Directory README](../README.md) - General tuner plugin development guide
- [Example Plugin](../example/README.md) - Fully featured implementation
This basic plugin provides the foundation you need to start developing custom NCCL tuner plugins. Extend it with your specific tuning logic and requirements.