From 90c5fe9f83ceb9859ba70e31bbd832dd1ad6d9b9 Mon Sep 17 00:00:00 2001 From: Donato Capitella Date: Tue, 3 Feb 2026 08:34:56 +0000 Subject: [PATCH] docs: Standardize Fedora OS version references and update IOMMU kernel parameter from `amd_iommu=off` to `iommu=pt` in documentation. --- README.md | 6 +++--- docs/index.html | 2 +- rdma_cluster/setup_guide.md | 4 ++-- rdma_cluster/troubleshooting_rccl.md | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 92c7cc8..aa18819 100644 --- a/README.md +++ b/README.md @@ -176,17 +176,17 @@ This should work on any Strix Halo. For a complete list of available hardware, s | **CPU** | Ryzen AI MAX+ 395 "Strix Halo" | | **System Memory** | 128 GB RAM | | **GPU Memory** | 512 MB allocated in BIOS | -| **Host OS** | Fedora 43 (Rawhide), Linux 6.18.5-200.fc43.x86_64 | +| **Host OS** | Fedora 43, Linux 6.18.5-200.fc43.x86_64 | ### 6.2 Kernel Parameters (tested on Fedora 42) Add these boot parameters to enable unified memory while reserving a minimum of 4 GiB for the OS (max 124 GiB for iGPU): -amd_iommu=pt amdgpu.gttsize=126976 ttm.pages_limit=32505856 +iommu=pt amdgpu.gttsize=126976 ttm.pages_limit=32505856 | Parameter | Purpose | |-----------------------------|--------------------------------------------------------------------------------------------| -| `amd_iommu=off` | Disables AMD IOMMU to reduce overhead for better performance | +| `iommu=pt` | Sets IOMMU to "Pass-Through" mode. This helps performance, reducing overhead for both the RDMA NIC and the iGPU unified memory access. | | `amdgpu.gttsize=126976` | Caps GPU unified memory to 124 GiB; 126976 MiB ÷ 1024 = 124 GiB | | `ttm.pages_limit=32505856` | Caps pinned memory to 124 GiB; 32505856 × 4 KiB = 126976 MiB = 124 GiB | diff --git a/docs/index.html b/docs/index.html index 43d15c2..ecc9862 100644 --- a/docs/index.html +++ b/docs/index.html @@ -506,7 +506,7 @@
OS/Kernel - Fedora 43 (Rawhide) · Linux 6.18.5-200.fc43.x86_64 + Fedora 43 · Linux 6.18.5-200.fc43.x86_64
Interconnect diff --git a/rdma_cluster/setup_guide.md b/rdma_cluster/setup_guide.md index 32bf684..4c649de 100644 --- a/rdma_cluster/setup_guide.md +++ b/rdma_cluster/setup_guide.md @@ -73,8 +73,8 @@ Perform these steps on the **Host OS** (Fedora 43) of **both nodes**. | Node | Kernel | OS | IP (RDMA Interface) | | :--- | :--- | :--- | :--- | -| **Node 1** | `6.18.5-200.fc43.x86_64` | Fedora Linux 43 (Rawhide) | `192.168.100.1/30` | -| **Node 2** | `6.18.6-200.fc43.x86_64` | Fedora Linux 43 (Rawhide) | `192.168.100.2/30` | +| **Node 1** | `6.18.5-200.fc43.x86_64` | Fedora Linux 43 | `192.168.100.1/30` | +| **Node 2** | `6.18.6-200.fc43.x86_64` | Fedora Linux 43 | `192.168.100.2/30` | > **Note:** These specific kernel versions were verified to work. Fedora 43 is recommended. diff --git a/rdma_cluster/troubleshooting_rccl.md b/rdma_cluster/troubleshooting_rccl.md index e172231..0644df4 100644 --- a/rdma_cluster/troubleshooting_rccl.md +++ b/rdma_cluster/troubleshooting_rccl.md @@ -43,7 +43,7 @@ I have established a stable low-latency RDMA link and a functional Ray cluster o * **Interconnect:** Direct connection via Intel Ethernet Controller E810-CQDA1. * **Protocol:** RoCE v2 (RDMA over Converged Ethernet). -### Host Software (Fedora Rawhide) +### Host Software (Fedora) | Node | Hostname | Kernel | OS | IP (RDMA Interface) | | :--- | :--- | :--- | :--- | :--- | | **Node 1** | `frmwk-dsk` | `6.18.5-200.fc43.x86_64` | Fedora Linux 43 | `192.168.100.1/30` |