Fixed typos due to copy/paste

This commit is contained in:
Donato Capitella
2025-09-04 17:22:18 +01:00
parent 7e17fa8660
commit f8db65e8d7
+2 -2
View File
@@ -23,8 +23,8 @@ This setup is **highly experimental** on ROCm/Strix Halo. Some models work; **ma
| `meta-llama/Llama-2-7b-chat-hf` | 7B FP16 | ✅ Works | (recommended) `--dtype float16` | Stable. |
| `Qwen/Qwen3-30B-A3B-Instruct-2507` | 30B (A3B) FP16 | ✅ Works | (recommended) `--dtype float16` | |
| `Google/Gemma3-27B-Instruct` | 27B FP16 | ✅ Works | (recommended) `--dtype float16` | Slow |
| `Google/Gemma3-12B-Instruct` | 12B FP16 | ✅ Works | (recommended) `--dtype float16` | Slow |
| `Google/Gemma3-4B-Instruct` |4B FP16 | ✅ Works | (recommended) `--dtype float16` | Slow |
| `Google/Gemma3-12B-Instruct` | 12B FP16 | ✅ Works | (recommended) `--dtype float16` | |
| `Google/Gemma3-4B-Instruct` |4B FP16 | ✅ Works | (recommended) `--dtype float16` | |
| `Qwen/Qwen3-14B-AWQ` | 14B AWQ | ✅ Works (with flags) | `--quantization awq --dtype float16 --enforce-eager` | On ROCm, eager avoids missing `awq_dequantize` during compile; vLLM autosets `VLLM_USE_TRITON_AWQ`. |
| `openai/gpt-oss-20b` | 20B MXFP4 | ❌ Fails | — | `ModuleNotFoundError: triton_kernels.matmul_ogs` (MXFP4 path not available in this image). |
| `zai-org/GLM-4.5-Air-FP8` | FP8 | ❌ Fails | — | `ValueError: type fp8e4nv not supported (only 'fp8e5')`. |