This website requires JavaScript.
Jelajahi
Bantuan
Masuk
AI
/
amd-strix-halo-vllm-toolboxes
Menonton
2
Bintang
0
Garpu
0
You've already forked amd-strix-halo-vllm-toolboxes
Kode
Masalah
Tarik Permintaan
Actions
Packages
Projects
Rilis
Wiki
Kegiatan
74
Melakukan
3
Cabang
0
Tag
e0fadf426b767aaa0d4fafafa259fbfa1ce1a4b0
Grafik Komit
7 Melakukan
Penulis
SHA1
Pesan
Tanggal
Donato Capitella
90c5fe9f83
docs: Standardize Fedora OS version references and update IOMMU kernel parameter from
amd_iommu=off
to
iommu=pt
in documentation.
2026-02-03 08:34:56 +00:00
Donato Capitella
8ff52abf4e
perf: Increase
max_num_seqs
for bus batch scaling and
OFF_NUM_PROMPTS
for steady-state throughput measurement on Strix Halo.
2026-02-02 22:36:15 +00:00
Donato Capitella
4d3b046870
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
2026-02-02 21:30:17 +00:00
Donato Capitella
1f96c391fb
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
2026-02-02 19:34:33 +00:00
Donato Capitella
a1105a0b96
feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.
2026-02-01 19:36:07 +00:00
Donato Capitella
711de530f6
added ROCm/Triton attention comparison
2025-12-20 11:49:03 +00:00
Donato Capitella
5e8b6bb545
updates
2025-12-20 11:37:06 +00:00