Donato Capitella
|
90c5fe9f83
|
docs: Standardize Fedora OS version references and update IOMMU kernel parameter from amd_iommu=off to iommu=pt in documentation.
|
2026-02-03 08:34:56 +00:00 |
|
Donato Capitella
|
fde8f520d9
|
feat: Update benchmark results across various models and configurations, increasing num_requests from 100 to 200.
|
2026-02-03 08:31:54 +00:00 |
|
Donato Capitella
|
8ff52abf4e
|
perf: Increase max_num_seqs for bus batch scaling and OFF_NUM_PROMPTS for steady-state throughput measurement on Strix Halo.
|
2026-02-02 22:36:15 +00:00 |
|
Donato Capitella
|
4d3b046870
|
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
|
2026-02-02 21:30:17 +00:00 |
|
Donato Capitella
|
1f96c391fb
|
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
|
2026-02-02 19:34:33 +00:00 |
|
Donato Capitella
|
6f118ff936
|
feat: Update ROCm benchmark result paths, improve cluster node discovery and cache clearing, and refine cluster benchmark result directory.
|
2026-02-02 07:35:50 +00:00 |
|
Donato Capitella
|
a1105a0b96
|
feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.
|
2026-02-01 19:36:07 +00:00 |
|
Donato Capitella
|
711de530f6
|
added ROCm/Triton attention comparison
|
2025-12-20 11:49:03 +00:00 |
|
Donato Capitella
|
5e8b6bb545
|
updates
|
2025-12-20 11:37:06 +00:00 |
|