此网站需要 JavaScript。
探索
帮助
登录
AI
/
amd-strix-halo-vllm-toolboxes
关注
2
点赞
0
派生
0
您已经派生过 amd-strix-halo-vllm-toolboxes
代码
工单
合并请求
工作流
软件包
项目
发布
百科
活动
文件
693757f5d945bb9f28949ea52ce76c7ea3cf1e42
amd-strix-halo-vllm-toolboxes
/
benchmarks
T
文件历史
Donato Capitella
4d3b046870
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
2026-02-02 21:30:17 +00:00
..
benchmark_results
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
2026-02-02 21:30:17 +00:00
benchmark_results_rocm
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
2026-02-02 21:30:17 +00:00
find_max_context.py
feat: Optimize model
max_num_seqs
and global benchmark parameters for Strix Halo, and centralize configurations in
models.py
.
2026-02-02 08:45:13 +00:00
max_context_results.json
updating max context results
2026-02-02 11:56:26 +00:00
run_vllm_bench.py
feat: Configure ROCm attention via
--attention-backend
CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.
2026-02-02 15:40:16 +00:00
vllm_cluster_bench.py
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
2026-02-02 19:34:33 +00:00