amd-strix-halo-vllm-toolboxes/benchmarks at 0d8afba0935edd7ea5c6971294fa4ed0a6ec573d - amd-strix-halo-vllm-toolboxes - BadStorm.xyz - Code Hub

AI/amd-strix-halo-vllm-toolboxes

Files

T

History

Donato Capitella 965cd2c339 feat: Improve Ray node detection, enable cluster-wide vLLM cache clearing, and enforce eager mode for benchmarks.

2026-02-01 21:35:27 +00:00

..

benchmark_results

updates

2025-12-20 11:37:06 +00:00

benchmark_results_rocm_attn/benchmark_results

added ROCm/Triton attention comparison

2025-12-20 11:49:03 +00:00

find_max_context.py

updates

2025-12-20 11:37:06 +00:00

max_context_results.json

feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.

2026-02-01 19:36:07 +00:00

run_vllm_bench.py

feat: centralize model configurations and benchmark settings into a new models.py module and update Dockerfile and scripts to use it.

2026-02-01 21:17:15 +00:00

vllm_cluster_bench.py

feat: Improve Ray node detection, enable cluster-wide vLLM cache clearing, and enforce eager mode for benchmarks.

2026-02-01 21:35:27 +00:00