amd-strix-halo-vllm-toolboxes

Autor	SHA1	Nachricht	Datum
Donato Capitella	16405e8943	config: Add VLLM_DISABLE_COMPILE_CACHE=1 to environment variables across VLLM scripts.	2026-03-09 14:07:43 +00:00
Donato Capitella	b035bcb482	updated benchmarks including thunderbolt and configuratuion guides	2026-02-25 10:48:42 +00:00
Donato Capitella	6875f62ccf	improve benchmarks	2026-02-25 09:29:46 +00:00
Donato Capitella	1ddcb9a202	feat: Configure ROCm attention via `--attention-backend` CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.	2026-02-02 15:40:16 +00:00
Donato Capitella	6f118ff936	feat: Update ROCm benchmark result paths, improve cluster node discovery and cache clearing, and refine cluster benchmark result directory.	2026-02-02 07:35:50 +00:00
Donato Capitella	ba503f6e61	feat: centralize model configurations and benchmark settings into a new `models.py` module and update Dockerfile and scripts to use it.	2026-02-01 21:17:15 +00:00
Donato Capitella	a1105a0b96	feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.	2026-02-01 19:36:07 +00:00
Donato Capitella	e5cc96bf48	feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks.	2026-02-01 15:43:56 +00:00
Donato Capitella	5e8b6bb545	updates	2025-12-20 11:37:06 +00:00