amd-strix-halo-vllm-toolboxes

Author	SHA1	Message	Date
Donato Capitella	16405e8943	config: Add VLLM_DISABLE_COMPILE_CACHE=1 to environment variables across VLLM scripts.	2026-03-09 14:07:43 +00:00
Donato Capitella	e726d406fa	updated benchmarks, fix start-vllm	2026-02-23 19:39:19 +00:00
Donato Capitella	49b85fc1fb	add MiniMax	2026-02-18 15:22:12 +00:00
Donato Capitella	1ddcb9a202	feat: Configure ROCm attention via `--attention-backend` CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.	2026-02-02 15:40:16 +00:00
Donato Capitella	ba503f6e61	feat: centralize model configurations and benchmark settings into a new `models.py` module and update Dockerfile and scripts to use it.	2026-02-01 21:17:15 +00:00
Donato Capitella	039484a41e	Updated name of card	2025-12-24 08:13:34 +00:00
Donato Capitella	3b0e736c94	feat: Implement dynamic model discovery from benchmark results, add benchmark notes, and include `dialog` dependency.	2025-12-20 12:31:20 +00:00
Donato Capitella	5e8b6bb545	updates	2025-12-20 11:37:06 +00:00