amd-strix-halo-vllm-toolboxes

Szerző	SHA1	Üzenet	Dátum
Donato Capitella	8ff52abf4e	perf: Increase `max_num_seqs` for bus batch scaling and `OFF_NUM_PROMPTS` for steady-state throughput measurement on Strix Halo.	2026-02-02 22:36:15 +00:00
Donato Capitella	0109e6a19b	feat: Optimize model `max_num_seqs` and global benchmark parameters for Strix Halo, and centralize configurations in `models.py`.	2026-02-02 08:45:13 +00:00
Donato Capitella	ba503f6e61	feat: centralize model configurations and benchmark settings into a new `models.py` module and update Dockerfile and scripts to use it.	2026-02-01 21:17:15 +00:00