amd-strix-halo-vllm-toolboxes

Autor	SHA1	Nachricht	Datum
Donato Capitella	e0fadf426b	force egaer mode to make gemma stable	2026-02-23 18:19:15 +00:00
Donato Capitella	49b85fc1fb	add MiniMax	2026-02-18 15:22:12 +00:00
Donato Capitella	8ff52abf4e	perf: Increase `max_num_seqs` for bus batch scaling and `OFF_NUM_PROMPTS` for steady-state throughput measurement on Strix Halo.	2026-02-02 22:36:15 +00:00
Donato Capitella	0109e6a19b	feat: Optimize model `max_num_seqs` and global benchmark parameters for Strix Halo, and centralize configurations in `models.py`.	2026-02-02 08:45:13 +00:00
Donato Capitella	ba503f6e61	feat: centralize model configurations and benchmark settings into a new `models.py` module and update Dockerfile and scripts to use it.	2026-02-01 21:17:15 +00:00