Diese Website benötigt JavaScript.
Erkunden
Hilfe
Anmelden
AI
/
amd-strix-halo-vllm-toolboxes
Beobachten
2
Favorisieren
0
Fork
0
Du hast bereits einen Fork von amd-strix-halo-vllm-toolboxes erstellt
Code
Issues
Pull-Requests
Actions
Pakete
Projekte
Releases
Wiki
Aktivität
81
Commits
3
Branches
0
Tags
8a20ec27b228905ffda2c8fbdf9abb121ce90464
Commit graph
5 Commits
Autor
SHA1
Nachricht
Datum
Donato Capitella
e0fadf426b
force egaer mode to make gemma stable
2026-02-23 18:19:15 +00:00
Donato Capitella
49b85fc1fb
add MiniMax
2026-02-18 15:22:12 +00:00
Donato Capitella
8ff52abf4e
perf: Increase
max_num_seqs
for bus batch scaling and
OFF_NUM_PROMPTS
for steady-state throughput measurement on Strix Halo.
2026-02-02 22:36:15 +00:00
Donato Capitella
0109e6a19b
feat: Optimize model
max_num_seqs
and global benchmark parameters for Strix Halo, and centralize configurations in
models.py
.
2026-02-02 08:45:13 +00:00
Donato Capitella
ba503f6e61
feat: centralize model configurations and benchmark settings into a new
models.py
module and update Dockerfile and scripts to use it.
2026-02-01 21:17:15 +00:00