Ce site Web nécessite JavaScript.
Explorateur
Aide
Connexion
AI
/
amd-strix-halo-vllm-toolboxes
Suivre
2
Ajouter aux favoris
0
Bifurcation
0
Vous avez déjà forké amd-strix-halo-vllm-toolboxes
Code
Tickets
Demandes d'ajout
Actions
Paquets
Projets
Publications
Wiki
Activité
43
Révisions
3
Branches
0
Étiquette
0d8afba0935edd7ea5c6971294fa4ed0a6ec573d
Graphe des révisions
6 Révisions
Auteur
SHA1
Message
Date
Donato Capitella
965cd2c339
feat: Improve Ray node detection, enable cluster-wide vLLM cache clearing, and enforce eager mode for benchmarks.
2026-02-01 21:35:27 +00:00
Donato Capitella
ba503f6e61
feat: centralize model configurations and benchmark settings into a new
models.py
module and update Dockerfile and scripts to use it.
2026-02-01 21:17:15 +00:00
Donato Capitella
a1105a0b96
feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.
2026-02-01 19:36:07 +00:00
Donato Capitella
e5cc96bf48
feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks.
2026-02-01 15:43:56 +00:00
Donato Capitella
711de530f6
added ROCm/Triton attention comparison
2025-12-20 11:49:03 +00:00
Donato Capitella
5e8b6bb545
updates
2025-12-20 11:37:06 +00:00