이 웹사이트는 JavaScript가 필요합니다.
탐색
도움말
로그인
AI
/
amd-strix-halo-vllm-toolboxes
구독
2
별점
0
포크
0
amd-strix-halo-vllm-toolboxes 이미 포크됨
코드
이슈
풀 리퀘스트
액션
패키지
프로젝트
릴리즈
위키
활동
파일
1f96c391fb69b9f22834124e7dd08f43b4c818e7
amd-strix-halo-vllm-toolboxes
/
benchmarks
T
히스토리
Donato Capitella
1f96c391fb
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
2026-02-02 19:34:33 +00:00
..
benchmark_results
updates
2025-12-20 11:37:06 +00:00
benchmark_results_rocm_attn
/benchmark_results
added ROCm/Triton attention comparison
2025-12-20 11:49:03 +00:00
find_max_context.py
feat: Optimize model
max_num_seqs
and global benchmark parameters for Strix Halo, and centralize configurations in
models.py
.
2026-02-02 08:45:13 +00:00
max_context_results.json
updating max context results
2026-02-02 11:56:26 +00:00
run_vllm_bench.py
feat: Configure ROCm attention via
--attention-backend
CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.
2026-02-02 15:40:16 +00:00
vllm_cluster_bench.py
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
2026-02-02 19:34:33 +00:00