Donato Capitella
|
4d3b046870
|
feat: Add new benchmark results for various models and configurations, and update documentation UI with filtering for attention and tensor parallelism.
|
2026-02-02 21:30:17 +00:00 |
|
Donato Capitella
|
1f96c391fb
|
feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.
|
2026-02-02 19:34:33 +00:00 |
|
Donato Capitella
|
1ddcb9a202
|
feat: Configure ROCm attention via --attention-backend CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.
|
2026-02-02 15:40:16 +00:00 |
|
Donato Capitella
|
9c6d32e326
|
updating max context results
|
2026-02-02 11:56:26 +00:00 |
|
Donato Capitella
|
0109e6a19b
|
feat: Optimize model max_num_seqs and global benchmark parameters for Strix Halo, and centralize configurations in models.py.
|
2026-02-02 08:45:13 +00:00 |
|
Donato Capitella
|
6f118ff936
|
feat: Update ROCm benchmark result paths, improve cluster node discovery and cache clearing, and refine cluster benchmark result directory.
|
2026-02-02 07:35:50 +00:00 |
|
Donato Capitella
|
c587981d73
|
refactor: Centralize Ray/vLLM cluster management into a new cluster_manager.py module and refactor start_vllm_cluster.py to use it.
|
2026-02-01 22:19:34 +00:00 |
|
Donato Capitella
|
128ddade14
|
fix: improve RDMA stability by configuring NCCL IB timeout and retry count.
|
2026-02-01 22:04:34 +00:00 |
|
Donato Capitella
|
965cd2c339
|
feat: Improve Ray node detection, enable cluster-wide vLLM cache clearing, and enforce eager mode for benchmarks.
|
2026-02-01 21:35:27 +00:00 |
|
Donato Capitella
|
ba503f6e61
|
feat: centralize model configurations and benchmark settings into a new models.py module and update Dockerfile and scripts to use it.
|
2026-02-01 21:17:15 +00:00 |
|
Donato Capitella
|
a1105a0b96
|
feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.
|
2026-02-01 19:36:07 +00:00 |
|
Donato Capitella
|
e5cc96bf48
|
feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks.
|
2026-02-01 15:43:56 +00:00 |
|
Donato Capitella
|
711de530f6
|
added ROCm/Triton attention comparison
|
2025-12-20 11:49:03 +00:00 |
|
Donato Capitella
|
5e8b6bb545
|
updates
|
2025-12-20 11:37:06 +00:00 |
|