Donato Capitella
|
6f118ff936
|
feat: Update ROCm benchmark result paths, improve cluster node discovery and cache clearing, and refine cluster benchmark result directory.
|
2026-02-02 07:35:50 +00:00 |
|
Donato Capitella
|
c587981d73
|
refactor: Centralize Ray/vLLM cluster management into a new cluster_manager.py module and refactor start_vllm_cluster.py to use it.
|
2026-02-01 22:19:34 +00:00 |
|
Donato Capitella
|
128ddade14
|
fix: improve RDMA stability by configuring NCCL IB timeout and retry count.
|
2026-02-01 22:04:34 +00:00 |
|
Donato Capitella
|
b458b287d0
|
docs: update quickstart to recommend refresh_toolbox.sh for toolbox creation and detail its InfiniBand/RDMA detection capabilities.
|
2026-02-01 21:55:46 +00:00 |
|
Donato Capitella
|
0d8afba093
|
feat: Add RAY_DISABLE_METRICS=1 to disable Ray metrics across cluster configurations and scripts.
|
2026-02-01 21:52:48 +00:00 |
|
Donato Capitella
|
965cd2c339
|
feat: Improve Ray node detection, enable cluster-wide vLLM cache clearing, and enforce eager mode for benchmarks.
|
2026-02-01 21:35:27 +00:00 |
|
Donato Capitella
|
ba503f6e61
|
feat: centralize model configurations and benchmark settings into a new models.py module and update Dockerfile and scripts to use it.
|
2026-02-01 21:17:15 +00:00 |
|
Donato Capitella
|
4b09188776
|
feat: add refresh_toolbox.sh script to automate creation and refresh of the vLLM Podman toolbox.
|
2026-02-01 20:44:54 +00:00 |
|
Donato Capitella
|
a1105a0b96
|
feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies.
|
2026-02-01 19:36:07 +00:00 |
|
Donato Capitella
|
e5cc96bf48
|
feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks.
|
2026-02-01 15:43:56 +00:00 |
|
Donato Capitella
|
47bf7daba3
|
feat: add input to specify RCCL artifact run ID for download in build-and-publish workflow
|
2026-02-01 14:58:10 +00:00 |
|
Donato Capitella
|
b10aa50745
|
feat: Modularize Dockerfile dependency and ROCm SDK installations into dedicated scripts and add a GitHub Actions workflow to build and consume a custom RCCL library.
|
2026-02-01 14:50:37 +00:00 |
|
Donato Capitella
|
a8added616
|
feat: Introduce custom RCCL library management for gfx1151, including build scripts, Docker integration, and VLLM benchmarks.
|
2026-02-01 13:23:10 +00:00 |
|
Donato Capitella
|
13caab0634
|
typos
|
2026-01-31 14:39:04 +00:00 |
|
Donato Capitella
|
36424706ee
|
added troubleshooting steps for RDMA
|
2026-01-31 14:37:46 +00:00 |
|
Donato Capitella
|
8ebd432ac6
|
adding patch dependency
|
2026-01-31 12:43:42 +00:00 |
|
Donato Capitella
|
57b592b912
|
added dependecies for RDMA/way
|
2026-01-30 14:47:09 +00:00 |
|
Donato Capitella
|
039484a41e
|
Updated name of card
|
2025-12-24 08:13:34 +00:00 |
|
Donato Capitella
|
255c167734
|
fix
|
2025-12-22 16:40:44 +00:00 |
|
Donato Capitella
|
bc7c8e271b
|
updated table with host configuration
|
2025-12-22 16:40:25 +00:00 |
|
Donato Capitella
|
86eac2889b
|
docs: Update README to specify Fedora 43
|
2025-12-21 09:55:31 +00:00 |
|
Donato Capitella
|
15f1889c6f
|
fixes
|
2025-12-20 12:32:46 +00:00 |
|
Donato Capitella
|
3b0e736c94
|
feat: Implement dynamic model discovery from benchmark results, add benchmark notes, and include dialog dependency.
|
2025-12-20 12:31:20 +00:00 |
|
Donato Capitella
|
711de530f6
|
added ROCm/Triton attention comparison
|
2025-12-20 11:49:03 +00:00 |
|
Donato Capitella
|
5e8b6bb545
|
updates
|
2025-12-20 11:37:06 +00:00 |
|
Donato Capitella
|
f19932b360
|
updated envs for better strix halo support on vllm
|
2025-12-19 08:30:02 +00:00 |
|
Donato Capitella
|
69f869ae41
|
restore staging
|
2025-12-19 08:06:51 +00:00 |
|
Donato Capitella
|
2b48cae736
|
feat: Update Dockerfile with pgrep and PyTorch nightly URL.
|
2025-12-19 07:45:07 +00:00 |
|
Donato Capitella
|
f91dc685ad
|
add bits and bytes
|
2025-12-18 08:56:14 +00:00 |
|
Donato Capitella
|
b8678b08ba
|
Installing flash_attn, as this is now neded by vLLM
|
2025-11-30 17:49:29 +00:00 |
|
Donato Capitella
|
30bd06b1bd
|
more dockerfile AI SLOP
|
2025-11-30 15:45:48 +00:00 |
|
Donato Capitella
|
c9cc843787
|
fix
|
2025-11-30 15:41:01 +00:00 |
|
Donato Capitella
|
52814ef9a2
|
fixing Dockerfile
|
2025-11-30 15:37:12 +00:00 |
|
Donato Capitella
|
1fe0b82853
|
updated Dockerfile
|
2025-11-30 15:29:02 +00:00 |
|
Donato Capitella
|
74a2e5254a
|
Updating toolbox and pushing GitHub Action
|
2025-11-30 14:57:37 +00:00 |
|
Donato Capitella
|
7c85688924
|
fixed missing model provider in model tag
|
2025-09-04 17:27:38 +01:00 |
|
Donato Capitella
|
f8db65e8d7
|
Fixed typos due to copy/paste
|
2025-09-04 17:22:18 +01:00 |
|
Donato Capitella
|
7e17fa8660
|
Added gemma models
|
2025-09-04 17:20:24 +01:00 |
|
Donato Capitella
|
8ee405f07e
|
Fixed Docker/Podman commands
|
2025-09-04 15:02:00 +01:00 |
|
Donato Capitella
|
fb54a2a9b9
|
Fixed missing parameters in start-vllm
|
2025-09-04 13:58:51 +01:00 |
|
Donato Capitella
|
e9460b20ad
|
updated with set of working models
|
2025-09-04 13:33:53 +01:00 |
|
Donato Capitella
|
8509fe2d92
|
another patch for amdsmi
|
2025-09-04 07:34:55 +01:00 |
|
Donato Capitella
|
fc12e2cc63
|
fixing quant
|
2025-09-03 23:08:45 +01:00 |
|
Donato Capitella
|
0212638d6a
|
fixes
|
2025-09-03 22:59:16 +01:00 |
|
Donato Capitella
|
e17d61916b
|
typo
|
2025-09-03 22:42:06 +01:00 |
|
Donato Capitella
|
46f4003f79
|
added start-vllm script
|
2025-09-03 22:37:26 +01:00 |
|
Donato Capitella
|
a1501febb4
|
first commit
|
2025-09-03 20:42:44 +01:00 |
|