88 Commity

Autor SHA1 Zpráva Datum
Donato Capitella e5cc96bf48 feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks. 2026-02-01 15:43:56 +00:00
Donato Capitella 47bf7daba3 feat: add input to specify RCCL artifact run ID for download in build-and-publish workflow 2026-02-01 14:58:10 +00:00
Donato Capitella b10aa50745 feat: Modularize Dockerfile dependency and ROCm SDK installations into dedicated scripts and add a GitHub Actions workflow to build and consume a custom RCCL library. 2026-02-01 14:50:37 +00:00
Donato Capitella a8added616 feat: Introduce custom RCCL library management for gfx1151, including build scripts, Docker integration, and VLLM benchmarks. 2026-02-01 13:23:10 +00:00
Donato Capitella 13caab0634 typos 2026-01-31 14:39:04 +00:00
Donato Capitella 36424706ee added troubleshooting steps for RDMA 2026-01-31 14:37:46 +00:00
Donato Capitella 8ebd432ac6 adding patch dependency 2026-01-31 12:43:42 +00:00
Donato Capitella 57b592b912 added dependecies for RDMA/way 2026-01-30 14:47:09 +00:00
Donato Capitella 039484a41e Updated name of card 2025-12-24 08:13:34 +00:00
Donato Capitella 255c167734 fix 2025-12-22 16:40:44 +00:00
Donato Capitella bc7c8e271b updated table with host configuration 2025-12-22 16:40:25 +00:00
Donato Capitella 86eac2889b docs: Update README to specify Fedora 43 2025-12-21 09:55:31 +00:00
Donato Capitella 15f1889c6f fixes 2025-12-20 12:32:46 +00:00
Donato Capitella 3b0e736c94 feat: Implement dynamic model discovery from benchmark results, add benchmark notes, and include dialog dependency. 2025-12-20 12:31:20 +00:00
Donato Capitella 711de530f6 added ROCm/Triton attention comparison 2025-12-20 11:49:03 +00:00
Donato Capitella 5e8b6bb545 updates 2025-12-20 11:37:06 +00:00
Donato Capitella f19932b360 updated envs for better strix halo support on vllm 2025-12-19 08:30:02 +00:00
Donato Capitella 69f869ae41 restore staging 2025-12-19 08:06:51 +00:00
Donato Capitella 2b48cae736 feat: Update Dockerfile with pgrep and PyTorch nightly URL. 2025-12-19 07:45:07 +00:00
Donato Capitella f91dc685ad add bits and bytes 2025-12-18 08:56:14 +00:00
Donato Capitella b8678b08ba Installing flash_attn, as this is now neded by vLLM 2025-11-30 17:49:29 +00:00
Donato Capitella 30bd06b1bd more dockerfile AI SLOP 2025-11-30 15:45:48 +00:00
Donato Capitella c9cc843787 fix 2025-11-30 15:41:01 +00:00
Donato Capitella 52814ef9a2 fixing Dockerfile 2025-11-30 15:37:12 +00:00
Donato Capitella 1fe0b82853 updated Dockerfile 2025-11-30 15:29:02 +00:00
Donato Capitella 74a2e5254a Updating toolbox and pushing GitHub Action 2025-11-30 14:57:37 +00:00
Donato Capitella 7c85688924 fixed missing model provider in model tag 2025-09-04 17:27:38 +01:00
Donato Capitella f8db65e8d7 Fixed typos due to copy/paste 2025-09-04 17:22:18 +01:00
Donato Capitella 7e17fa8660 Added gemma models 2025-09-04 17:20:24 +01:00
Donato Capitella 8ee405f07e Fixed Docker/Podman commands 2025-09-04 15:02:00 +01:00
Donato Capitella fb54a2a9b9 Fixed missing parameters in start-vllm 2025-09-04 13:58:51 +01:00
Donato Capitella e9460b20ad updated with set of working models 2025-09-04 13:33:53 +01:00
Donato Capitella 8509fe2d92 another patch for amdsmi 2025-09-04 07:34:55 +01:00
Donato Capitella fc12e2cc63 fixing quant 2025-09-03 23:08:45 +01:00
Donato Capitella 0212638d6a fixes 2025-09-03 22:59:16 +01:00
Donato Capitella e17d61916b typo 2025-09-03 22:42:06 +01:00
Donato Capitella 46f4003f79 added start-vllm script 2025-09-03 22:37:26 +01:00
Donato Capitella a1501febb4 first commit 2025-09-03 20:42:44 +01:00