İşleme Grafiği

33 İşleme

Yazar SHA1 Mesaj Tarih
Donato Capitella fb0aef0864 Downgrade Python to 3.12 and remove the --no-deps flag from a pip install command in the Dockerfile. 2026-03-09 11:08:11 +00:00
Donato Capitella 9997faaa1e build: Add --no-deps flag to local wheel installation. 2026-03-08 16:31:16 +00:00
Donato Capitella 6875f62ccf improve benchmarks 2026-02-25 09:29:46 +00:00
Donato Capitella 1af159af81 removing llvm flags as they have no impact on performance 2026-02-24 08:27:57 +00:00
Donato Capitella f968cb1f30 most of the time spent by devs is to ensure there is no standard way of passing flags - I have no idea why 2026-02-23 12:08:57 +00:00
Donato Capitella fedfa3c682 Trying fix for ROCm/llvm loop unrolling bug, to see if performance improves on custom complied kernels 2026-02-23 11:43:44 +00:00
Donato Capitella 13c5a929a3 feat: refactor vLLM Strix Halo patching into a dedicated script 2026-02-23 10:33:20 +00:00
Donato Capitella 5a7f0cc676 feat: Implement temporary patch for C10_CHECK macro import missing 2026-02-23 09:49:42 +00:00
Donato Capitella b3fcb0091f feat: Enhance find_max_context.py with Ray cluster support and fix C10_HIP_CHECK build error in Dockerfile. 2026-02-23 09:11:30 +00:00
Donato Capitella 726cd5ae53 remove clang patch 2026-02-18 15:23:02 +00:00
Donato Capitella 290beffb05 feat: Enhance quantization support for MoE layers with new FP8/INT8 configs and model-specific optimizations across various devices. 2026-02-12 11:10:28 +00:00
Donato Capitella 6754095398 feat: Introduce measure_bandwidth.sh script, install perfquery, and add the script to the Docker image for RDMA bandwidth monitoring. 2026-02-07 10:40:53 +00:00
Donato Capitella 6f118ff936 feat: Update ROCm benchmark result paths, improve cluster node discovery and cache clearing, and refine cluster benchmark result directory. 2026-02-02 07:35:50 +00:00
Donato Capitella c587981d73 refactor: Centralize Ray/vLLM cluster management into a new cluster_manager.py module and refactor start_vllm_cluster.py to use it. 2026-02-01 22:19:34 +00:00
Donato Capitella ba503f6e61 feat: centralize model configurations and benchmark settings into a new models.py module and update Dockerfile and scripts to use it. 2026-02-01 21:17:15 +00:00
Donato Capitella a1105a0b96 feat: Enhance vLLM benchmarking to compare Triton and ROCm attention, introduce a new script for cluster configuration, and update Dockerfile for new tools and dependencies. 2026-02-01 19:36:07 +00:00
Donato Capitella e5cc96bf48 feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks. 2026-02-01 15:43:56 +00:00
Donato Capitella b10aa50745 feat: Modularize Dockerfile dependency and ROCm SDK installations into dedicated scripts and add a GitHub Actions workflow to build and consume a custom RCCL library. 2026-02-01 14:50:37 +00:00
Donato Capitella a8added616 feat: Introduce custom RCCL library management for gfx1151, including build scripts, Docker integration, and VLLM benchmarks. 2026-02-01 13:23:10 +00:00
Donato Capitella 36424706ee added troubleshooting steps for RDMA 2026-01-31 14:37:46 +00:00
Donato Capitella 8ebd432ac6 adding patch dependency 2026-01-31 12:43:42 +00:00
Donato Capitella 57b592b912 added dependecies for RDMA/way 2026-01-30 14:47:09 +00:00
Donato Capitella 3b0e736c94 feat: Implement dynamic model discovery from benchmark results, add benchmark notes, and include dialog dependency. 2025-12-20 12:31:20 +00:00
Donato Capitella 5e8b6bb545 updates 2025-12-20 11:37:06 +00:00
Donato Capitella 69f869ae41 restore staging 2025-12-19 08:06:51 +00:00
Donato Capitella 2b48cae736 feat: Update Dockerfile with pgrep and PyTorch nightly URL. 2025-12-19 07:45:07 +00:00
Donato Capitella f91dc685ad add bits and bytes 2025-12-18 08:56:14 +00:00
Donato Capitella b8678b08ba Installing flash_attn, as this is now neded by vLLM 2025-11-30 17:49:29 +00:00
Donato Capitella 30bd06b1bd more dockerfile AI SLOP 2025-11-30 15:45:48 +00:00
Donato Capitella c9cc843787 fix 2025-11-30 15:41:01 +00:00
Donato Capitella 52814ef9a2 fixing Dockerfile 2025-11-30 15:37:12 +00:00
Donato Capitella 1fe0b82853 updated Dockerfile 2025-11-30 15:29:02 +00:00
Donato Capitella 74a2e5254a Updating toolbox and pushing GitHub Action 2025-11-30 14:57:37 +00:00