amd-strix-halo-vllm-toolboxes/scripts at 8ff52abf4ec6c3a6e56b3569e997f68aa0d31446 - amd-strix-halo-vllm-toolboxes - BadStorm.xyz - Code Hub

AI/amd-strix-halo-vllm-toolboxes

Files

T

Riwayat

Donato Capitella 8ff52abf4e perf: Increase max_num_seqs for bus batch scaling and OFF_NUM_PROMPTS for steady-state throughput measurement on Strix Halo.

2026-02-02 22:36:15 +00:00

..

01-rocm-env-for-triton.sh

updated envs for better strix halo support on vllm

2025-12-19 08:30:02 +00:00

99-toolbox-banner.sh

feat: Introduce vLLM cluster benchmarking and setup scripts, and expand the list of models for local benchmarks.

2026-02-01 15:43:56 +00:00

build_rccl_gfx1151.sh

feat: Introduce custom RCCL library management for gfx1151, including build scripts, Docker integration, and VLLM benchmarks.

2026-02-01 13:23:10 +00:00

cluster_manager.py

feat: Configure ROCm attention via --attention-backend CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.

2026-02-02 15:40:16 +00:00

configure_cluster.sh

feat: Add RAY_DISABLE_METRICS=1 to disable Ray metrics across cluster configurations and scripts.

2026-02-01 21:52:48 +00:00

generate_readme_table.py

feat: Add script to automate README benchmark table generation and update max context benchmarks with new models and a kernel parameter change.

2026-02-02 22:32:12 +00:00

install_deps.sh

feat: Modularize Dockerfile dependency and ROCm SDK installations into dedicated scripts and add a GitHub Actions workflow to build and consume a custom RCCL library.

2026-02-01 14:50:37 +00:00

install_rocm_sdk.sh

feat: Modularize Dockerfile dependency and ROCm SDK installations into dedicated scripts and add a GitHub Actions workflow to build and consume a custom RCCL library.

2026-02-01 14:50:37 +00:00

manage_rccl_install.sh

feat: Introduce custom RCCL library management for gfx1151, including build scripts, Docker integration, and VLLM benchmarks.

2026-02-01 13:23:10 +00:00

models.py

perf: Increase max_num_seqs for bus batch scaling and OFF_NUM_PROMPTS for steady-state throughput measurement on Strix Halo.

2026-02-02 22:36:15 +00:00

start_vllm_cluster.py

feat: Add comprehensive RDMA cluster setup guide, enforce eager mode in cluster benchmarks, and update documentation with cluster details.

2026-02-02 19:34:33 +00:00

start_vllm.py

feat: Configure ROCm attention via --attention-backend CLI argument, disable the Ray dashboard, and make eager mode configurable for cluster benchmarks.

2026-02-02 15:40:16 +00:00

zz-venv-last.sh

Updating toolbox and pushing GitHub Action

2025-11-30 14:57:37 +00:00