This repository serves as the Docker image pack center for GPUStack Runner. It provides a collection of Dockerfiles to build images for various inference services across different accelerated backends.
TIP
The following table lists the supported accelerated backends and their corresponding inference services with versions.
WARNING
| CANN Version (Variant) | MindIE | vLLM | SGLang |
|---|---|---|---|
| 8.5 (A3/910C) | 2.3.0 | 0.15.0(rc), 0.14.1(rc), 0.13.0 | 0.5.9, 0.5.8.post1 |
| 8.5 (910B) | 2.3.0 | 0.15.0(rc), 0.14.1(rc), 0.13.0 | 0.5.9, 0.5.8.post1 |
| 8.5 (310P) | 2.3.0 | 0.15.0(rc), 0.14.1(rc) | |
| 8.3 (A3/910C) | 2.2.rc1 | 0.12.0(rc), 0.11.0 | 0.5.7, 0.5.6.post2 |
| 8.3 (910B) | 2.2.rc1 | 0.12.0(rc), 0.11.0 | 0.5.7, 0.5.6.post2 |
| 8.3 (310P) | 2.2.rc1 | ||
| 8.2 (A3/910C) | 2.1.rc2 | 0.10.2(rc) | 0.5.20.5.1.post3 |
| 8.2 (910B) | 2.1.rc2 | 0.10.2(rc), 0.10.0(rc), 0.9.1 | 0.5.20.5.1.post3 |
| 8.2 (310P) | 2.1.rc2 | 0.10.0(rc), 0.9.1 |
| CoreX Version (Variant) | vLLM |
|---|---|
| 4.2 | 0.8.3 |
NOTE
7.5 8.0+PTX 8.9 9.0 10.0 10.3 12.0 12.1+PTX.7.5 8.0+PTX 8.9 9.0 10.0+PTX 12.0+PTX.7.5 8.0+PTX 8.9 9.0+PTX.| CUDA Version (Variant) | vLLM | SGLang | VoxBox |
|---|---|---|---|
| 12.9 | 0.16.0, 0.15.1, 0.14.1, 0.13.0, 0.12.0, 0.11.2 | 0.5.9, 0.5.8.post1, 0.5.7, 0.5.6.post2 | |
| 12.8 | 0.16.0, 0.15.1, 0.14.1, 0.13.0, 0.12.0, 0.11.2, 0.10.2 | 0.5.9, 0.5.8.post1, 0.5.7, 0.5.6.post2, 0.5.5.post3 | 0.0.21 |
| 12.6 | 0.15.1, 0.14.1, 0.13.0, 0.12.0, 0.11.2, 0.10.2 | 0.0.21 |
| DTK Version (Variant) | vLLM |
|---|---|
| 25.04 | 0.11.0, 0.9.2, 0.8.5 |
| HGGC Version (Variant) | vLLM | SGLang |
|---|---|---|
| 12.3 | 0.12.0, 0.11.1 | 0.5.6, 0.5.5 |
| MACA Version (Variant) | vLLM | SGLang |
|---|---|---|
| 3.3 | 0.11.2 | 0.5.6 |
| 3.2 | 0.10.2 | |
| 3.0 | 0.9.1 |
| MUSA Version (Variant) | vLLM | SGLang |
|---|---|---|
| 4.3.2 | 0.5.7 | |
| 4.1.0 | 0.9.2 |
NOTE
gfx908 gfx90a gfx942 gfx950 gfx1030 gfx1100 gfx1101 gfx1200 gfx1201 gfx1150 gfx1151.gfx908 gfx90a gfx942 gfx1030 gfx1100.WARNING
0.11.2 are reusing the official ROCm 6.4 PyTorch 2.9 wheel package rather than a ROCm
7.0 specific PyTorch build. Although supports ROCm 7.0 in vLLM 0.11.2, gfx1150/gfx1151 are not supported yet.0.13.0 supports gfx903 gfx90a gfx942 only.gfx942 only.gfx950 only.| ROCm Version (Variant) | vLLM | SGLang |
|---|---|---|
| 7.0 | 0.16.0, 0.15.1, 0.14.1, 0.13.0, 0.12.0, 0.11.2 | 0.5.9, 0.5.8.post1, 0.5.7, 0.5.6.post2 |
| 6.4 | 0.16.0, 0.15.1, 0.14.1, 0.13.0, 0.12.0, 0.11.2, 0.10.2 | 0.5.8.post1, 0.5.7, 0.5.6.post2, 0.5.5.post3 |
The pack skeleton is organized by backend:
pack ├── {BACKEND 1} │ └── Dockerfile ├── {BACKEND 2} │ └── Dockerfile ├── {BACKEND 3} │ └── Dockerfile ├── ... │ └── Dockerfile └── {BACKEND N} └── Dockerfile
Each Dockerfile follows these conventions:
ARGs).ARG for all required and optional build arguments. If a required argument is unused, mark it as (PLACEHOLDER).RUN commands to improve readability.# Describe package logic and ARG usage. # ARG PYTHON_VERSION=... # REQUIRED ARG CMAKE_MAX_JOBS=... # REQUIRED ARG {OTHERS} # OPTIONAL ARG {BACKEND}_VERSION=... # REQUIRED ARG {BACKEND}_VERSION_EXTRA=... # OPTIONAL ARG {BACKEND}_ARCHS=... # REQUIRED ARG {BACKEND}_{OTHERS}=... # OPTIONAL ARG {SERVICE}_BASE_IMAGE=... # REQUIRED ARG {SERVICE}_VERSION=... # REQUIRED ARG {SERVICE}_{OTHERS}=... # OPTIONAL ARG {SERVICE}_{FRAMEWORK}_VERSION=... # REQUIRED ARG {SERVICE}_{FRAMEWORK}_{OTHERS}=... # OPTIONAL # Stage Bake Runtime FROM {BACKEND DEVEL IMAGE} AS runtime SHELL ["/bin/bash", "-eo", "pipefail", "-c"] ARG TARGETPLATFORM ARG TARGETOS ARG TARGETARCH ARG ... RUN <<EOF # TODO: install runtime dependencies EOF # Stage Install Service FROM {BACKEND}_BASE_IMAGE AS {service} SHELL ["/bin/bash", "-eo", "pipefail", "-c"] ARG TARGETPLATFORM ARG TARGETOS ARG TARGETARCH ARG ... RUN <<EOF # TODO: install service and dependencies EOF WORKDIR / ENTRYPOINT [ "tini", "--" ]
The Docker image naming convention is as follows:
{NAMESPACE}/{REPOSITORY}:{TAG}.{BACKEND}{BACKEND_VERSION%.*}[-{BACKEND_VARIANT}]-{SERVICE}{SERVICE_VERSION}-{OS}-{ARCH}.{BACKEND}{BACKEND_VERSION%.*}[-{BACKEND_VARIANT}]-{SERVICE}{SERVICE_VERSION}[-dev].gpustackrunner| Accelerated Backend | OS/ARCH | Inference Service | Single-Arch Image Name | Multi-Arch Image Name |
|---|---|---|---|---|
| Ascend CANN 910b | linux/amd64 | vLLM | gpustack/runner:cann8.1-910b-vllm0.9.2-linux-amd64 | gpustack/runner:cann8.1-910b-vllm0.9.2 |
| Ascend CANN 910b | linux/arm64 | vLLM | gpustack/runner:cann8.1-910b-vllm0.9.2-linux-arm64 | gpustack/runner:cann8.1-910b-vllm0.9.2 |
| NVIDIA CUDA 12.8 | linux/amd64 | vLLM | gpustack/runner:cuda12.8-910b-vllm0.9.2-linux-amd64 | gpustack/runner:cuda12.8-910b-vllm0.9.2 |
| NVIDIA CUDA 12.8 | linux/arm64 | vLLM | gpustack/runner:cuda12.8-910b-vllm0.9.2-linux-arm64 | gpustack/runner:cuda12.8-910b-vllm0.9.2 |
gpustack/runner:cann8.1-910b-vllm0.9.2-linux-amd64.gpustack/runner:cann8.1-910b-vllm0.9.2-dev.gpustack/runner:cann8.1-910b-vllm0.9.2.To add support for a new accelerated backend:
pack/ named with the new backend.Dockerfile in the new directory following the Dockerfile Convention._RE_DOCKER_IMAGE in runner.py to recognize the new backend.To add support for a new inference service:
Dockerfile of the relevant backend in pack/{BACKEND}/Dockerfile to include the new service._RE_DOCKER_IMAGE in runner.py to recognize the new service.Copyright (c) 2025 The GPUStack authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at LICENSE file for details.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.