Skip to content

Commit e4e9ea0

Browse files
authored
Upgrade vLLM version to v0.9.2 (#1652)
### What this PR does / why we need it? This patch upgrade vLLM version to v0.9.2, this patch didn't remove the v0.9.1 compatible code to easy review. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.9.1 - vLLM main: vllm-project/vllm@14601f5 - Accuracy test with 0.9.2: https://github.com/vllm-project/vllm-ascend/actions/runs/16121612087 Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
1 parent 71de52d commit e4e9ea0

File tree

10 files changed

+15
-14
lines changed

10 files changed

+15
-14
lines changed

.github/workflows/accuracy_test.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ on:
3737
# Current supported vLLM versions
3838
options:
3939
- main
40+
- v0.9.2
4041
- v0.9.1
4142
- v0.7.3
4243
vllm-ascend-version:
@@ -163,7 +164,7 @@ jobs:
163164
repository: vllm-project/vllm
164165
path: ./vllm-empty
165166
# Please also update this when bump matched version
166-
ref: ${{ github.event.inputs.vllm-version || 'v0.9.1' }}
167+
ref: ${{ github.event.inputs.vllm-version || 'v0.9.2' }}
167168

168169
- name: Install vllm-project/vllm from source
169170
working-directory: ./vllm-empty

.github/workflows/nightly_benchmarks.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ jobs:
5050
strategy:
5151
matrix:
5252
include:
53-
- vllm_branch: v0.9.1
53+
- vllm_branch: v0.9.2
5454
vllm_ascend_branch: main
5555
vllm_use_v1: 1
5656
max-parallel: 1

.github/workflows/vllm_ascend_test.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -138,13 +138,13 @@ jobs:
138138
if: ${{ needs.lint.result == 'success' || github.event_name == 'push' }}
139139
runs-on: ubuntu-latest
140140
container:
141-
image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10
141+
image: quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10
142142
env:
143143
VLLM_LOGGING_LEVEL: ERROR
144144
VLLM_USE_MODELSCOPE: True
145145
strategy:
146146
matrix:
147-
vllm_version: [main, v0.9.1]
147+
vllm_version: [main, v0.9.2]
148148
steps:
149149
- name: Install packages
150150
run: |
@@ -201,7 +201,7 @@ jobs:
201201
max-parallel: 2
202202
matrix:
203203
os: [linux-arm64-npu-1]
204-
vllm_version: [main, v0.9.1]
204+
vllm_version: [main, v0.9.2]
205205
name: singlecard e2e test
206206
runs-on: ${{ matrix.os }}
207207
container:
@@ -302,7 +302,7 @@ jobs:
302302
max-parallel: 1
303303
matrix:
304304
os: [linux-arm64-npu-4]
305-
vllm_version: [main, v0.9.1]
305+
vllm_version: [main, v0.9.2]
306306
name: multicard e2e test
307307
runs-on: ${{ matrix.os }}
308308
container:

.github/workflows/vllm_ascend_test_long_term.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ jobs:
4343
max-parallel: 2
4444
matrix:
4545
os: [linux-arm64-npu-1, linux-arm64-npu-4]
46-
vllm_version: [main, v0.9.1]
46+
vllm_version: [main, v0.9.2]
4747
name: vLLM Ascend long term test
4848
runs-on: ${{ matrix.os }}
4949
container:

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ RUN pip config set global.index-url ${PIP_INDEX_URL}
3737

3838
# Install vLLM
3939
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
40-
ARG VLLM_TAG=v0.9.1
40+
ARG VLLM_TAG=v0.9.2
4141
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
4242
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.
4343
RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install -v -e /vllm-workspace/vllm/ --extra-index https://download.pytorch.org/whl/cpu/ && \

Dockerfile.310p

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ RUN pip config set global.index-url ${PIP_INDEX_URL}
3737

3838
# Install vLLM
3939
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
40-
ARG VLLM_TAG=v0.9.1
40+
ARG VLLM_TAG=v0.9.2
4141
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
4242
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.
4343
RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install -v -e /vllm-workspace/vllm/ --extra-index https://download.pytorch.org/whl/cpu/ && \

Dockerfile.310p.openEuler

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ COPY . /vllm-workspace/vllm-ascend/
3434

3535
# Install vLLM
3636
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
37-
ARG VLLM_TAG=v0.9.1
37+
ARG VLLM_TAG=v0.9.2
3838

3939
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
4040
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.

Dockerfile.openEuler

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ COPY . /vllm-workspace/vllm-ascend/
3434

3535
# Install vLLM
3636
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
37-
ARG VLLM_TAG=v0.9.1
37+
ARG VLLM_TAG=v0.9.2
3838

3939
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
4040
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.

docs/source/community/versioning_policy.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,8 @@ Usually, each minor version of vLLM (such as 0.7) will correspond to a vLLM Asce
7474

7575
| Branch | Status | Note |
7676
|------------|--------------|--------------------------------------|
77-
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.x branch |
78-
| v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.0 and 0.9.1 version |
77+
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.2 branch |
78+
| v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.1 version |
7979
| v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version |
8080
| v0.7.1-dev | Unmaintained | Replaced by v0.7.3-dev |
8181

docs/source/user_guide/graph_mode.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ From v0.9.1rc1 with V1 Engine, vLLM Ascend will run models in graph mode by defa
1212

1313
There are two kinds for graph mode supported by vLLM Ascend:
1414
- **ACLGraph**: This is the default graph mode supported by vLLM Ascend. In v0.9.1rc1, only Qwen series models are well tested.
15-
- **TorchAirGraph**: This is the GE graph mode. In v0.9.1rc1, only DeepSeek series models are supported. In v0.9.1rc2, we also support PanguProMoe with torchair.
15+
- **TorchAirGraph**: This is the GE graph mode. In v0.9.1rc1, only DeepSeek series models are supported.
1616

1717
## Using ACLGraph
1818
ACLGraph is enabled by default. Take Qwen series models as an example, just set to use V1 Engine is enough.

0 commit comments

Comments
 (0)