Skip to content

Commit 5903547

Browse files
authored
[doc] add 0.7.3.post1 release note (#1008)
Add release note for 0.7.3.post1 Add the missing release note back for 0.7.3 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1 parent c464c32 commit 5903547

File tree

5 files changed

+60
-15
lines changed

5 files changed

+60
-15
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Below is maintained branches:
6363

6464
| Branch | Status | Note |
6565
|------------|--------------|--------------------------------------|
66-
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.8.x branch |
66+
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.x branch |
6767
| v0.7.1-dev | Unmaintained | Only doc fixed is allowed |
6868
| v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version |
6969

docs/source/_templates/sections/header.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,5 +54,5 @@
5454
</style>
5555

5656
<div class="notification-bar">
57-
<p>You are viewing the latest developer preview docs. <a href="https://vllm-ascend.readthedocs.io/en/stable/">Click here</a> to view docs for the latest stable release(v0.7.3).</p>
57+
<p>You are viewing the latest developer preview docs. <a href="https://vllm-ascend.readthedocs.io/en/v0.7.3-dev">Click here</a> to view docs for the latest stable release(v0.7.3.post1).</p>
5858
</div>

docs/source/developer_guide/versioning_policy.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,16 @@ Following is the Release Compatibility Matrix for vLLM Ascend Plugin:
2424
|-------------|--------------|------------------|-------------|--------------------|--------------|
2525
| v0.8.5rc1 | v0.8.5.post1 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
2626
| v0.8.4rc2 | v0.8.4 | >= 3.9, < 3.12 | 8.0.0 | 2.5.1 / 2.5.1 | |
27-
| v0.7.3 | v0.7.3 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | |
27+
| v0.7.3.post1| v0.7.3 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | 2.0rc1 |
28+
| v0.7.3 | v0.7.3 | >= 3.9, < 3.12 | 8.1.RC1 | 2.5.1 / 2.5.1 | 2.0rc1 |
2829

2930
## Release cadence
3031

3132
### release window
3233

3334
| Date | Event |
3435
|------------|-------------------------------------------|
36+
| 2025.05.29 | v0.7.x post release, v0.7.3.post1 |
3537
| 2025.05.08 | v0.7.x Final release, v0.7.3 |
3638
| 2025.05.06 | Release candidates, v0.8.5rc1 |
3739
| 2025.04.28 | Release candidates, v0.8.4rc2 |
@@ -66,7 +68,7 @@ Usually, each minor version of vLLM (such as 0.7) will correspond to a vLLM Asce
6668

6769
| Branch | Status | Note |
6870
|------------|--------------|--------------------------------------|
69-
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.8.x branch |
71+
| main | Maintained | CI commitment for vLLM main branch and vLLM 0.9.x branch |
7072
| v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version |
7173
| v0.7.1-dev | Unmaintained | Replaced by v0.7.3-dev |
7274

docs/source/faqs.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,7 @@
22

33
## Version Specific FAQs
44

5-
- [[v0.7.1rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/19)
6-
- [[v0.7.3rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/267)
7-
- [[v0.7.3rc2] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/418)
8-
- [[v0.8.4rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/546)
9-
- [[v0.8.4rc2] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/707)
5+
- [[v0.7.3.post1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/1007)
106
- [[v0.8.5rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-ascend/issues/754)
117

128
## General FAQs

docs/source/user_guide/release_notes.md

Lines changed: 53 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,53 @@
11
# Release note
22

3-
## v0.8.5rc1
3+
## v0.7.3.post1 - 2025.05.29
4+
5+
This is the first post release of 0.7.3. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev) to start the journey. It includes the following changes:
6+
7+
### Highlights
8+
9+
- Qwen3 and Qwen3MOE is supported now. The performance and accuracy of Qwen3 is well tested. You can try it now. Mindie Turbo is recomanded to improve the performance of Qwen3. [#903](https://github.com/vllm-project/vllm-ascend/pull/903) [#915](https://github.com/vllm-project/vllm-ascend/pull/915)
10+
- Added a new performance guide. The guide aims to help users to improve vllm-ascend performance on system level. It includes OS configuration, library optimization, deploy guide and so on. [#878](https://github.com/vllm-project/vllm-ascend/pull/878) [Doc Link](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/developer_guide/performance/optimization_and_tuning.html)
11+
12+
### Bug Fix
13+
14+
- Qwen2.5-VL works for RLHF scenarios now. [#928](https://github.com/vllm-project/vllm-ascend/pull/928)
15+
- Users can launch the model from online weights now. e.g. from huggingface or modelscope directly [#858](https://github.com/vllm-project/vllm-ascend/pull/858) [#918](https://github.com/vllm-project/vllm-ascend/pull/918)
16+
- The meaningless log info `UserWorkspaceSize0` has been cleaned. [#911](https://github.com/vllm-project/vllm-ascend/pull/911)
17+
- The log level for `Failed to import vllm_ascend_C` has been changed to `warning` instead of `error`. [#956](https://github.com/vllm-project/vllm-ascend/pull/956)
18+
- DeepSeek MLA now works with chunked prefill in V1 Engine. Please note that V1 engine in 0.7.3 is just expermential and only for test usage. [#849](https://github.com/vllm-project/vllm-ascend/pull/849) [#936](https://github.com/vllm-project/vllm-ascend/pull/936)
19+
20+
### Docs
21+
22+
- The benchmark doc is updated for Qwen2.5 and Qwen2.5-VL [#792](https://github.com/vllm-project/vllm-ascend/pull/792)
23+
- Add the note to clear that only "modelscope<1.23.0" works with 0.7.3. [#954](https://github.com/vllm-project/vllm-ascend/pull/954)
24+
25+
## v0.7.3 - 2025.05.08
26+
27+
🎉 Hello, World!
28+
29+
We are excited to announce the release of 0.7.3 for vllm-ascend. This is the first official release. The functionality, performance, and stability of this release are fully tested and verified. We encourage you to try it out and provide feedback. We'll post bug fix versions in the future if needed. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev) to start the journey.
30+
31+
### Highlights
32+
- This release includes all features landed in the previous release candidates ([v0.7.1rc1](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.7.1rc1), [v0.7.3rc1](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.7.3rc1), [v0.7.3rc2](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.7.3rc2)). And all the features are fully tested and verified. Visit the official doc the get the detail [feature](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/user_guide/suppoted_features.html) and [model](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/user_guide/supported_models.html) support matrix.
33+
- Upgrade CANN to 8.1.RC1 to enable chunked prefill and automatic prefix caching features. You can now enable them now.
34+
- Upgrade PyTorch to 2.5.1. vLLM Ascend no longer relies on the dev version of torch-npu now. Now users don't need to install the torch-npu by hand. The 2.5.1 version of torch-npu will be installed automatically. [#662](https://github.com/vllm-project/vllm-ascend/pull/662)
35+
- Integrate MindIE Turbo into vLLM Ascend to improve DeepSeek V3/R1, Qwen 2 series performance. [#708](https://github.com/vllm-project/vllm-ascend/pull/708)
36+
37+
### Core
38+
- LoRA、Multi-LoRA And Dynamic Serving is supported now. The performance will be improved in the next release. Please follow the official doc for more usage information. Thanks for the contribution from China Merchants Bank. [#700](https://github.com/vllm-project/vllm-ascend/pull/700)
39+
40+
### Model
41+
- The performance of Qwen2 vl and Qwen2.5 vl is improved. [#702](https://github.com/vllm-project/vllm-ascend/pull/702)
42+
- The performance of `apply_penalties` and `topKtopP` ops are improved. [#525](https://github.com/vllm-project/vllm-ascend/pull/525)
43+
44+
### Other
45+
- Fixed a issue that may lead CPU memory leak. [#691](https://github.com/vllm-project/vllm-ascend/pull/691) [#712](https://github.com/vllm-project/vllm-ascend/pull/712)
46+
- A new environment `SOC_VERSION` is added. If you hit any soc detection error when building with custom ops enabled, please set `SOC_VERSION` to a suitable value. [#606](https://github.com/vllm-project/vllm-ascend/pull/606)
47+
- openEuler container image supported with v0.7.3-openeuler tag. [#665](https://github.com/vllm-project/vllm-ascend/pull/665)
48+
- Prefix cache feature works on V1 engine now. [#559](https://github.com/vllm-project/vllm-ascend/pull/559)
49+
50+
## v0.8.5rc1 - 2025.05.06
451

552
This is the 1st release candidate of v0.8.5 for vllm-ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/) to start the journey. Now you can enable V1 egnine by setting the environment variable `VLLM_USE_V1=1`, see the feature support status of vLLM Ascend in [here](https://vllm-ascend.readthedocs.io/en/latest/user_guide/suppoted_features.html).
653

@@ -23,7 +70,7 @@ This is the 1st release candidate of v0.8.5 for vllm-ascend. Please follow the [
2370
- Add nightly CI [#668](https://github.com/vllm-project/vllm-ascend/pull/668)
2471
- Add accuracy test report [#542](https://github.com/vllm-project/vllm-ascend/pull/542)
2572

26-
## v0.8.4rc2
73+
## v0.8.4rc2 - 2025.04.29
2774

2875
This is the second release candidate of v0.8.4 for vllm-ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/) to start the journey. Some experimental features are included in this version, such as W8A8 quantization and EP/DP support. We'll make them stable enough in the next release.
2976

@@ -43,7 +90,7 @@ This is the second release candidate of v0.8.4 for vllm-ascend. Please follow th
4390
- Add "Using EvalScope evaluation" doc [#611](https://github.com/vllm-project/vllm-ascend/pull/611)
4491
- Add a `VLLM_VERSION` environment to make vLLM version configurable to help developer set correct vLLM version if the code of vLLM is changed by hand locally. [#651](https://github.com/vllm-project/vllm-ascend/pull/651)
4592

46-
## v0.8.4rc1
93+
## v0.8.4rc1 - 2025.04.18
4794

4895
This is the first release candidate of v0.8.4 for vllm-ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/) to start the journey. From this version, vllm-ascend will follow the newest version of vllm and release every two weeks. For example, if vllm releases v0.8.5 in the next two weeks, vllm-ascend will release v0.8.5rc1 instead of v0.8.4rc2. Please find the detail from the [official documentation](https://vllm-ascend.readthedocs.io/en/latest/developer_guide/versioning_policy.html#release-window).
4996

@@ -66,7 +113,7 @@ This is the first release candidate of v0.8.4 for vllm-ascend. Please follow the
66113
- The custom ops build is enabled by default. You should install the packages like `gcc`, `cmake` first to build `vllm-ascend` from source. Set `COMPILE_CUSTOM_KERNELS=0` environment to disable the compilation if you don't need it. [#466](https://github.com/vllm-project/vllm-ascend/pull/466)
67114
- The custom op `rotay embedding` is enabled by default now to improve the performance. [#555](https://github.com/vllm-project/vllm-ascend/pull/555)
68115

69-
## v0.7.3rc2
116+
## v0.7.3rc2 - 2025.03.29
70117

71118
This is 2nd release candidate of v0.7.3 for vllm-ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev) to start the journey.
72119
- Quickstart with container: https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/quick_start.html
@@ -88,7 +135,7 @@ This is 2nd release candidate of v0.7.3 for vllm-ascend. Please follow the [offi
88135
- Fixed a bug to make sure multi step scheduler feature work. [#349](https://github.com/vllm-project/vllm-ascend/pull/349)
89136
- Fixed a bug to make prefix cache feature works with correct accuracy. [#424](https://github.com/vllm-project/vllm-ascend/pull/424)
90137

91-
## v0.7.3rc1
138+
## v0.7.3rc1 - 2025.03.14
92139

93140
🎉 Hello, World! This is the first release candidate of v0.7.3 for vllm-ascend. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/v0.7.3-dev) to start the journey.
94141
- Quickstart with container: https://vllm-ascend.readthedocs.io/en/v0.7.3-dev/quick_start.html
@@ -116,7 +163,7 @@ This is 2nd release candidate of v0.7.3 for vllm-ascend. Please follow the [offi
116163
- In [some cases](https://github.com/vllm-project/vllm-ascend/issues/324), especially when the input/output is very long, the accuracy of output may be incorrect. We are working on it. It'll be fixed in the next release.
117164
- Improved and reduced the garbled code in model output. But if you still hit the issue, try to change the generation config value, such as `temperature`, and try again. There is also a knonwn issue shown below. Any [feedback](https://github.com/vllm-project/vllm-ascend/issues/267) is welcome. [#277](https://github.com/vllm-project/vllm-ascend/pull/277)
118165

119-
## v0.7.1rc1
166+
## v0.7.1rc1 - 2025.02.19
120167

121168
🎉 Hello, World!
122169

0 commit comments

Comments
 (0)