Skip to content

Commit e4e0b7a

Browse files
authored
[Doc] Add patch doc (#1414)
1. Format the developer guide content to make it more clear 2. Add the patch doc for developer guide Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1 parent 52317f9 commit e4e0b7a

File tree

10 files changed

+112
-12
lines changed

10 files changed

+112
-12
lines changed

docs/source/developer_guide/contributing.md renamed to docs/source/developer_guide/contribution/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,3 +85,10 @@ If the PR spans more than one category, please include all relevant prefixes.
8585

8686
You may find more information about contributing to vLLM Ascend backend plugin on [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html).
8787
If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers.
88+
89+
90+
:::{toctree}
91+
:caption: Index
92+
:maxdepth: 1
93+
testing
94+
:::
Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,10 @@
1-
# Evaluation
1+
# Accuracy
22

33
:::{toctree}
44
:caption: Accuracy
55
:maxdepth: 1
6+
using_evalscope
67
using_lm_eval
78
using_opencompass
8-
using_evalscope
99
accuracy_report/index
1010
:::
11-
12-
:::{toctree}
13-
:caption: Performance
14-
:maxdepth: 1
15-
performance_benchmark
16-
profile_execute_duration
17-
:::
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Feature Guide
2+
3+
This section provides an overview of the features implemented in vLLM Ascend. Developers can refer to this guide to understand how vLLM Ascend works.
4+
5+
:::{toctree}
6+
:caption: Feature Guide
7+
:maxdepth: 1
8+
patch
9+
:::
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Patch in vLLM Ascend
2+
3+
vLLM Ascend is a platform plugin for vLLM. Due to the release cycle of vLLM and vLLM Ascend is different, and the hardware limitation in some case, we need to patch some code in vLLM to make it compatible with vLLM Ascend.
4+
5+
In vLLM Ascend code, we provide a patch module `vllm_ascend/patch` to address the change for vLLM.
6+
7+
## Principle
8+
9+
We should keep in mind that Patch is not the best way to make vLLM Ascend compatible. It's just a temporary solution. The best way is to contribute the change to vLLM to make it compatible with vLLM Ascend originally. In vLLM Ascend, we have the basic principle for Patch strategy:
10+
11+
1. Less is more. Please do not patch unless it's the only way currently.
12+
2. Once a patch is added, it's required to describe the future plan for removing the patch.
13+
3. Anytime, clean the patch code is welcome.
14+
15+
## How it work
16+
17+
In `vllm_ascend/patch`, you can see the code structure as follows:
18+
19+
```
20+
vllm_ascend
21+
├── patch
22+
│ ├── platform
23+
│ │ ├── patch_0_9_1
24+
│ │ ├── patch_common
25+
│ │ ├── patch_main
26+
│ ├── worker
27+
│ │ ├── patch_0_9_1
28+
│ │ ├── patch_common
29+
│ │ ├── patch_main
30+
└───────────
31+
```
32+
33+
- **platform**: The patch code in this directory is for patching the code in vLLM main process. It's called by `vllm_ascend/platform::NPUPlatform::pre_register_and_update` very early when vLLM is initialized.
34+
- for online mode, vLLM process calls the platform patch here `vllm/vllm/engine/arg_utils.py::AsyncEngineArgs.add_cli_args` when parsing the cli args.
35+
- for offline mode, vLLM process calls the platform patch here `vllm/vllm/engine/arg_utils.py::EngineArgs.create_engine_config` when parsing the input parameters.
36+
- **worker**: The patch code in this directory is for patching the code in vLLM worker process. It's called by `vllm_ascend/worker/worker_v1::NPUWorker::__init__` when the vLLM worker process is initialized.
37+
- for both online and offline mode, vLLM engine core process calls the worker patch here `vllm/vllm/worker/worker_base.py::WorkerWrapperBase.init_worker` when initializing the worker process.
38+
39+
In both **platform** and **worker** folder, there are several patch module. They are used for patching different version of vLLM.
40+
41+
- `patch_0_9_1`: This module is used for patching vLLM 0.9.1. The version is always the nearest version of vLLM. Once vLLM is released, we will drop this patch module and bump a new version. For example, `patch_0_9_2` is used for patching vLLM 0.9.2.
42+
- `patch_main`: This module is used for patching the code in vLLM main branch.
43+
- `patch_common`: This module is used for patching both vLLM 0.9.1 and vLLM main branch.
44+
45+
## How to write a patch
46+
47+
Before writing a patch, following the principle above, we should patch the least code. If it's necessary, we can patch the code in either **platform** and **worker** folder. Here is an example to patch `distributed` module in vLLM.
48+
49+
1. Decide which version of vLLM we should patch. For example, after analysis, here we want to patch both 0.9.1 and main of vLLM.
50+
2. Decide which process we should patch. For example, here `distributed` belongs to the vLLM main process, so we should patch `platform`.
51+
3. Create the patch file in the write folder. The file should be named as `patch_{module_name}.py`. The example here is `vllm_ascend/patch/platform/patch_common/patch_distributed.py`.
52+
4. Write your patch code in the new file. Here is an example:
53+
```python
54+
import vllm
55+
56+
def patch_destroy_model_parallel():
57+
# your patch code
58+
...
59+
60+
vllm.distributed.parallel_state.destroy_model_parallel = patch_destroy_model_parallel
61+
```
62+
5. Import the patch file in `__init__.py`. In this example, add `import vllm_ascend.patch.platform.patch_common.patch_distributed` into `vllm_ascend/patch/platform/patch_common/__init__.py`.
63+
6. Add the description of the patch in `vllm_ascend/patch/__init__.py`. The description format is as follows:
64+
```
65+
# ** File: <The patch file name> **
66+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67+
# 1. `<The target patch module in vLLM>`
68+
# Why:
69+
# <Describe the reason why we need to patch>
70+
# How:
71+
# <Describe the way to patch>
72+
# Related PR (if no, explain why):
73+
# <Add a link to the related PR in vLLM. If there is no related PR, explain why>
74+
# Future Plan:
75+
# <Describe the future plan to remove the patch>
76+
```
77+
7. Add the Unit Test and E2E Test. Any new added code in vLLM Ascend should contain the Unit Test and E2E Test as well. You can find more detail in [test guide](../contribution/testing.md)
78+
79+
80+
## Limitation
81+
1. In V1 Engine, vLLM start three kinds for process: Main process, EngineCore process and Worker process. Now vLLM Ascend only support patch the code in Main process and Worker process by default. If you want to patch the code runs in EngineCore process, you should patch EngineCore process totally during setup, the entry code is here `vllm.v1.engine.core`. Please override `EngineCoreProc` and `DPEngineCoreProc` totally.
82+
2. If you are running an edited vLLM code, the version of the vLLM may be changed automatically. For example, if you runs an edited vLLM basing on v0.9.1, the version of vLLM may be change to v0.9.2xxx, in this case, the patch for v0.9.1 in vLLM Ascend would not work as expect, because that vLLM Ascend can't distinguish the version of vLLM you're using. In this case, you can set the environment variable `VLLM_VERSION` to specify the version of vLLM you're using, then the patch for v0.9.1 should work.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Performance
2+
3+
:::{toctree}
4+
:caption: Performance
5+
:maxdepth: 1
6+
performance_benchmark
7+
profile_execute_duration
8+
:::

docs/source/index.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,10 @@ user_guide/release_notes
5656
:::{toctree}
5757
:caption: Developer Guide
5858
:maxdepth: 1
59-
developer_guide/contributing
60-
developer_guide/testing
61-
developer_guide/versioning_policy
59+
developer_guide/contribution/index
60+
developer_guide/feature_guide/index
6261
developer_guide/evaluation/index
62+
developer_guide/performance/index
6363
:::
6464

6565
% How to involve vLLM Ascend
@@ -68,5 +68,6 @@ developer_guide/evaluation/index
6868
:maxdepth: 1
6969
community/governance
7070
community/contributors
71+
community/versioning_policy
7172
community/user_stories/index
7273
:::

0 commit comments

Comments
 (0)