Skip to content

Commit e112317

Browse files
authored
[Doc] Add reinstall instructions doc (#1303)
Add a new FAQ, if users re-install vllm-ascend with pip, the `build` folder should be removed first --------- Signed-off-by: rjg-lyh <1318825571@qq.com> Signed-off-by: weiguihua <weiguihua2@huawei.com> Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
1 parent 15592c0 commit e112317

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

docs/source/faqs.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ In scenarios where NPUs have limited HBM (High Bandwidth Memory) capacity, dynam
114114

115115
- **Configure `PYTORCH_NPU_ALLOC_CONF`**: Set this environment variable to optimize NPU memory management. For example, you can `export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True` to enable virtual memory feature to mitigate memory fragmentation caused by frequent dynamic memory size adjustments during runtime, see more note in: [PYTORCH_NPU_ALLOC_CONF](https://www.hiascend.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html).
116116

117-
### 15. Failed to enable NPU graph mode when running DeepSeek?
117+
### 16. Failed to enable NPU graph mode when running DeepSeek?
118118
You may encounter the following error if running DeepSeek with NPU graph mode enabled. The allowed number of queries per kv when enabling both MLA and Graph mode only support {32, 64, 128}, **Thus this is not supported for DeepSeek-V2-Lite**, as it only has 16 attention heads. The NPU graph mode support on DeepSeek-V2-Lite will be done in the future.
119119

120120
And if you're using DeepSeek-V3 or DeepSeek-R1, please make sure after the tensor parallel split, num_heads / num_kv_heads in {32, 64, 128}.
@@ -123,3 +123,6 @@ And if you're using DeepSeek-V3 or DeepSeek-R1, please make sure after the tenso
123123
[rank0]: RuntimeError: EZ9999: Inner Error!
124124
[rank0]: EZ9999: [PID: 62938] 2025-05-27-06:52:12.455.807 numHeads / numKvHeads = 8, MLA only support {32, 64, 128}.[FUNC:CheckMlaAttrs][FILE:incre_flash_attention_tiling_check.cc][LINE:1218]
125125
```
126+
127+
### 17. Failed to reinstall vllm-ascend from source after uninstalling vllm-ascend?
128+
You may encounter the problem of C compilation failure when reinstalling vllm-ascend from source using pip. If the installation fails, it is recommended to use `python setup.py install` to install, or use `python setup.py clean` to clear the cache.

0 commit comments

Comments
 (0)