Skip to content

Commit 3ef45d0

Browse files
authored
feat: Improve the offline_inference npu v0/v1 scripts (#1669)
### What this PR does / why we need it? Improve - Keep the same file name format as v1, `offline_inference_npu_v0.py`, `offline_inference_npu_v1.py` - Use `VLLM_USE_V1` = 0/1 clearly in py scripts - Fix some run errors in `offline_inference_npu_v1.py`, e.g. `deepseekv3-lite-base-latest` not exists in modescope or hf. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - vLLM version: v0.9.2 - vLLM main: vllm-project/vllm@baed180 Signed-off-by: xleoken <xleoken@163.com>
1 parent 6af35f6 commit 3ef45d0

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

examples/offline_inference_npu.py renamed to examples/offline_inference_npu_v0.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@
1717
# Adapted from vllm-project/vllm/examples/offline_inference/basic.py
1818
#
1919

20+
import os
21+
22+
os.environ["VLLM_USE_V1"] = "0"
23+
os.environ["VLLM_USE_MODELSCOPE"] = "True"
24+
2025
from vllm import LLM, SamplingParams
2126

2227
prompts = [

examples/offline_inference_npu_v1.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,11 @@
1919

2020
import os
2121

22+
os.environ["VLLM_USE_MODELSCOPE"] = "True"
23+
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
24+
2225
from vllm import LLM, SamplingParams
2326

24-
os.environ["VLLM_USE_V1"] = "1"
25-
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
2627

2728
if __name__ == "__main__":
2829
prompts = [
@@ -35,7 +36,7 @@
3536
# Create a sampling params object.
3637
sampling_params = SamplingParams(max_tokens=100, temperature=0.0)
3738
# Create an LLM.
38-
llm = LLM(model="/data/weights/deepseek-ai/deepseekv3-lite-base-latest",
39+
llm = LLM(model="deepseek-ai/DeepSeek-V2-Lite",
3940
tensor_parallel_size=2,
4041
enforce_eager=True,
4142
trust_remote_code=True,

0 commit comments

Comments
 (0)