Skip to content

Commit 99e6855

Browse files
[Doc] Add Qwen2.5-VL eager mode doc (#1394)
### What this PR does / why we need it? Add Qwen2.5-VL eager mode doc. --------- Signed-off-by: shen-shanshan <467638484@qq.com>
1 parent d59e7fa commit 99e6855

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

docs/source/tutorials/single_npu_multimodal.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@ docker run --rm \
2929
Setup environment variables:
3030

3131
```bash
32+
# Use vllm v1 engine
33+
export VLLM_USE_V1=1
34+
3235
# Load model from ModelScope to speed up download
3336
export VLLM_USE_MODELSCOPE=True
3437

@@ -57,6 +60,7 @@ llm = LLM(
5760
model=MODEL_PATH,
5861
max_model_len=16384,
5962
limit_mm_per_prompt={"image": 10},
63+
enforce_eager=True,
6064
)
6165

6266
sampling_params = SamplingParams(
@@ -103,13 +107,11 @@ outputs = llm.generate([llm_inputs], sampling_params=sampling_params)
103107
generated_text = outputs[0].outputs[0].text
104108

105109
print(generated_text)
106-
107110
```
108111

109112
If you run this script successfully, you can see the info shown below:
110113

111114
```bash
112-
Processed prompts: 100%|███████████████| 1/1 [00:11<00:00, 11.29s/it, est. speed input: 9.48 toks/s, output: 20.55 toks/s]
113115
The image displays a logo consisting of two main elements: a stylized geometric design and a pair of text elements.
114116

115117
1. **Geometric Design**: On the left side of the image, there is a blue geometric design that appears to be made up of interconnected shapes. These shapes resemble a network or a complex polygonal structure, possibly hinting at a technological or interconnected theme. The design is monochromatic and uses only blue as its color, which could be indicative of a specific brand or company.
@@ -141,10 +143,15 @@ docker run --rm \
141143
-v /etc/ascend_install.info:/etc/ascend_install.info \
142144
-v /root/.cache:/root/.cache \
143145
-p 8000:8000 \
146+
-e VLLM_USE_V1=1 \
144147
-e VLLM_USE_MODELSCOPE=True \
145148
-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
146149
-it $IMAGE \
147-
vllm serve Qwen/Qwen2.5-VL-7B-Instruct --dtype bfloat16 --max_model_len 16384 --max-num-batched-tokens 16384
150+
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
151+
--dtype bfloat16 \
152+
--max_model_len 16384 \
153+
--max-num-batched-tokens 16384 \
154+
--enforce-eager
148155
```
149156
150157
:::{note}

0 commit comments

Comments
 (0)