Skip to content

Commit c991d18

Browse files
reidliu41huydhn
authored andcommitted
[Misc] small update (vllm-project#20462)
Signed-off-by: reidliu41 <reid201711@gmail.com>
1 parent 7b37ecc commit c991d18

File tree

3 files changed

+18
-9
lines changed

3 files changed

+18
-9
lines changed

examples/offline_inference/profiling_tpu/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,10 @@ Once you have collected your profiles with this script, you can visualize them u
5757
Here are most likely the dependencies you need to install:
5858

5959
```bash
60-
pip install tensorflow-cpu tensorboard-plugin-profile etils importlib_resources
60+
pip install tensorflow-cpu \
61+
tensorboard-plugin-profile \
62+
etils \
63+
importlib_resources
6164
```
6265

6366
Then you just need to point TensorBoard to the directory where you saved the profiles and visit `http://localhost:6006/` in your browser:

examples/online_serving/structured_outputs/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,15 @@ vllm serve Qwen/Qwen2.5-3B-Instruct
1313
To serve a reasoning model, you can use the following command:
1414

1515
```bash
16-
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --reasoning-parser deepseek_r1
16+
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B \
17+
--reasoning-parser deepseek_r1
1718
```
1819

1920
If you want to run this script standalone with `uv`, you can use the following:
2021

2122
```bash
22-
uvx --from git+https://github.com/vllm-project/vllm#subdirectory=examples/online_serving/structured_outputs structured-output
23+
uvx --from git+https://github.com/vllm-project/vllm#subdirectory=examples/online_serving/structured_outputs \
24+
structured-output
2325
```
2426

2527
See [feature docs](https://docs.vllm.ai/en/latest/features/structured_outputs.html) for more information.
@@ -44,7 +46,9 @@ uv run structured_outputs.py --stream
4446
Run certain constraints, for example `structural_tag` and `regex`, streaming:
4547

4648
```bash
47-
uv run structured_outputs.py --constraint structural_tag regex --stream
49+
uv run structured_outputs.py \
50+
--constraint structural_tag regex \
51+
--stream
4852
```
4953

5054
Run all constraints, with reasoning models and streaming:

examples/others/tensorize_vllm_model.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ def parse_args():
202202

203203

204204

205-
def deserialize():
205+
def deserialize(args, tensorizer_config):
206206
if args.lora_path:
207207
tensorizer_config.lora_dir = tensorizer_config.tensorizer_dir
208208
llm = LLM(model=args.model,
@@ -242,7 +242,7 @@ def deserialize():
242242
return llm
243243

244244

245-
if __name__ == '__main__':
245+
def main():
246246
args = parse_args()
247247

248248
s3_access_key_id = (getattr(args, 's3_access_key_id', None)
@@ -260,8 +260,6 @@ def deserialize():
260260

261261
model_ref = args.model
262262

263-
model_name = model_ref.split("/")[1]
264-
265263
if args.command == "serialize" or args.command == "deserialize":
266264
keyfile = args.keyfile
267265
else:
@@ -309,6 +307,10 @@ def deserialize():
309307
encryption_keyfile = keyfile,
310308
**credentials
311309
)
312-
deserialize()
310+
deserialize(args, tensorizer_config)
313311
else:
314312
raise ValueError("Either serialize or deserialize must be specified.")
313+
314+
315+
if __name__ == "__main__":
316+
main()

0 commit comments

Comments
 (0)