How to stream model generation with the `vllm.LLM` programing API? #15239

CNSeniorious000 · 2025-03-20T17:47:04Z

CNSeniorious000
Mar 20, 2025

I mean, I know running vllm as a server and use the OpenAI-compatible API can achieve this. But I don't want to start a server. Can I generate output with streaming with programming API? I've walked through the docs and found nothing about this.

hmellor · 2025-03-21T19:27:56Z

hmellor
Mar 21, 2025
Collaborator

This cannot be done.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

How to stream model generation with the `vllm.LLM` programing API? #15239

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to stream model generation with the vllm.LLM programing API? #15239

Uh oh!

CNSeniorious000 Mar 20, 2025

Replies: 1 comment

Uh oh!

hmellor Mar 21, 2025 Collaborator

How to stream model generation with the `vllm.LLM` programing API? #15239

CNSeniorious000
Mar 20, 2025

hmellor
Mar 21, 2025
Collaborator