Is it possible to do continuous batching with an openai ChatCompletion compatible interface? #1605
Closed
msugiyama57
announced in
Q&A
Replies: 1 comment
-
The OpenAI server automatically batches concurrent requests already, just try it with concurrent requests using any OpenAI compatible clients! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The following code shows an example of doing Continuous batching, can it be done in oepnai ChatCompletion format?
https://github.com/vllm-project/vllm/blob/main/examples/offline_inference.py
Beta Was this translation helpful? Give feedback.
All reactions