Is it possible to do continuous batching with an openai ChatCompletion compatible interface? #1605

msugiyama57 · 2023-11-09T09:15:56Z

msugiyama57
Nov 9, 2023

The following code shows an example of doing Continuous batching, can it be done in oepnai ChatCompletion format?
https://github.com/vllm-project/vllm/blob/main/examples/offline_inference.py

simon-mo · 2023-11-09T18:44:07Z

The OpenAI server automatically batches concurrent requests already, just try it with concurrent requests using any OpenAI compatible clients!

0 replies