What’s the recommended way to use vLLM openAI server for batch processing? #7639
ktrapeznikov
announced in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to process a batch of requests. What is the recommended way?
I typically use multiple workers with ThreadpoolExectuor. I am wondering if there is a better way?
Beta Was this translation helpful? Give feedback.
All reactions