Is there a way to set max-concurrency #2691
tommasofavaron1
announced in
Q&A
Replies: 2 comments
-
Any pointers on this would be greatly appreciated! 🙏 |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is this what you're looking for? https://docs.vllm.ai/en/v0.6.1.post2/dev/engine/async_llm_engine.html#vllm.AsyncLLMEngine.limit_concurrency |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I wanted to ask if it is possible to configure a maximum number of concurrent requests.
does it make sense to do this and therefore increase num_requests_waiting or is it just a overhead that doesn't increase the speed of requests in queue?
Beta Was this translation helpful? Give feedback.
All reactions