is there a way to terminal model server? #5020
gabohouhou
announced in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
for example,10 people call model api,there are 10 requests running,but 3 people waiting too long ,so 3 people cancel request at front-end,is there a way to terminal model server request for 3 people,and 7 people keep waiting ? can use vllm to complete ?
Beta Was this translation helpful? Give feedback.
All reactions