neural-chat-7b-v3-1 vllm.entrypoints.openai.api_server request return multiple turns of dialogue not one #2256
cninnovationai
announced in
Q&A
Replies: 1 comment
-
Add IMO this should be something you should be able to configure as additional stop-words when you start vLLM, so you don't have to deal with adding them manually. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm loading the checkpoints via vllm.
The inference command which I'm running is:
The problem is that multiple turns of dialogue were obtained in one request.
curl request like this:
Most of the time the response is normal like this:
but sometime it went wrong, response like this(It returned multiple rounds of dialogue -> [INST] xxxxxx [/INST] ):
Does anyone know how to solve this problem?I would be very grateful
Beta Was this translation helpful? Give feedback.
All reactions