llama.cpp server support for alternate EOS/antiprompt settings to support non-llama prompt formats #4474
SanDiegoDude
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
Same problems here. Even with main anything other than the llama2 70b / llama2 prompt format will just start outputting a conversation without any control. I don't know if the reverse prompt even works with Mixstral, for me it actually keeps going at times with the prompt I put it. Adding the ### Response: did not keep it from keeping talking, and just seems "out of control" vs. 70b and the llama2 syntax. I haven't had much else handle prompting for any duration at all without completely falling apart and repeating or continuous output, unless it is the llama2 70b chat model.
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
support for the Obsidian 3B models was just added recently, however attempting to use them in multimodal form with llama.cpp server is an exercise in frustration as we have no way to set the EOS for the model, which then causes it to continue repeating itself until it caps out on tokens. I'm building out a captioning tool that is dependent on speed, so just filtering out the response past the first ### isn't a viable option. llama.cpp main supports manually setting --reverse-prompt or even works in instruction mode (which catches the ### properly) but doesn't work as a server.
Beta Was this translation helpful? Give feedback.
All reactions