Replies: 1 comment 2 replies
-
Please see here. In my experience gemma does not work like other models with a repeat penalty other than 1.0 (i.e. disabled) and the default in llama.cpp is set to 1.1 if you don't specify one. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
It seems I get poor results from Gemma-7B-it 4 bit quantized model. It seems behaving very differently from other popular LLMs such as llama2-7b or mistral-7b. Almost all other models work well with default settings: I only need to do "main -m MODEL -i --color" to get a good interactive chat going. But if I do the same with gemma-7b, it is both too long winding, and ignore the latest prompt and keeps on rambling based on the first prompt. Do you have similar experience?
What are your command line parameters while using llama.cpp for gemma-7b?
Beta Was this translation helpful? Give feedback.
All reactions