gemma-7b-it poor results? #5751

zbruceli · 2024-02-27T16:32:42Z

zbruceli
Feb 27, 2024

It seems I get poor results from Gemma-7B-it 4 bit quantized model. It seems behaving very differently from other popular LLMs such as llama2-7b or mistral-7b. Almost all other models work well with default settings: I only need to do "main -m MODEL -i --color" to get a good interactive chat going. But if I do the same with gemma-7b, it is both too long winding, and ignore the latest prompt and keeps on rambling based on the first prompt. Do you have similar experience?

What are your command line parameters while using llama.cpp for gemma-7b?

dranger003 · 2024-02-27T16:43:23Z

dranger003
Feb 27, 2024

Please see here. In my experience gemma does not work like other models with a repeat penalty other than 1.0 (i.e. disabled) and the default in llama.cpp is set to 1.1 if you don't specify one.

2 replies

zbruceli Feb 27, 2024
Author

Thanks for the link to gg's HF post, and his example works well.

I'm wondering if anyone has successfully made gemma-7b-it working with llama.cpp in interactive mode?

ggerganov Feb 28, 2024
Maintainer

Try this: https://huggingface.co/google/gemma-7b-it/discussions/38#65deee6dbdcdceb7614ebb00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gemma-7b-it poor results? #5751

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

gemma-7b-it poor results? #5751

Uh oh!

zbruceli Feb 27, 2024

Replies: 1 comment · 2 replies

Uh oh!

dranger003 Feb 27, 2024

Uh oh!

zbruceli Feb 27, 2024 Author

Uh oh!

ggerganov Feb 28, 2024 Maintainer

zbruceli
Feb 27, 2024

Replies: 1 comment 2 replies

dranger003
Feb 27, 2024

zbruceli Feb 27, 2024
Author

ggerganov Feb 28, 2024
Maintainer