Skip to content

Endless answering in one conversation. #10150

Closed Answered by danbev
Jayoprell asked this question in Q&A
Discussion options

You must be logged in to vote

The output looks like it would if a base model is used and not a chat/instruct trained model, in which case it just continues to generate tokens.

Could you double check that the model you are using is indeed a chat/instruct model (I can see that it has chat in the path but perhaps it got overwritten or something)?

If I run the same prompt using llama-2-7b-chat.Q4_K_M.gguf I get:

$ ./llama-cli -m ./models/llama-2-7b-chat.Q4_K_M.gguf -p "You are a helpful assistant" -cnv
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (12th Gen Intel(R) Core(TM) i7-1260P)
build: 4003 (48e6e4c2) with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu (debug)
m…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@Jayoprell
Comment options

@danbev
Comment options

danbev Nov 5, 2024
Collaborator

@Jayoprell
Comment options

Answer selected by Jayoprell
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants