Replies: 1 comment 2 replies
-
Try to pass the ./build/bin/llama-cli -m /root/autodl-fs/poc1k.gguf -n 512 --top-p 0.7 --temp 0.95 -t 50 -cnv |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Environments
Model: shenzhi-wang/Llama3-8B-Chinese-Chat
Finetuning method: lora
Dataset: some yaml format data
Compute type: f16
use huggingface inference
the output content is only yaml data, which is the format I expected
convert hf (merge and export the model with lora by llama factory) to gguf format
use llama-cli
response:
You can see that the replies are irrelevant and there are many other responses.
Why?And how to solve it?
Beta Was this translation helpful? Give feedback.
All reactions