how to use the feature "--prompt-cache" in llama-cli #14025
Unanswered
shenyunjason
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
hello
I want to use the feature"--prompt-cache" in llama-cli(version b5593), the model is qwen2.5-0.5b-instruct-q8_0.gguf,
my command is "./llama-cli --no-warmup -m qwen2.5-0.5b-instruct-q8_0.gguf --prompt-cache prompt.bin ",
I do not find prompt.bin is created after some questions are finished, it seems that the feature does not work.
Is what I did right? Pls give me some helps.
Beta Was this translation helpful? Give feedback.
All reactions