Replies: 2 comments
-
I am trying too to understand this. I am studying the main.cpp trying to delete everything I do not need and restart from the basics. As for now, I've found a suggestion here: I'm citing SuperMonkeyCollider:
|
Beta Was this translation helpful? Give feedback.
-
Is there load cache dynamically for server or llama cpp python ? Its very useful for usage case like extracting information based on in context learning, because prompt processing takes time |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I'm trying to understand how to load a session from disk with in-context learning. I would like to be able to preprocess the in-context learning prompt, persist it to disk, and load the saved session for each future use and append a new prompt (different each time) to generate a response.
I initially thought the purpose of
--prompt-cache
was for this. I expect to be able to run:./main -ngl 84 -m models/llama-2-7b.Q4_0.gguf -c 4096 -n 400 -s 42 --temp 0.7 --repeat_penalty 1.1 --prompt-cache prompt.cache.bin -f ./prompts/chat-with-bob.txt
to create a cache of the prompt template for in-context learning. I then expect to be able to run:
./main -ngl 84 -m models/llama-2-7b.Q4_0.gguf -c 4096 -n 400 -s 42 --temp 0.7 --repeat_penalty 1.1 --prompt-cache prompt.cache.bin --prompt-cache-ro -f ./chat/default/new-prompt.txt
where new-prompt.txt contains
What is your first name?
, which I'd expect the answer to beBob
given the cached prompt and the cached prompt to not be updated because--prompt-cache-ro
. However, it seems that I cannot use the file parameter like this with prompt cache.This is my expectation given:
https://github.com/ggerganov/llama.cpp/discussions/2110
https://github.com/ggerganov/llama.cpp/issues/1398
I then came across:
https://github.com/ggerganov/llama.cpp/pull/1169
However, I do not see a
--session
parameter available for ./main. I do seellama_save_session_file
in main.cpp but I do not see any examples of how to use it for my purposes.Does anyone have any insights into how I can save an in-context learning template session, and load it for future use while append a text file?
Beta Was this translation helpful? Give feedback.
All reactions