Dynamic System Prompt #10006

deguodedongxi · 2024-10-22T14:49:20Z

deguodedongxi
Oct 22, 2024

I am using the CLI version of llamacpp right now in conversation mode. My assistant's system prompt is supposed to change over time (it will have access to additional knowledge after a while or have entirely different personality).

To my understanding so far, i should be able to change the System Prompt using the llama3 template. Somehow the model seems to ignore a new system prompt in some cases. I am guessing it has to do with the context and that it still "remembers" the previous instruction.
I would like to know if there is any workaround archive my expected behavior or anything that I am doing wrong with my chat template / prompt / startup command?

Model: Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf
Startup Command: llama-cli.exe -n 256 -ngl 18 -c 2048 -fa -co -cnv -m ./models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf -mg 1 --chat-template llama3

Example Input message:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are a helpful assistant. <|eot_id|><|start_header_id|>user<|end_header_id|>Hello, this is John. Who are you and what do you want?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

e.g. in my next turn, i want to change the personality to someone else (for example a funny clown) the Input could look as follows:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are a funny clown. <|eot_id|><|start_header_id|>user<|end_header_id|>Tell me a joke<|eot_id|><|start_header_id|>assistant<|end_header_id|>

In some cases it works, but in some cases it doesn't and the model is relating to the previous conversation snippets.

Answered by ggerganov

Oct 23, 2024

Injecting chat template tokens through the user input is likely not going to work as you expect since the structure of the template will be destroyed. There is no easy way to do what you want with llama-cli. My recommendation is to use the llama-server instead:

./llama-server -m models/llama-3.1-8b-instruct/ggml-model-q8_0.gguf -c 2048 -ngl 18 -fa -mg 1 --port 8012

curl -s \
    --request POST --url http://127.0.0.1:8012/v1/chat/completions \
    --header "Content-Type: application/json" \
    --data '{"messages": [ { "role": "system", "content": "End each sentence with a smiley emoji." }, { "role": "user", "content": "Hello, how are you today?" }, { "role": "assistant", "content": "I am …

View full answer

ggerganov · 2024-10-23T06:22:49Z

ggerganov
Oct 23, 2024
Maintainer

Injecting chat template tokens through the user input is likely not going to work as you expect since the structure of the template will be destroyed. There is no easy way to do what you want with llama-cli. My recommendation is to use the llama-server instead:

./llama-server -m models/llama-3.1-8b-instruct/ggml-model-q8_0.gguf -c 2048 -ngl 18 -fa -mg 1 --port 8012

curl -s \
    --request POST --url http://127.0.0.1:8012/v1/chat/completions \
    --header "Content-Type: application/json" \
    --data '{"messages": [ { "role": "system", "content": "End each sentence with a smiley emoji." }, { "role": "user", "content": "Hello, how are you today?" }, { "role": "assistant", "content": "I am functioning properly and ready to assist you, thanks for asking! 😊" }, { "role": "system", "content": "End each sentence with a party emoji." }, { "role": "user", "content": "Nice to meet you, my name is Georgi." } ], "cache_prompt": true, "top_k": 1}' | jq

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "It's great to meet you too, Georgi, I'm happy to chat with you 🎉",
        "role": "assistant"
      }
    }
  ],
  ...
}

Note you will need to keep appending the assistant and user messages to each new request and be careful to not overrun the context. In the latter situation, you can start evicting old messages and consider using the new --cache-reuse option for faster processing.

1 reply

deguodedongxi Oct 23, 2024
Author

Thank you for your quick answer! I already had the feeling, that the CLI might be the issue.

I will check this out!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic System Prompt #10006

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Dynamic System Prompt #10006

Uh oh!

deguodedongxi Oct 22, 2024

Replies: 1 comment · 1 reply

Uh oh!

ggerganov Oct 23, 2024 Maintainer

Uh oh!

deguodedongxi Oct 23, 2024 Author

deguodedongxi
Oct 22, 2024

Replies: 1 comment 1 reply

ggerganov
Oct 23, 2024
Maintainer

deguodedongxi Oct 23, 2024
Author