Skip to content

Dynamic System Prompt #10006

Answered by ggerganov
deguodedongxi asked this question in Q&A
Oct 22, 2024 · 1 comments · 1 reply
Discussion options

You must be logged in to vote

Injecting chat template tokens through the user input is likely not going to work as you expect since the structure of the template will be destroyed. There is no easy way to do what you want with llama-cli. My recommendation is to use the llama-server instead:

./llama-server -m models/llama-3.1-8b-instruct/ggml-model-q8_0.gguf -c 2048 -ngl 18 -fa -mg 1 --port 8012
curl -s \
    --request POST --url http://127.0.0.1:8012/v1/chat/completions \
    --header "Content-Type: application/json" \
    --data '{"messages": [ { "role": "system", "content": "End each sentence with a smiley emoji." }, { "role": "user", "content": "Hello, how are you today?" }, { "role": "assistant", "content": "I am …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@deguodedongxi
Comment options

Answer selected by deguodedongxi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants