improving llama.cpp prompt example... #3216

hiqsociety · 2023-09-16T13:49:46Z

hiqsociety
Sep 16, 2023

lora the "do not mention things you are not sure or do not know." really helps with hallucination. it doesnt do the inline markdown references like gpt4 though. if u guys have any ideas how to make it formatted with markdown, pls do mention. chatgpt will show the output i wanted with the prompt below:

with reference to reducing hallucination here: #3209

this is the prompt and t/s i got with my system rtx 4060, 8gb vram, 16gb ram hp laptop victus Ryzen 5:
does anyone know how to make the markdown reference link format shown? try the prompt on chat gpt3 and it will show what i want but llama.cpp doesnt. also i did -c 3620 (because it fits into the vram 8gb "perfectly") but i read somewhere that all llama2 are trained with 2048 context so 3620 "is not advisable"? is this true? anything else i can improve my llama prompt? i'm using gpt4 to rephrase my llama prompt at this stage so not sure if any of u guys have any "cheatsheet" prompts specifically for llama2. will appreciate all info sharing here. thx.

llama_print_timings: load time = 762.10 ms
llama_print_timings: sample time = 887.69 ms / 2394 runs ( 0.37 ms per token, 2696.88 tokens per second)
llama_print_timings: prompt eval time = 138.33 ms / 101 tokens ( 1.37 ms per token, 730.14 tokens per second)
llama_print_timings: eval time = 60778.50 ms / 2393 runs ( 25.40 ms per token, 39.37 tokens per second)
llama_print_timings: total time = 63100.61 ms
Log end
root@ubuntu:/usr/local/src/llama.cpp# ./main -m models/llama-2-7b-lora-assemble.Q4_K_M.gguf -ngl 35 -c 3620 -n 12288 -p "Detailed encyclopedia-style article titled 'elon musk' with a minimum of 2400 words. The content should be in English and formatted in markdown. Do not mention things you are not sure or do not know.Structured with headings, an intro, and conclusion. Include inline citations, external/internal links (excluding images), and the markdown reference link format This is [an example][id] reference-style link; [id]: http://example.com/ \"Optional Title Here\". Integrate advanced markdown elements and a table of contents where appropriate." -e -t 1

KerfuffleV2 · 2023-09-16T23:50:20Z

KerfuffleV2
Sep 16, 2023
Collaborator

i read somewhere that all llama2 are trained with 2048 context so 3620 "is not advisable"?

It's LLaMA 1 models that are mostly trained with 2,048 context. LLaMA 2 is usually 4,096.

prompts specifically for llama2.

For non-instruction tuned models like it appears this one is you want to write it in a way that it can complete what you write. For example:

The following is a detailed encyclopedia-style article about Elon Musk which consists of 2400 words. The article is written in English and formatted in markdown. Blah blah blah. You can find the entire article below this line:

Obviously you don't write "blah blah", I just didn't write the whole prompt for you. The point is, if something is completing the text you wrote, it wouldn't make sense for the article not to be below that point. Right? So it's structured so the continuation of the text is what you want it to write. That's the non-instruct format prompting. Don't think of it like a conversation or question and answer. Think of it like a shared text editor where you write some stuff and then the LLM comes along and tries to finish it.

For instruct tuned models, you need to take a different approach. Different models use different prompting styles so you should look at the model card or whatever to see how to prompt it.

4 replies

hiqsociety Sep 17, 2023
Author

@KerfuffleV2 based on your experience. is there a way i can really make it write markdown encyclopedia article with heading and sub headings in "#" and "##" and proper "bold" and "italicized" content inside? i can live without the bold and italicizing.

do you know how to prompt it to do so? i dont mind if instruct or otherwise.

sometimes the prompt will get the "#" and "##" generated etc. sometimes none. im using 7b models.

models/llama-2-7b-lora-assemble.Q4_K_M.gguf and models/stablebeluga-7b.Q6_K.gguf

tried too many prompts with consistencies issues. anyone can help here?

KerfuffleV2 Sep 17, 2023
Collaborator

Stable Beluga seems like an instruction tuned model: https://huggingface.co/TheBloke/StableBeluga-7B-GGUF#prompt-template-orca-hashes

based on your experience. is there a way i can really make it write markdown encyclopedia article with heading and sub headings

I haven't ever tried to get models to write markdown, so I can't say I have specific experience there. You're not going to like hearing this, but personally I don't feel like I could get a 70B model to reliably do what you want. That's my personal opinion, it isn't necessary correct.

sometimes the prompt will get the "#" and "##" generated etc.

I feel like you might actually have better luck with non-instruction tuned models if you prompt correctly. You can possibly start by writing a description like I showed in my previous message and then also add a bit of markdown with the main heading and something for it to start completing.

Don't forget about sampling settings also. It looks like you're just using everything at the defaults? That's probably not what you want here. For example, the default temperature (randomness) is pretty high (0.8). If you want fairly deterministic output, you could consider setting it lower. I'd try something like 0.3-0.5 and see how that feels and adjust it accordingly. You could also try other samplers like Mirostat and see if you get better output.

There's also a grammar sampler where you can set up a grammar to force the output to conform. So you could force it to write using markdown that way. I don't think there's already a grammar for markdown so you'd have to figure out how to make one.

hiqsociety Sep 17, 2023
Author

@KerfuffleV2 thanks! that helps greatly BUT i find that it's not easy for it to be forced to use the same formatted output. it seems i can only try multiple times for it to generate in the markdown output i would like to see.

this is so far the best "prompt" i can come up with but the output format is not consistent. i need to rerun it multiple times to get what i want. like 1 out of 5 times will be the output i'm looking for.

microstat doesnt seem to have any effect on it. speechless 13b is the best so far. (my feedback) from 7B to 13B with high truthfulness.

./main -m models/speechless-llama2-13b.Q4_K_M.gguf -ngl 36 -n 3620 -c 3620 -p "### System: You possess expertise in creating articles with markdown formatting. ### User: Craft an article in encyclopedia-style on the topic 'tesla stock' in English. The article length should range from 800 to 2400 words. Utilize markdown for styling, using '#' for headers, '##' for sub-headers, and so on. Refrain from employing HTML; only markdown should be used. Enhance the article's readability with markdown features like bold, italics, and hyperlinks pointing to Wikipedia entries where relevant. Commence the article with '# [title]', followed immediately by '## Table of Contents\n1. '..., and then start the first subheading with '## 1. '... and continue accordingly." --temp 0.3

aehlke Nov 8, 2023

I recommend looking thru huggingface to find fine-tuned ones that follow instruct / q&a format as well as markdown or programmatic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improving llama.cpp prompt example... #3216

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

improving llama.cpp prompt example... #3216

Uh oh!

Uh oh!

hiqsociety Sep 16, 2023

Replies: 1 comment · 4 replies

Uh oh!

Uh oh!

KerfuffleV2 Sep 16, 2023 Collaborator

Uh oh!

hiqsociety Sep 17, 2023 Author

Uh oh!

KerfuffleV2 Sep 17, 2023 Collaborator

Uh oh!

hiqsociety Sep 17, 2023 Author

Uh oh!

aehlke Nov 8, 2023

hiqsociety
Sep 16, 2023

Replies: 1 comment 4 replies

KerfuffleV2
Sep 16, 2023
Collaborator

hiqsociety Sep 17, 2023
Author

KerfuffleV2 Sep 17, 2023
Collaborator

hiqsociety Sep 17, 2023
Author