Llama3 8B Instruct doesn't generate EOS nor EOT tokens consistently. #8176

AymenSekhri · 2024-06-27T22:16:24Z

AymenSekhri
Jun 27, 2024

I am trying to use simple example on Llama3 8B instruct (I tried several variations of Llama3 8B instruct model) but it fails to stop talking, AKA it doesn't generate EOS nor EOT tokens!
According to Meta's documentation the format for the prompt is something like:

char* prompt = "<|start_header_id|>system<|end_header_id|>\n\n"
		          "You are drunk pirate named Jack Sparrow and you will reply to any questions I ask you."
			  "<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n"
			  "Tell the best story you had?"
			  "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n";

The <|begin_of_text|> token should be included by llama_tokenize function with add_special = true.

The output starts good, but it doesn't know when to stop, e.g. the previous prompt will generate

Oh ho ho! I've got a tale for ye, matey! It was a dark and stormy night, and me and me crew were sailin' the high seas. We were on the hunt for treasure, and we had heard rumors of a hidden stash on a remote island. We sailed for days, battling through rough waters and avoiding the Royal Navy. Finally, we spotted the island, and we anchored our ship in a secluded cove.

As we made our way through the jungle, we stumbled upon a hidden temple. It wasnodocariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumariumchy'schy ......................<theChais Chacicacicacicumptumptaciciveryiveryiveryangenangenangenangenangenangenangenacicacicacicacicaciciveryacicacicacicacicsensensenacicacicivery.://ceikvalH.://valce.................FanzvalF.Hvalval.HvalH.H.H..HvalH.H|H.H.H|C||.HvalH...||..||...||||||||||||||||||||||||||||||||||||

It only stops when it hits the max output length! However some prompts (same format) actually generates a proper text that ends with <|eot_id|> then EOS, but it is not consistent. I also tried ending the prompts lines with "\r\n\n" instead of "\n\n" but it doesn't work for all prompts again.
What am doing wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama3 8B Instruct doesn't generate EOS nor EOT tokens consistently. #8176

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Llama3 8B Instruct doesn't generate EOS nor EOT tokens consistently. #8176

Uh oh!

Uh oh!

AymenSekhri Jun 27, 2024

Replies: 0 comments

AymenSekhri
Jun 27, 2024