why two token id positions left have such a big impact on LLM training? #734

Jessen-Li · 2025-07-10T12:29:45Z

Jessen-Li
Jul 10, 2025

In chapter 7, Exercise 7.2, Instruction and input masking,
the solution is to mask targets in def custom_collate_fn.

targets[:instruction_length-1] = -100

I accidentally wrote wrong stop position by

targets[:instruction_length+1] = -100

Use this code to train the LLM, and I found that the output of response go off the rail.

Below is an instruction that describes a task. Write a response that appropriately completes the request.  ### Instruction: Convert the active sentence to passive: 'The chef cooks the meal every day.'<|endoftext|>The new year is a time of great hope and optimism for the world. It is a time when we can all be inspired by the new ideas and new ideas can be a great inspiration to others.<|endoftext|>The new year is a time of great

and using Ollama to score it, the result is:

Number of scores: 110 of 110
Average score: 16.66

Why only two token ids left would have such a big impact on LLM training.

rasbt · 2025-07-10T23:01:01Z

rasbt
Jul 10, 2025
Maintainer

Oh wow that's a huge difference. Actually, I don't know why that would happen. The training set is very small (intentionally, so most readers can run the code on small hardware in reasonable time), so it's maybe quite brittle.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

why two token id positions left have such a big impact on LLM training? #734

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

why two token id positions left have such a big impact on LLM training? #734

Uh oh!

Uh oh!

Jessen-Li Jul 10, 2025

Replies: 1 comment

Uh oh!

rasbt Jul 10, 2025 Maintainer

Jessen-Li
Jul 10, 2025

rasbt
Jul 10, 2025
Maintainer