Replies: 1 comment
-
Oh wow that's a huge difference. Actually, I don't know why that would happen. The training set is very small (intentionally, so most readers can run the code on small hardware in reasonable time), so it's maybe quite brittle. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In chapter 7, Exercise 7.2, Instruction and input masking,
the solution is to mask targets in
def custom_collate_fn
.I accidentally wrote wrong stop position by
Use this code to train the LLM, and I found that the output of response go off the rail.
and using Ollama to score it, the result is:
Why only two token ids left would have such a big impact on LLM training.
Beta Was this translation helpful? Give feedback.
All reactions