-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi all! I find this interesting, and I would like to participate.
However, it's unclear to me what the "goal" is. I.e., when should we stop the clock?
- When we reach a certain training validation loss?
- When we reach a certain generation quality, according to some fidelity metric?
- Both?
Additionally, when should the clock be running? In the modded-nanogpt
speedrun, we only allow the clock to run during training loops, including data fetching between steps, but not during validation. I propose we do the same as modded-nanogpt
and make this explicit and also log everything into text files.
And IMO, it's best to have an initial, downloadable benchmark logs we can compare against.
Metadata
Metadata
Assignees
Labels
No labels