Minimal PyTorch implementations of core deep learning components, built from scratch to understand how they work.
-
llm.ipynb
A minimal decoder-only Transformer for language modeling, with code for architecture, training, and generation. -
adamw.ipynb
A from-scratch implementation of the AdamW optimizer, comparing it against L2-regularized Adam on a toy regression task.
๐ก Tip: Run these notebooks directly in Google Colab โ no setup required.