GitHub - MuzzammilShah/GPT-TransformerModel-1: A step-by-step implementation of a GPT-style language model on a combined harry potter novels dataset, inspired by Andrej Karpathy’s lecture.

Building GPT from Scratch - GPT 1

This project is an implementation of a GPT-style language model following Andrej Karpathy’s (iconic, if i may add) "Let’s Build GPT from Scratch" video. It walks through the key components of modern transformers, from a simple bigram model to a fully functional self-attention mechanism and multi-headed transformer blocks.

💡Key Topics Covered

Baseline Model: Bigram language modeling, loss calculation, and text generation.
Self-Attention: Understanding matrix multiplications, softmax, and positional encodings.
Transformer Architecture: Multi-head self-attention, feedforward layers, residual connections, and layer normalization.
Scaling Up: Dropout regularization, encoder vs. decoder architectures (only decoder block has been implemented, no encoder).

Note

Changes from the original video and Notes

I've used a different and a bigger dataset for this, namely the 'Harry Potter Novels' collection. I found the raw dataset on kaggle (as 7 individual datasets) after which i had them merged and cleaned up seperately, so that the outputs can be a lot more cleaner. You may find the notebooks which I had implemented for that under the additional-files directory, so feel free to check that out.
This model is trained on 6 million characters (so ~6 million tokens)
The final output can be found in the file generated.txt.
I ran this model on a NVIDIA GeForce GTX 1650 of my personal laptop with a decent amount of GPU memory (CUDA Version 12.6) and it took approximately 90 minutes to train and generate the final document.
I've also added breakdowns of the codes based on andrej's explainations and how much I understood so feel free to read them as well.

Execution of the model

⭐Documentation

For a better reading experience and detailed notes, visit my Road to GPT Documentation Site.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
additional-files		additional-files
.gitignore		.gitignore
README.md		README.md
bigram.py		bigram.py
cleaned_dataset.txt		cleaned_dataset.txt
generated.txt		generated.txt
gpt-dev.ipynb		gpt-dev.ipynb
gpt.py		gpt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Building GPT from Scratch - GPT 1

💡Key Topics Covered

⭐Documentation

About

Uh oh!

Languages

MuzzammilShah/GPT-TransformerModel-1

Folders and files

Latest commit

History

Repository files navigation

Building GPT from Scratch - GPT 1

💡Key Topics Covered

⭐Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages