Shakespeare GPT from Scratch

This repository contains a minimal transformer-based language model that learns directly from Shakespeare's text. The code is intentionally short and easy to follow, so you can understand how GPT-style models work under the hood.

Quick Start

Install dependencies:
```
pip install -r requirements.txt
```
(requirements.txt currently just lists PyTorch; install the appropriate build for your system.)
Train the model:
```
python train.py
```
The script saves the model weights to shakespeare.pt.
Generate text from the trained checkpoint:
```
python sample.py
```
To start generation from your own prompt instead, run:
```
python custom_prompt.py
```

Directory Overview

gpt/                 # Transformer model implementation
notebooks/           # Jupyter notebook walkthrough
shakespeare.txt      # Training data (tiny-shakespeare)
train.py             # Training script
sample.py            # Text generation script
custom_prompt.py     # Prompt-based generation script
requirements.txt     # Python dependencies

Training on Your Own Data

Replace shakespeare.txt with any plain text file and run python train.py again. The model will learn the style and vocabulary of whatever data you provide.

Example datasets

Song lyrics – generate new verses in the style of a favorite artist
Movie or TV scripts – create new scenes or dialogue
Classic novels – emulate authors like Jane Austen or Herman Melville
Poetry collections – craft new poems with a unique voice
Programming code – experiment with autocompletion for your projects

Adjust hyperparameters at the top of train.py if your dataset is larger or smaller.

Jupyter Notebook

The notebook notebooks/built_from_scratch.ipynb mirrors the code in a step-by-step format. Open it in Jupyter to explore the training loop, model architecture, and generation process interactively.

Model Highlights

Character-level tokenization
Multi-head self-attention with residual connections
Positional embeddings
Minimal PyTorch code—no external training frameworks

Credits

Inspired by Andrej Karpathy's GPT from scratch tutorial
Dataset derived from tiny-shakespeare

Feel free to open issues or PRs to improve this project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Shakespeare GPT from Scratch

Quick Start

Directory Overview

Training on Your Own Data

Example datasets

Jupyter Notebook

Model Highlights

Credits

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
gpt		gpt
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
custom_prompt.py		custom_prompt.py
requirements.txt		requirements.txt
sample.py		sample.py
shakespeare.txt		shakespeare.txt
train.py		train.py

jaibhasin/shakespeare-gpt

Folders and files

Latest commit

History

Repository files navigation

Shakespeare GPT from Scratch

Quick Start

Directory Overview

Training on Your Own Data

Example datasets

Jupyter Notebook

Model Highlights

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages