Make GPT training fun and approachable! A visual training platform based on karpathy/nanoGPT.
Mini-NanoGPT is a tool that helps you get started with training GPT models effortlessly. Whether you're:
- 🎓 A beginner in deep learning
- 👨🔬 A researcher
- 🛠️ A developer
Or simply curious about large language models and want to experience their magic,
You can train a model through an intuitive graphical interface!
For the original version of Mini NanoGPT (no longer updated), please check out the old branch.
- 📱 Visual Interface: Say goodbye to command line; point-and-click to start training
- 🌍 Bilingual UI: Full support for both English and Chinese interfaces
- 🎯 One-click Operations: Data preprocessing, training, and text generation — all in one click
- 🔤 Flexible Tokenization: Supports character-level and GPT-2/Qwen tokenizers, with multilingual support
- 🚄 Efficient Training: Supports multi-process acceleration and distributed training
- 📊 Real-time Feedback: Live display of training progress and performance
- ⚙️ Parameter Visualization: All training parameters can be adjusted directly in the UI
- 🧩 Model Database: Easily manage models and reuse training settings anytime
# Clone the repository
git clone --depth 1 https://github.com/ystemsrx/mini-nanoGPT.git
cd mini-nanogpt
# Install dependencies (Python 3.7+)
pip install -r requirements.txt
python app.py
Open the displayed link in your browser (usually http://localhost:7860) to see the training interface!
- Open the "Data Processing" page, paste your training text, and choose a tokenization method. For better results, check the option to use a tokenizer — it will automatically build a vocabulary based on your text.
- If you don't want to use a validation set, check the "Skip validation set" option.
- Click "Start Processing" when you're done.
Here's a small example for demonstration:
-
Switch to the "Training" page, and adjust the parameters as needed (or leave them as default for a quick try).
-
The training and validation loss curves are displayed in real time. If you generated a validation set in Step 1, you should see two curves: blue for training loss, orange for validation loss.
-
If only one curve is shown, check the terminal output. If you see an error like:
Error while evaluating val loss: Dataset too small: minimum dataset(val) size is 147, but block size is 512. Either reduce block size or add more data.
it means your
block size
is too large for the validation set. Try reducing it, for example to 128. -
You should now see both loss curves updating dynamically.
-
Click "Start Training" and wait for training to complete.
- This mode lets you evaluate the model's loss on the validation set. Set the
Number of Evaluation Seeds
to any value >0 to activate evaluation-only mode. You'll see how the model performs with different random seeds.
- Go to the "Inference" page
- Enter a prompt
- Click "Generate" and see what the model comes up with!
- Go to the "Comparison" page
- Select two models to compare — they can even be the same model with different settings
- Their configurations will be displayed automatically
- You can input the same prompt and see how both models generate text
- Or, apply different inference settings (temperature, top_k, etc.) to compare outputs
mini-nanogpt/
├── app.py # App entry point
├── src/ # Configuration and core modules
├── data/ # Data storage
├── out/ # Model checkpoints
└── assets/ # Tokenizer files and other resources
- 💡 Try reducing batch size or model size
- 💡 Use a GPU to greatly improve speed
- 💡 Increase the evaluation interval
- 💡 Try increasing the training data
- 💡 Tune the model hyperparameters
- 💡 Adjust the temperature during generation
- 💡 On the "Training" page, select "resume" under Initialization
- 💡 Point to the previous output directory
Suggestions and improvements are welcome! You can:
- Submit an Issue
- Open a Pull Request
- Share your experience using the tool
This project is open-sourced under the MIT License.
🎉 Start your GPT journey now!