A complete demonstration of "zero-to-hero" knowledge injection using LoRA (Low-Rank Adaptation) fine-tuning to transform general language models into D&D 5e experts.
This project showcases how LoRA fine-tuning can inject domain-specific knowledge into small language models with minimal computational cost. We transform models like DistilGPT2 (82M parameters) from having zero D&D knowledge to expert-level understanding using only 1.42% of the model's parameters.
- Dramatic Knowledge Injection: 200-800% increase in D&D terminology usage
- Efficient Training: Only ~1% of model parameters needed for domain expertise
- Complete Pipeline: End-to-end system from data preparation to deployment
- Real-time Comparison: Interactive API and dashboard for model evaluation
- Comprehensive Evaluation: Automated testing and HTML report generation
# Clone the repository
git clone https://github.com/haveard/dnd-srd-model.git
cd dnd-srd-model
# Install dependencies
pip install -r requirements.txt
# D&D SRD data is already included in ./data/raw/
# Convert D&D SRD to training format from local data
python prepare_dnd_data.py
# This creates data/dnd_srd_qa.jsonl with 2,953 Q&A examples
# Train DistilGPT2 with D&D knowledge (fastest)
python train_dnd_lora.py --model distilgpt2 --epochs 3
# Train Pythia-1.4B (more powerful)
python train_dnd_lora.py --model EleutherAI/pythia-1.4b --epochs 2
# See the dramatic transformation
python demo.py
# Comprehensive comparison
python compare_models.py
# Start interactive API
python api_server.py
dnd-srd-model/
โโโ ๐ง Core Library
โ โโโ dnd_lora_core.py # Consolidated functionality
โโโ ๐ ๏ธ Main Scripts
โ โโโ prepare_dnd_data.py # Data preparation
โ โโโ train_dnd_lora.py # LoRA training
โ โโโ compare_models.py # Model evaluation
โ โโโ demo.py # Simple demonstration
โ โโโ api_server.py # FastAPI server
โโโ ๐ฅ๏ธ Web Interface
โ โโโ streamlit_dashboard.py # Comprehensive dashboard
โ โโโ launch_dashboard.py # Dashboard launcher
โโโ ๐ Data & Models
โ โโโ data/ # Training datasets
โ โโโ models/ # Trained LoRA adapters
โ โโโ eval/ # Evaluation reports
โโโ ๐ Documentation
โโโ README.md # This file
# Quick training on DistilGPT2 (recommended for testing)
python train_dnd_lora.py --model distilgpt2 --epochs 3 --batch-size 8
# Full training on Pythia (better results)
python train_dnd_lora.py --model EleutherAI/pythia-1.4b --epochs 2 --batch-size 4
# Custom configuration
python train_dnd_lora.py \
--model distilgpt2 \
--data data/dnd_srd_qa.jsonl \
--output models/my-dnd-model \
--epochs 5 \
--learning-rate 3e-4 \
--lora-rank 32
# Basic comparison (5 questions)
python compare_models.py --model distilgpt2
# Comprehensive evaluation
python compare_models.py \
--model distilgpt2 \
--batch-size 20 \
--include-general \
--output eval/detailed_comparison
# Quick demo
python demo.py --questions 3 --pause 2
# Start server (default: localhost:8000)
python api_server.py
# Custom configuration
python api_server.py \
--model distilgpt2 \
--lora-path models/my-dnd-model \
--port 8080 \
--host 0.0.0.0
# Launch comprehensive Streamlit dashboard
python launch_dashboard.py
# Custom host/port
python launch_dashboard.py --host 0.0.0.0 --port 8080
# Direct streamlit command
streamlit run streamlit_dashboard.py
The dashboard provides:
- ๐ Overview: Project stats and model information
- ๐ค Model Comparison: Side-by-side evaluation interface
- ๐ฌ Interactive Chat: Real-time model testing
- ๐ Training & Evaluation: Progress visualization and reports
- ๐ Documentation: Complete project documentation
The dnd_lora_core.py
module provides three main classes:
from dnd_lora_core import DnDLoRATrainer
trainer = DnDLoRATrainer(model_name="distilgpt2")
trainer.setup_lora(rank=16, alpha=32)
dataset = trainer.prepare_dataset("data/dnd_srd_qa.jsonl")
trainer.train(dataset, num_epochs=3)
from dnd_lora_core import DnDModelComparator
comparator = DnDModelComparator(
model_name="distilgpt2",
lora_path="models/dnd-lora"
)
result = comparator.compare_responses("What is a Fireball spell?")
from dnd_lora_core import DnDDataProcessor
# Load from default local data/raw directory
srd_data = DnDDataProcessor.load_srd_data()
# Or specify custom path
srd_data = DnDDataProcessor.load_srd_data("path/to/raw/data")
qa_pairs = DnDDataProcessor.create_qa_pairs(srd_data)
DnDDataProcessor.save_dataset(qa_pairs, "data/training.jsonl")
- Parameters Trained: 1.18M / 83M (1.42%)
- Training Loss: 2.04 โ 1.64
- D&D Term Usage: 200-800% increase
- Training Time: ~15 minutes (Apple M4)
- Parameters Trained: 6.29M / 1.4B (0.44%)
- Training Loss: 1.92 โ 1.47
- D&D Term Usage: 300-1000% increase
- Training Time: ~45 minutes (Apple M4)
Original DistilGPT2:
"I don't know what a fireball spell is, but I think it's something that can be used to create fire."
LoRA Fine-tuned:
"Fireball is a 3rd-level evocation spell. It deals 8d6 fire damage in a 20-foot radius sphere. Creatures in the area make a Dexterity saving throw for half damage."
Analysis: 0 โ 7 D&D terms โจ Zero-to-Hero transformation!
- Rank (r): 8-32 (controls adapter size)
- Alpha: 16-64 (scaling factor)
- Target Modules: Attention and feed-forward layers
- Dropout: 0.1
- Task Type: Causal Language Modeling
- Device: Apple Silicon MPS optimization
- Precision: Float32 (MPS compatibility)
- Data Format: Instruction-following Q&A pairs
- Evaluation: 10% holdout set
- Minimum: 8GB RAM, Apple Silicon or CUDA GPU
- Recommended: 16GB RAM for Pythia training
- Storage: ~2GB for models and data
Compare responses from both models:
{
"prompt": "What is a Beholder in D&D?",
"max_length": 150,
"temperature": 0.7
}
Check server and model status.
Interactive API documentation (Swagger UI).
- D&D Term Count: Domain-specific vocabulary usage
- Response Length: Detailed vs. generic responses
- Knowledge Accuracy: Correctness of D&D facts
- General Knowledge: Preservation of non-domain abilities
This is the refactored, clean version of the project. The original development created many experimental scripts in the scripts/
directory. The core functionality has been consolidated into:
- Core Library:
dnd_lora_core.py
- Main Scripts: Clean, documented, production-ready
- Legacy Scripts: Original development scripts (for reference)
- Multi-domain knowledge injection
- Larger model support (7B+ parameters)
- Advanced evaluation metrics
- LoRA: Low-Rank Adaptation of Large Language Models
- D&D 5e System Reference Document
- PEFT: Parameter-Efficient Fine-Tuning
- Transformers Library
- D&D 5e Data Source
This project is for educational and research purposes. D&D content is used under the Open Gaming License.