D&D SRD LoRA Fine-Tuning Project

A complete demonstration of "zero-to-hero" knowledge injection using LoRA (Low-Rank Adaptation) fine-tuning to transform general language models into D&D 5e experts.

🎯 Project Overview

This project showcases how LoRA fine-tuning can inject domain-specific knowledge into small language models with minimal computational cost. We transform models like DistilGPT2 (82M parameters) from having zero D&D knowledge to expert-level understanding using only 1.42% of the model's parameters.

🏆 Key Achievements

Dramatic Knowledge Injection: 200-800% increase in D&D terminology usage
Efficient Training: Only ~1% of model parameters needed for domain expertise
Complete Pipeline: End-to-end system from data preparation to deployment
Real-time Comparison: Interactive API and dashboard for model evaluation
Comprehensive Evaluation: Automated testing and HTML report generation

🚀 Quick Start

1. Setup Environment

# Clone the repository
git clone https://github.com/haveard/dnd-srd-model.git
cd dnd-srd-model

# Install dependencies
pip install -r requirements.txt

# D&D SRD data is already included in ./data/raw/

2. Prepare Training Data

# Convert D&D SRD to training format from local data
python prepare_dnd_data.py

# This creates data/dnd_srd_qa.jsonl with 2,953 Q&A examples

3. Train a LoRA Model

# Train DistilGPT2 with D&D knowledge (fastest)
python train_dnd_lora.py --model distilgpt2 --epochs 3

# Train Pythia-1.4B (more powerful)
python train_dnd_lora.py --model EleutherAI/pythia-1.4b --epochs 2

4. Run Demonstration

# See the dramatic transformation
python demo.py

# Comprehensive comparison
python compare_models.py

# Start interactive API
python api_server.py

📁 Project Structure

dnd-srd-model/
├── 🧠 Core Library
│   └── dnd_lora_core.py          # Consolidated functionality
├── 🛠️ Main Scripts
│   ├── prepare_dnd_data.py       # Data preparation
│   ├── train_dnd_lora.py         # LoRA training
│   ├── compare_models.py         # Model evaluation
│   ├── demo.py                   # Simple demonstration
│   └── api_server.py             # FastAPI server
├── 🖥️ Web Interface
│   ├── streamlit_dashboard.py    # Comprehensive dashboard
│   └── launch_dashboard.py       # Dashboard launcher
├── 📊 Data & Models
│   ├── data/                     # Training datasets
│   ├── models/                   # Trained LoRA adapters
│   └── eval/                     # Evaluation reports
└── 📚 Documentation
    ├── README.md                 # This file

🎮 Usage Examples

Training Different Models

# Quick training on DistilGPT2 (recommended for testing)
python train_dnd_lora.py --model distilgpt2 --epochs 3 --batch-size 8

# Full training on Pythia (better results)
python train_dnd_lora.py --model EleutherAI/pythia-1.4b --epochs 2 --batch-size 4

# Custom configuration
python train_dnd_lora.py \
    --model distilgpt2 \
    --data data/dnd_srd_qa.jsonl \
    --output models/my-dnd-model \
    --epochs 5 \
    --learning-rate 3e-4 \
    --lora-rank 32

Model Comparison

# Basic comparison (5 questions)
python compare_models.py --model distilgpt2

# Comprehensive evaluation
python compare_models.py \
    --model distilgpt2 \
    --batch-size 20 \
    --include-general \
    --output eval/detailed_comparison

# Quick demo
python demo.py --questions 3 --pause 2

API Server

# Start server (default: localhost:8000)
python api_server.py

# Custom configuration
python api_server.py \
    --model distilgpt2 \
    --lora-path models/my-dnd-model \
    --port 8080 \
    --host 0.0.0.0

Interactive Dashboard

# Launch comprehensive Streamlit dashboard
python launch_dashboard.py

# Custom host/port
python launch_dashboard.py --host 0.0.0.0 --port 8080

# Direct streamlit command
streamlit run streamlit_dashboard.py

The dashboard provides:

🏠 Overview: Project stats and model information
🤖 Model Comparison: Side-by-side evaluation interface
💬 Interactive Chat: Real-time model testing
📈 Training & Evaluation: Progress visualization and reports
📚 Documentation: Complete project documentation

🔬 Core Library API

The dnd_lora_core.py module provides three main classes:

DnDLoRATrainer

from dnd_lora_core import DnDLoRATrainer

trainer = DnDLoRATrainer(model_name="distilgpt2")
trainer.setup_lora(rank=16, alpha=32)
dataset = trainer.prepare_dataset("data/dnd_srd_qa.jsonl")
trainer.train(dataset, num_epochs=3)

DnDModelComparator

from dnd_lora_core import DnDModelComparator

comparator = DnDModelComparator(
    model_name="distilgpt2",
    lora_path="models/dnd-lora"
)
result = comparator.compare_responses("What is a Fireball spell?")

DnDDataProcessor

from dnd_lora_core import DnDDataProcessor

# Load from default local data/raw directory
srd_data = DnDDataProcessor.load_srd_data()

# Or specify custom path
srd_data = DnDDataProcessor.load_srd_data("path/to/raw/data")

qa_pairs = DnDDataProcessor.create_qa_pairs(srd_data)
DnDDataProcessor.save_dataset(qa_pairs, "data/training.jsonl")

📊 Training Results

DistilGPT2 Results

Parameters Trained: 1.18M / 83M (1.42%)
Training Loss: 2.04 → 1.64
D&D Term Usage: 200-800% increase
Training Time: ~15 minutes (Apple M4)

Pythia-1.4B Results

Parameters Trained: 6.29M / 1.4B (0.44%)
Training Loss: 1.92 → 1.47
D&D Term Usage: 300-1000% increase
Training Time: ~45 minutes (Apple M4)

🎯 Example Transformations

Question: "What is a Fireball spell in D&D?"

Original DistilGPT2:

"I don't know what a fireball spell is, but I think it's something that can be used to create fire."

LoRA Fine-tuned:

"Fireball is a 3rd-level evocation spell. It deals 8d6 fire damage in a 20-foot radius sphere. Creatures in the area make a Dexterity saving throw for half damage."

Analysis: 0 → 7 D&D terms ✨ Zero-to-Hero transformation!

🛡️ Technical Details

LoRA Configuration

Rank (r): 8-32 (controls adapter size)
Alpha: 16-64 (scaling factor)
Target Modules: Attention and feed-forward layers
Dropout: 0.1
Task Type: Causal Language Modeling

Training Setup

Device: Apple Silicon MPS optimization
Precision: Float32 (MPS compatibility)
Data Format: Instruction-following Q&A pairs
Evaluation: 10% holdout set

Hardware Requirements

Minimum: 8GB RAM, Apple Silicon or CUDA GPU
Recommended: 16GB RAM for Pythia training
Storage: ~2GB for models and data

🌐 API Endpoints

POST /generate

Compare responses from both models:

{
  "prompt": "What is a Beholder in D&D?",
  "max_length": 150,
  "temperature": 0.7
}

GET /health

Check server and model status.

GET /docs

Interactive API documentation (Swagger UI).

📈 Evaluation Metrics

D&D Term Count: Domain-specific vocabulary usage
Response Length: Detailed vs. generic responses
Knowledge Accuracy: Correctness of D&D facts
General Knowledge: Preservation of non-domain abilities

🧹 Development Notes

This is the refactored, clean version of the project. The original development created many experimental scripts in the scripts/ directory. The core functionality has been consolidated into:

Core Library: dnd_lora_core.py
Main Scripts: Clean, documented, production-ready
Legacy Scripts: Original development scripts (for reference)

🚀 Future Enhancements

Multi-domain knowledge injection
Larger model support (7B+ parameters)
Advanced evaluation metrics

📚 References

📄 License

This project is for educational and research purposes. D&D content is used under the Open Gaming License.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data/raw		data/raw
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
api_server.py		api_server.py
cleanup_project.py		cleanup_project.py
compare_models.py		compare_models.py
demo.py		demo.py
dnd_lora_core.py		dnd_lora_core.py
launch_dashboard.py		launch_dashboard.py
prepare_dnd_data.py		prepare_dnd_data.py
requirements.txt		requirements.txt
streamlit_dashboard.py		streamlit_dashboard.py
train_dnd_lora.py		train_dnd_lora.py

haveard/dnd-srd-model

Folders and files

Latest commit

History

Repository files navigation