A comprehensive toolkit for creating datasets, training, and evaluating fine-tuned models for the Model Context Protocol.
This toolkit provides end-to-end functionality for fine-tuning large language models on MCP-style instructions using PEFT/QLoRA methods. The system enables efficient fine-tuning on consumer hardware (e.g., RTX 3090) by leveraging parameter-efficient methods.
- Dataset Generation: Create supervised instruction-completion pairs for MCP tasks
- Example Templating: Generate templates for various MCP servers (Filesystem, Git, GitHub, Database, etc.)
- Validation Tools: Ensure dataset quality and formatting consistency
- Training Integration: Fine-tune models using HuggingFace PEFT/QLoRA
- Evaluation Framework: Compare fine-tuned models against baselines on MCP tasks
The toolkit uses:
- UV for Python environment management
- PEFT/QLoRA for parameter-efficient fine-tuning
- HuggingFace Transformers for model management
- Justfile for workflow orchestration
- Pytest for component testing
# Set up the environment with UV
just setup
# Generate MCP example templates
just generate-examples
# Generate dataset from examples
just generate-dataset
# Validate the dataset
just validate-dataset
# Run all tests
just test-all
# Run the entire pipeline
just complete-workflow
The dataset consists of instruction-completion pairs following MCP conventions:
{
"examples": [
{
"instruction": "Find commit history of main.py file from the last week",
"completion": "I'll find the commit history...\n\n<use_mcp_tool>\n<server_name>git</server_name>...",
"metadata": {
"server": "git",
"tool": "log",
"complexity": "simple"
}
}
]
}
- Loss Masking: Using DataCollatorForCompletionOnlyLM to mask prompt tokens (setting to -100)
- Parameter Efficiency: LoRA adapters focus training on a small subset of model parameters
- Hardware Requirements: Optimized for consumer GPUs (e.g., RTX 3090)
- Integration: Supports Accelerate for multi-GPU or distributed training
The toolkit supports examples from multiple MCP servers:
- Filesystem: File operations, directory management, path manipulation
- Git: Repository operations, commit history, diffs, branch management
- GitHub: Issues, PRs, repository exploration, code search
- Database: Query execution, schema exploration
- Custom Tools: Weather data, calculator functions, image processing
See the TODO.md file for upcoming features and improvements.