Self-Evolving Agent Orchestration for Claude Code
Quick Start • Features • Architecture • Documentation • Community
AgentForge is where AI workflows are forged, not configured. Start with minimal recipes, evolve through usage, and forge your perfect workflow.
🌟 Slogan: Forge your workflow, evolve your agents
中文 | English
AgentForge is a self-evolving agent orchestration system for Claude Code that transforms vague requests into successful executions through smart agent dispatch, rigorous workflows, and continuous learning from execution history.
Unlike traditional frameworks that require extensive pre-configuration, AgentForge follows the ALITA principles:
- 📦 Minimal Predefined: Start with basic official Recipes
- 🌱 Maximum Self-Evolution: Learn from every execution and improve over time
Think of it as a blacksmith's forge for AI agents:
- 🔥 Every execution is a hammer strike, refining your workflow
- ⚒️ Recipes are the tools being forged through usage
- ✨ The system becomes smarter with each task you complete
🧠 Intelligent Orchestration
|
🌱 Self-Evolution Engine
|
📊 Knowledge Management
|
🔧 Developer Experience
|
Dimension | AgentForge | Traditional Frameworks |
---|---|---|
Philosophy | 🌱 Evolve through usage | 📝 Configure upfront |
Initial Setup | ⚡ Minimal (basic Recipes) | 🐌 Extensive (complex configs) |
Learning | 🧠 Automatic pattern learning | 🤖 Static rules |
Recipe Generation | ✅ Auto-generated from patterns | ❌ Manual creation only |
Optimization | ✅ Automatic merge & archive | ❌ Manual maintenance |
Agent Discovery | 🔍 4-tier intelligent search | 📁 Fixed agent list |
Knowledge Base | 📊 JSONL (append-only, reliable) | 💾 Various formats |
Startup Time | ⚡ ~10ms (Bash) | 🐌 100-500ms (Python/Node) |
Dependencies | ✅ Zero (pure Bash) |
Claude Code CLI must be installed first.
Option 1: Quick Install (Recommended)
curl -fsSL https://raw.githubusercontent.com/candybox-ai/agentforge/main/scripts/install.sh | bash
Option 2: Manual Installation
git clone https://github.com/candybox-ai/agentforge.git
cd agentforge
chmod +x scripts/install.sh
./scripts/install.sh
Option 3: Direct Download
curl -o agentforge https://raw.githubusercontent.com/candybox-ai/agentforge/main/bin/agentforge
chmod +x agentforge
sudo mv agentforge /usr/local/bin/
agentforge --help
agentforge "Build a REST API with user authentication"
The system will:
- 📝 Match Recipe from learned patterns
- ✅ Clarify your requirements with Recipe guidance
- 🎯 Define success criteria
- 🔍 Discover optimal agents (4-tier search)
⚠️ Assess risks with Recipe strategies- 🚀 Execute with agent coordination
- ✨ Verify complete delivery
- 📊 Learn from execution (update knowledge base)
graph TB
A[User Task] --> B[Recipe Matcher]
B --> C{Recipe Found?}
C -->|Yes| D[Load Recipe Workflow]
C -->|No| E[General Workflow]
D --> F[Agent Finder]
E --> F
F --> G[4-Tier Discovery]
G --> H[Local Agents]
G --> I[Official Agents]
G --> J[GitHub Search]
G --> K[Community Sources]
H --> L[Prompt Builder]
I --> L
J --> L
K --> L
L --> M[Claude Code Execution]
M --> N[Data Extractor]
N --> O[Knowledge Recorder]
O --> P{Patterns Found?}
P -->|Yes| Q[Recipe Generator]
P -->|No| R[End]
Q --> S[Optimizer]
S --> R
agentforge/
├── bin/
│ ├── agentforge # Main entry point
│ └── claude-agent-dispatch # Symlink (backward compatibility)
├── core/ # 11 core modules (4,274 lines)
│ ├── recipe-loader.sh # Recipe YAML parsing (230 lines)
│ ├── recipe-matcher.sh # Task-to-Recipe matching (280 lines)
│ ├── agent-finder.sh # 4-tier agent discovery (350 lines)
│ ├── prompt-builder.sh # Structured prompts (320 lines)
│ ├── data-extractor.sh # Execution metadata (280 lines)
│ ├── recipe-generator.sh # Auto-generation (515 lines)
│ ├── optimizer.sh # Merge & archive (435 lines)
│ ├── feedback-collector.sh # User satisfaction (240 lines)
│ ├── knowledge-recorder.sh # JSONL recording (310 lines)
│ ├── config-loader.sh # Configuration (342 lines)
│ └── agent-installer.sh # Interactive install (456 lines)
├── recipes/ # Recipe collection (1,587 lines)
│ ├── official/ # 5 official Recipes
│ │ ├── api-development.yaml
│ │ ├── devops-deployment.yaml
│ │ ├── mobile-development.yaml
│ │ ├── web-development.yaml
│ │ └── data-analysis.yaml
│ ├── generated/ # Auto-generated from patterns
│ └── custom/ # User-created Recipes
├── config/
│ └── agent-sources.yaml # Agent source configuration
└── evolution/
└── knowledge/ # Knowledge base (JSONL)
├── success-patterns.jsonl
├── failure-patterns.jsonl
├── agent-combinations.jsonl
└── task-fingerprints.jsonl
sequenceDiagram
participant U as User
participant AF as AgentForge
participant R as Recipe
participant A as Agents
participant CC as Claude Code
participant KB as Knowledge Base
U->>AF: Task Description
AF->>R: Match Recipe
R->>AF: Workflow Enhancements
AF->>U: Step 1: Clarify Requirements
U->>AF: Confirmation
AF->>U: Step 2: Define Success Criteria
AF->>A: Step 3: Discover Agents
A->>AF: Agent Recommendations
AF->>U: Step 4: Risk Assessment
AF->>CC: Step 5: Execute with Monitoring
CC->>AF: Execution Results
AF->>U: Step 6: Verify Delivery
AF->>KB: Record Patterns
KB->>R: Generate/Optimize Recipes
Execute a task:
agentforge "your task description"
List available Recipes:
agentforge --list-recipes
Show Recipe statistics:
agentforge --recipe-stats
Generate new Recipes from patterns:
agentforge --generate-recipes
Optimize Recipe collection:
agentforge --optimize-recipes
View knowledge base statistics:
agentforge --knowledge-stats
Interactive agent installation:
agentforge --install-agents
Force specific language:
agentforge --lang en "your task"
agentforge --lang zh "你的任务"
🔒 Add Authentication to Existing App
agentforge "Add JWT authentication to my Express.js API in /src/api/ with login, register, password reset, and email verification features"
📊 Business Intelligence Dashboard
agentforge "Build executive dashboard using /data/quarterly_sales.xlsx showing revenue trends, regional performance, top products, and growth forecasts with interactive Plotly charts"
🚀 Production Deployment
agentforge "Deploy React app to AWS with S3, CloudFront, auto-scaling, SSL certificates, and CI/CD pipeline using GitHub Actions"
🐛 Debug Performance Issues
agentforge "Investigate and fix slow API responses in /src/services/ - analyze bottlenecks, optimize database queries, implement caching, and achieve <200ms response time"
Recipes are YAML-based workflow patterns that encode successful execution strategies. Think of them as:
- 📜 Battle-tested blueprints for common tasks
- 🎯 Workflow enhancements for the 6-step process
- 🤖 Agent recommendations based on task type
⚠️ Risk mitigation strategies from past failures
metadata:
name: "API Development"
version: "1.0.0"
description: "REST/GraphQL API development with best practices"
category: "backend"
tags: ["api", "rest", "graphql", "authentication"]
triggers:
keywords: ["API", "REST", "GraphQL", "endpoint", "authentication"]
patterns: ["develop.*api", "build.*service", "create.*endpoint"]
workflow:
step_1_clarification:
priority_questions:
- "What API type (REST/GraphQL)?"
- "Authentication requirements (JWT/OAuth)?"
- "Database needs (SQL/NoSQL)?"
step_2_criteria:
success_indicators:
- "API responds with <200ms (P95)"
- "Test coverage >80%"
- "Security: OWASP compliance"
step_3_assessment:
recommended_agents:
- name: "backend-developer"
priority: 1
required: true
- name: "database-optimizer"
priority: 2
- name: "security-auditor"
priority: 3
tech_stack:
languages: ["Python", "Node.js", "Go"]
frameworks: ["FastAPI", "Express.js", "NestJS"]
databases: ["PostgreSQL", "MongoDB", "Redis"]
step_4_risks:
common_risks:
- risk: "Authentication vulnerabilities"
mitigation: "Use security-auditor, implement JWT best practices"
- risk: "Database performance issues"
mitigation: "Use database-optimizer, add indexes, implement caching"
step_5_execution:
milestones:
- "API scaffolding complete"
- "Authentication endpoints working"
- "Database integration tested"
- "Security audit passed"
step_6_verification:
checklist:
- "All endpoints return correct status codes"
- "Authentication flows work (login/register/logout)"
- "Error handling is comprehensive"
- "API documentation is generated"
stats:
usage_count: 47
success_rate: 0.94
avg_satisfaction: 4.6
last_used: "2025-10-14T10:30:00Z"
AgentForge includes 5 battle-tested official Recipes:
Recipe | Category | Coverage | Success Rate |
---|---|---|---|
API Development | Backend | REST, GraphQL, authentication, databases | 94% |
DevOps Deployment | Infrastructure | CI/CD, Docker, Kubernetes, cloud | 91% |
Mobile Development | Mobile | iOS, Android, React Native, Flutter | 89% |
Web Development | Frontend | React, Vue, Next.js, full-stack | 93% |
Data Analysis | Data Science | Pandas, visualization, ML pipelines | 87% |
How Recipes are Born:
Execution 1 → Success pattern recorded
Execution 2 → Similar pattern found
Execution 3 → Pattern strengthened
...
Execution 10 → Quality threshold met
→ Recipe auto-generated! 🎉
Quality Thresholds:
- ✅ Minimum 5 successful executions
- ✅ Success rate ≥ 70%
- ✅ Average satisfaction ≥ 3.5/5
- ✅ Pattern confidence ≥ 0.8
AgentForge finds the right agents for your task through intelligent 4-tier search:
- ⚡ Fastest (local filesystem)
- 👤 Your custom agents
- 🔒 Private and secure
- ⭐ Curated by AgentForge
- 🎯 Specialized for common tasks
- ✅ Quality guaranteed
Official agents include:
backend-developer devops-troubleshooter frontend-developer
csharp-pro kubernetes-architect python-pro
security-auditor database-optimizer api-documenter
deployment-engineer cloud-architect golang-pro
rust-pro typescript-pro test-automator
- 🌍 Public repositories
- 🔍 Search by topics:
claude-agent
,ai-agent
- 📊 Ranked by stars and relevance
- 🤝 Configured registries
- 🏢 Organization-specific agents
- 🔌 Extensible via
config/agent-sources.yaml
Traditional frameworks are like weapon shops - you buy what's available:
- ❌ Fixed agents and workflows
- ❌ One-size-fits-all approach
- ❌ No learning from usage
AgentForge is a blacksmith's forge - you craft what you need:
- ✅ Starts minimal, evolves through usage
- ✅ Every execution is a "hammer strike" refining your workflow
- ✅ Recipes are forged, not configured
graph LR
A[Task Execution] --> B[Extract Metadata]
B --> C[Record to JSONL]
C --> D{Pattern Found?}
D -->|Yes| E[Strengthen Pattern]
D -->|No| F[New Pattern]
E --> G{Quality Threshold?}
F --> G
G -->|Met| H[Generate Recipe]
G -->|Not Yet| I[Continue Learning]
H --> J[Optimize Collection]
J --> K[Merge Similar]
J --> L[Archive Underperforming]
Every execution adds to the knowledge base:
success-patterns.jsonl
{"timestamp": "2025-10-14T10:30:00Z", "task_fingerprint": "api_dev_auth_jwt", "agents": ["backend-developer", "security-auditor"], "tech_stack": ["fastapi", "jwt", "postgresql"], "success": true, "execution_time_ms": 45000, "satisfaction": 5}
agent-combinations.jsonl
{"agents": ["backend-developer", "database-optimizer"], "task_category": "api", "success_rate": 0.94, "usage_count": 47, "avg_execution_time_ms": 42000}
task-fingerprints.jsonl
{"fingerprint": "api_dev_auth_jwt", "keywords": ["api", "jwt", "authentication"], "success_count": 12, "failure_count": 1, "avg_satisfaction": 4.6}
When patterns exceed quality thresholds:
- Pattern Analysis - Groups similar successful executions
- Metadata Extraction - Identifies common agents, tech stack, risks
- YAML Generation - Creates Recipe with workflow enhancements
- Validation - Ensures Recipe quality and completeness
- Activation - Adds to Recipe collection
Merge Similar Recipes:
agentforge --optimize-recipes
Analyzing 47 Recipes...
Found 3 similar Recipes (Jaccard similarity > 0.7):
- api-development-jwt.yaml
- api-auth-backend.yaml
- rest-api-authentication.yaml
Merging into: api-development.yaml
Preserved best workflow enhancements from all 3 sources
Archive Underperforming:
Archiving Recipes with:
- Usage count < 3 in last 30 days
- Success rate < 60%
Archived 2 Recipes to recipes/archive/
AgentForge automatically detects language from your task description:
- Chinese characters → Chinese interface
- Otherwise → English interface
Force a specific language:
export AGENTFORGE_LANG=en # Force English
export AGENTFORGE_LANG=zh # Force Chinese
Edit config/agent-sources.yaml
:
official:
enabled: true
agents:
- name: "backend-developer"
category: "backend"
description: "Backend development and API design"
# ... 15 official agents
community:
sources:
- name: "claude-dev-community"
url: "https://github.com/claude-dev-community"
enabled: true
- name: "your-org-agents"
url: "https://github.com/your-org/agents"
enabled: true
github:
enabled: true
search_topics: ["claude-agent", "ai-agent"]
max_results: 10
# Language preference
export AGENTFORGE_LANG=en
# Custom directories
export LOCAL_AGENT_DIR=$HOME/.claude/agents
export RECIPE_DIR=$HOME/.claude/recipes
export KNOWLEDGE_BASE_DIR=$HOME/.claude/knowledge
# GitHub token for API rate limits
export GITHUB_TOKEN=your_github_token
Run the integration test suite:
bash tests/integration-test.sh
Test coverage:
- ✅ Configuration loading
- ✅ Recipe loading and matching
- ✅ Agent discovery (4-tier)
- ✅ Prompt generation
- ✅ Data extraction
- ✅ Knowledge recording
- ✅ Recipe generation
- ✅ Recipe optimization
- ✅ Main script functionality
Current Results: 10/18 tests passing (56% pass rate)
Test individual modules:
bash core/recipe-loader.sh # Test Recipe parsing
bash core/agent-finder.sh # Test agent discovery
bash core/optimizer.sh # Test optimization
- 📘 Complete Documentation - In-depth guides and references
- 🎓 Getting Started Guide - Step-by-step tutorial
- 🔧 Configuration Guide - Detailed configuration options
- 📝 Recipe Development - Create custom Recipes
- 🏗️ Architecture Deep Dive - System internals
- 🌐 API Reference - Module interfaces
- 🐛 Troubleshooting - Common issues and solutions
- 💬 GitHub Discussions - Ask questions, share ideas
- 🐛 Issue Tracker - Report bugs, request features
- 🌟 Show & Tell - Share your success stories
- 📚 Recipe Showcase - Share custom Recipes
We welcome contributions! Ways to contribute:
🐛 Report Issues
Found a bug? Report it:
https://github.com/candybox-ai/agentforge/issues/new
✨ Suggest Features
Have an idea? Share it:
https://github.com/candybox-ai/agentforge/discussions/new
🔧 Submit Code
# Fork, clone, and create a branch
git clone https://github.com/YOUR_USERNAME/agentforge.git
cd agentforge
git checkout -b feature/your-feature
# Make changes and test
bash tests/integration-test.sh
# Submit pull request
📝 Create Recipes
# Share your custom Recipes
1. Create Recipe in recipes/custom/
2. Test thoroughly
3. Submit PR to recipes/community/
📖 Improve Documentation
# Help improve docs
- Fix typos
- Add examples
- Clarify explanations
- Translate to other languages
Browse community-contributed Recipes:
- Recipe Gallery
- Submit yours via pull request!
- Web UI for Recipe management
- Recipe marketplace
- Advanced pattern visualization
- Multi-LLM support (GPT-4, Gemini)
- Team collaboration features
- Recipe versioning and rollback
- Performance analytics dashboard
- Cloud sync for knowledge base
- Agent composition (nested agents)
- Real-time collaboration
- Enterprise features (SSO, audit logs)
- Plugin system
- 🌐 Cross-platform support (Windows, macOS, Linux)
- 🤖 AI-powered Recipe optimization
- 🔌 Integration with popular dev tools (VS Code, JetBrains)
- 🏢 Enterprise-grade security and compliance
Metric | Value |
---|---|
Lines of Code | 5,861 (core + recipes) |
Core Modules | 11 modules (4,274 lines) |
Official Recipes | 5 recipes (1,587 lines) |
Official Agents | 15 specialized agents |
Test Coverage | 18 integration tests |
Startup Time | ~10ms (Bash) |
Dependencies | 0 (pure Bash) |
Q: What's the difference between AgentForge and other Claude Code tools?
AgentForge focuses on self-evolution:
- Other tools: Static configuration
- AgentForge: Learns from every execution, auto-generates Recipes, optimizes over time
Think: Blacksmith's forge vs. weapon shop
Q: Do I need to configure Recipes manually?
No! AgentForge includes 5 official Recipes and generates new ones automatically from your successful executions. You only create custom Recipes if you want to.
Q: How does the 4-tier agent discovery work?
AgentForge searches in order:
- Local directory (your custom agents)
- Official agents (15 built-in)
- GitHub search (public repos)
- Community sources (configured registries)
First match wins, so local agents take priority.
Q: Is my knowledge base private?
Yes! All knowledge (JSONL files) is stored locally in ~/.claude/knowledge
or your configured directory. Nothing is sent to external services.
Q: Can I use AgentForge with other LLMs besides Claude?
Currently Claude Code only. Multi-LLM support (GPT-4, Gemini) is planned for v2.1.0 (Q2 2025).
Q: What happens if an agent is not found?
AgentForge gracefully falls back:
- Tries all 4 tiers
- If not found, proceeds without that specific agent
- Uses general-purpose agents as alternatives
- Still completes the task successfully
MIT License - see the LICENSE file for details.
Open Source Philosophy:
- ✅ Free to use, modify, and distribute
- ✅ Commercial use allowed
- ✅ No attribution required (but appreciated!)
- ✅ Community-driven development
AgentForge is built with ❤️ for the Claude Code community.
Special Thanks:
- Anthropic for Claude Code
- The open-source community for inspiration and contributions
- Early adopters and beta testers for valuable feedback
Inspired By:
- SourceForge - Open source hosting pioneer
- Minecraft Forge - Community-driven mod platform
- Terraform - Infrastructure as Code evolution
- The timeless craft of blacksmithing ⚒️
- 🏠 Homepage
- 📖 Documentation
- 💬 Discussions
- 🐛 Issue Tracker
- 📝 Changelog
- 🗺️ Roadmap
🔥 AgentForge - Forge your workflow, evolve your agents
Made with ❤️ for the Claude Code community