MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks.
Multi-agent scaling through intelligent collaboration in Grok Heavy style
MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The power of this "parallel study group" approach is exemplified by advanced systems like xAI's Grok Heavy and Google DeepMind's Gemini Deep Think. This project started with the "threads of thought" and "iterative refinement" ideas presented in The Myth of Reasoning, and extends the classic "multi-agent conversation" idea in AG2. Here is a video recording of the background context introduction presented at the Berkeley Agentic AI Summit 2025.
- Key Future Enhancements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integration
- Improved Performance & Scalability
- Enhanced Developer Experience
- Web Interface
- v0.0.5 Roadmap
Feature | Description |
---|---|
π€ Cross-Model/Agent Synergy | Harness strengths from diverse frontier model-powered agents |
β‘ Parallel Processing | Multiple agents tackle problems simultaneously |
π₯ Intelligence Sharing | Agents share and learn from each other's work |
π Consensus Building | Natural convergence through collaborative refinement |
π Live Visualization | See agents' working processes in real-time |
MassGen operates through an architecture designed for seamless multi-agent collaboration:
graph TB
O[π MassGen Orchestrator<br/>π Task Distribution & Coordination]
subgraph Collaborative Agents
A1[Agent 1<br/>ποΈ Anthropic/Claude + Tools]
A2[Agent 2<br/>π Google/Gemini + Tools]
A3[Agent 3<br/>π€ OpenAI/GPT/O + Tools]
A4[Agent 4<br/>β‘ xAI/Grok + Tools]
end
H[π Shared Collaboration Hub<br/>π‘ Real-time Notification & Consensus]
O --> A1 & A2 & A3 & A4
A1 & A2 & A3 & A4 <--> H
classDef orchestrator fill:#e1f5fe,stroke:#0288d1,stroke-width:3px
classDef agent fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef hub fill:#e8f5e8,stroke:#388e3c,stroke-width:2px
class O orchestrator
class A1,A2,A3,A4 agent
class H hub
The system's workflow is defined by the following key principles:
Parallel Processing - Multiple agents tackle the same task simultaneously, each leveraging their unique capabilities (different models, tools, and specialized approaches).
Real-time Collaboration - Agents continuously share their working summaries and insights through a notification system, allowing them to learn from each other's approaches and build upon collective knowledge.
Convergence Detection - The system intelligently monitors when agents have reached stability in their solutions and achieved consensus through natural collaboration rather than forced agreement.
Adaptive Coordination - Agents can restart and refine their work when they receive new insights from others, creating a dynamic and responsive problem-solving environment.
This collaborative approach ensures that the final output leverages collective intelligence from multiple AI systems, leading to more robust and well-rounded results than any single agent could achieve alone.
git clone https://github.com/Leezekun/MassGen.git
cd MassGen
pip install uv
uv venv
Create a .env
file in the massgen
directory with your API keys:
# Copy example configuration
cp .env.example .env
# Edit with your API keys
ANTHROPIC_API_KEY=your-anthropic-key-here
GEMINI_API_KEY=your-gemini-key-here
OPENAI_API_KEY=your-openai-key-here
XAI_API_KEY=your-xai-key-here
Make sure you set up the API key for the model you want to use.
Useful links to get API keys:
MassGen now supports GPT-5 series models & GPT-OSS models! π
The system currently supports major model providers with advanced reasoning capabilities: Anthropic Claude, Cerebras, Google Gemini, OpenAI, and xAI Grok. GPT-OSS models can be accessed through the Cerebras backend. More providers and local inference of open-weight models (using vllm or sglang) are welcome to be added.
MassGen agents can leverage various tools to enhance their problem-solving capabilities. Claude
, Gemini
, and OpenAI
models support built-in web search and code execution. Grok
supports web search as well, but it does not currently offer native code execution at the model level.
Supported Built-in Tools by Models:
Backend | Live Search | Code Execution | Example Models |
---|---|---|---|
Claude | β | β | Claude-4-Opus |
Gemini | β | β | Gemini-2.5 |
Grok | β | β | Grok-4 |
OpenAI | β | β | GPT-5 |
Others (Cerebras...) | β | β | GPT-OSS-120B |
uv run python -m massgen.cli --model gemini-2.5-flash "Which AI won IMO in 2025?"
uv run python -m massgen.cli --model gpt-5-mini "Which AI won IMO in 2025?"
uv run python -m massgen.cli --model grok-3-mini "Which AI won IMO in 2025?"
uv run python -m massgen.cli --backend chatcompletion --model gpt-oss-120b --base-url https://api.cerebras.ai/v1/chat/completions "Which AI won IMO in 2025?"
All models that can be directly accessed using the --model
parameter can be found here.
Other models can be used with the --backend
parameter, the --model
parameter and optionally the --base-url
parameter (e.g GPT-OSS-120B).
# Use configuration file
uv run python -m massgen.cli --config three_agents_default.yaml "Compare different approaches to renewable energy"
All available quick configuration files can be found here.
Parameter | Description |
---|---|
--config |
Path to YAML configuration file with agent definitions, model parameters, backend parameters and UI settings. |
--backend |
Backend type for quick setup without a config file (chatcompletion , claude , gemini , grok or openai ). |
--model |
Model name for quick setup (e.g., gemini-2.5-flash , gpt-5-mini ). See all supported models without needing to specify backend. --config and --model are mutually exclusive - use one or the other. |
--base_url |
Base URL for API endpoint (e.g., https://api.cerebras.ai/v1/chat/completions) |
--system-message |
System prompt for the agent in quick setup mode. If --config is provided, --system-message is omitted. |
--no-display |
Disable real-time streaming UI coordination display (fallback to simple text output). |
--no-logs |
Disable real-time logging. |
"<your question>" |
Optional single-question input; if omitted, MassGen enters interactive chat mode. |
MassGen supports YAML configuration files with the following structure (All available quick configuration files can be found here):
Single Agent Configuration:
Use the agent
field to define a single agent with its backend and settings:
agent:
id: "<agent_name>"
backend:
type: "chatcompletion" | "claude" | "gemini" | "grok" | "openai" #Type of backend
model: "<model_name>" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
system_message: "..." # System Message for Single Agent
Multi-Agent Configuration:
Use the agents
field to define multiple agents, each with its own backend and config:
agents: # Multiple agents (alternative to 'agent')
- id: "<agent1 name>"
backend:
type: "chatcompletion" | "claude" | "gemini" | "grok" | "openai" #Type of backend
model: "<model_name>" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
system_message: "..." # System Message for Single Agent
- id: "..."
backend:
type: "..."
model: "..."
...
system_message: "..."
Backend Configuration:
Detailed parameters for each agent's backend can be specified using the following configuration formats:
backend:
type: "chatcompletion"
model: "gpt-oss-120b" # Model name
base_url: "https://api.cerebras.ai/v1/chat/completions" # Base URL for API endpoint
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
backend:
type: "claude"
model: "claude-sonnet-4-20250514" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability
enable_code_execution: true # Code execution capability
backend:
type: "gemini"
model: "gemini-2.5-flash" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability
enable_code_execution: true # Code execution capability
backend:
type: "grok"
model: "grok-3-mini" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0)
max_tokens: 2500 # Maximum response length
enable_web_search: true # Web search capability
return_citations: true # Include search result citations
max_search_results: 10 # Maximum search results to use
search_mode: "auto" # Search strategy: "auto", "fast", "thorough"
backend:
type: "openai"
model: "gpt-5" # Model name
api_key: "<optional_key>" # API key for backend. Uses env vars by default.
temperature: 0.7 # Creativity vs consistency (0.0-1.0, GPT-5 series models and GPT o-series models don't support this)
max_tokens: 2500 # Maximum response length (GPT-5 series models and GPT o-series models don't support this)
text:
verbosity: "medium" # Response detail level (low/medium/high, only supported in GPT-5 series models)
reasoning:
effort: "high" # Reasoning depth (low/medium/high, only supported in GPT-5 series models and GPT o-series models)
enable_web_search: true # Web search capability. Note, reasoning and web_search are mutually exclusive and can't be turned on at the same time
enable_code_interpreter: true # Code interpreter capability
UI Configuration:
Configure how MassGen displays information and handles logging during execution:
ui:
display_type: "rich_terminal" | "terminal" | "simple" # Display format for agent interactions
logging_enabled: true | false # Enable/disable real-time logging
display_type
: Controls the visual presentation of agent interactions"rich_terminal"
: Full-featured display with multi-region layout, live status updates, and colored output"terminal"
: Standard terminal display with basic formatting and sequential output"simple"
: Plain text output without any formatting or special display features
logging_enabled
: Whentrue
, saves detailed timestamp, agent outputs and system status
Advanced Parameters:
# Global backend parameters
backend_params:
temperature: 0.7
max_tokens: 2000
enable_web_search: true # Web search capability (all backends)
enable_code_interpreter: true # OpenAI only
enable_code_execution: true # Gemini/Claude only
MassGen supports an interactive mode where you can have ongoing conversations with the system:
# Start interactive mode with a single agent
uv run python -m massgen.cli --model gpt-5-mini
# Start interactive mode with configuration file
uv run python -m massgen.cli --config three_agents_default.yaml
Interactive Mode Features:
- Multi-turn conversations: Multiple agents collaborate to chat with you in an ongoing conversation
- Real-time feedback: Displays real-time agent and system status
- Clear conversation history: Type
/clear
to reset the conversation and start fresh - Easy exit: Type
/quit
,/exit
,/q
, or pressCtrl+C
to stop
Watch the recorded demo:
The system provides multiple ways to view and analyze results:
- Live Collaboration View: See agents working in parallel through a multi-region terminal display
- Status Updates: Real-time phase transitions, voting progress, and consensus building
- Streaming Output: Watch agents' reasoning and responses as they develop
Watch an example here:
All sessions are automatically logged with detailed information. The file can be viewed throught the interaction with UI.
agent_outputs/
βββ agent_1.txt # The full logs by agent 1
βββ agent_2.txt # The full logs by agent 2
βββ agent_3.txt # The full logs by agent 3
βββ system_status.txt # The full logs of system status
Here are a few examples of how you can use MassGen for different tasks:
To see how MassGen works in practice, check out these detailed case studies based on real session logs:
# Ask a question about a complex topic
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "what's best to do in Stockholm in October 2025"
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "give me all the talks on agent frameworks in Berkeley Agentic AI Summit 2025, note, the sources must include the word Berkeley, don't include talks from any other agentic AI summits"
# Generate a short story
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "Write a short story about a robot who discovers music."
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "How much does it cost to run HLE benchmark with Grok-4"
MassGen is currently in its foundational stage, with a focus on parallel, asynchronous multi-agent collaboration and orchestration. Our roadmap is centered on transforming this foundation into a highly robust, intelligent, and user-friendly system, while enabling frontier research and exploration. An earlier version of MassGen can be found here.
- Advanced Agent Collaboration: Exploring improved communication patterns and consensus-building protocols to improve agent synergy.
- Expanded Model, Tool & Agent Integration: Adding support for more models/tools/agents, including a wider range of tools like MCP Servers, and coding agents.
- Improved Performance & Scalability: Optimizing the streaming and logging mechanisms for better performance and resource management.
- Enhanced Developer Experience: Introducing a more modular agent design and a comprehensive benchmarking framework for easier extension and evaluation.
- Web Interface: Developing a web-based UI for better visualization and interaction with the agent ecosystem.
We welcome community contributions to help us achieve these goals.
Version 0.0.5 focuses primarily on Coding Agent Integration, introducing Claude Code CLI and Gemini CLI as powerful coding agents. Key enhancements include:
- Coding Agent Integration (Required): Seamless integration of Claude Code CLI and Gemini CLI with coding-specific tools and workflows
- Enhanced Backend Features (Optional): Improved error handling, health monitoring, and support for additional model providers
- Advanced CLI Features (Optional): Conversation save/load functionality, templates, export formats, and better multi-turn display
For detailed milestones and technical specifications, see the full v0.0.5 roadmap.
We welcome contributions! Please see our Contributing Guidelines for details.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
β Star this repo if you find it useful! β
Made with β€οΈ by the MassGen team