The Multi-Agent Deep Researcher is a modular workflow system where specialized agents collaborate to perform in-depth research on any prompt. Agents are coordinated by an orchestration layer, store context in a vector DB, and iteratively refine results through structured communication.
Key use cases:
- π Market Research β competitor, trends, reports
- π§ Academic Research β summarizing and synthesizing papers
- βοΈ Tech Research β emerging tools, frameworks, patents
- Multi-Agent Architecture (search, summarizer, evaluator, synthesizer)
- Orchestration Layer via LangChain / MCP / custom orchestrator
- Memory + RAG with FAISS / Pinecone / Weaviate
- Error Recovery: retries, graceful fallbacks, logging
- Extensible API (FastAPI backend with REST endpoints)
- Dockerized for reproducible environments
- CI/CD with GitHub Actions
Screenshot from Swagger UI (/docs)
example response from Deep Researcher system
deep-researcher/
βββ README.md
βββ docs/
β βββ architecture.md
β βββ agents.md
β βββ dev-setup.md
βββ src/
β βββ orchestrator/ # orchestration layer
β β βββ orchestrator.py
β βββ agents/ # agent implementations
β β βββ searcher.py
β β βββ extractor.py
β β βββ summarizer.py
β β βββ critic.py
β β βββ synthesizer.py
β βββ storage/
β β βββ vector_store.py
β β βββ metadata_db.py
β βββ api/
β βββ server.py # FastAPI
β
βββ tests/
βββ Dockerfile
βββ docker-compose.yml
βββ .github/workflows/ci.yml
βββ examples/
βββ demo_prompt.md
Searcher
- Inputs: research prompt
- Actions: query web, scholarly APIs (arXiv, CrossRef), news, and internal corpora.
- Outputs: raw documents, URLs, metadata, confidence scores.
Extractor
- Inputs: raw documents (HTML, PDF, text)
- Actions: OCR (if needed), text extraction, chunking, metadata extraction (title, authors, date, section headings)
- Outputs: chunks with embeddings ready for vector store ingestion.
Summarizer
- Inputs: retrieved chunks or raw text
- Actions: produce concise summaries at multiple granularities (sentence, paragraph, section)
- Outputs: summaries with provenance (source pointers)
Critic / Evaluator
- Inputs: summaries / synthesized content
- Actions: fact-check against source docs, evaluate reasoning quality, flag hallucinations, rank items by novelty/relevance
- Outputs: critiques, suggested revisions, confidence metrics
Synthesizer
- Inputs: summaries + critiques
- Actions: produce final structured deliverable (executive summary, literature review, gaps & opportunities, recommended reading list)
- Outputs: report (Markdown), references (structured), action items
Planner (optional)
- Builds multi-step plans, decides whether to re-run Search/Extractor on uncovered gaps.
Monitor / Supervisor
- Observes workflows, retries failed tasks, rate-limit management, and alerts.
- Message format (JSON)
{
"task_id": "uuid",
"agent": "searcher",
"prompt": "Find recent papers on retrieval-augmented generation",
"inputs": { },
"outputs": { },
"status": "queued|running|done|failed",
"meta": {"timestamp": "2025-08-01T12:00:00Z"}
}
- Transport
- Simple mode: in-process orchestrator with async function calls (asyncio)
- Distributed mode: RabbitMQ / Redis streams / Celery / MCP (message passing)
- Provenance
- Every chunk and summary must include
source_id
,source_url
,page
,bbox
(if from PDF), andtimestamp
.
- Idempotency & retries
- Tasks carry
task_id
andattempts
counter. Orchestrator retries transient errors up toN
attempts.
- Vector store
- Each chunk has embedding vector + metadata. Use sentence-transformers embeddings or OpenAI/Azure embeddings.
- Store in Pinecone / Weaviate / FAISS (local for demo).
- Short-term vs long-term memory
- Short-term context: session-level cache holding current prompt, recent messages (kept in memory or Redis with TTL)
- Long-term memory: vector DB plus metadata DB for persistent artifacts (reports, raw documents, citations)
- Context windowing
- Retrieval: top-k nearest neighbors with metadata filtering (e.g., date range, domain).
- Build RAG context by concatenating highest-quality chunks with provenance.
- Transient failures: retry policy with exponential backoff, capped attempts
- Permanent failures: mark task
failed
, escalate toMonitor
which can notify via Slack/email - Partial results: allow downstream agents to operate on partial outputs; track completeness score
- Logging: structured logs (JSON) with request IDs and spans for tracing (OpenTelemetry compatible)
- Metrics: Prometheus metrics for task latencies, failure rates, queue depth
flowchart LR
A[User Prompt]
A --> Orchestrator[Orchestration Layer]
Orchestrator --> Searcher[Searcher]
Searcher --> Extractor[Extractor]
Extractor --> VectorDB[(Vector DB)]
Orchestrator --> Retriever[Retriever]
Retriever --> Summarizer[Summarizer]
Summarizer --> Critic[Critic]
Critic --> Synthesizer[Synthesizer]
Synthesizer --> Output[Final Report]
Output -->|Store| VectorDB
Synthesizer -->|Feedback| Orchestrator
sequenceDiagram
participant U as User
participant O as Orchestrator
participant S as Searcher
participant X as Extractor
participant V as VectorDB
participant SUM as Summarizer
participant C as Critic
participant SYN as Synthesizer
U->>O: submit prompt
O->>S: search
S->>X: found docs
X->>V: store chunks
O->>V: retrieve context
V->>SUM: context
SUM->>C: summary
C->>O: critique
O->>SYN: synthesize (with critique)
SYN->>U: report
alt Critic requests more search
O->>S: additional search
end
git clone https://github.com/PranithChowdary/deep-researcher.git
cd deep-researcher
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn src.api.server:app --reload
curl -X POST "http://127.0.0.1:8000/research" \
-H "Content-Type: application/json" \
-d '{"prompt": "Latest research on quantum machine learning"}'
pytest tests/ -v
docker build -t deep-researcher .
docker run -p 8000:8000 deep-researcher
-
Dockerfile for app
-
docker-compose.yml
for local dev (FastAPI + FAISS + Redis) -
GitHub Actions:
ci.yml
: run lint, unit tests, build docker image, push to GHCR
architecture.md
β expanded diagrams, component responsibilities, tradeoffsagents.md
β each agent's input/output contract, sample prompts/templates, failure modesdev-setup.md
β environment variables, local vs cloud vector store optionsrunbook.md
β monitoring, incident response, scaling recommendations
- Unit tests for agent contracts
- Integration test: end-to-end flow with mocked LLM and local FAISS
- Metric examples: ROUGE/BERTScore for summary quality, human evaluation checklist, latency and cost per run
- Rate-limit external scrapers; respect robots.txt and copyright. Use only public data or licensed databases.
- Sanitize PII before storing in vector DB. Provide deletion endpoints for user data.
- Flag and human-review outputs with low-confidence or safety flags.