Skip to content

πŸŒͺ️ AI research assistant that generates Wikipedia-quality articles through multi-perspective analysis. Based on Stanford's STORM methodology.

License

Notifications You must be signed in to change notification settings

teddynote-lab/STORM-Research-Assistant

Repository files navigation

πŸŒͺ️ STORM Research Assistant

License: MIT Python 3.11+ LangGraph Code style: black

STORM (Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking) - A writing system for generating grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages

πŸ“– Overview

STORM Research Assistant is a LangGraph-based implementation of the STORM methodology from Stanford, designed to write grounded and organized long-form articles from scratch. The system models the pre-writing stage by (1) discovering diverse perspectives for researching the given topic, (2) simulating conversations where writers with different perspectives pose questions to a topic expert grounded on trusted Internet sources, and (3) curating the collected information to create an outline before generating the final article.

🎯 Key Features

  • πŸ” Pre-writing Stage Modeling: Comprehensive research and outline preparation before article generation
  • πŸ€– Diverse Perspective Discovery: Automatic generation of multiple expert perspectives for comprehensive topic coverage
  • πŸ’¬ Simulated Expert Conversations: Multi-perspective question asking with grounded answers from trusted sources
  • πŸ“š Grounded Information: All content backed by reliable Internet sources (Tavily web search and ArXiv papers)
  • πŸ“Š Structured Outline Creation: Systematic curation of collected information into organized outlines
  • ✏️ Long-form Article Generation: Wikipedia-quality articles with introduction, detailed sections, and conclusion
  • πŸ”„ User Feedback Integration: Human-in-the-loop capability for refining analyst perspectives
  • ⚑ Parallel Processing: Simultaneous execution of multiple perspective interviews for efficiency
  • 🎨 LangGraph Studio Support: Full integration with LangGraph Studio for visual debugging

πŸ—οΈ Architecture

System Structure

πŸ“ src/storm_research/
β”œβ”€β”€ πŸ“„ __init__.py          # Package initialization
β”œβ”€β”€ 🧠 graph.py            # LangGraph graph definition (main logic)
β”œβ”€β”€ πŸ“Š state.py            # State and data model definitions
β”œβ”€β”€ πŸ’¬ prompts.py          # Prompt templates
β”œβ”€β”€ βš™οΈ configuration.py     # System configuration management
β”œβ”€β”€ πŸ”§ tools.py            # Search tool implementations
└── πŸ› οΈ utils.py            # Utility functions

Workflow

graph TD
    A[Start] --> B[Discover Diverse Perspectives]
    B --> C[Generate Expert Analysts]
    C --> D{User Feedback?}
    D -->|Has Feedback| C
    D -->|No Feedback| E[Simulate Expert Conversations]
    E --> F1[Perspective 1: Q&A with Expert]
    E --> F2[Perspective 2: Q&A with Expert]
    E --> F3[Perspective 3: Q&A with Expert]
    F1 --> G1[Ground Answers in Sources]
    F2 --> G2[Ground Answers in Sources]
    F3 --> G3[Ground Answers in Sources]
    G1 --> H[Curate Information]
    G2 --> H
    G3 --> H
    H --> I[Create Structured Outline]
    I --> J[Generate Article Sections]
    J --> K[Write Introduction]
    J --> L[Write Conclusion]
    K --> M[Final Wikipedia-style Article]
    L --> M
    M --> N[End]
Loading

πŸš€ Installation & Setup

Prerequisites

  • Python 3.11 or higher
  • uv package manager
  • API keys for chosen LLM providers

1. Clone the Repository

git clone https://github.com/teddynote-lab/STORM-Research-Assistant.git
cd STORM-Research-Assistant

2. Environment Setup

# Create virtual environment using uv
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
uv pip install -e .

# Install development dependencies
uv pip install -e ".[dev]"

3. Environment Variables

Create a .env file in the root directory and configure the following API keys:

# LangSmith for tracing
LANGSMITH_PROJECT=STORM-Research-Assistant
LANGSMITH_API_KEY=your_langsmith_api_key

# Required API Keys
TAVILY_API_KEY=your_tavily_api_key

# LLM Provider API Keys (choose one or more)
# OpenAI
OPENAI_API_KEY=your_openai_api_key

# Anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key

# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

4. Running LangGraph Studio

# Install LangGraph CLI (one-time setup)
pip install "langgraph-cli[inmem]"

# Run LangGraph Studio
uv run langgraph dev

Access the studio at http://localhost:2024

πŸ“ Usage

Basic Usage

from storm_research import graph
from langchain_core.runnables import RunnableConfig

# Configuration
config = RunnableConfig(
    configurable={
        "thread_id": "research-001",
        "model": "openai/gpt-4.1",  # Setup model
        "max_analysts": 3,
        "max_interview_turns": 3,
    }
)

# Start article generation
inputs = {
    "topic": "The Future of Quantum Computing in Cryptography",
    "max_analysts": 3
}

# Execute (First step: Discover perspectives and generate analysts)
result = await graph.ainvoke(inputs, config)

# Provide user feedback (optional) to refine perspectives
await graph.aupdate_state(
    config,
    {"human_analyst_feedback": "Please add a cybersecurity expert perspective"},
    as_node="human_feedback"
)

# Complete the pre-writing stage and generate article
final_result = await graph.ainvoke(None, config)
print(final_result["final_report"])

Configuration Options

Setting Default Description
model azure/gpt-4.1 LLM model to use (provider/model format)
max_analysts 3 Number of analysts to generate
max_interview_turns 3 Maximum interview turns per analyst
tavily_max_results 3 Number of Tavily search results
arxiv_max_docs 3 Number of ArXiv documents to retrieve
parallel_interviews True Whether to run interviews in parallel

Supported Models

  • Azure OpenAI: azure/gpt-4.1, azure/gpt-4.1-mini, azure/gpt-4.1-nano
  • OpenAI: openai/gpt-4.1, openai/gpt-4.1-mini, openai/gpt-4.1-nano
  • Anthropic: anthropic/claude-opus-4-20250514, anthropic/claude-3-7-sonnet-latest, anthropic/claude-3-5-haiku-latest

πŸ“š Examples

Technology Research

topic = "Next-Generation AI Architectures: Beyond Transformers"

Generated analysts might include:

  • AI Architecture Researcher
  • Hardware Optimization Expert
  • Industry Applications Specialist

Business Analysis

topic = "The Impact of AI on Global Supply Chain Management in 2024"

Generated analysts might include:

  • Supply Chain Expert
  • AI Technology Analyst
  • Business Strategy Consultant

Academic Research

topic = "Quantum Error Correction Methods for Scalable Quantum Computing"

Generated analysts might include:

  • Quantum Physics Researcher
  • Error Correction Specialist
  • Hardware Implementation Expert

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support

About

πŸŒͺ️ AI research assistant that generates Wikipedia-quality articles through multi-perspective analysis. Based on Stanford's STORM methodology.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •