Skip to content

MAESTRO is an AI-powered research application designed to streamline complex research tasks.

License

Notifications You must be signed in to change notification settings

huycke/maestro2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAESTRO Logo

MAESTRO: Your Self-Hosted AI Research Assistant

License: AGPL v3 Version Docker

⚠️ Version 0.1.3 - BREAKING CHANGE (08/15/2025)

Complete migration from SQLite/ChromaDB to PostgreSQL with pgvector.

  • Action Required: If upgrading, you must rebuild from scratch with docker compose down -v
  • New Requirements: PostgreSQL with pgvector extension (included in Docker setup)
  • Security: All credentials now configurable via environment variables

MAESTRO is an AI-powered research platform you can host on your own hardware. It's designed to manage complex research tasks from start to finish in a collaborative, multi-user environment. Plan your research, let AI agents carry it out, and watch as they generate detailed reports based on your documents and sources from the web.

Final Draft

A New Way to Conduct Research

MAESTRO streamlines the research process with a unified, chat-driven workflow. Define your research goals, upload your source materials, and let a team of AI agents handle the heavy lifting. It's a powerful tool for anyone who works with large amounts of information, from academics and analysts to writers and developers.

Core Features

Manage Your Document Library

Upload and manage your PDF documents in a central library. MAESTRO's advanced Retrieval-Augmented Generation (RAG) pipeline is optimized for academic and technical papers, ensuring your AI agents have access to the right information.

Document Library

Create Focused Document Groups

Organize your library by creating document groups for specific projects. This allows you to direct the AI to pull information from a curated set of sources, ensuring relevance and accuracy in its research.

Document Groups

Customize Your Research Mission

Fine-tune the research process by setting specific parameters for the mission. You can define the scope, depth, and focus of the AI's investigation to match your exact needs.

Mission Settings

Chat with Your Documents and the Web

Use the chat interface to ask questions and get answers sourced directly from your documents or the internet. It's a powerful way to get quick insights or inspiration for your work.

Chat with Documents

Get Help from the Writing Assistant

The writing assistant works alongside you, ready to pull information from your library or the web to help you draft notes, summarize findings, or overcome writer's block.

Writing Assistant

Follow the Agent's Research Path

MAESTRO provides full transparency into the AI's process. You can see the research outline it develops and follow along as it explores different avenues of investigation.

Research Transparency

Review AI-Generated Notes

Let the research agent dive into your PDF collection or find new sources online. It will then synthesize the information and generate structured notes based on your research questions.

Automated Notes

Track Mission Progress in Detail

Keep a close eye on every step of the research mission. The system provides detailed, real-time tracking of agent activities and status updates.

Mission Tracking

Understand the Agent's Reasoning

The AI agents provide detailed reflection notes, giving you insight into their thought processes, the decisions they make, and the conclusions they draw from the data.

Agent Reflection

Get a Full Report with References

Based on the research plan and generated notes, a final draft will be generated, including references from your documents and internet sources.

How It Works: The WRITER Agentic Framework


MAESTRO is a sophisticated multi-agent system designed to automate complex research synthesis. Instead of a single AI model, MAESTRO employs a team of specialized AI agents that collaborate to plan, execute, critique, and write research reports.

This methodology ensures a structured, transparent, and rigorous process from the initial question to the final, evidence-based report.

The MAESTRO Research Lifecycle

graph TD
    subgraph User Interaction
        A["User Defines Mission"]
    end

    subgraph "Phase 1: Planning"
        B["Planning Agent<br>Creates Research Plan & Outline"]
    end

    subgraph "Phase 2: Research & Reflection"
        C["Research Agent<br>Gathers Information (RAG/Web)"]
        D["Reflection Agent<br>Critiques Findings & Identifies Gaps"]
        C --> D
        D -- "Revisions Needed? ↪" --> B
        D -- "Evidence Complete? ✔" --> E
    end

    subgraph "Phase 3: Writing & Reflection"
        E["Writing Agent<br>Drafts Report Sections"]
        F["Reflection Agent<br>Reviews Draft for Clarity"]
        E --> F
        F -- "Revisions Needed? ↪" --> E
        F -- "Draft Approved? ✔" --> G
    end

    subgraph "Phase 4: Finalization"
        G["Agent Controller<br>Composes Final Report"]
    end

    A --> B
    B --> C
    G --> H["User Receives Report"]

    style A fill:#e6e6fa,stroke:#333,stroke-width:1px
    style H fill:#e6e6fa,stroke:#333,stroke-width:1px
    style B fill:#f9f0ff,stroke:#333,stroke-width:2px
    style C fill:#e0f7fa,stroke:#333,stroke-width:2px
    style D fill:#fff0f5,stroke:#333,stroke-width:2px
    style E fill:#e8f5e9,stroke:#333,stroke-width:2px
    style F fill:#fff0f5,stroke:#333,stroke-width:2px
    style G fill:#fffde7,stroke:#333,stroke-width:2px
Loading

The Core Agent Team

MAESTRO's capabilities are driven by a team of specialized agents, each with a distinct role:

  • Agent Controller (The Orchestrator): Manages the entire mission, delegating tasks to the appropriate agents and ensuring the workflow progresses smoothly from one phase to the next.
  • Planning Agent (The Strategist): Takes the user's initial request and transforms it into a structured, hierarchical research plan and a report outline. This creates a clear roadmap for the mission.
  • Research Agent (The Investigator): Executes the research plan by gathering information. It uses its tools—the local RAG pipeline and web search—to find relevant evidence and organizes it into structured ResearchNote objects.
  • Reflection Agent (The Critical Reviewer): This is the key to MAESTRO's analytical depth. The Reflection Agent constantly reviews the work of other agents, identifying knowledge gaps, inconsistencies, or deviations from the plan. Its feedback drives the iterative loops that refine and improve the quality of the research.
  • Writing Agent (The Synthesizer): Takes the curated research notes and weaves them into a coherent, well-structured narrative that follows the report outline.

The Research Process: Iteration and Refinement

The research process is not linear; it's a series of iterative loops designed to simulate critical thinking and ensure a high-quality outcome.

  1. The Research-Reflection Loop: The Research Agent doesn't just gather information in one pass. After an initial round of research, the Reflection Agent steps in to critique the findings. It asks questions like:

    • Are there gaps in the evidence?
    • Do sources contradict each other?
    • Have new, unexpected themes emerged? Based on this critique, the Reflection Agent can recommend new research tasks or even prompt the Planning Agent to revise the entire plan. This loop continues until the evidence is comprehensive and robust. The number of iterations is dynamic and depends on the complexity of the topic.
  2. The Writing-Reflection Loop: Drafting is also an iterative process. Once the Writing Agent produces a section of the report, the Reflection Agent reviews it for:

    • Clarity and Coherence: Is the argument easy to follow?
    • Logical Flow: Are the ideas connected logically?
    • Fidelity to Sources: Does the writing accurately represent the evidence in the ResearchNotes? The Writing Agent then revises the draft based on this feedback. This loop repeats until the writing meets the required standard of quality and accuracy.

This structured, reflective, and iterative process allows MAESTRO to move beyond simple information aggregation and produce sophisticated, reliable, and auditable research syntheses.

Getting Started

MAESTRO is designed to be run as a containerized application using Docker.

Prerequisites

  • Docker and Docker Compose (v2.0+)
  • Git for cloning the repository
  • PostgreSQL with pgvector extension (automatically provided via Docker)
  • Disk Space: ~5GB for AI models + ~2GB for database

Hardware Requirements

  • Recommended: NVIDIA GPU with CUDA support for optimal performance
  • Minimum: CPU-only operation is supported on all platforms
  • Platform Support:
    • Linux: Full GPU support with nvidia-container-toolkit
    • macOS: CPU mode (optimized for Apple Silicon and Intel)
    • Windows: GPU support via WSL2

Quick Start

The simplest way to get started:

git clone https://github.com/murtaza-nasir/maestro.git
cd maestro
./setup-env.sh    # Linux/macOS
# or setup-env.ps1 # Windows PowerShell
docker compose up -d

⚠️ First Run: Initial startup takes 5-10 minutes to download AI models. Monitor progress with:

docker compose logs -f maestro-backend
# Wait for: "Application startup complete"

Access MAESTRO at http://localhost

Default Credentials (change immediately after first login):

  • Username: admin
  • Password: Generated during setup (check your .env file) or admin123 if not using setup script

Installation

Linux/macOS

  1. Clone the Repository

    git clone https://github.com/murtaza-nasir/maestro.git
    cd maestro
  2. Configure Your Environment Run the interactive setup script:

    ./setup-env.sh

    Choose from three simple options:

    • Simple (localhost only) - Recommended for most users
    • Network (access from other devices on your network)
    • Custom domain (for reverse proxy setups like researcher.local)
  3. Start MAESTRO

    # Recommended: Automatic GPU detection
    ./start.sh
    
    # Or manually:
    docker compose up -d

    ⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:

    • The frontend will be accessible but login will fail
    • You'll see "Network Error" or login failures
    • This is normal! The backend is still downloading required models

    Monitor the startup progress:

    # Watch the backend logs
    docker compose logs -f maestro-backend
    
    # Wait for this message:
    # "INFO:     Application startup complete."
    # or "Uvicorn running on http://0.0.0.0:8000"

Windows

  1. Clone the Repository

    git clone https://github.com/murtaza-nasir/maestro.git
    cd maestro

    Important for Windows/WSL Users: If you encounter "bad interpreter" errors, run:

    # Fix line endings before setup
    .\fix-line-endings.ps1
  2. Configure Your Environment Run the interactive setup script:

    # Using PowerShell (recommended)
    .\setup-env.ps1
    
    # Or using Command Prompt
    setup-env.bat

    Choose from three simple options:

    • Simple (localhost only) - Recommended for most users
    • Network (access from other devices on your network)
    • Custom domain (for reverse proxy setups)
  3. Start MAESTRO

    # For Windows/WSL without GPU:
    docker compose -f docker-compose.cpu.yml up -d
    
    # Or with GPU support (if available):
    docker compose up -d

    ⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:

    • The frontend will be accessible but login will fail
    • You'll see "Network Error" or login failures
    • This is normal! The backend is still downloading required models

    Monitor the startup progress:

    # Watch the backend logs
    docker compose logs -f maestro-backend
    
    # Wait for this message:
    # "INFO:     Application startup complete."
    # or "Uvicorn running on http://0.0.0.0:8000"

Access MAESTRO

Once the backend shows "Application startup complete", access the web interface at the address shown by the setup script (default: http://localhost).

Default Login:

  • Username: admin
  • Password: admin123

Important: Change the default password immediately after your first login via Settings → Profile.

For detailed instructions on configuring MAESTRO's settings and using all features, see the USER_GUIDE.md.

Architecture & Networking

MAESTRO now uses a unified reverse proxy architecture to eliminate CORS issues:

  • Single Entry Point: Everything accessible through one port (default: 80)
  • No CORS Problems: Frontend and backend served from the same origin
  • Simple Configuration: One host, one port to configure
  • Efficient Routing: nginx handles static files and API routing efficiently

Network Access Options

  1. Localhost Only (Default): Access from the same computer
  2. Network Access: Access from other devices on your network
  3. Custom Domain: Use with reverse proxies (e.g., researcher.local)

Troubleshooting

Windows/WSL: Backend won't start ("bad interpreter" error)?

  • This is a line ending issue. Fix it with:
    .\fix-line-endings.ps1
    docker compose down
    docker compose build --no-cache maestro-backend
    docker compose up -d

Can't log in with admin/admin123?

  • Reset the admin password using the built-in script:
    # Run the reset script (already in the container)
    docker exec -it maestro-backend python reset_admin_password.py
    
    # Or with a custom password:
    docker exec -it maestro-backend python reset_admin_password.py YourNewPassword
    
    # Or using environment variable:
    docker exec -it maestro-backend bash -c "ADMIN_PASSWORD=YourNewPassword python reset_admin_password.py"

Using without a GPU?

  • Use the CPU-only compose file:
    docker compose -f docker-compose.cpu.yml up -d
    Always use -f docker-compose.cpu.yml for all Docker commands on Windows without GPU support.

Can't access from another device?

  • Re-run setup script with "Network" option
  • Check firewall settings
  • Ensure Docker containers are running: docker compose ps

Still seeing CORS errors?

  • Old configurations may conflict. Try: docker compose down && docker compose up --build -d
  • Check that you're accessing through the correct port

Getting 504 Gateway Timeout errors?

  • If running behind a reverse proxy (nginx, Apache, etc.), you need to increase timeout settings
  • Default proxy timeout (60s) is too short for AI operations
  • See Reverse Proxy Configuration for detailed instructions
  • The app handles timeouts gracefully, but proper configuration improves user experience

Migration from v0.1.2 or earlier (SQLite/ChromaDB):

  • IMPORTANT: This is a breaking change - no automatic migration available
  • You must start fresh with docker compose down -v to remove old volumes
  • Re-run setup script to generate new .env with secure passwords
  • All data will be lost - backup any important documents first
  • After migration, all data is stored in PostgreSQL with pgvector

Planning/Outline Generation Errors (Local LLMs or Large Research Tasks)?

  • If you encounter errors during outline generation or planning phases with the Planning Agent
  • This often happens with local LLMs or when processing extensive research with many notes
  • Solution: Reduce the "Planning Context" parameter in Settings → Research Parameters
  • Navigate to Settings → Research tab → Content Processing Limits section
  • Decrease "Planning Context" from default 200,000 to a lower value (e.g., 100,000 or 50,000)
  • This splits large planning tasks into smaller, more manageable batches
  • Particularly important for local LLMs with smaller context windows

GPU Support and Performance Optimization

MAESTRO includes automatic GPU detection and configuration for optimal performance across different platforms.

CPU-Only Mode

For systems without GPU or when GPU support is problematic, you have two options:

Option 1: Use the CPU-only Docker Compose file (Recommended)

docker compose -f docker-compose.cpu.yml up -d

This is the cleanest approach - it completely removes GPU dependencies from your containers, making them smaller and preventing any GPU-related errors. Perfect for dedicated CPU-only setups or servers without GPUs.

Option 2: Set FORCE_CPU_MODE in your .env file

# Add to your .env file
FORCE_CPU_MODE=true
# Then use regular docker compose
docker compose up -d

This approach keeps the GPU libraries in the container but tells the application to ignore them. It's convenient for development when you might want to switch between CPU and GPU modes without changing compose files.

Quick Start with GPU Detection

# Linux/macOS: Automatic platform detection
./start.sh

# Windows: PowerShell
.\start.sh

# Stop services
./stop.sh  # or .\stop.sh on Windows

Platform-Specific GPU Support

Linux with NVIDIA GPU:

  • ✅ Full GPU acceleration with automatic detection
  • ✅ Multi-GPU support with load distribution
  • Requirements: nvidia-container-toolkit installed
  • Setup: GPU support is automatically enabled when detected

macOS (Apple Silicon & Intel):

  • ✅ CPU-optimized performance (no GPU runtime needed)
  • ✅ Optimized for both Apple Silicon and Intel Macs
  • ✅ Full compatibility through Docker Desktop

Windows with WSL2:

  • ✅ GPU support through WSL2 and nvidia-container-toolkit
  • ✅ Compatible with NVIDIA GPUs
  • Requirements: WSL2 with GPU support enabled

Manual GPU Configuration

For advanced users, you can manually configure GPU settings in .env:

# GPU device assignment (0, 1, 2, etc.)
BACKEND_GPU_DEVICE=0
DOC_PROCESSOR_GPU_DEVICE=0
CLI_GPU_DEVICE=0
GPU_AVAILABLE=true

Performance Tips

  • Multi-GPU: Assign different services to different GPUs
  • CPU Mode: Still performant for development and smaller workloads
  • Memory: Monitor GPU memory with nvidia-smi

Troubleshooting GPU Issues

# Check GPU detection
./detect_gpu.sh

# Test GPU in Docker
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

# View service logs
docker compose logs backend

Windows Users: For detailed Windows setup instructions, troubleshooting, and CLI usage, see WINDOWS_SETUP.md.

Database Management Tools

Database Reset and Consistency Tools

MAESTRO uses PostgreSQL with pgvector extension for all data storage including vector embeddings. The system includes powerful CLI tools for database management and consistency checking.

Quick Database Operations

# Check database status and consistency
./maestro-cli.sh reset-db --check

# Get database statistics  
./maestro-cli.sh reset-db --stats

# Reset all databases (with backup)
./maestro-cli.sh reset-db --backup

Document Consistency Management

# Check system-wide document consistency
python maestro_backend/cli_document_consistency.py system-status

# Clean up orphaned documents
python maestro_backend/cli_document_consistency.py cleanup-all

# Check specific user's documents
python maestro_backend/cli_document_consistency.py check-user <user_id>

When to Use These Tools

  • Database Reset: Complete fresh start, removes ALL data
  • Consistency Tools: Targeted cleanup, preserves valid data
  • Automatic Monitoring: Built-in system runs every 60 minutes

For detailed instructions and advanced usage, see README_DATABASE_RESET.md

Technical Overview

MAESTRO is built on a modern, decoupled architecture:

  • Backend: A robust API built with FastAPI that handles user authentication, mission control, agentic logic, and the RAG pipeline.
  • Frontend: A dynamic and responsive single-page application built with React, Vite, and TypeScript, using Tailwind CSS for styling.
  • Real-time Communication: WebSockets stream live updates, logs, and status changes from the backend to the frontend.
  • Database: PostgreSQL with pgvector extension for all data storage, SQLAlchemy ORM for database management.
  • Containerization: Docker Compose orchestrates the multi-service application for reliable deployment.

Fully Self-Hosted Operation

MAESTRO can be configured for a completely self-hosted environment. It supports local, OpenAI-compatible API models, allowing you to run your own LLMs. For web searches, it integrates with SearXNG, a private and hackable metasearch engine, ensuring that your entire research workflow can remain on your own hardware.

SearXNG Configuration

If you choose to use SearXNG as your search provider, ensure your SearXNG instance is properly configured:

Required Configuration:

  • Your SearXNG instance must support JSON output format
  • Add -json to the format section in your SearXNG settings after -html

Example SearXNG settings.yml configuration:

search:
  format:
    - html
    - json  # <- This line is required for MAESTRO integration

Available Search Categories: MAESTRO supports the following SearXNG categories, which you can configure in the Settings > Search section:

  • General (default)
  • Science
  • IT
  • News
  • Images
  • Videos
  • Music
  • Files
  • Map
  • Social Media

You can select multiple categories to refine your search results based on your research needs.

For advanced users and administrators, a powerful Command Line Interface (CLI) is available for bulk document ingestion, user management, and other administrative tasks. See CLI_GUIDE.md for complete documentation.

Quick CLI Examples

Linux/macOS:

./maestro-cli.sh help
./maestro-cli.sh create-user researcher mypass123
./maestro-cli.sh ingest researcher ./documents

Windows:

# Using PowerShell (recommended)
.\maestro-cli.ps1 help
.\maestro-cli.ps1 create-user researcher mypass123
.\maestro-cli.ps1 ingest researcher .\documents

# Or using Command Prompt
maestro-cli.bat help
maestro-cli.bat create-user researcher mypass123
maestro-cli.bat ingest researcher .\documents

Documentation

  • USER_GUIDE.md - Detailed guide for configuring and using MAESTRO's features
  • CLI_GUIDE.md - Comprehensive command-line interface documentation
  • DOCKER.md - Complete Docker setup and deployment instructions
  • WINDOWS_SETUP.md - Windows-specific installation guide

License

This project is dual-licensed:

  1. GNU Affero General Public License v3.0 (AGPLv3): MAESTRO is offered under the AGPLv3 as its open-source license.
  2. Commercial License: For users or organizations who cannot comply with the AGPLv3, a separate commercial license is available. Please contact the maintainers for more details.

Contributing

Feedback, bug reports, and feature suggestions are highly valuable. Please feel free to open an Issue on the GitHub repository.

About

MAESTRO is an AI-powered research application designed to streamline complex research tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.0%
  • TypeScript 26.1%
  • Jupyter Notebook 2.6%
  • CSS 1.1%
  • Shell 0.9%
  • PowerShell 0.7%
  • Other 0.6%