MAESTRO: Your Self-Hosted AI Research Assistant

⚠️ Version 0.1.3 - BREAKING CHANGE (08/15/2025)

Complete migration from SQLite/ChromaDB to PostgreSQL with pgvector.

Action Required: If upgrading, you must rebuild from scratch with docker compose down -v

New Requirements: PostgreSQL with pgvector extension (included in Docker setup)

Security: All credentials now configurable via environment variables

MAESTRO is an AI-powered research platform you can host on your own hardware. It's designed to manage complex research tasks from start to finish in a collaborative, multi-user environment. Plan your research, let AI agents carry it out, and watch as they generate detailed reports based on your documents and sources from the web.

A New Way to Conduct Research

MAESTRO streamlines the research process with a unified, chat-driven workflow. Define your research goals, upload your source materials, and let a team of AI agents handle the heavy lifting. It's a powerful tool for anyone who works with large amounts of information, from academics and analysts to writers and developers.

Core Features

Manage Your Document Library

Upload and manage your PDF documents in a central library. MAESTRO's advanced Retrieval-Augmented Generation (RAG) pipeline is optimized for academic and technical papers, ensuring your AI agents have access to the right information.

Create Focused Document Groups

Organize your library by creating document groups for specific projects. This allows you to direct the AI to pull information from a curated set of sources, ensuring relevance and accuracy in its research.

Customize Your Research Mission

Fine-tune the research process by setting specific parameters for the mission. You can define the scope, depth, and focus of the AI's investigation to match your exact needs.

Chat with Your Documents and the Web

Use the chat interface to ask questions and get answers sourced directly from your documents or the internet. It's a powerful way to get quick insights or inspiration for your work.

Get Help from the Writing Assistant

The writing assistant works alongside you, ready to pull information from your library or the web to help you draft notes, summarize findings, or overcome writer's block.

Follow the Agent's Research Path

MAESTRO provides full transparency into the AI's process. You can see the research outline it develops and follow along as it explores different avenues of investigation.

Review AI-Generated Notes

Let the research agent dive into your PDF collection or find new sources online. It will then synthesize the information and generate structured notes based on your research questions.

Track Mission Progress in Detail

Keep a close eye on every step of the research mission. The system provides detailed, real-time tracking of agent activities and status updates.

Understand the Agent's Reasoning

The AI agents provide detailed reflection notes, giving you insight into their thought processes, the decisions they make, and the conclusions they draw from the data.

Get a Full Report with References

Based on the research plan and generated notes, a final draft will be generated, including references from your documents and internet sources.

How It Works: The WRITER Agentic Framework

MAESTRO is a sophisticated multi-agent system designed to automate complex research synthesis. Instead of a single AI model, MAESTRO employs a team of specialized AI agents that collaborate to plan, execute, critique, and write research reports.

This methodology ensures a structured, transparent, and rigorous process from the initial question to the final, evidence-based report.

The MAESTRO Research Lifecycle

graph TD
    subgraph User Interaction
        A["User Defines Mission"]
    end

    subgraph "Phase 1: Planning"
        B["Planning Agent<br>Creates Research Plan & Outline"]
    end

    subgraph "Phase 2: Research & Reflection"
        C["Research Agent<br>Gathers Information (RAG/Web)"]
        D["Reflection Agent<br>Critiques Findings & Identifies Gaps"]
        C --> D
        D -- "Revisions Needed? ↪" --> B
        D -- "Evidence Complete? ✔" --> E
    end

    subgraph "Phase 3: Writing & Reflection"
        E["Writing Agent<br>Drafts Report Sections"]
        F["Reflection Agent<br>Reviews Draft for Clarity"]
        E --> F
        F -- "Revisions Needed? ↪" --> E
        F -- "Draft Approved? ✔" --> G
    end

    subgraph "Phase 4: Finalization"
        G["Agent Controller<br>Composes Final Report"]
    end

    A --> B
    B --> C
    G --> H["User Receives Report"]

    style A fill:#e6e6fa,stroke:#333,stroke-width:1px
    style H fill:#e6e6fa,stroke:#333,stroke-width:1px
    style B fill:#f9f0ff,stroke:#333,stroke-width:2px
    style C fill:#e0f7fa,stroke:#333,stroke-width:2px
    style D fill:#fff0f5,stroke:#333,stroke-width:2px
    style E fill:#e8f5e9,stroke:#333,stroke-width:2px
    style F fill:#fff0f5,stroke:#333,stroke-width:2px
    style G fill:#fffde7,stroke:#333,stroke-width:2px

The Core Agent Team

MAESTRO's capabilities are driven by a team of specialized agents, each with a distinct role:

Agent Controller (The Orchestrator): Manages the entire mission, delegating tasks to the appropriate agents and ensuring the workflow progresses smoothly from one phase to the next.
Planning Agent (The Strategist): Takes the user's initial request and transforms it into a structured, hierarchical research plan and a report outline. This creates a clear roadmap for the mission.
Research Agent (The Investigator): Executes the research plan by gathering information. It uses its tools—the local RAG pipeline and web search—to find relevant evidence and organizes it into structured ResearchNote objects.
Reflection Agent (The Critical Reviewer): This is the key to MAESTRO's analytical depth. The Reflection Agent constantly reviews the work of other agents, identifying knowledge gaps, inconsistencies, or deviations from the plan. Its feedback drives the iterative loops that refine and improve the quality of the research.
Writing Agent (The Synthesizer): Takes the curated research notes and weaves them into a coherent, well-structured narrative that follows the report outline.

The Research Process: Iteration and Refinement

The research process is not linear; it's a series of iterative loops designed to simulate critical thinking and ensure a high-quality outcome.

The Research-Reflection Loop: The Research Agent doesn't just gather information in one pass. After an initial round of research, the Reflection Agent steps in to critique the findings. It asks questions like:
- Are there gaps in the evidence?
- Do sources contradict each other?
- Have new, unexpected themes emerged? Based on this critique, the Reflection Agent can recommend new research tasks or even prompt the Planning Agent to revise the entire plan. This loop continues until the evidence is comprehensive and robust. The number of iterations is dynamic and depends on the complexity of the topic.
The Writing-Reflection Loop: Drafting is also an iterative process. Once the Writing Agent produces a section of the report, the Reflection Agent reviews it for:
- Clarity and Coherence: Is the argument easy to follow?
- Logical Flow: Are the ideas connected logically?
- Fidelity to Sources: Does the writing accurately represent the evidence in the ResearchNotes? The Writing Agent then revises the draft based on this feedback. This loop repeats until the writing meets the required standard of quality and accuracy.

This structured, reflective, and iterative process allows MAESTRO to move beyond simple information aggregation and produce sophisticated, reliable, and auditable research syntheses.

Getting Started

MAESTRO is designed to be run as a containerized application using Docker.

Prerequisites

Docker and Docker Compose (v2.0+)
Git for cloning the repository
PostgreSQL with pgvector extension (automatically provided via Docker)
Disk Space: ~5GB for AI models + ~2GB for database

Hardware Requirements

Recommended: NVIDIA GPU with CUDA support for optimal performance
Minimum: CPU-only operation is supported on all platforms
Platform Support:
- Linux: Full GPU support with nvidia-container-toolkit
- macOS: CPU mode (optimized for Apple Silicon and Intel)
- Windows: GPU support via WSL2

Quick Start

The simplest way to get started:

git clone https://github.com/murtaza-nasir/maestro.git
cd maestro
./setup-env.sh    # Linux/macOS
# or setup-env.ps1 # Windows PowerShell
docker compose up -d

⚠️ First Run: Initial startup takes 5-10 minutes to download AI models. Monitor progress with:

docker compose logs -f maestro-backend
# Wait for: "Application startup complete"

Access MAESTRO at http://localhost

Default Credentials (change immediately after first login):

Username: admin
Password: Generated during setup (check your .env file) or admin123 if not using setup script

Installation

Linux/macOS

Clone the Repository

git clone https://github.com/murtaza-nasir/maestro.git
cd maestro

Configure Your Environment Run the interactive setup script:
```
./setup-env.sh
```
Choose from three simple options:
- Simple (localhost only) - Recommended for most users
- Network (access from other devices on your network)
- Custom domain (for reverse proxy setups like researcher.local)
Start MAESTRO
```
# Recommended: Automatic GPU detection
./start.sh

# Or manually:
docker compose up -d
```
⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:
- The frontend will be accessible but login will fail
- You'll see "Network Error" or login failures
- This is normal! The backend is still downloading required models
Monitor the startup progress:
```
# Watch the backend logs
docker compose logs -f maestro-backend

# Wait for this message:
# "INFO:     Application startup complete."
# or "Uvicorn running on http://0.0.0.0:8000"
```

Windows

Clone the Repository

git clone https://github.com/murtaza-nasir/maestro.git
cd maestro

Important for Windows/WSL Users: If you encounter "bad interpreter" errors, run:

# Fix line endings before setup
.\fix-line-endings.ps1

Configure Your Environment Run the interactive setup script:
```
# Using PowerShell (recommended)
.\setup-env.ps1

# Or using Command Prompt
setup-env.bat
```
Choose from three simple options:
- Simple (localhost only) - Recommended for most users
- Network (access from other devices on your network)
- Custom domain (for reverse proxy setups)
Start MAESTRO
```
# For Windows/WSL without GPU:
docker compose -f docker-compose.cpu.yml up -d

# Or with GPU support (if available):
docker compose up -d
```
⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:
- The frontend will be accessible but login will fail
- You'll see "Network Error" or login failures
- This is normal! The backend is still downloading required models
Monitor the startup progress:
```
# Watch the backend logs
docker compose logs -f maestro-backend

# Wait for this message:
# "INFO:     Application startup complete."
# or "Uvicorn running on http://0.0.0.0:8000"
```

Access MAESTRO

Once the backend shows "Application startup complete", access the web interface at the address shown by the setup script (default: http://localhost).

Default Login:

Username: admin
Password: admin123

Important: Change the default password immediately after your first login via Settings → Profile.

For detailed instructions on configuring MAESTRO's settings and using all features, see the USER_GUIDE.md.

Architecture & Networking

MAESTRO now uses a unified reverse proxy architecture to eliminate CORS issues:

Single Entry Point: Everything accessible through one port (default: 80)
No CORS Problems: Frontend and backend served from the same origin
Simple Configuration: One host, one port to configure
Efficient Routing: nginx handles static files and API routing efficiently

Network Access Options

Localhost Only (Default): Access from the same computer
Network Access: Access from other devices on your network
Custom Domain: Use with reverse proxies (e.g., researcher.local)

Troubleshooting

Windows/WSL: Backend won't start ("bad interpreter" error)?

This is a line ending issue. Fix it with:

.\fix-line-endings.ps1
docker compose down
docker compose build --no-cache maestro-backend
docker compose up -d

Can't log in with admin/admin123?

Reset the admin password using the built-in script:

# Run the reset script (already in the container)
docker exec -it maestro-backend python reset_admin_password.py

# Or with a custom password:
docker exec -it maestro-backend python reset_admin_password.py YourNewPassword

# Or using environment variable:
docker exec -it maestro-backend bash -c "ADMIN_PASSWORD=YourNewPassword python reset_admin_password.py"

Using without a GPU?

Use the CPU-only compose file:
```
docker compose -f docker-compose.cpu.yml up -d
```
Always use -f docker-compose.cpu.yml for all Docker commands on Windows without GPU support.

Can't access from another device?

Re-run setup script with "Network" option
Check firewall settings
Ensure Docker containers are running: docker compose ps

Still seeing CORS errors?

Old configurations may conflict. Try: docker compose down && docker compose up --build -d
Check that you're accessing through the correct port

Getting 504 Gateway Timeout errors?

If running behind a reverse proxy (nginx, Apache, etc.), you need to increase timeout settings
Default proxy timeout (60s) is too short for AI operations
See Reverse Proxy Configuration for detailed instructions
The app handles timeouts gracefully, but proper configuration improves user experience

Migration from v0.1.2 or earlier (SQLite/ChromaDB):

IMPORTANT: This is a breaking change - no automatic migration available
You must start fresh with docker compose down -v to remove old volumes
Re-run setup script to generate new .env with secure passwords
All data will be lost - backup any important documents first
After migration, all data is stored in PostgreSQL with pgvector

Planning/Outline Generation Errors (Local LLMs or Large Research Tasks)?

If you encounter errors during outline generation or planning phases with the Planning Agent
This often happens with local LLMs or when processing extensive research with many notes
Solution: Reduce the "Planning Context" parameter in Settings → Research Parameters
Navigate to Settings → Research tab → Content Processing Limits section
Decrease "Planning Context" from default 200,000 to a lower value (e.g., 100,000 or 50,000)
This splits large planning tasks into smaller, more manageable batches
Particularly important for local LLMs with smaller context windows

GPU Support and Performance Optimization

MAESTRO includes automatic GPU detection and configuration for optimal performance across different platforms.

CPU-Only Mode

For systems without GPU or when GPU support is problematic, you have two options:

Option 1: Use the CPU-only Docker Compose file (Recommended)

docker compose -f docker-compose.cpu.yml up -d

This is the cleanest approach - it completely removes GPU dependencies from your containers, making them smaller and preventing any GPU-related errors. Perfect for dedicated CPU-only setups or servers without GPUs.

Option 2: Set FORCE_CPU_MODE in your .env file

# Add to your .env file
FORCE_CPU_MODE=true
# Then use regular docker compose
docker compose up -d

This approach keeps the GPU libraries in the container but tells the application to ignore them. It's convenient for development when you might want to switch between CPU and GPU modes without changing compose files.

Quick Start with GPU Detection

# Linux/macOS: Automatic platform detection
./start.sh

# Windows: PowerShell
.\start.sh

# Stop services
./stop.sh  # or .\stop.sh on Windows

Platform-Specific GPU Support

Linux with NVIDIA GPU:

✅ Full GPU acceleration with automatic detection
✅ Multi-GPU support with load distribution
Requirements: nvidia-container-toolkit installed
Setup: GPU support is automatically enabled when detected

macOS (Apple Silicon & Intel):

✅ CPU-optimized performance (no GPU runtime needed)
✅ Optimized for both Apple Silicon and Intel Macs
✅ Full compatibility through Docker Desktop

Windows with WSL2:

✅ GPU support through WSL2 and nvidia-container-toolkit
✅ Compatible with NVIDIA GPUs
Requirements: WSL2 with GPU support enabled

Manual GPU Configuration

For advanced users, you can manually configure GPU settings in .env:

# GPU device assignment (0, 1, 2, etc.)
BACKEND_GPU_DEVICE=0
DOC_PROCESSOR_GPU_DEVICE=0
CLI_GPU_DEVICE=0
GPU_AVAILABLE=true

Performance Tips

Multi-GPU: Assign different services to different GPUs
CPU Mode: Still performant for development and smaller workloads
Memory: Monitor GPU memory with nvidia-smi

Troubleshooting GPU Issues

# Check GPU detection
./detect_gpu.sh

# Test GPU in Docker
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

# View service logs
docker compose logs backend

Windows Users: For detailed Windows setup instructions, troubleshooting, and CLI usage, see WINDOWS_SETUP.md.

Database Management Tools

Database Reset and Consistency Tools

MAESTRO uses PostgreSQL with pgvector extension for all data storage including vector embeddings. The system includes powerful CLI tools for database management and consistency checking.

Quick Database Operations

# Check database status and consistency
./maestro-cli.sh reset-db --check

# Get database statistics  
./maestro-cli.sh reset-db --stats

# Reset all databases (with backup)
./maestro-cli.sh reset-db --backup

Document Consistency Management

# Check system-wide document consistency
python maestro_backend/cli_document_consistency.py system-status

# Clean up orphaned documents
python maestro_backend/cli_document_consistency.py cleanup-all

# Check specific user's documents
python maestro_backend/cli_document_consistency.py check-user <user_id>

When to Use These Tools

Database Reset: Complete fresh start, removes ALL data
Consistency Tools: Targeted cleanup, preserves valid data
Automatic Monitoring: Built-in system runs every 60 minutes

For detailed instructions and advanced usage, see README_DATABASE_RESET.md

Technical Overview

MAESTRO is built on a modern, decoupled architecture:

Backend: A robust API built with FastAPI that handles user authentication, mission control, agentic logic, and the RAG pipeline.
Frontend: A dynamic and responsive single-page application built with React, Vite, and TypeScript, using Tailwind CSS for styling.
Real-time Communication: WebSockets stream live updates, logs, and status changes from the backend to the frontend.
Database: PostgreSQL with pgvector extension for all data storage, SQLAlchemy ORM for database management.
Containerization: Docker Compose orchestrates the multi-service application for reliable deployment.

Fully Self-Hosted Operation

MAESTRO can be configured for a completely self-hosted environment. It supports local, OpenAI-compatible API models, allowing you to run your own LLMs. For web searches, it integrates with SearXNG, a private and hackable metasearch engine, ensuring that your entire research workflow can remain on your own hardware.

SearXNG Configuration

If you choose to use SearXNG as your search provider, ensure your SearXNG instance is properly configured:

Required Configuration:

Your SearXNG instance must support JSON output format
Add -json to the format section in your SearXNG settings after -html

Example SearXNG settings.yml configuration:

search:
  format:
    - html
    - json  # <- This line is required for MAESTRO integration

Available Search Categories: MAESTRO supports the following SearXNG categories, which you can configure in the Settings > Search section:

General (default)
Science
IT
News
Images
Videos
Music
Files
Map
Social Media

You can select multiple categories to refine your search results based on your research needs.

For advanced users and administrators, a powerful Command Line Interface (CLI) is available for bulk document ingestion, user management, and other administrative tasks. See CLI_GUIDE.md for complete documentation.

Quick CLI Examples

Linux/macOS:

./maestro-cli.sh help
./maestro-cli.sh create-user researcher mypass123
./maestro-cli.sh ingest researcher ./documents

Windows:

# Using PowerShell (recommended)
.\maestro-cli.ps1 help
.\maestro-cli.ps1 create-user researcher mypass123
.\maestro-cli.ps1 ingest researcher .\documents

# Or using Command Prompt
maestro-cli.bat help
maestro-cli.bat create-user researcher mypass123
maestro-cli.bat ingest researcher .\documents

Documentation

USER_GUIDE.md - Detailed guide for configuring and using MAESTRO's features
CLI_GUIDE.md - Comprehensive command-line interface documentation
DOCKER.md - Complete Docker setup and deployment instructions
WINDOWS_SETUP.md - Windows-specific installation guide

License

This project is dual-licensed:

GNU Affero General Public License v3.0 (AGPLv3): MAESTRO is offered under the AGPLv3 as its open-source license.
Commercial License: For users or organizations who cannot comply with the AGPLv3, a separate commercial license is available. Please contact the maintainers for more details.

Contributing

Feedback, bug reports, and feature suggestions are highly valuable. Please feel free to open an Issue on the GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.streamlit		.streamlit
evaluation		evaluation
example reports		example reports
images		images
init-db		init-db
maestro_backend		maestro_backend
maestro_frontend		maestro_frontend
nginx		nginx
scripts		scripts
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLI_GUIDE.md		CLI_GUIDE.md
CPU_MODE_SETUP.md		CPU_MODE_SETUP.md
DOCKER.md		DOCKER.md
LICENSE		LICENSE
README.md		README.md
README_DATABASE_RESET.md		README_DATABASE_RESET.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
USER_GUIDE.md		USER_GUIDE.md
VERIFIER_AND_MODEL_FINDINGS.md		VERIFIER_AND_MODEL_FINDINGS.md
WINDOWS_SETUP.md		WINDOWS_SETUP.md
cleanup_old_system.sh		cleanup_old_system.sh
detect_gpu.sh		detect_gpu.sh
docker-compose.cpu.yml		docker-compose.cpu.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
env.example		env.example
fix-line-endings.ps1		fix-line-endings.ps1
heatmap_all_categories.png		heatmap_all_categories.png
heatmap_performance_metrics.png		heatmap_performance_metrics.png
log_process.py		log_process.py
maestro-cli.ps1		maestro-cli.ps1
maestro-cli.sh		maestro-cli.sh
reset_databases.py		reset_databases.py
reset_databases_docker.sh		reset_databases_docker.sh
setup-env.ps1		setup-env.ps1
setup-env.sh		setup-env.sh
start.sh		start.sh
stop.sh		stop.sh

License

huycke/maestro2

Folders and files

Latest commit

History

Repository files navigation

MAESTRO: Your Self-Hosted AI Research Assistant

A New Way to Conduct Research

Core Features

How It Works: The WRITER Agentic Framework

The Core Agent Team

The Research Process: Iteration and Refinement

Getting Started

Prerequisites

Hardware Requirements

Quick Start

Installation

Linux/macOS

Windows

Access MAESTRO

Architecture & Networking

Network Access Options

Troubleshooting

GPU Support and Performance Optimization

CPU-Only Mode

Quick Start with GPU Detection

Platform-Specific GPU Support

Manual GPU Configuration

Performance Tips

Troubleshooting GPU Issues

Database Management Tools

Quick Database Operations

Document Consistency Management

When to Use These Tools

Technical Overview

Fully Self-Hosted Operation

SearXNG Configuration

Quick CLI Examples

Documentation

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages