⚠️ Version 0.1.3 - BREAKING CHANGE (08/15/2025)Complete migration from SQLite/ChromaDB to PostgreSQL with pgvector.
- Action Required: If upgrading, you must rebuild from scratch with
docker compose down -v
- New Requirements: PostgreSQL with pgvector extension (included in Docker setup)
- Security: All credentials now configurable via environment variables
MAESTRO is an AI-powered research platform you can host on your own hardware. It's designed to manage complex research tasks from start to finish in a collaborative, multi-user environment. Plan your research, let AI agents carry it out, and watch as they generate detailed reports based on your documents and sources from the web.
MAESTRO streamlines the research process with a unified, chat-driven workflow. Define your research goals, upload your source materials, and let a team of AI agents handle the heavy lifting. It's a powerful tool for anyone who works with large amounts of information, from academics and analysts to writers and developers.
Manage Your Document Library
Upload and manage your PDF documents in a central library. MAESTRO's advanced Retrieval-Augmented Generation (RAG) pipeline is optimized for academic and technical papers, ensuring your AI agents have access to the right information.
Create Focused Document Groups
Organize your library by creating document groups for specific projects. This allows you to direct the AI to pull information from a curated set of sources, ensuring relevance and accuracy in its research.
Customize Your Research Mission
Fine-tune the research process by setting specific parameters for the mission. You can define the scope, depth, and focus of the AI's investigation to match your exact needs.
Chat with Your Documents and the Web
Use the chat interface to ask questions and get answers sourced directly from your documents or the internet. It's a powerful way to get quick insights or inspiration for your work.
Get Help from the Writing Assistant
The writing assistant works alongside you, ready to pull information from your library or the web to help you draft notes, summarize findings, or overcome writer's block.
Follow the Agent's Research Path
MAESTRO provides full transparency into the AI's process. You can see the research outline it develops and follow along as it explores different avenues of investigation.
Review AI-Generated Notes
Let the research agent dive into your PDF collection or find new sources online. It will then synthesize the information and generate structured notes based on your research questions.
Track Mission Progress in Detail
Keep a close eye on every step of the research mission. The system provides detailed, real-time tracking of agent activities and status updates.
Understand the Agent's Reasoning
The AI agents provide detailed reflection notes, giving you insight into their thought processes, the decisions they make, and the conclusions they draw from the data.
Get a Full Report with References
Based on the research plan and generated notes, a final draft will be generated, including references from your documents and internet sources.
MAESTRO is a sophisticated multi-agent system designed to automate complex research synthesis. Instead of a single AI model, MAESTRO employs a team of specialized AI agents that collaborate to plan, execute, critique, and write research reports.
This methodology ensures a structured, transparent, and rigorous process from the initial question to the final, evidence-based report.
The MAESTRO Research Lifecycle
graph TD
subgraph User Interaction
A["User Defines Mission"]
end
subgraph "Phase 1: Planning"
B["Planning Agent<br>Creates Research Plan & Outline"]
end
subgraph "Phase 2: Research & Reflection"
C["Research Agent<br>Gathers Information (RAG/Web)"]
D["Reflection Agent<br>Critiques Findings & Identifies Gaps"]
C --> D
D -- "Revisions Needed? ↪" --> B
D -- "Evidence Complete? ✔" --> E
end
subgraph "Phase 3: Writing & Reflection"
E["Writing Agent<br>Drafts Report Sections"]
F["Reflection Agent<br>Reviews Draft for Clarity"]
E --> F
F -- "Revisions Needed? ↪" --> E
F -- "Draft Approved? ✔" --> G
end
subgraph "Phase 4: Finalization"
G["Agent Controller<br>Composes Final Report"]
end
A --> B
B --> C
G --> H["User Receives Report"]
style A fill:#e6e6fa,stroke:#333,stroke-width:1px
style H fill:#e6e6fa,stroke:#333,stroke-width:1px
style B fill:#f9f0ff,stroke:#333,stroke-width:2px
style C fill:#e0f7fa,stroke:#333,stroke-width:2px
style D fill:#fff0f5,stroke:#333,stroke-width:2px
style E fill:#e8f5e9,stroke:#333,stroke-width:2px
style F fill:#fff0f5,stroke:#333,stroke-width:2px
style G fill:#fffde7,stroke:#333,stroke-width:2px
MAESTRO's capabilities are driven by a team of specialized agents, each with a distinct role:
- Agent Controller (The Orchestrator): Manages the entire mission, delegating tasks to the appropriate agents and ensuring the workflow progresses smoothly from one phase to the next.
- Planning Agent (The Strategist): Takes the user's initial request and transforms it into a structured, hierarchical research plan and a report outline. This creates a clear roadmap for the mission.
- Research Agent (The Investigator): Executes the research plan by gathering information. It uses its tools—the local RAG pipeline and web search—to find relevant evidence and organizes it into structured
ResearchNote
objects. - Reflection Agent (The Critical Reviewer): This is the key to MAESTRO's analytical depth. The Reflection Agent constantly reviews the work of other agents, identifying knowledge gaps, inconsistencies, or deviations from the plan. Its feedback drives the iterative loops that refine and improve the quality of the research.
- Writing Agent (The Synthesizer): Takes the curated research notes and weaves them into a coherent, well-structured narrative that follows the report outline.
The research process is not linear; it's a series of iterative loops designed to simulate critical thinking and ensure a high-quality outcome.
-
The Research-Reflection Loop: The
Research Agent
doesn't just gather information in one pass. After an initial round of research, theReflection Agent
steps in to critique the findings. It asks questions like:- Are there gaps in the evidence?
- Do sources contradict each other?
- Have new, unexpected themes emerged?
Based on this critique, the
Reflection Agent
can recommend new research tasks or even prompt thePlanning Agent
to revise the entire plan. This loop continues until the evidence is comprehensive and robust. The number of iterations is dynamic and depends on the complexity of the topic.
-
The Writing-Reflection Loop: Drafting is also an iterative process. Once the
Writing Agent
produces a section of the report, theReflection Agent
reviews it for:- Clarity and Coherence: Is the argument easy to follow?
- Logical Flow: Are the ideas connected logically?
- Fidelity to Sources: Does the writing accurately represent the evidence in the
ResearchNote
s? TheWriting Agent
then revises the draft based on this feedback. This loop repeats until the writing meets the required standard of quality and accuracy.
This structured, reflective, and iterative process allows MAESTRO to move beyond simple information aggregation and produce sophisticated, reliable, and auditable research syntheses.
MAESTRO is designed to be run as a containerized application using Docker.
- Docker and Docker Compose (v2.0+)
- Git for cloning the repository
- PostgreSQL with pgvector extension (automatically provided via Docker)
- Disk Space: ~5GB for AI models + ~2GB for database
- Recommended: NVIDIA GPU with CUDA support for optimal performance
- Minimum: CPU-only operation is supported on all platforms
- Platform Support:
- Linux: Full GPU support with nvidia-container-toolkit
- macOS: CPU mode (optimized for Apple Silicon and Intel)
- Windows: GPU support via WSL2
The simplest way to get started:
git clone https://github.com/murtaza-nasir/maestro.git
cd maestro
./setup-env.sh # Linux/macOS
# or setup-env.ps1 # Windows PowerShell
docker compose up -d
docker compose logs -f maestro-backend
# Wait for: "Application startup complete"
Access MAESTRO at http://localhost
Default Credentials (change immediately after first login):
- Username:
admin
- Password: Generated during setup (check your
.env
file) oradmin123
if not using setup script
-
Clone the Repository
git clone https://github.com/murtaza-nasir/maestro.git cd maestro
-
Configure Your Environment Run the interactive setup script:
./setup-env.sh
Choose from three simple options:
- Simple (localhost only) - Recommended for most users
- Network (access from other devices on your network)
- Custom domain (for reverse proxy setups like researcher.local)
-
Start MAESTRO
# Recommended: Automatic GPU detection ./start.sh # Or manually: docker compose up -d
⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:- The frontend will be accessible but login will fail
- You'll see "Network Error" or login failures
- This is normal! The backend is still downloading required models
Monitor the startup progress:
# Watch the backend logs docker compose logs -f maestro-backend # Wait for this message: # "INFO: Application startup complete." # or "Uvicorn running on http://0.0.0.0:8000"
-
Clone the Repository
git clone https://github.com/murtaza-nasir/maestro.git cd maestro
Important for Windows/WSL Users: If you encounter "bad interpreter" errors, run:
# Fix line endings before setup .\fix-line-endings.ps1
-
Configure Your Environment Run the interactive setup script:
# Using PowerShell (recommended) .\setup-env.ps1 # Or using Command Prompt setup-env.bat
Choose from three simple options:
- Simple (localhost only) - Recommended for most users
- Network (access from other devices on your network)
- Custom domain (for reverse proxy setups)
-
Start MAESTRO
# For Windows/WSL without GPU: docker compose -f docker-compose.cpu.yml up -d # Or with GPU support (if available): docker compose up -d
⚠️ IMPORTANT - First-Time Startup: On the first run, the backend needs to download AI models (text embedders, etc.), which can take 5-10 minutes. During this time:- The frontend will be accessible but login will fail
- You'll see "Network Error" or login failures
- This is normal! The backend is still downloading required models
Monitor the startup progress:
# Watch the backend logs docker compose logs -f maestro-backend # Wait for this message: # "INFO: Application startup complete." # or "Uvicorn running on http://0.0.0.0:8000"
Once the backend shows "Application startup complete", access the web interface at the address shown by the setup script (default: http://localhost
).
Default Login:
- Username:
admin
- Password:
admin123
Important: Change the default password immediately after your first login via Settings → Profile.
For detailed instructions on configuring MAESTRO's settings and using all features, see the USER_GUIDE.md.
MAESTRO now uses a unified reverse proxy architecture to eliminate CORS issues:
- Single Entry Point: Everything accessible through one port (default: 80)
- No CORS Problems: Frontend and backend served from the same origin
- Simple Configuration: One host, one port to configure
- Efficient Routing: nginx handles static files and API routing efficiently
- Localhost Only (Default): Access from the same computer
- Network Access: Access from other devices on your network
- Custom Domain: Use with reverse proxies (e.g., researcher.local)
Windows/WSL: Backend won't start ("bad interpreter" error)?
- This is a line ending issue. Fix it with:
.\fix-line-endings.ps1 docker compose down docker compose build --no-cache maestro-backend docker compose up -d
Can't log in with admin/admin123?
- Reset the admin password using the built-in script:
# Run the reset script (already in the container) docker exec -it maestro-backend python reset_admin_password.py # Or with a custom password: docker exec -it maestro-backend python reset_admin_password.py YourNewPassword # Or using environment variable: docker exec -it maestro-backend bash -c "ADMIN_PASSWORD=YourNewPassword python reset_admin_password.py"
Using without a GPU?
- Use the CPU-only compose file:
Always use
docker compose -f docker-compose.cpu.yml up -d
-f docker-compose.cpu.yml
for all Docker commands on Windows without GPU support.
Can't access from another device?
- Re-run setup script with "Network" option
- Check firewall settings
- Ensure Docker containers are running:
docker compose ps
Still seeing CORS errors?
- Old configurations may conflict. Try:
docker compose down && docker compose up --build -d
- Check that you're accessing through the correct port
Getting 504 Gateway Timeout errors?
- If running behind a reverse proxy (nginx, Apache, etc.), you need to increase timeout settings
- Default proxy timeout (60s) is too short for AI operations
- See Reverse Proxy Configuration for detailed instructions
- The app handles timeouts gracefully, but proper configuration improves user experience
Migration from v0.1.2 or earlier (SQLite/ChromaDB):
- IMPORTANT: This is a breaking change - no automatic migration available
- You must start fresh with
docker compose down -v
to remove old volumes - Re-run setup script to generate new
.env
with secure passwords - All data will be lost - backup any important documents first
- After migration, all data is stored in PostgreSQL with pgvector
Planning/Outline Generation Errors (Local LLMs or Large Research Tasks)?
- If you encounter errors during outline generation or planning phases with the Planning Agent
- This often happens with local LLMs or when processing extensive research with many notes
- Solution: Reduce the "Planning Context" parameter in Settings → Research Parameters
- Navigate to Settings → Research tab → Content Processing Limits section
- Decrease "Planning Context" from default 200,000 to a lower value (e.g., 100,000 or 50,000)
- This splits large planning tasks into smaller, more manageable batches
- Particularly important for local LLMs with smaller context windows
MAESTRO includes automatic GPU detection and configuration for optimal performance across different platforms.
For systems without GPU or when GPU support is problematic, you have two options:
Option 1: Use the CPU-only Docker Compose file (Recommended)
docker compose -f docker-compose.cpu.yml up -d
This is the cleanest approach - it completely removes GPU dependencies from your containers, making them smaller and preventing any GPU-related errors. Perfect for dedicated CPU-only setups or servers without GPUs.
Option 2: Set FORCE_CPU_MODE in your .env file
# Add to your .env file
FORCE_CPU_MODE=true
# Then use regular docker compose
docker compose up -d
This approach keeps the GPU libraries in the container but tells the application to ignore them. It's convenient for development when you might want to switch between CPU and GPU modes without changing compose files.
# Linux/macOS: Automatic platform detection
./start.sh
# Windows: PowerShell
.\start.sh
# Stop services
./stop.sh # or .\stop.sh on Windows
Linux with NVIDIA GPU:
- ✅ Full GPU acceleration with automatic detection
- ✅ Multi-GPU support with load distribution
- Requirements:
nvidia-container-toolkit
installed - Setup: GPU support is automatically enabled when detected
macOS (Apple Silicon & Intel):
- ✅ CPU-optimized performance (no GPU runtime needed)
- ✅ Optimized for both Apple Silicon and Intel Macs
- ✅ Full compatibility through Docker Desktop
Windows with WSL2:
- ✅ GPU support through WSL2 and nvidia-container-toolkit
- ✅ Compatible with NVIDIA GPUs
- Requirements: WSL2 with GPU support enabled
For advanced users, you can manually configure GPU settings in .env
:
# GPU device assignment (0, 1, 2, etc.)
BACKEND_GPU_DEVICE=0
DOC_PROCESSOR_GPU_DEVICE=0
CLI_GPU_DEVICE=0
GPU_AVAILABLE=true
- Multi-GPU: Assign different services to different GPUs
- CPU Mode: Still performant for development and smaller workloads
- Memory: Monitor GPU memory with
nvidia-smi
# Check GPU detection
./detect_gpu.sh
# Test GPU in Docker
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
# View service logs
docker compose logs backend
Windows Users: For detailed Windows setup instructions, troubleshooting, and CLI usage, see WINDOWS_SETUP.md.
Database Reset and Consistency Tools
MAESTRO uses PostgreSQL with pgvector extension for all data storage including vector embeddings. The system includes powerful CLI tools for database management and consistency checking.
# Check database status and consistency
./maestro-cli.sh reset-db --check
# Get database statistics
./maestro-cli.sh reset-db --stats
# Reset all databases (with backup)
./maestro-cli.sh reset-db --backup
# Check system-wide document consistency
python maestro_backend/cli_document_consistency.py system-status
# Clean up orphaned documents
python maestro_backend/cli_document_consistency.py cleanup-all
# Check specific user's documents
python maestro_backend/cli_document_consistency.py check-user <user_id>
- Database Reset: Complete fresh start, removes ALL data
- Consistency Tools: Targeted cleanup, preserves valid data
- Automatic Monitoring: Built-in system runs every 60 minutes
For detailed instructions and advanced usage, see README_DATABASE_RESET.md
MAESTRO is built on a modern, decoupled architecture:
- Backend: A robust API built with FastAPI that handles user authentication, mission control, agentic logic, and the RAG pipeline.
- Frontend: A dynamic and responsive single-page application built with React, Vite, and TypeScript, using Tailwind CSS for styling.
- Real-time Communication: WebSockets stream live updates, logs, and status changes from the backend to the frontend.
- Database: PostgreSQL with pgvector extension for all data storage, SQLAlchemy ORM for database management.
- Containerization: Docker Compose orchestrates the multi-service application for reliable deployment.
MAESTRO can be configured for a completely self-hosted environment. It supports local, OpenAI-compatible API models, allowing you to run your own LLMs. For web searches, it integrates with SearXNG, a private and hackable metasearch engine, ensuring that your entire research workflow can remain on your own hardware.
If you choose to use SearXNG as your search provider, ensure your SearXNG instance is properly configured:
Required Configuration:
- Your SearXNG instance must support JSON output format
- Add
-json
to theformat
section in your SearXNG settings after-html
Example SearXNG settings.yml configuration:
search:
format:
- html
- json # <- This line is required for MAESTRO integration
Available Search Categories: MAESTRO supports the following SearXNG categories, which you can configure in the Settings > Search section:
- General (default)
- Science
- IT
- News
- Images
- Videos
- Music
- Files
- Map
- Social Media
You can select multiple categories to refine your search results based on your research needs.
For advanced users and administrators, a powerful Command Line Interface (CLI) is available for bulk document ingestion, user management, and other administrative tasks. See CLI_GUIDE.md for complete documentation.
Linux/macOS:
./maestro-cli.sh help
./maestro-cli.sh create-user researcher mypass123
./maestro-cli.sh ingest researcher ./documents
Windows:
# Using PowerShell (recommended)
.\maestro-cli.ps1 help
.\maestro-cli.ps1 create-user researcher mypass123
.\maestro-cli.ps1 ingest researcher .\documents
# Or using Command Prompt
maestro-cli.bat help
maestro-cli.bat create-user researcher mypass123
maestro-cli.bat ingest researcher .\documents
- USER_GUIDE.md - Detailed guide for configuring and using MAESTRO's features
- CLI_GUIDE.md - Comprehensive command-line interface documentation
- DOCKER.md - Complete Docker setup and deployment instructions
- WINDOWS_SETUP.md - Windows-specific installation guide
This project is dual-licensed:
- GNU Affero General Public License v3.0 (AGPLv3): MAESTRO is offered under the AGPLv3 as its open-source license.
- Commercial License: For users or organizations who cannot comply with the AGPLv3, a separate commercial license is available. Please contact the maintainers for more details.
Feedback, bug reports, and feature suggestions are highly valuable. Please feel free to open an Issue on the GitHub repository.