Skip to content

Enterprise-ready vector database toolkit for building searchable knowledge bases from multiple data sources. Supports multi-project management, automatic ingestion from Confluence/JIRA/Git, intelligent file conversion (PDF/Office/images), and semantic search. Includes MCP server for seamless AI assistant integration in development environments.

License

Notifications You must be signed in to change notification settings

martin-papy/qdrant-loader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

QDrant Loader

PyPI - qdrant-loader PyPI - mcp-server Test Coverage License: GPL v3

πŸ“‹ Release Notes v0.4.7 - Latest improvements and bug fixes (June 9, 2025)

A comprehensive toolkit for loading data into Qdrant vector database with advanced MCP server support for AI-powered development workflows.

🎯 What is QDrant Loader?

QDrant Loader is a powerful data ingestion and retrieval system that bridges the gap between your technical content and AI development tools. It collects, processes, and vectorizes content from multiple sources, then provides intelligent search capabilities through a Model Context Protocol (MCP) server.

Perfect for:

  • πŸ€– AI-powered development with Cursor, Windsurf, and GitHub Copilot
  • πŸ“š Knowledge base creation from scattered documentation
  • πŸ” Intelligent code assistance with contextual documentation
  • 🏒 Enterprise content integration from Confluence, JIRA, and Git repositories

πŸ“¦ Packages

This monorepo contains two complementary packages:

πŸ”„ QDrant Loader

Data ingestion and processing engine

Collects and vectorizes content from multiple sources into QDrant vector database.

Key Features:

  • Multi-source connectors: Git, Confluence (Cloud & Data Center), JIRA (Cloud & Data Center), Public Docs, Local Files
  • Advanced file conversion: 20+ file types including PDF, Office docs, images with AI-powered processing
  • Intelligent chunking: Smart document processing with metadata extraction
  • Incremental updates: Change detection and efficient synchronization
  • Flexible embeddings: OpenAI, local models, and custom endpoints

AI development integration layer

Model Context Protocol server providing RAG capabilities to AI development tools.

Key Features:

  • MCP protocol compliance: Full integration with Cursor, Windsurf, and Claude Desktop
  • Advanced search tools: Semantic, hierarchy-aware, and attachment-focused search
  • Confluence intelligence: Deep understanding of page hierarchies and relationships
  • File attachment support: Comprehensive attachment discovery with parent document context
  • Real-time processing: Streaming responses for large result sets

πŸš€ Quick Start

Installation

# Install both packages
pip install qdrant-loader qdrant-loader-mcp-server

# Or install individually
pip install qdrant-loader          # Data ingestion only
pip install qdrant-loader-mcp-server  # MCP server only

5-Minute Setup

  1. Create a workspace

    mkdir my-qdrant-workspace && cd my-qdrant-workspace
  2. Download configuration templates

    curl -o config.yaml https://raw.githubusercontent.com/martin-papy/qdrant-loader/main/packages/qdrant-loader/conf/config.template.yaml
    curl -o .env https://raw.githubusercontent.com/martin-papy/qdrant-loader/main/packages/qdrant-loader/conf/.env.template
  3. Configure your environment (edit .env)

    QDRANT_URL=http://localhost:6333
    QDRANT_COLLECTION_NAME=my_docs
    OPENAI_API_KEY=your_openai_key
  4. Configure data sources (edit config.yaml)

    sources:
      git:
        - url: "https://github.com/your-org/your-repo.git"
          branch: "main"
  5. Load your data

    qdrant-loader --workspace . init
    qdrant-loader --workspace . ingest
  6. Start the MCP server

    mcp-qdrant-loader

πŸŽ‰ You're ready! Your content is now searchable through AI development tools.

πŸ”§ Integration Examples

Cursor IDE Integration

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "qdrant-loader": {
      "command": "/path/to/venv/bin/mcp-qdrant-loader",
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "QDRANT_COLLECTION_NAME": "my_docs",
        "OPENAI_API_KEY": "your_key",
        "MCP_DISABLE_CONSOLE_LOGGING": "true"
      }
    }
  }
}

Example Queries in Cursor

  • "Find documentation about authentication in our API"
  • "Show me examples of error handling patterns"
  • "What are the deployment requirements for this service?"
  • "Find all attachments related to database schema"

πŸ“ Project Structure

qdrant-loader/
β”œβ”€β”€ packages/
β”‚   β”œβ”€β”€ qdrant-loader/           # Core data ingestion package
β”‚   └── qdrant-loader-mcp-server/ # MCP server for AI integration
β”œβ”€β”€ docs/                        # Comprehensive documentation
β”œβ”€β”€ website/                     # Documentation website generator
└── README.md                   # This file

πŸ“š Documentation

πŸš€ Getting Started

πŸ‘₯ For Users

πŸ› οΈ For Developers

πŸ“¦ Package Documentation

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details on:

  • Setting up the development environment
  • Code style and standards
  • Pull request process
  • Issue reporting guidelines

Quick Development Setup

# Clone the repository
git clone https://github.com/martin-papy/qdrant-loader.git
cd qdrant-loader

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e packages/qdrant-loader[dev]
pip install -e packages/qdrant-loader-mcp-server[dev]

# Run tests
pytest

πŸ†˜ Support

πŸ“„ License

This project is licensed under the GNU GPLv3 - see the LICENSE file for details.


Ready to supercharge your AI development workflow? Start with our Quick Start Guide or explore the complete documentation.

About

Enterprise-ready vector database toolkit for building searchable knowledge bases from multiple data sources. Supports multi-project management, automatic ingestion from Confluence/JIRA/Git, intelligent file conversion (PDF/Office/images), and semantic search. Includes MCP server for seamless AI assistant integration in development environments.

Topics

Resources

License

Stars

Watchers

Forks