Developed by NexuSecurus™
Supported HW | Target OS | Code Stats | Audience | Type | Dependencies | License |
---|---|---|---|---|---|---|
|
||||||
|
- LangChain-Qdrant Multi-Source API
- Table of Contents
Langdrant is an open-source API backend for semantic document search, powered by Qdrant and modern embedding models. Easily ingest, embed, and search your documents with a simple REST API. Perfect for building AI-powered search, knowledge bases, and more. A high-performance FastAPI server for multi-source data ingestion, semantic search, and LLM-powered query generation. It integrates Qdrant as the vector database and Ollama for embeddings and LLM completions.
Supports asynchronous ingestion, streaming responses, and structured n8n-ready outputs.
- Multi-source ingestion
- Generic text
- Files: PDF, DOCX, HTML, TXT
- Logs
- Database rows (PostgreSQL)
- RSS feeds
- Social media posts
- Vector search
- Semantic search with optional LLM-based context
- Hybrid queries (vector similarity + keyword filters)
- Multi-collection search
- LLM generation
- Context-aware completions
- n8n-ready structured responses (
summary
andcanonical_embedding_text
) - Streaming responses via SSE for conversational endpoints
- Collection management
- List collections with vector counts
- Delete collections safely
- Debug endpoints
- Text chunk preview
- Embedding inspection
- Security
- API key enforcement for all endpoints
- Deployment-ready
- Dockerized with FastAPI & Uvicorn
- Configurable via
.env
- Multi-worker async ingestion
- CPU: x86_64 or ARM architecture
- RAM: Minimum 4GB (8GB recommended for optimal performance)
- Disk: Minimum 10GB free space
- Internet connection: Required for fetching dependencies, accessing APIs, and pulling container images
- Ollama Instance with available models (Local or Remote)
-
Git: Required to clone the repository and manage version control
sudo apt install git # Linux brew install git # macOS
-
Python 3.11+: Required for running locally or for building Docker images
python3 --version
-
pip: Python package manager to install dependencies
-
Docker: Required for containerized deployment
docker --version
-
Docker Compose: For multi-container orchestration
docker compose version
-
PostgreSQL client libraries: Required if you intend to use database ingestion endpoints (psycopg2-binary included in Python dependencies)
-
Optional system packages for file ingestion (PDF/DOCX/HTML parsing, OCR):
- poppler-utils (PDF text extraction)
- libreoffice (DOCX conversion, optional)
- tesseract-ocr (OCR for scanned documents)
-
cURL or HTTP client for testing API endpoints
-
jq (optional) make json output prettier and more readable.
-
n8n (optional) for automation workflows and connecting ingestion + query endpoints
git clone https://github.com/nexusecurus/langdrant.git
cd langdrant
Copy the .env.example
to .env
and configure your environment:
cp .env.example .env
Auto Generate API Key (Optional):
python3 langserver/api-generator.py
This will generate an API_KEY and update it under
.env
file automatically.
Customize variables to your preference:
nano .env
Variable | Description | Default |
---|---|---|
API_KEY |
API key for FastAPI endpoints | "" |
API_PORT |
FastAPI port | 8000 |
LOG_LEVEL |
Logging level | INFO |
QDRANT_URL |
Qdrant server URL | http://127.0.0.1:6333 |
QDRANT_API_KEY |
Optional Qdrant API key | "" |
VECTOR_SIZE |
Embedding vector size | 1536 |
OLLAMA_BASE_URL |
Ollama server URL | http://127.0.0.1:11434 |
EMBED_MODEL |
Ollama embedding model | nomic-embed-text |
LLM_MODEL |
Ollama LLM model | llama3:8b |
LLM_CTX |
LLM context window | 4096 |
LLM_MAX_TOKENS |
Max tokens per generation | 300 |
DB_HOST/PORT/NAME/USER/PASSWORD |
PostgreSQL connection | - |
CHUNK_SIZE |
Text chunk size | 800 |
CHUNK_OVERLAP |
Overlap per chunk | 120 |
EMBED_BATCH_SIZE |
Batch size for embedding | 64 |
docker compose build --no-cache
docker compose pull
docker compose up -d
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd langserver
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
> For a full detailed description of all API endpoints, request/response examples, default values, and n8n integration examples, please check ENDPOINTS.md.
POST /generate
POST /chat
- Generate text using an LLM model. Returns structured n8n-ready data (summary and canonical_embedding_text). Use model, max_tokens, and num_ctx to override defau
- Conversational endpoint. Accepts a list of messages. Supports streaming mode via SSE. Returns assistant response. Flags: model, max_tokens, num_ctx, stream.
Async ingestion of multiple data sources:
POST /ingest_texts # Generic text
POST /ingest_file # File upload
POST /ingest_logs # Logs
POST /ingest_db # Database rows
POST /ingest_rss # RSS feeds
POST /ingest_social # Social posts
POST /fetch_rss_feeds # Fetch and ingest RSS feeds
- Supports deterministic ID generation for deduplication
- Text is chunked with configurable size and overlap
- Batch embedding and upsert into Qdrant
- Preserves full metadata (source, timestamp, platform, etc.)
POST /query # Single collection
POST /query_hybrid # Hybrid semantic + keyword filters
POST /query_multi # Multi-collection search
- Returns top-K nearest vectors
- Optional LLM answer generation from retrieved context
- Keyword filters, recency boosts, and hybrid queries supported
GET /collections
POST /collections/delete
- List collections with vector counts
- Delete collection safely
POST /debug/chunk
POST /debug/embeds
- Preview how text is chunked
- Inspect embeddings before insertion
GET /health
GET /ping
- Returns server status.
- Fork the repository
- Create your feature branch
- Submit pull requests with clear descriptions
- Ensure all new endpoints have tests and documentation