⚡ All-in-one, fully-local LLM retrieval orchestration: parse, embed, search, deploy, done.
Kolosal AI Retrieval Management System is an integrated, all-in-one solution combining a powerful LLM inference engine, embedding engine, robust document parser, high-performance vector database, built-in internet search capability, and a centralized management dashboard to seamlessly orchestrate all components. This unified platform fulfills over 90% of your LLM integration requirements while ensuring complete local deployment, privacy, and data control.
-
MCP (Model Context Protocol)
-
RAG (Retrieval-Augmented Generation)
-
LLM & Embeddings Inference
-
AI Agents
-
Internet Search
This system integrates the following components:
- Kolosal AI Inference Engine - LLM inference server (docker branch)
- Kolosal RMS Dashboard - Web-based management interface
- Kolosal RMS MarkItDown - Document parsing service
- Qdrant Vector Database - High-performance vector database
- SearXNG Docker - Privacy-respecting search engine
The easiest way to get started - just 2 commands:
# 1. Clone this repository
git clone https://github.com/KolosalAI/Retrieval-Management-System.git
cd Retrieval-Management-System
# 2. Start all services
docker-compose up -d
All services will be automatically built from their respective GitHub repositories. No setup scripts or submodules needed!
- Dashboard: http://localhost:3000
- Kolosal Server (AI Inference): http://localhost:8084
- Qdrant Vector DB: http://localhost:6333
- SearXNG Search: http://localhost:8090
- MarkItDown API: http://localhost:8001
For detailed Docker setup instructions, see DOCKER_SETUP.md.
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.