AI WebHub is a high-performance, fully self-hosted AI platform for managing, running, and interacting with LLMs (Large Language Models). Designed to work entirely offline, it supports Ollama, OpenAI-compatible APIs, and provides a built-in inference engine for RAG (Retrieval Augmented Generation), enabling enterprise-grade AI deployment.
- Seamless Deployment: Docker, Docker Compose, Kubernetes (Helm/Kustomize), and native Python installation for flexible deployment.
- Multi-LLM Support: Integrate Ollama, OpenAI APIs, LMStudio, GroqCloud, Mistral, OpenRouter, and more.
- Offline & Secure: Fully functional offline mode; supports granular RBAC and SCIM 2.0 provisioning.
- Modern UI/UX: Responsive, PWA-ready Web UI with Markdown, LaTeX, and live multimedia support.
- Python Function Integration: Bring Your Own Function (BYOF) for custom Python tools inside the LLM workspace.
- RAG & Web Integration: Local document RAG, web searches (SearXNG, Google, DuckDuckGo, Bing), and live web content injection.
- Model Builder & Management: Create, import, and manage Ollama models via a clean Web UI.
- Image Generation: Integrates AUTOMATIC1111, ComfyUI (local), or DALL-E for rich visual AI content.
- Pipelines & Plugins: Extend AI WebHub with Python plugins and custom pipelines.
- Multilingual Support: Full i18n support for global accessibility.
Requires Python 3.11+:
pip install ai-webhub
Start the server:
ai-webhub serve
Access Web UI: http://localhost:8080
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v ai-webhub:/app/backend/data \
--name ai-webhub \
--restart always \
ghcr.io/karianne50m/ai-webhub:main
docker run -d -p 3000:8080 --gpus all \
-v ollama:/root/.ollama \
-v ai-webhub:/app/backend/data \
--name ai-webhub \
--restart always \
ghcr.io/karianne50m/ai-webhub:ollama
docker run -d -p 3000:8080 \
-e OPENAI_API_KEY=your_secret_key \
-v ai-webhub:/app/backend/data \
--name ai-webhub \
--restart always \
ghcr.io/karianne50m/ai-webhub:main
export HF_HUB_OFFLINE=1
- Server Connection Errors: Use
--network=host
if container cannot reach Ollama on localhost. - Docker Updates: Use Watchtower for automatic container updates.
- Join our Discord community for real-time support.
- RAG Document Integration: Load documents and query via
#doc_name
. - Web & Image Content Injection: Integrate live web content and images dynamically.
- Multi-Model Conversations: Simultaneously interact with multiple LLMs.
- Role-Based Access Control (RBAC): Restrict model creation and access to specific users.
- Pipelines & Plugins: Automate workflows, rate limiting, monitoring, and real-time translation.
Check the roadmap to explore upcoming features. Contributions are welcome β report issues or submit PRs on GitHub.
This repository uses a BSD-3-Clause style license with an additional clause preserving the AI WebHub branding. Full license details in LICENSE.
Connect, collaborate, and contribute via Discord.