RAG Chatbot API

A context-only Retrieval-Augmented Generation (RAG) chatbot built with FastAPI. It allows users to upload documents, which are ingested into a hybrid retrieval pipeline (BM25 + dense + cross-encoder reranking + HyDE). When a user asks a question, the system retrieves relevant chunks and uses a local LLM to generate grounded responses based solely on those chunks.

Features

📁 Multi-format ingestion: Supports .pdf, .docx, .txt, .md
📚 Hybrid retrieval: Combines BM25, dense embeddings, HyDE generation, and RRF fusion
🧠 Context-only LLM: Answers strictly from provided documents (no hallucination)
🧾 Cross-encoder reranking: Improves retrieval quality
🔁 Auto reindexing on document upload/delete
🖼️ Frontend support: Serves static HTML/JS/CSS

📦 Installation

1. Clone the Repository

git clone https://github.com/your-username/rag-chatbot-api.git
cd rag-chatbot-api

2. Create a virtual environment

python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

if prefer using Conda

conda create --name <env_name> python=<version>
conda activate <env_name>

3. Installing dependencies

pip install -r requirements.txt

Project Structure

.
├── app.py # FastAPI app entrypoint
├── bulk_ingest.py # Ingests all files from ./data
├── data/ # Uploaded documents
├── static/ # Frontend HTML/CSS/JS
├── models/
│ └── llama_wrapper.py # Local LLM (Mistral/OpenHermes wrapper)
├── memory/
│ ├── embedder.py # Embedding logic (MiniLM or similar)
│ ├── memory_store.py # Vector store (e.g., ChromaDB)
│ └── logger.py # JSON-based chat logger
├── utils/
│ ├── pipeline_runner.py # HyDE and standard pipeline runners
│ └── prompt_builder.py # Builds system+context prompts
├── hybrid_retrieval_pipeline.py# Hybrid RAG retrieval logic
├── document_ingestor.py # Ingests individual documents
├── requirements.txt
└── README.md

Running the App

uvicorn app:app --reload

API endpoints

Method	Endpoint	Description
GET	`/`	Serves the frontend UI
POST	`/upload`	Uploads and ingests a document
POST	`/chat`	Chat with the LLM using RAG
GET	`/documents`	Lists all uploaded documents
DELETE	`/documents/{fn}`	Deletes a document and reindexes
POST	`/reindex`	Manually reindex all files
GET	`/health`	Health check for readiness

LLM + Retrieval

LLM: openhermes-2.5-mistral-7b.Q4_K_M.gguf (run locally via llama-cpp-python)

Embedding: all-MiniLM-L6-v2 via SentenceTransformers

Cross-Encoder: ms-marco-MiniLM-L-6-v2

Vector store: ChromaDB (can be swapped)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Chatbot API

Features

📦 Installation

1. Clone the Repository

2. Create a virtual environment

3. Installing dependencies

Project Structure

Running the App

API endpoints

LLM + Retrieval

Sample screenshots

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
ingestion		ingestion
memory		memory
models		models
rerankers		rerankers
scripts		scripts
static		static
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
bulk_ingest.py		bulk_ingest.py
evaluate_beir.py		evaluate_beir.py
evaluate_beir_hybrid.py		evaluate_beir_hybrid.py
hybrid_retrieval_pipeline.py		hybrid_retrieval_pipeline.py
main.py		main.py
old_main.py		old_main.py
requirements.txt		requirements.txt
retrieval_pipeline.py		retrieval_pipeline.py

Dhyanesh18/rag-enterprise-search

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot API

Features

📦 Installation

1. Clone the Repository

2. Create a virtual environment

3. Installing dependencies

Project Structure

Running the App

API endpoints

LLM + Retrieval

Sample screenshots

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages