RAG-chatbot is a production-grade chatbot app that combines LLMs with RAG techniques. It leverages semantic text embeddings, vector search using ElasticSearch and LLM integration (via OpenAI API or open-source alternatives) to produce context-rich responses for customer service and automated content generation.
This project was some sort of PoC/demo as part of my freelance work.
- LLM integration: communicate with LLMs (e.g., OpenAI GPT) for natural language understanding and generation.
- Retrieval augmentation: retrieve relevant documents from a knowledge base using vector search to augment chatbot responses.
- Text similarity & reranking: utilize SOTA text similarity methods to rerank retrieved content.
- Data preprocessing: clean and preprocess text and generate embeddings using modern transformer-based models.
- Modular architecture: clearly separated modules for LLM integration, vector database operations, retrieval augmentation and chatbot logic.
- Real-time chat interface: CLI for immediate testing and demonstration.
- Testing suite: unit tests covering major modules and components.
For a detailed description of the system architecture, please refer to docs/architecture.md.
-
Clone the repository:
git clone git@github.com:avrtt/RAG-chatbot.git cd RAG-chatbot
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate
-
Install the dependencies:
pip install -r requirements.txt
-
Configure the application:
Updateconfig.py
with your API keys, ElasticSearch settings and other configuration parameters as needed.
Start the chatbot by running:
python main.py
You will be prompted with a CLI where you can enter your queries. The chatbot will process your input, retrieve relevant documents and generate a context-aware response.
Run the full test suite using pytest:
pytest
Contributions are welcome.
MIT