This is a Retrieval-Augmented Generation (RAG) Chatbot built with Pinecone for vector search, local embeddings using Sentence Transformers, and Llama as the LLM. The chatbot retrieves relevant context from a document store and generates responses using a local Llama model. I am doing R&D on different models for CPU inference and better accuracy.
- Retrieval-Augmented Generation (RAG): Retrieves relevant documents from Pinecone before generating responses.
- Local Embeddings: Uses
sentence-transformersinstead of OpenAI embeddings. - Llama Model for Chat Completion: Runs a local Llama model for response generation.
- Streamlit UI: Interactive chat interface built with Streamlit.
git clone https://github.com/forhadsidhu/bhbfc_gpt
cd BHBFC_chatbotpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the project root and add your Pinecone API key:
PINECONE_API_KEY=your_pinecone_api_keyEnsure your Pinecone index has the correct dimensions matching your embedding model:
all-MiniLM-L6-v2→ 384 dimensionsall-mpnet-base-v2→ 768 dimensions
If you need to create a new index:
from pinecone import Pinecone
pc = Pinecone(api_key="your_pinecone_api_key")
pc.create_index(name="langchain-demo", dimension=768, metric="cosine")By default, the chatbot uses:
embedding_model = SentenceTransformer("all-mpnet-base-v2") # 768 dimensionsIf your Pinecone index uses a different dimension, change the model accordingly.
streamlit run app.pyThis will open a web UI where you can chat with the bot.
📂 rag-chatbot
├── 📄 app.py # Main Streamlit app
├── 📄 requirements.txt # Required Python dependencies
├── 📄 .env # Pinecone API key (Not included in the repo)
├── 📂 models/ # Directory for storing the Llama model
├── 📂 data/ # Store any preprocessed documents (if needed)
├── 📄 demo.png # Screenshot of the chatbot UI
├── 📄 README.md # Project documentation
❌ Vector dimension 384 does not match the dimension of the index 768
- Solution: Ensure your embedding model and Pinecone index dimensions match.
- Fix: Use
all-mpnet-base-v2for 768-dimension or recreate the Pinecone index with 384 dimensions.
❌ Missing Pinecone API key! Check your .env file.
- Solution: Ensure you have added your Pinecone API key in the
.envfile.
This project is licensed under the MIT License.
