GitHub - Anirudh-Unni/RAG: Retrieval-Augmented Generation (RAG) Chatbot: This repository contains a Jupyter Notebook that demonstrates how to build a simple RAG chatbot using LangChain, OpenAI, and ChromaDB.

Retrieval-Augmented Generation (RAG) Chatbot

This repository contains a Jupyter Notebook that demonstrates how to build a simple Retrieval-Augmented Generation (RAG) chatbot using LangChain, OpenAI, and ChromaDB. The notebook guides you through the entire process, from loading a document to generating answers based on its content.

Overview

Retrieval-Augmented Generation (RAG) is a technique that enhances the accuracy and reliability of Large Language Models (LLMs) by grounding them in external knowledge bases. Instead of relying solely on its pre-trained knowledge, the model first retrieves relevant information from a specified document source and then uses that information to generate a response.

This notebook implements the following workflow:

Load Document: Ingest a PDF document using LangChain's PyPDFLoader.
Split & Chunk: Break the document into smaller, semantically meaningful chunks using RecursiveCharacterTextSplitter.
Embed: Convert the text chunks into numerical vector representations (embeddings) using OpenAI's text-embedding-3-large model.
Store: Store these embeddings in a ChromaDB vector store for efficient retrieval.
Retrieve: Given a user question, perform a similarity search in the vector store to find the most relevant chunks.
Generate: Pass the original question and the retrieved context to an LLM (like GPT-4o) to generate a final, context-aware answer.

Technologies Used

Python 3.10+
LangChain: A framework for developing applications powered by language models.
OpenAI: For generating text embeddings and the final chat response.
ChromaDB: An open-source vector database for storing and querying embeddings.
PyPDF: A library to read and extract text from PDF files.
Jupyter Notebook: For interactive development and demonstration.

Getting Started

Follow these steps to set up and run the project on your local machine.

Prerequisites

An OpenAI API Key.

Installation

Clone the repository:

git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git)
cd your-repo-name

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```
(You can create a requirements.txt file with the content from the first code cell in the notebook, or simply run the pip install commands from the notebook.)
Set up your environment variables:
- Create a file named .env in the root of the project directory.
- Add your OpenAI API key to this file:
```
OPENAI_API_KEY="your-openai-api-key-here"
```
Add your data source:
- Place the PDF file you want to use as the knowledge base into the appropriate directory.
- Update the file_path variable in the notebook to point to your PDF file. For example:
```
file_path = "path/to/your/document.pdf"
```

Running the Notebook

Start Jupyter Notebook:
```
jupyter notebook
```
Open the LAB_GenAI_RAG_Chatbot.ipynb file.
Run the cells sequentially from top to bottom.

exercice

Exercice1: Write a user question that someone might ask about your book’s topic or content.
Exercice2: Write a prompt that is relevant and tailored to the content and style of your book.
Exercice3: Tune parameters like temperature, and penalties to control how creative, focused, or varied the model's responses are.
Exercice4: add your keywords

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LAB_GenAI_RAG_Chatbot.ipynb		LAB_GenAI_RAG_Chatbot.ipynb
LICENSE		LICENSE
README.md		README.md
ai-for-everyone.pdf		ai-for-everyone.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Technologies Used

Getting Started

Prerequisites

Installation

Running the Notebook

exercice

About

Uh oh!

Releases

Packages

Languages

License

Anirudh-Unni/RAG

Folders and files

Latest commit

History

Repository files navigation

Overview

Technologies Used

Getting Started

Prerequisites

Installation

Running the Notebook

exercice

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages