Agentic AI RAG System with Pinecone and Gemini (Local PDFs)

This project implements a Retrieval Augmented Generation (RAG) system that answers questions based on two local PDF documents stored in Google Drive. It uses Pinecone as a vector database for efficient retrieval of relevant information and Gemini (or another LLM) to generate natural language responses.

Overview

This RAG system operates as follows:

Data Loading: Two PDF files are loaded from a designated Google Drive folder using PyPDFDirectoryLoader.
Text Chunking: The extracted text from both PDFs is divided into smaller chunks to optimize retrieval and manage context for the LLM.
Embedding Generation: Sentence embeddings are created for each text chunk using the "sentence-transformers/all-mpnet-base-v2" Hugging Face model.
Vector Database Storage: The text chunks and their corresponding embeddings are stored in a Pinecone vector database.
Retrieval and Question Answering: When a user asks a question, the system generates an embedding for the query, retrieves the most similar text chunks from Pinecone, and uses Gemini (or an alternative LLM) to generate a natural language answer based on the retrieved context.

Technologies Used

Pinecone: Vector database for storing and retrieving embeddings.
Hugging Face Transformers: For generating sentence embeddings using "sentence-transformers/all-mpnet-base-v2".
Gemini (or alternative LLM): Large Language Model for generating natural language responses.
Python: Programming language for implementation.
LangChain (Optional but highly recommended): For streamlining the RAG pipeline.
PyPDFDirectoryLoader (LangChain): For loading PDF documents from a directory.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
RAG_with_PineconeDB.ipynb		RAG_with_PineconeDB.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agentic AI RAG System with Pinecone and Gemini (Local PDFs)

Overview

Technologies Used

About

Uh oh!

Releases

Packages

Languages

License

Farhaj499/RAG_with_PineconeDB

Folders and files

Latest commit

History

Repository files navigation

Agentic AI RAG System with Pinecone and Gemini (Local PDFs)

Overview

Technologies Used

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages