RAG-Based Document Question Answering System 🤖📄

This project implements a Retrieval-Augmented Generation (RAG) chatbot that allows users to upload PDF documents, ask questions based on the content, and receive accurate, document-specific answers. It combines the power of Cohere for language processing and embeddings, Pinecone for efficient vector storage and retrieval, and Streamlit for a user-friendly interface.

Features ✨

PDF Processing: Extracts text from uploaded PDF documents and splits it into manageable chunks for embedding and storage.
Embedding and Retrieval: Uses Cohere's embeddings for encoding document chunks and Pinecone for scalable vector similarity search.
Question Answering: Leverages Cohere's language models to generate accurate responses by retrieving and analyzing relevant document chunks.
Interactive Interface: Provides a simple and intuitive interface using Streamlit for uploading documents, entering queries, and viewing results.

How to Use 🚀

Follow the steps below to run and interact with the project:

1. Clone the Repository

Clone the repository to your local system using the following command:

git clone https://github.com/VivekChauhan05/RAG_Document_Question_Answering.git
cd RAG_Document_Question_Answering

2. Create and Activate a Virtual Environment 🏗️

Create a virtual environment and activate it to isolate project dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies 📦

Install the required Python libraries using the provided requirements.txt file:

pip install -r requirements.txt

4. Obtain API Keys 🔑

Get your API keys for:

Cohere: Sign up at Cohere to obtain an API key.
Pinecone: Sign up at Pinecone to obtain an API key. These keys will be entered via the Streamlit interface when running the app.

5. Run the Application 🏃‍♂️

Launch the Streamlit application:

cd src
streamlit run app.py

6. Access the Application 🌐

Once the application is running, open your browser and navigate to the URL provided by Streamlit, typically http://localhost:8501.

7. Upload a Document 📄

Use the interface to upload a PDF file containing the content you want to query.

8. Ask Questions ❓

Enter your question in the query box. The chatbot will:
Retrieve relevant chunks of text from the uploaded document.
Generate a precise and context-aware response.

Project Structure 📁

├── app.py                # Main application file with Streamlit interface
├── vectorstore.py        # Handles PDF processing, embedding, and retrieval
├── chatbot.py            # Handles user interaction and response generation
├── requirements.txt      # Project dependencies
├── README.md             # Project documentation

Future Enhancements 🚧

Add support for multi-language documents. Enhance the UI with multi-document support and export options for chat history. Enable deployment to cloud platforms for wider accessibility. Integrate additional vector databases for broader compatibility.

Contributing 🤝

🚀 We warmly welcome contributions to enhance this project! Whether it's fixing bugs, adding new features, or improving documentation, your efforts will help make this project better for everyone. Let's collaborate and build something amazing together! 🌟✨

License 📜

This project is licensed under the Apache License. See the LICENSE file for more details.

Acknowledgments 🙏

Cohere AI for their powerful embedding and language models. 🧠✨
Pinecone for scalable vector search infrastructure. 🔍⚡
Streamlit for making it easy to build interactive data apps. 📊🎉

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.devcontainer		.devcontainer
src		src
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG-Based Document Question Answering System 🤖📄

Features ✨

How to Use 🚀

1. Clone the Repository

2. Create and Activate a Virtual Environment 🏗️

3. Install Dependencies 📦

4. Obtain API Keys 🔑

5. Run the Application 🏃‍♂️

6. Access the Application 🌐

7. Upload a Document 📄

8. Ask Questions ❓

Project Structure 📁

Future Enhancements 🚧

Contributing 🤝

License 📜

Acknowledgments 🙏

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

VivekChauhan05/RAG_Document_Question_Answering

Folders and files

Latest commit

History

Repository files navigation

RAG-Based Document Question Answering System 🤖📄

Features ✨

How to Use 🚀

1. Clone the Repository

2. Create and Activate a Virtual Environment 🏗️

3. Install Dependencies 📦

4. Obtain API Keys 🔑

5. Run the Application 🏃‍♂️

6. Access the Application 🌐

7. Upload a Document 📄

8. Ask Questions ❓

Project Structure 📁

Future Enhancements 🚧

Contributing 🤝

License 📜

Acknowledgments 🙏

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages