AI Chat with PDF

AI Chat with PDF is a powerful application that allows you to have natural conversations with your PDF documents. Using state-of-the-art language models and HuggingFace embeddings, the application can understand and answer questions about the content of your PDF files in a conversational manner.

Features

📄 Upload and process multiple PDF documents simultaneously
💬 Interactive chat interface with conversation history
🔍 Semantic search using HuggingFace embeddings
🧠 Powered by LangChain and HuggingFace models
🚀 Modern Streamlit-based web interface
🌍 Multilingual support (French/English)

Prerequisites

Python 3.8 or higher
Poetry (recommended) or pip
HuggingFace API key (optional, for some models)

Installation

Clone the repository:

git clone https://github.com/djili/aichatpdf.git
cd aichatpdf

Install dependencies using Poetry:
```
poetry install
```
Or using pip:
```
pip install -r requirements.txt
```

Set up environment variables:

Copy .env.example to .env
Configure your preferred models and API keys:

# For OpenAI models (optional)
OPENAI_API_KEY=your_openai_key

# For HuggingFace models (recommended)
HUGGINGFACEHUB_API_TOKEN=your_hf_token

Usage

Start the application:

poetry run streamlit run app.py

Or with pip:

streamlit run app.py

Open your web browser and navigate to http://localhost:8501
Upload a PDF file using the file uploader
Start chatting with your document by typing questions in the chat interface

How It Works

The application processes your PDF document and extracts the text content
The text is split into manageable chunks using recursive text splitting
These chunks are converted into vector embeddings using HuggingFace's instructor-xl model
When you ask a question, the system performs a semantic search to find the most relevant text chunks
The conversation history and relevant context are used to generate a coherent response
The chat interface maintains conversation history for context-aware responses

Technologies Used

Streamlit - Web application framework
LangChain - Framework for developing applications with LLMs
LangChain - Framework for LLM applications
HuggingFace - For embeddings and language models
FAISS - Efficient similarity search
PyPDF2 - PDF text extraction
Streamlit - Web application framework
Sentence Transformers - For generating embeddings

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with ❤️ using amazing open-source libraries
Inspired by the growing ecosystem of AI-powered document processing tools

Note: Make sure to handle sensitive documents appropriately and be aware of the data you're processing through the application.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.env.example		.env.example
.gitignore		.gitignore
app.py		app.py
python-version		python-version
readme.md		readme.md
requirements.txt		requirements.txt
ui.py		ui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Chat with PDF

Features

Prerequisites

Installation

Usage

How It Works

Technologies Used

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

djili/aichatpdf

Folders and files

Latest commit

History

Repository files navigation

AI Chat with PDF

Features

Prerequisites

Installation

Usage

How It Works

Technologies Used

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages