PDF Insight Extractor

Overview

The PDF Insight Extractor is a Streamlit-based web application designed to analyze PDF documents and extract insights from text, images, and tables. It utilizes the OpenAI GPT model to process multimodal inputs and generate accurate responses to user queries.

Application

Key Features

📄 Upload PDF files for processing.
🔍 Extract insights from:
- Text: Understand and analyze textual content.
- Images: Extract context and meaning from embedded visuals.
- Tables: Retrieve structured data from tables.
💬 Ask questions about the document's content and get detailed responses.

Installation

Prerequisites

Python 3.8 or above
Pip for package management
OpenAI API key (stored in a .env file)

Steps

Clone the Repository

git clone https://github.com/Harshita1195/pdf-insight-extractor.git
cd pdf-insight-extractor

Install Dependencies
```
pip install -r requirements.txt
```
Set OpenAI API Key
- Create a .env file in the project directory.
- Add the following line:
```
OPENAI_API_KEY=your_openai_api_key
```
Run the Application
```
streamlit run app.py
```
Open the provided local URL in your web browser.

Usage

Upload a PDF File
- Drag and drop or select a PDF file via the file uploader.
Process the PDF
- The application converts each page into a base64-encoded image for analysis.
Ask a Query
- Enter a query in natural language, such as:
  - "What data is presented in the table on page 2?"
  - "Summarize the text on page 1."
  - "Describe the image on page 3."
- Click "Submit Query" to receive a detailed response.

Code Details

File: `app.py`

Core Functionalities

PDF to Image Conversion: Converts PDF pages to base64-encoded images using the fitz library for processing with OpenAI's GPT model.
Query Handling: Processes user queries using the LangChain OpenAI integration (ChatOpenAI).
Streamlit Interface:
- Provides an intuitive user interface for uploading PDFs and entering queries.
- Highlights key capabilities for text, image, and table extraction.

Dependencies

Streamlit: For building the web interface.
PyMuPDF (fitz): For PDF processing.
Pillow: For image handling.
LangChain: For OpenAI GPT model integration.
dotenv: For environment variable management.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
images		images
sample_pdfs		sample_pdfs
.gitignore		.gitignore
Readme.md		Readme.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Insight Extractor

Overview

Application

Key Features

Installation

Prerequisites

Steps

Usage

Code Details

File: `app.py`

Core Functionalities

Dependencies

License

About

Uh oh!

Releases

Packages

Languages

Harshita1195/pdf-insight-extractor

Folders and files

Latest commit

History

Repository files navigation

PDF Insight Extractor

Overview

Application

Key Features

Installation

Prerequisites

Steps

Usage

Code Details

File: app.py

Core Functionalities

Dependencies

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

File: `app.py`

Packages