🔎 Google Gemini PDF to Table Extractor in HTML

Transform PDF tables into HTML with the power of Gemini 2.5 that detects the layout and content and transforms into a viewable HTML file

This experimental tool leverages Google's Gemini 2.5 Flash Preview model to parse complex tables from PDF documents and convert them into clean HTML that preserves the exact layout, structure, and data.

test.mp4

✨ Why This Matters

PDF tables are notoriously difficult to extract accurately. Standard conversion tools often produce:

Misaligned columns and rows
Lost formatting and merged cells
Garbled text and numbers
Completely broken layouts

This tool achieves ~80% layout accuracy while maintaining nearly 100% data accuracy for most tables.

🚀 Key Features

Preserves Complex Table Structures - Handles merged cells, nested headers, and multi-line content
Maintains Visual Fidelity - Recreates the visual appearance of tables with proper CSS
Extracts Text with High Accuracy - Particularly effective with numerical data
Direct PDF Processing - Sends PDF data directly to the model without intermediary conversions
Thinking Mode - Uses Gemini's unique thinking capability for improved analysis
Token Usage Reporting - Tracks processing efficiency

📊 Technical Approach

This project explores how AI models understand and parse structured PDF content. Rather than using OCR or traditional table extraction libraries, this tool gives the raw PDF to Gemini and uses specialized prompting techniques to optimize the extraction process.

📝 Installation

Clone this repository:

git clone https://github.com/lesteroliver911/gemini-pdf-table-extractor
cd gemini-pdf-table-extractor

Install the required dependencies:

pip install -r requirements.txt

Set up your Google API key:
- Create a .env file in the project root directory
- Add your Google API key: GOOGLE_API_KEY=your_api_key_here

💻 Usage

Basic usage:

python main.py path/to/your/document.pdf

This will generate an HTML file with the same name in the same directory.

Advanced options:

python main.py path/to/your/document.pdf --output custom_output.html --thinking-budget 24000

Arguments:

--output: Specify a custom output file path
--thinking: Enable thinking mode (default: True)
--thinking-budget: Set the thinking token budget (default: 24000)
--prompt: Provide a custom prompt for conversion

🧪 Experimental Status

This project is an exploration of AI-powered PDF parsing capabilities. While it achieves strong results for many tables, complex documents with unusual layouts may present challenges. The extraction accuracy will improve as the underlying models advance.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔎 Google Gemini PDF to Table Extractor in HTML

✨ Why This Matters

🚀 Key Features

📊 Technical Approach

📝 Installation

💻 Usage

🧪 Experimental Status

About

Uh oh!

Languages

License

lesteroliver911/google-gemini-pdf-table-extractor

Folders and files

Latest commit

History

Repository files navigation

🔎 Google Gemini PDF to Table Extractor in HTML

✨ Why This Matters

🚀 Key Features

📊 Technical Approach

📝 Installation

💻 Usage

🧪 Experimental Status

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages