PDF Form Filler

A powerful Python desktop application that automates the filling of PDF forms using data from Excel spreadsheets. This tool enables bulk processing of forms by mapping PDF form fields to Excel columns, eliminating manual data entry and reducing errors.

🚀 Features

Core Functionality

PDF Form Analysis: Automatically detect and extract form fields from PDF documents
Excel Data Import: Import and parse Excel spreadsheets (.xlsx, .xls formats)
Field Mapping: Visual drag-and-drop interface for mapping PDF fields to Excel columns
Bulk Processing: Process multiple PDF forms using mapped Excel data
Template Management: Save and reuse field mapping configurations

Advanced Features

Data Validation: Custom validation rules for form fields
Data Transformation: Apply formatting rules (uppercase, lowercase, date formatting)
Audit Trail: Comprehensive logging of all processing activities
Preview Mode: Preview filled forms before final processing
Error Handling: Graceful handling of missing or invalid data

Installation

Prerequisites

Python 3.8 or higher
Windows 10/11 (primary target platform)

Setup

Clone or download this repository
Install required dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python main.py
```

Dependencies

PyPDF2: PDF manipulation and form field extraction
PyMuPDF: Enhanced PDF processing capabilities
pandas: Excel file handling and data manipulation
openpyxl: Excel file format support
tkinter: GUI framework (included with Python)
Pillow: Image processing for preview functionality

Usage

Basic Workflow

Load PDF Form: Open a PDF file containing form fields
Load Excel Data: Import an Excel spreadsheet with the data to fill forms
Map Fields: Create mappings between PDF form fields and Excel columns
Process Forms: Generate filled PDF forms for each row in the Excel data

Step-by-Step Guide

1. Open Files

Click "Open PDF" to select your PDF form template
Click "Open Excel" to select your data spreadsheet
The application will analyze both files and display available fields/columns

2. Create Field Mappings

The left panel shows PDF form fields and Excel columns
Use drag-and-drop or double-click to create mappings
Configure data transformations and validation rules as needed
Use "Auto Map" for automatic mapping based on field name similarity

3. Process Forms

Click "Preview" to see how the first form will look when filled
Click "Process All" to generate filled PDFs for all Excel rows
Select an output directory for the generated forms
Monitor progress in the status bar and log panel

4. Save Templates

Save your field mappings as templates for reuse
Load saved templates for similar forms or future processing

Configuration

Field Mapping Options

Basic Mapping: Simple field-to-column assignments
Data Transformations:
- Text case conversion (uppercase, lowercase, title case)
- Date formatting
- Text prefixes and suffixes
- Custom formatting rules
Validation Rules:
- Required field validation
- Data type validation (numeric, email)
- Length constraints
- Custom validation patterns

Output Options

File Naming: Customizable naming conventions for output files
Organization: Folder structure based on data attributes
Batch Processing: Configurable batch sizes for large datasets

File Structure

PDF_Form_Filler/
├── main.py                 # Application entry point
├── requirements.txt        # Python dependencies
├── README.md              # This file
├── docs/
│   └── PRD.md             # Product Requirements Document
├── src/
│   ├── __init__.py
│   ├── pdf_processor.py   # PDF handling and form filling
│   ├── excel_handler.py   # Excel file operations
│   ├── field_mapper.py    # Field mapping logic
│   └── ui/
│       ├── __init__.py
│       └── main_window.py # Main GUI interface
├── .github/
│   └── copilot-instructions.md  # Development guidelines
└── .vscode/
    └── tasks.json         # VS Code build tasks

Development

Architecture

The application follows a modular design with clear separation of concerns:

PDF Processing: Handles PDF form field detection and filling operations
Excel Handling: Manages Excel file operations and data parsing
Field Mapping: Manages the relationship between PDF fields and Excel columns
User Interface: Tkinter-based GUI for user interaction

Key Components

PDFProcessor (`src/pdf_processor.py`)

PDF form field extraction using PyMuPDF
Form filling and output generation
PDF validation and compatibility checking

ExcelHandler (`src/excel_handler.py`)

Excel file loading and parsing with pandas
Data validation and preprocessing
Support for multiple worksheets and data formats

FieldMapper (`src/field_mapper.py`)

Field mapping configuration and management
Data transformation and validation rules
Template save/load functionality

MainWindow (`src/ui/main_window.py`)

Primary user interface implementation
File management and processing controls
Progress tracking and logging display

Running in Development Mode

To run the application in development mode with enhanced logging:

python -c "import logging; logging.basicConfig(level=logging.DEBUG); exec(open('main.py').read())"

Building and Testing

A VS Code task is configured for building and running the application. Use Ctrl+Shift+P and search for "Tasks: Run Task" to access available tasks.

Troubleshooting

Common Issues

PDF Not Loading

Ensure the PDF contains fillable form fields (AcroForms)
Check if the PDF is password protected (feature planned for future release)
Verify the PDF is not corrupted or damaged

Excel Import Errors

Ensure the Excel file is not open in another application
Check that the file format is supported (.xlsx, .xls)
Verify column headers are present and properly formatted

Processing Failures

Check that all required fields have mappings or default values
Verify Excel data types match PDF field requirements
Review the log panel for specific error messages

Performance Optimization

For large datasets (>1000 rows), consider processing in smaller batches
Close unnecessary applications to free up memory during processing
Use SSD storage for faster file I/O operations

Contributing

This is a private project, but contributions are welcome. Please:

Follow the coding standards outlined in .github/copilot-instructions.md
Add appropriate tests for new functionality
Update documentation for any new features
Ensure all existing tests pass before submitting changes

License

Support

For technical support or feature requests, please refer to the project documentation or contact the development team.

Version History

Version 1.0.0 (Current)

Initial release with core functionality
PDF form field detection and filling
Excel data import and processing
Basic field mapping interface
Template save/load functionality
Comprehensive logging and error handling

Planned Features

Enhanced PDF preview with field highlighting
Advanced data transformation options
Batch processing with pause/resume
Integration with cloud storage services
Mobile companion app for field verification

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
.vscode		.vscode
docs		docs
samples		samples
src		src
README.md		README.md
create_demo_form.py		create_demo_form.py
create_samples.py		create_samples.py
launch.bat		launch.bat
launch.ps1		launch.ps1
main.py		main.py
pdf_form_filler.log		pdf_form_filler.log
requirements.txt		requirements.txt
verify_setup.py		verify_setup.py

Mikekay262/PDF_Form_Filler

Folders and files

Latest commit

History

Repository files navigation