Skip to content

PDF Form Filler - Automated Form Processing Tool About PDF Form Filler is a powerful Python desktop application designed to automate the tedious process of filling out repetitive PDF forms.

Notifications You must be signed in to change notification settings

Mikekay262/PDF_Form_Filler

Repository files navigation

PDF Form Filler

A powerful Python desktop application that automates the filling of PDF forms using data from Excel spreadsheets. This tool enables bulk processing of forms by mapping PDF form fields to Excel columns, eliminating manual data entry and reducing errors.

Python License Platform

🚀 Features

Core Functionality

  • PDF Form Analysis: Automatically detect and extract form fields from PDF documents
  • Excel Data Import: Import and parse Excel spreadsheets (.xlsx, .xls formats)
  • Field Mapping: Visual drag-and-drop interface for mapping PDF fields to Excel columns
  • Bulk Processing: Process multiple PDF forms using mapped Excel data
  • Template Management: Save and reuse field mapping configurations

Advanced Features

  • Data Validation: Custom validation rules for form fields
  • Data Transformation: Apply formatting rules (uppercase, lowercase, date formatting)
  • Audit Trail: Comprehensive logging of all processing activities
  • Preview Mode: Preview filled forms before final processing
  • Error Handling: Graceful handling of missing or invalid data

Installation

Prerequisites

  • Python 3.8 or higher
  • Windows 10/11 (primary target platform)

Setup

  1. Clone or download this repository
  2. Install required dependencies:
    pip install -r requirements.txt
  3. Run the application:
    python main.py

Dependencies

  • PyPDF2: PDF manipulation and form field extraction
  • PyMuPDF: Enhanced PDF processing capabilities
  • pandas: Excel file handling and data manipulation
  • openpyxl: Excel file format support
  • tkinter: GUI framework (included with Python)
  • Pillow: Image processing for preview functionality

Usage

Basic Workflow

  1. Load PDF Form: Open a PDF file containing form fields
  2. Load Excel Data: Import an Excel spreadsheet with the data to fill forms
  3. Map Fields: Create mappings between PDF form fields and Excel columns
  4. Process Forms: Generate filled PDF forms for each row in the Excel data

Step-by-Step Guide

1. Open Files

  • Click "Open PDF" to select your PDF form template
  • Click "Open Excel" to select your data spreadsheet
  • The application will analyze both files and display available fields/columns

2. Create Field Mappings

  • The left panel shows PDF form fields and Excel columns
  • Use drag-and-drop or double-click to create mappings
  • Configure data transformations and validation rules as needed
  • Use "Auto Map" for automatic mapping based on field name similarity

3. Process Forms

  • Click "Preview" to see how the first form will look when filled
  • Click "Process All" to generate filled PDFs for all Excel rows
  • Select an output directory for the generated forms
  • Monitor progress in the status bar and log panel

4. Save Templates

  • Save your field mappings as templates for reuse
  • Load saved templates for similar forms or future processing

Configuration

Field Mapping Options

  • Basic Mapping: Simple field-to-column assignments
  • Data Transformations:
    • Text case conversion (uppercase, lowercase, title case)
    • Date formatting
    • Text prefixes and suffixes
    • Custom formatting rules
  • Validation Rules:
    • Required field validation
    • Data type validation (numeric, email)
    • Length constraints
    • Custom validation patterns

Output Options

  • File Naming: Customizable naming conventions for output files
  • Organization: Folder structure based on data attributes
  • Batch Processing: Configurable batch sizes for large datasets

File Structure

PDF_Form_Filler/
├── main.py                 # Application entry point
├── requirements.txt        # Python dependencies
├── README.md              # This file
├── docs/
│   └── PRD.md             # Product Requirements Document
├── src/
│   ├── __init__.py
│   ├── pdf_processor.py   # PDF handling and form filling
│   ├── excel_handler.py   # Excel file operations
│   ├── field_mapper.py    # Field mapping logic
│   └── ui/
│       ├── __init__.py
│       └── main_window.py # Main GUI interface
├── .github/
│   └── copilot-instructions.md  # Development guidelines
└── .vscode/
    └── tasks.json         # VS Code build tasks

Development

Architecture

The application follows a modular design with clear separation of concerns:

  • PDF Processing: Handles PDF form field detection and filling operations
  • Excel Handling: Manages Excel file operations and data parsing
  • Field Mapping: Manages the relationship between PDF fields and Excel columns
  • User Interface: Tkinter-based GUI for user interaction

Key Components

PDFProcessor (src/pdf_processor.py)

  • PDF form field extraction using PyMuPDF
  • Form filling and output generation
  • PDF validation and compatibility checking

ExcelHandler (src/excel_handler.py)

  • Excel file loading and parsing with pandas
  • Data validation and preprocessing
  • Support for multiple worksheets and data formats

FieldMapper (src/field_mapper.py)

  • Field mapping configuration and management
  • Data transformation and validation rules
  • Template save/load functionality

MainWindow (src/ui/main_window.py)

  • Primary user interface implementation
  • File management and processing controls
  • Progress tracking and logging display

Running in Development Mode

To run the application in development mode with enhanced logging:

python -c "import logging; logging.basicConfig(level=logging.DEBUG); exec(open('main.py').read())"

Building and Testing

A VS Code task is configured for building and running the application. Use Ctrl+Shift+P and search for "Tasks: Run Task" to access available tasks.

Troubleshooting

Common Issues

PDF Not Loading

  • Ensure the PDF contains fillable form fields (AcroForms)
  • Check if the PDF is password protected (feature planned for future release)
  • Verify the PDF is not corrupted or damaged

Excel Import Errors

  • Ensure the Excel file is not open in another application
  • Check that the file format is supported (.xlsx, .xls)
  • Verify column headers are present and properly formatted

Processing Failures

  • Check that all required fields have mappings or default values
  • Verify Excel data types match PDF field requirements
  • Review the log panel for specific error messages

Performance Optimization

  • For large datasets (>1000 rows), consider processing in smaller batches
  • Close unnecessary applications to free up memory during processing
  • Use SSD storage for faster file I/O operations

Contributing

This is a private project, but contributions are welcome. Please:

  1. Follow the coding standards outlined in .github/copilot-instructions.md
  2. Add appropriate tests for new functionality
  3. Update documentation for any new features
  4. Ensure all existing tests pass before submitting changes

License

This project is for internal use. All rights reserved.

Support

For technical support or feature requests, please refer to the project documentation or contact the development team.

Version History

Version 1.0.0 (Current)

  • Initial release with core functionality
  • PDF form field detection and filling
  • Excel data import and processing
  • Basic field mapping interface
  • Template save/load functionality
  • Comprehensive logging and error handling

Planned Features

  • Enhanced PDF preview with field highlighting
  • Advanced data transformation options
  • Batch processing with pause/resume
  • Integration with cloud storage services
  • Mobile companion app for field verification

About

PDF Form Filler - Automated Form Processing Tool About PDF Form Filler is a powerful Python desktop application designed to automate the tedious process of filling out repetitive PDF forms.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published