A powerful Python desktop application that automates the filling of PDF forms using data from Excel spreadsheets. This tool enables bulk processing of forms by mapping PDF form fields to Excel columns, eliminating manual data entry and reducing errors.
- PDF Form Analysis: Automatically detect and extract form fields from PDF documents
- Excel Data Import: Import and parse Excel spreadsheets (.xlsx, .xls formats)
- Field Mapping: Visual drag-and-drop interface for mapping PDF fields to Excel columns
- Bulk Processing: Process multiple PDF forms using mapped Excel data
- Template Management: Save and reuse field mapping configurations
- Data Validation: Custom validation rules for form fields
- Data Transformation: Apply formatting rules (uppercase, lowercase, date formatting)
- Audit Trail: Comprehensive logging of all processing activities
- Preview Mode: Preview filled forms before final processing
- Error Handling: Graceful handling of missing or invalid data
- Python 3.8 or higher
- Windows 10/11 (primary target platform)
- Clone or download this repository
- Install required dependencies:
pip install -r requirements.txt
- Run the application:
python main.py
- PyPDF2: PDF manipulation and form field extraction
- PyMuPDF: Enhanced PDF processing capabilities
- pandas: Excel file handling and data manipulation
- openpyxl: Excel file format support
- tkinter: GUI framework (included with Python)
- Pillow: Image processing for preview functionality
- Load PDF Form: Open a PDF file containing form fields
- Load Excel Data: Import an Excel spreadsheet with the data to fill forms
- Map Fields: Create mappings between PDF form fields and Excel columns
- Process Forms: Generate filled PDF forms for each row in the Excel data
- Click "Open PDF" to select your PDF form template
- Click "Open Excel" to select your data spreadsheet
- The application will analyze both files and display available fields/columns
- The left panel shows PDF form fields and Excel columns
- Use drag-and-drop or double-click to create mappings
- Configure data transformations and validation rules as needed
- Use "Auto Map" for automatic mapping based on field name similarity
- Click "Preview" to see how the first form will look when filled
- Click "Process All" to generate filled PDFs for all Excel rows
- Select an output directory for the generated forms
- Monitor progress in the status bar and log panel
- Save your field mappings as templates for reuse
- Load saved templates for similar forms or future processing
- Basic Mapping: Simple field-to-column assignments
- Data Transformations:
- Text case conversion (uppercase, lowercase, title case)
- Date formatting
- Text prefixes and suffixes
- Custom formatting rules
- Validation Rules:
- Required field validation
- Data type validation (numeric, email)
- Length constraints
- Custom validation patterns
- File Naming: Customizable naming conventions for output files
- Organization: Folder structure based on data attributes
- Batch Processing: Configurable batch sizes for large datasets
PDF_Form_Filler/
├── main.py # Application entry point
├── requirements.txt # Python dependencies
├── README.md # This file
├── docs/
│ └── PRD.md # Product Requirements Document
├── src/
│ ├── __init__.py
│ ├── pdf_processor.py # PDF handling and form filling
│ ├── excel_handler.py # Excel file operations
│ ├── field_mapper.py # Field mapping logic
│ └── ui/
│ ├── __init__.py
│ └── main_window.py # Main GUI interface
├── .github/
│ └── copilot-instructions.md # Development guidelines
└── .vscode/
└── tasks.json # VS Code build tasks
The application follows a modular design with clear separation of concerns:
- PDF Processing: Handles PDF form field detection and filling operations
- Excel Handling: Manages Excel file operations and data parsing
- Field Mapping: Manages the relationship between PDF fields and Excel columns
- User Interface: Tkinter-based GUI for user interaction
- PDF form field extraction using PyMuPDF
- Form filling and output generation
- PDF validation and compatibility checking
- Excel file loading and parsing with pandas
- Data validation and preprocessing
- Support for multiple worksheets and data formats
- Field mapping configuration and management
- Data transformation and validation rules
- Template save/load functionality
- Primary user interface implementation
- File management and processing controls
- Progress tracking and logging display
To run the application in development mode with enhanced logging:
python -c "import logging; logging.basicConfig(level=logging.DEBUG); exec(open('main.py').read())"
A VS Code task is configured for building and running the application. Use Ctrl+Shift+P
and search for "Tasks: Run Task" to access available tasks.
- Ensure the PDF contains fillable form fields (AcroForms)
- Check if the PDF is password protected (feature planned for future release)
- Verify the PDF is not corrupted or damaged
- Ensure the Excel file is not open in another application
- Check that the file format is supported (.xlsx, .xls)
- Verify column headers are present and properly formatted
- Check that all required fields have mappings or default values
- Verify Excel data types match PDF field requirements
- Review the log panel for specific error messages
- For large datasets (>1000 rows), consider processing in smaller batches
- Close unnecessary applications to free up memory during processing
- Use SSD storage for faster file I/O operations
This is a private project, but contributions are welcome. Please:
- Follow the coding standards outlined in
.github/copilot-instructions.md
- Add appropriate tests for new functionality
- Update documentation for any new features
- Ensure all existing tests pass before submitting changes
This project is for internal use. All rights reserved.
For technical support or feature requests, please refer to the project documentation or contact the development team.
- Initial release with core functionality
- PDF form field detection and filling
- Excel data import and processing
- Basic field mapping interface
- Template save/load functionality
- Comprehensive logging and error handling
- Enhanced PDF preview with field highlighting
- Advanced data transformation options
- Batch processing with pause/resume
- Integration with cloud storage services
- Mobile companion app for field verification