Skip to content

Automates contract data extraction from PDFs to Excel, reducing manual entry and improving accuracy. This tool captures key details like Contract IDs, Period of Performance, and other essential descriptions with precision. The built-in regex pattern is intuitive, adaptable, and can be easily modified for any custom data scraping application.

Notifications You must be signed in to change notification settings

Hamiltonius/PDF-Contract-Parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

PDF Contract Parser

What's New in v2

  • Enhanced parsing logic for greater accuracy and efficiency.
  • Improved metrics generation and structured output.
  • Cleaner code structure for easier maintenance and scalability.

Versions

  • v1: Initial implementation for basic contract parsing.
  • v2: Improved parsing logic, enhanced features, and better metrics generation.

Overview

A Python script for extracting contract details from PDF files and generating comprehensive metrics.

Features

  • Extracts specific contract details
  • Generates performance metrics
  • Exports results to Excel

Requirements

  • Python 3.7+
  • Dependencies listed in requirements.txt

Installation

  1. Clone the repository
  2. Create a virtual environment
  3. Install dependencies: pip install -r requirements.txt

Usage

Modify the directory path in main() and run the script.

Versions

  • v1: Initial implementation
  • v2: Enhanced extraction logic

About

Automates contract data extraction from PDFs to Excel, reducing manual entry and improving accuracy. This tool captures key details like Contract IDs, Period of Performance, and other essential descriptions with precision. The built-in regex pattern is intuitive, adaptable, and can be easily modified for any custom data scraping application.

Topics

Resources

Stars

Watchers

Forks

Languages