InvoiScope is an intelligent GST invoice information extraction system powered by YOLOv9c object detection and OCR technology. Extract key information from GST invoices with high accuracy and confidence through an intuitive web interface.
- Features
- Demo
- Installation
- Usage
- Project Structure
- Supported GST Fields
- Technical Architecture
- Configuration
- Model Performance
- API Reference
- Troubleshooting
- Contributing
- License
- π― High Accuracy Detection: 74.1% mAP@0.5 with 85.4% precision using optimized YOLOv9c
- π Comprehensive Field Extraction: 24 different GST invoice fields across 3 categories
- πΌοΈ Visual Annotations: Interactive annotated images showing detected fields with bounding boxes
- π± Modern Web Interface: Clean, responsive Streamlit interface with real-time processing
- π₯ Multiple Export Formats: Download results as CSV or JSON with timestamps
- β‘ Fast Processing: 2-5 seconds per invoice with optimized inference pipeline
- π§ Configurable Confidence: Adjustable confidence thresholds for different use cases
- π Performance Metrics: Real-time confidence scores and quality assessments
- π¨ Categorized Results: Organized display of invoice metadata, business info, and tax details
- Upload GST invoice images (JPG, PNG, JPEG)
- Real-time field detection and text extraction
- Interactive confidence threshold adjustment
- Categorized results display with visual indicators
- Downloadable extraction reports
- Python: 3.8 or higher
- Tesseract OCR: Required for text extraction
- System Memory: Minimum 4GB RAM recommended
- Storage: ~2GB for dependencies and model files
git clone https://github.com/ayush2635/Invoiscope.git
cd Invoiscope
# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Windows:
- Download from: https://github.com/UB-Mannheim/tesseract/wiki
- Install and add to PATH
- Verify:
tesseract --version
macOS:
brew install tesseract
Ubuntu/Debian:
sudo apt-get install tesseract-ocr
- Place your trained YOLOv9c model file in the
models/
directory - Rename it to:
gst_invoice_yolov9c_optimal.pt
- Ensure the file path matches:
models/gst_invoice_yolov9c_optimal.pt
python -c "import streamlit, cv2, pytesseract; print('All dependencies installed successfully!')"
streamlit run app.py
The application will be available at: http://localhost:8501
-
π€ Upload Invoice:
- Drag and drop or click to select a GST invoice image
- Supported formats: JPG, PNG, JPEG (max 10MB)
-
βοΈ Configure Settings:
- Adjust confidence threshold (0.1 - 1.0)
- Higher values = more precise but fewer detections
- Lower values = more detections but potentially less accurate
-
π Extract Information:
- Click "Extract GST Information" button
- Processing typically takes 2-5 seconds
-
π Review Results:
- View categorized extraction results
- Check annotated image with bounding boxes
- Review confidence scores and quality metrics
-
π₯ Export Data:
- Download results as CSV for spreadsheet analysis
- Download as JSON for programmatic use
- Files include timestamps for organization
from src.gst_extractor import GSTInvoiceExtractor
extractor = GSTInvoiceExtractor("models/gst_invoice_yolov9c_optimal.pt")
# Process multiple invoices
invoice_paths = ["invoice1.jpg", "invoice2.png", "invoice3.jpeg"]
results = []
for path in invoice_paths:
result = extractor.extract_gst_information(path, confidence_threshold=0.6)
results.append(result)
# Modify config.py for custom OCR settings
OCR_CONFIGS = {
'custom_field': '--psm 7 -c tessedit_char_whitelist=ABCDEF0123456789',
# Add your custom configurations
}
Invoiscope/
βββ π .streamlit/ # Streamlit configuration
β βββ config.toml # App settings and themes
βββ π assets/ # Static assets
β βββ style.css # Custom CSS styles
βββ π data/ # Data directories
β βββ π uploads/ # Uploaded invoice images
β βββ π results/ # Processing results and exports
β βββ π temp/ # Temporary processing files
βββ π models/ # Machine learning models
β βββ gst_invoice_yolov9c_optimal.pt # YOLOv9c trained model
βββ π src/ # Source code modules
β βββ gst_extractor.py # Main extraction logic and YOLO integration
β βββ ocr_processor.py # OCR processing and text cleaning
β βββ utils.py # Utility functions and visualization
β βββ __init__.py # Package initialization
βββ π venv/ # Virtual environment (created after setup)
βββ π app.py # Main Streamlit application
βββ π config.py # Configuration settings and constants
βββ π requirements.txt # Python dependencies
βββ π README.md # This documentation
βββ π .gitignore # Git ignore rules
app.py
: Main Streamlit application with UI components and user interactionsconfig.py
: Central configuration file with paths, settings, and OCR configurationssrc/gst_extractor.py
: Core extraction logic using YOLOv9c for field detectionsrc/ocr_processor.py
: OCR processing with field-specific text cleaning and validationsrc/utils.py
: Utility functions for visualization, data formatting, and file operations
InvoiScope can extract 24 different types of GST invoice fields organized into three main categories:
- Invoice Number: Unique invoice identifier
- Invoice Date: Date of invoice generation
- Due Date: Payment due date
- PO Number: Purchase order reference
- Total Amount: Final payable amount
- Taxable Amount: Amount before tax calculation
- Bill Number: Alternative billing reference
- Supplier/Merchant Name: Vendor business name
- Supplier Address: Complete vendor address
- Supplier GST Number: Vendor GST registration number
- Buyer Name: Customer/buyer business name
- Buyer Address: Customer address details
- Buyer GST Number: Customer GST registration
- Contact Information: Phone numbers, emails
- Business Registration Details: Additional business identifiers
- CGST: Central Goods and Services Tax
- SGST: State Goods and Services Tax
- IGST: Integrated Goods and Services Tax
- UTGST: Union Territory GST
- Total GST: Combined GST amount
- TCS: Tax Collected at Source
- TDS: Tax Deducted at Source
- Compensation Cess: Additional cess amount
- Tax Rate: Applicable tax percentage
- π’ High Confidence: β₯ 80% (Highly reliable)
- π‘ Medium Confidence: 60-79% (Generally reliable)
- π΄ Low Confidence: < 60% (Requires verification)
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Streamlit UI βββββΆβ Image Upload & βββββΆβ YOLOv9c β
β (Frontend) β β Preprocessing β β Detection β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Results & ββββββ Data Processing ββββββ OCR Text β
β Visualization β β & Formatting β β Extraction β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
- Model: Custom-trained YOLOv9c optimized for GST invoices
- Input: Invoice images (JPG, PNG, JPEG)
- Output: Bounding boxes with field classifications and confidence scores
- Performance: 74.1% mAP@0.5, 85.4% precision
- Engine: Tesseract OCR with custom configurations
- Preprocessing: Image enhancement, scaling, noise reduction
- Field-Specific Processing: Different OCR settings for numbers, dates, text
- Post-processing: Text cleaning, validation, and formatting
- Structured Output: JSON format with categorized fields
- Confidence Scoring: Per-field confidence assessment
- Data Validation: Format checking and error handling
- Export Generation: CSV and JSON export capabilities
Component | Technology | Version | Purpose |
---|---|---|---|
Object Detection | YOLOv9c (Ultralytics) | 8.1.34 | Field detection and localization |
OCR Engine | Tesseract (pytesseract) | 0.3.10 | Text extraction from detected regions |
Web Framework | Streamlit | 1.29.0 | User interface and web application |
Image Processing | OpenCV | 4.8.1.78 | Image preprocessing and manipulation |
Deep Learning | PyTorch | 2.1.0 | Neural network inference |
Data Processing | Pandas | 2.1.3 | Data manipulation and export |
Visualization | Matplotlib | 3.8.2 | Result visualization and annotations |
# Model Settings
MODEL_PATH = "models/gst_invoice_yolov9c_optimal.pt"
DEFAULT_CONFIDENCE = 0.5
# File Processing
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB
ALLOWED_EXTENSIONS = ['jpg', 'jpeg', 'png']
# OCR Configurations
OCR_CONFIGS = {
'numbers': '--psm 7 -c tessedit_char_whitelist=0123456789',
'amounts': '--psm 7 -c tessedit_char_whitelist=0123456789βΉ.,- ',
'dates': '--psm 7 -c tessedit_char_whitelist=0123456789/-.',
'gst_number': '--psm 7 -c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
}
[theme]
primaryColor = "#1e88e5"
backgroundColor = "#ffffff"
secondaryBackgroundColor = "#f0f2f6"
textColor = "#262730"
[server]
maxUploadSize = 10
enableCORS = false
# Optional: Set Tesseract path if not in system PATH
export TESSERACT_CMD="/usr/local/bin/tesseract"
# Optional: Set custom model path
export GST_MODEL_PATH="/path/to/your/model.pt"
- Dataset: 2,500+ annotated GST invoices
- Training Time: 48 hours on NVIDIA RTX 3080
- Validation Split: 80/20 train/validation
Metric | Value | Description |
---|---|---|
mAP@0.5 | 74.1% | Mean Average Precision at IoU 0.5 |
mAP@0.5:0.95 | 52.3% | Mean Average Precision across IoU thresholds |
Precision | 85.4% | True Positive / (True Positive + False Positive) |
Recall | 67.2% | True Positive / (True Positive + False Negative) |
F1-Score | 75.2% | Harmonic mean of precision and recall |
- Average Processing Time: 2.8 seconds per invoice
- Memory Usage: ~1.2GB during inference
- Supported Image Sizes: 640x640 to 2048x2048 pixels
- Batch Processing: Up to 10 images simultaneously
Field Category | Precision | Recall | F1-Score |
---|---|---|---|
Invoice Metadata | 88.2% | 71.5% | 79.0% |
Business Information | 82.1% | 65.8% | 73.0% |
Tax Information | 86.7% | 64.3% | 73.8% |
from src.gst_extractor import GSTInvoiceExtractor
# Initialize extractor
extractor = GSTInvoiceExtractor(model_path="models/gst_invoice_yolov9c_optimal.pt")
# Extract information
results = extractor.extract_gst_information(
image_path="path/to/invoice.jpg",
confidence_threshold=0.5
)
Extracts GST information from an invoice image.
Parameters:
image_path
(str): Path to the invoice image fileconfidence_threshold
(float): Minimum confidence score (0.1-1.0)
Returns:
{
"invoice_metadata": {
"invoice_number": [{"field_type": str, "text_content": str, "confidence": float, "bbox": list}],
# ... other fields
},
"business_information": { /* ... */ },
"tax_information": { /* ... */ },
"summary": {
"total_detections": int,
"average_confidence": float,
"extraction_timestamp": str
}
}
from src.ocr_processor import OCRProcessor
# Initialize OCR processor
ocr = OCRProcessor()
# Extract text from image region
text = ocr.extract_text(image_region, field_type="amounts")
from src.utils import create_annotated_image, create_results_dataframe
# Create annotated image
annotated_img = create_annotated_image(image_path, gst_data)
# Create results DataFrame
df = create_results_dataframe(gst_data)
Error: Model file not found at: models/gst_invoice_yolov9c_optimal.pt
Solution:
- Ensure the model file is placed in the
models/
directory - Check file name matches exactly:
gst_invoice_yolov9c_optimal.pt
- Verify file permissions and accessibility
TesseractNotFoundError: tesseract is not installed or it's not in your PATH
Solution:
- Install Tesseract OCR for your operating system
- Add Tesseract to your system PATH
- On Windows, verify installation path in environment variables
RuntimeError: CUDA out of memory
Solution:
- Reduce image size before processing
- Process images one at a time instead of batches
- Use CPU inference if GPU memory is limited:
# Force CPU usage
import torch
torch.cuda.is_available = lambda: False
Symptoms: Extracted text is garbled or incorrect Solutions:
- Ensure invoice images are high resolution (minimum 300 DPI)
- Check image quality - avoid blurry or low-contrast images
- Adjust confidence threshold to filter low-quality detections
- Verify proper lighting and minimal skew in scanned documents
Error: Port 8501 is already in use
Solution:
# Use different port
streamlit run app.py --server.port 8502
# Or kill existing process
# Windows
netstat -ano | findstr :8501
taskkill /PID <PID> /F
# macOS/Linux
lsof -ti:8501 | xargs kill -9
- Image Quality: Use high-resolution, well-lit images
- Confidence Tuning: Start with 0.6-0.7 for better precision
- Image Preprocessing: Ensure minimal skew and good contrast
- Image Resizing: Resize large images to 1024x1024 maximum
- Batch Processing: Process multiple invoices in sequence
- Model Optimization: Use TensorRT or ONNX for production deployment
Enable debug logging for troubleshooting:
import logging
logging.basicConfig(level=logging.DEBUG)
# Run extraction with detailed logs
results = extractor.extract_gst_information(image_path, confidence_threshold=0.5)
We welcome contributions to improve InvoiScope! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name
- Install development dependencies:
pip install -r requirements-dev.txt
- Make your changes
- Run tests:
python -m pytest tests/
- Submit a pull request
- π Bug Fixes: Report and fix issues
- β¨ New Features: Add support for new invoice formats
- π Documentation: Improve documentation and examples
- π§ͺ Testing: Add test cases and improve coverage
- π¨ UI/UX: Enhance the user interface
- β‘ Performance: Optimize processing speed and accuracy
- Follow PEP 8 guidelines
- Use type hints where appropriate
- Add docstrings for all functions and classes
- Include unit tests for new features
When reporting issues, please include:
- Python version and operating system
- Complete error messages and stack traces
- Sample invoice images (with sensitive data removed)
- Steps to reproduce the issue
This project is licensed under the MIT License.
- β Commercial use allowed
- β Modification allowed
- β Distribution allowed
- β Private use allowed
- β No warranty provided
- β No liability assumed
- Ultralytics: For the excellent YOLOv9 implementation
- Tesseract OCR: For robust text extraction capabilities
- Streamlit: For the intuitive web framework
- OpenCV: For comprehensive image processing tools
- Contributors: Thanks to all contributors who helped improve this project
- π§ Email: naman2634@gmail.com
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π Documentation: Wiki