Driver License OCR API

A .NET Core API that analyzes driver's licenses using AI to detect the state and extract text fields using OCR. The system uses Ollama's vision models for state detection and Tesseract OCR for text extraction.

Features

State detection using Ollama's LLaVA vision model
OCR text extraction based on state-specific templates
Support for all 50 US states
Simple web UI for testing
REST API endpoints for integration

Prerequisites

1. Install Ollama

Ollama is used for AI-based state detection:

Download and install Ollama from the official website: https://ollama.com/download
After installation, run Ollama from your applications menu or command line
Pull the LLaVA vision model by running the following in your terminal:

ollama pull llava:7b-v1.6-mistral-q2_K

NOTE: model can be changed based on your hardware.

2. Install .NET 9 SDK

This project requires .NET 9 SDK:

Download and install from: https://dotnet.microsoft.com/download/dotnet/9.0

3. Tesseract OCR Data

The application will automatically create a tessdata directory, but you need to download the English language data:

Download the English language data file from GitHub:

Invoke-WebRequest -Uri "https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata" -OutFile "DriverLicenseAPI/tessdata/eng.traineddata"

Installation

Clone the repository:

git clone https://github.com/peymannaderi10/OCR-API.git
cd OCR-API

Make sure Tesseract data is installed (if you haven't done step 3 from Prerequisites):

# Create tessdata directory if it doesn't exist
mkdir -p DriverLicenseAPI/tessdata
# Download English language data
Invoke-WebRequest -Uri "https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata" -OutFile "DriverLicenseAPI/tessdata/eng.traineddata"

Place your state XML templates in the DriverLicenseAPI/Templates/States directory. Each state should have its own XML file (e.g., michigan.xml, california.xml).

Running the Application

Navigate to the API project directory:

cd DriverLicenseAPI

Start the application:

dotnet run

Access the web UI by opening a browser and navigating to:

https://localhost:5001

Or the URL displayed in your console.

API Usage

The API provides two main endpoints:

1. Auto-detect state and extract data

Endpoint: POST /api/DriverLicense/analyze

Request: Form-data with file containing the driver's license image

Response:

{
  "analysis": "Successfully processed michigan driver's license.",
  "state": "michigan",
  "fields": {
    "License Number": "S123-456-789-000",
    "Expiration Date": "2025-06-30",
    "Date of Birth": "1990-01-01",
    "Full Name": "JOHN DOE",
    "Address": "123 MAIN ST, ANYTOWN, MI 12345",
    "Sex": "M"
  }
}

2. Process with known state

Endpoint: POST /api/DriverLicense/analyzeWithState?state={state}

Request:

Form-data with file containing the driver's license image
Query parameter state with the name of the state (e.g., michigan)

Response: Same structure as above

Template Structure

The system uses XML templates for each state that define the positions of fields on the license:

<annotation>
  <object>
    <name>License Number</name>
    <bndbox>
      <xmin>180</xmin>
      <ymin>74</ymin>
      <xmax>354</xmax>
      <ymax>95</ymax>
    </bndbox>
  </object>
  <!-- More fields -->
</annotation>

Labeling More License Images:

I use Label Studio to label the fields I want to extract from on a license variant and will use the generate XML to guide tesseract into extracting the fields and labelling them correctly without picking up noise.

https://github.com/HumanSignal/labelImg

Troubleshooting

Common Issues:

Ollama Connection Error
- Ensure Ollama is running
- Check if the model has been downloaded: ollama list
- Try a different model if needed
Template Not Found
- Check that the template file exists in DriverLicenseAPI/Templates/States/{state}.xml
- Ensure the filename is lowercase with no spaces
- Check logs for template path issues
OCR Not Working
- Verify the Tesseract data file is in the correct location
- Check image quality - blurry images may not OCR properly
- Adjust bounding boxes in the XML template if needed
Timeout Errors
- First-time vision model usage can take longer as the model loads
- Try using a smaller model version: llava:7b instead of larger versions

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
DriverLicenseAPI		DriverLicenseAPI
OCR-API.sln		OCR-API.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Driver License OCR API

Features

Prerequisites

1. Install Ollama

2. Install .NET 9 SDK

3. Tesseract OCR Data

Installation

Running the Application

API Usage

1. Auto-detect state and extract data

2. Process with known state

Template Structure

Labeling More License Images:

Troubleshooting

Common Issues:

About

Uh oh!

Releases

Packages

Languages

peymannaderi10/DriverLicenseOCR-API

Folders and files

Latest commit

History

Repository files navigation

Driver License OCR API

Features

Prerequisites

1. Install Ollama

2. Install .NET 9 SDK

3. Tesseract OCR Data

Installation

Running the Application

API Usage

1. Auto-detect state and extract data

2. Process with known state

Template Structure

Labeling More License Images:

Troubleshooting

Common Issues:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages