This project implements Optical Character Recognition (OCR) for handwritten text using state-of-the-art TrOCR models. The focus is on fine-tuning pretrained models to achieve high accuracy on real-world handwriting datasets.
Develop an accurate and efficient handwriting recognition system using advanced deep learning models trained on publicly available datasets. The model must be capable of handling diverse handwriting styles, varying noise levels, and irregular layouts.
Three high-quality datasets were used for training and evaluation:
-
IAM_TrOCR Dataset (Kaggle)
Handwritten line images with corresponding ground-truth labels.
➤ Kaggle Dataset -
IAM Handwritten Forms Dataset (Kaggle)
Scanned handwritten forms with printed labels for ground truth extraction.
➤ Kaggle Dataset -
Teklia IAM-Line Dataset (Hugging Face)
Pre-segmented line images from IAM forms, ready for training.
➤ Hugging Face Dataset
- Region extraction from printed and handwritten zones using pixel slicing.
- Pytesseract used to extract printed text as ground truth.
- CLAHE (Contrast Limited Adaptive Histogram Equalization) and a custom sharpening filter applied to enhance clarity.
- Image resizing and normalization to fit TrOCR input requirements (384x384).
-
TrOCR Large Handwritten
🔹 Best performer with CER:0.459
, WER:0.586
🔗 Model -
TrOCR Base Stage 1
🔹 CER:0.505
, WER:0.600
🔗 Model -
Teklia TrOCR Base Stage 1
🔹 Fine-tuned on Teklia IAM-Line
🔹 CER:0.551
, WER:0.633
-
CLAHE + Sharpen + TrOCR Large
🔹 Despite enhancement, performed lower (CER:0.731
, WER:0.930
)
Model | Dataset Used | CER | WER |
---|---|---|---|
TrOCR Large Handwritten | IAM_TrOCR | 0.459 | 0.586 |
TrOCR Base Stage 1 | IAM_TrOCR | 0.505 | 0.600 |
Teklia TrOCR Base Stage 1 | Teklia IAM-Line | 0.551 | 0.633 |
CLAHE + Sharpen + TrOCR Large | IAM Handwritten Forms | 0.731 | 0.930 |
- 🧪 Kaggle (Tesla P100 GPU)
- ☁️ Google Colab (T4 GPU)
- Epochs:
10
- Learning Rate:
5e-5
- Batch Size:
4 or 8
(depending on model type)
- Access issues with official IAM and IMGUR5K datasets
- Broken download links and GitHub script failures
- TrOCR Large Handwritten on Kaggle
- Teklia IAM-Line Notebook
- Google Colab Notebook 1
- Google Colab Notebook 2
Gandluru Mohammed Yaseen
M.Tech in Artificial Intelligence & Machine Learning
Lovely Professional University
📧 gandlurumohammedyaseen@gmail.com
🔗 LinkedIn
💻 GitHub
Feel free to ⭐ the repo and explore the notebooks!