This project focuses on recognizing CAPTCHA text using a deep learning approach, employing Convolutional Recurrent Neural Networks and Bidirectional Gated Recurrent Units. It combines a truncated ResNet‑18 visual encoder with BiGRU, a common approach for OCR in documents and images


Checkout the colab notebook!
- Backbone:
resnet18(pretrained=True)
truncated before the last 3 blocks - Sequence model: Two-layer BiGRU with hidden size 256 and residual summation after the first BiGRU
Uses CAPTCHA dataset containing labeled images of CAPTCHA text. The dataset is split into training and testing subsets. Data augmentation techniques are applied for better generalization.
The model achieves a high accuracy in recognizing CAPTCHA text, demonstrating its effectiveness for real-world CAPTCHA-solving applications.
I'm open to any new ideas. Just open a PR!
Made with ❤️ in Pittsburgh