This project is an end-to-end OCR system designed to recognize CAPTCHA images—specifically from some Banks. It includes tools to scrape, train, and serve a model that can decode distorted characters from CAPTCHA images using deep learning.
- 🚀 CNN-based CAPTCHA recognition using PyTorch
- 🛠 Custom Dataset loader with multi-label classification
- 📥 CAPTCHA scraper from MB Bank login page
- 🌐 Flask API server for real-time predictions
- 📦 Clean and modular structure for training + inference
ocr-recaptcha/
├── __pycache__/ # Python cache
├── mbbank_dataset/ # Folder for training/test CAPTCHA images
├── CONFIG.py # Configuration (image size, char set, model path)
├── MAIN.py # Main script to train or test the model
├── datasets.py # Custom PyTorch Dataset class
├── mbbank_scrapping.py # Script to scrape new CAPTCHA images
├── model.pth # Trained CNN model
├── requirements.txt # Python dependencies
├── server.py # Flask server for API-based prediction
├── temp.png # Temporary file used for API testing
git clone https://github.com/co-dev0909/ocr-recaptcha.git
cd ocr-recaptcha
pip install -r requirements.txt
Run the scraping script to download new CAPTCHA images:
python mbbank_scrapping.py
Images are saved into mbbank_dataset/
(you can modify this inside the script).
Ensure your image filenames represent the correct CAPTCHA labels (e.g., AB12cd.jpg
):
python MAIN.py
This will train the CNN and save the weights to model.pth
.
To test the model on all CAPTCHA images in a folder:
# Inside MAIN.py, you can call:
CrackLettesInt4().check_all('mbbank_dataset/test')
This will:
- Predict each CAPTCHA
- Print the result
- Move it to
./test/<prediction>.jpg
Start a REST API server for inference:
python server.py
Send a file:
curl -X POST -F "image=@mbbank_dataset/sample.jpg" http://localhost:8001/predict
Send base64:
POST /predict
{
"image_base64": "data:image/png;base64,iVBORw0KGgoAAAANS..."
}
Response:
{
"prediction": ["4Hd6Rf"]
}
Supported characters (from CONFIG.py
):
123456789ABDEFGHJMNRTYabdefghjmnqrty
- CAPTCHA Length: 6
- Image Size:
200x35
torch
torchvision
flask
Pillow
numpy
shortuuid
Install via:
pip install -r requirements.txt
- Customize CAPTCHA source URLs in
mbbank_scrapping.py
. - Update image paths or save logic in
MAIN.py
anddatasets.py
if needed. - To extend: Add beam search, augmentations, or use CRNN with CTC for harder CAPTCHAs.
Maintained by co-dev0909.
This project is open-source and available under the MIT License.