Zim Docs OCR-to-JSON Extractor

title	emoji	colorFrom	colorTo	sdk	sdk_version	app_file	pinned	license
Zim Docs OCR-to-JSON Extractor	⚡	purple	blue	gradio	5.31.0	app.py	false	mit

Zim Docs OCR-to-JSON Extractor

Overview

Welcome to the Zim Docs OCR-to-JSON Extractor! This is a powerful and user-friendly web application built with Gradio, designed to help you upload scanned documents (PDFs) or images (PNG, JPG, etc.). It then uses a vision AI model to perform Optical Character Recognition (OCR) and extract structured information into a JSON format. This tool aims to streamline your process of digitizing and organizing data from various document types, such as driver's licenses, passports, national ID cards, invoices, receipts, and more.

Requirements

To use this application, you'll need:

Python 3.7+
Gradio
Gradio-PDF (gradio_pdf)
Requests
PyMuPDF (fitz)
An API Key from OpenRouter.ai (or any other service compatible with the OpenAI chat completions API format).
- You should set this key as an environment variable named API_KEY. The Python script uses os.getenv("API_KEY") to retrieve this key. If you're using Hugging Face Spaces, you can set this as a "Secret".

Running the Application

Live Demo: You can try out a live demo of this application at: Demo

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitattributes		.gitattributes
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Zim Docs OCR-to-JSON Extractor

Overview

Requirements

Running the Application

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ronaldkanyepi/docs-ocr-2-json

Folders and files

Latest commit

History

Repository files navigation

Zim Docs OCR-to-JSON Extractor

Overview

Requirements

Running the Application

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages