Skip to content

klaushajdaraj/OCR

Repository files navigation

OCR Application using Gemma-3

This project leverages Gemma-3 vision capabilities and Streamlit to create a 100% locally running computer vision app that can perform both OCR and extract structured text from the image.

Installation and setup

Set virtual environment:

python -m venv ocr-gemma3-env

On macOS and Linux:

source ocr-gemma3-env/bin/activate

On Windows:

ocr-gemma3-env\Scripts\activate

Install Dependencies: Ensure you have Python 3.11 or later installed.

pip install -r requirements.txt

Setup Ollama:

# setup ollama on linux 
curl -fsSL https://ollama.com/install.sh | sh
# pull gemma-3 vision model
ollama run gemma3:12b

Run the Streamlit app:

streamlit run streamlit_app.py

Made with ❤️

About

A repo for an OCR app

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages