This repository contains scripts to run the dhchoi/manchu-llama32-11b-vision-merged
model for Manchu script OCR and text generation.
Note: This application is optimized for macOS with Apple Silicon (MPS) or CPU support only.
git clone https://github.com/dhchoi-lazy/manchu_mac.git
cd manchu_mac
# Remove existing virtual environment for clean setup
rm -rf .venv
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
The easiest way to use the Manchu OCR model is through the web interface:
python run_streamlit.py
The web app will open automatically in your browser at http://localhost:10011
# Interactive mode
python run_manchu_model.py
# Single image OCR
python run_manchu_model.py ./samples/validation_sample_0000.jpg
# Batch directory processing
python run_manchu_model.py ./samples/
python run_manchu_eval.py
- Model:
dhchoi/manchu-llama32-11b-vision-merged
- Base: LLaMA 3.2 11B Vision
- Purpose: Manchu script OCR and text generation
- Size: ~21.3 GB
- Device Support: Apple Silicon (MPS) or CPU
- OS: macOS (Apple Silicon recommended)
- Python: 3.10+
- Memory: 16GB+ RAM recommended
- Storage: 25GB+ free space for model cache
- Apple Silicon (MPS): ~30-40 seconds per image
- CPU: ~60-120 seconds per image (depending on CPU)