This app automatically crops and straightens photos from photo album pages (or any scanned document!).
- Initially, I tried to vibe-code an algorithmic approach, but it wasn't reliable enough for messy photo album pages lacking a white background and clear borders.
- Then, I found NerdScan, which uses the Grounding DINO model, and adopted it as a backend for this GUI. Thanks to klimentij and IDEA-Research.
- To fix photo orientation, the app uses the Deep-OAD model.
- Streamlit based GUI
- ML-Powered Detection: Uses Grounding DINO for arbitrary prompt-based detection
- Two-Pass Approach: After running an initial batch detection, you can refine your prompt and settings, and retry any individual pages.
- Auto-Rotation: Automatically corrects photo orientation using Deep-OAD
- Progress Tracking: Lists and tracks completed files in a csv file
- CPU-Only: No need for specific hardware
- To use with Docker, run
docker compose up
in the root directory, and access athttp://localhost:5000
. - To run on your machine, install uv, and
uv run streamlit run app.py --server.port 5000
- After running your first batch, take note of any mistakes in the detection, and refine the prompt to better match your document (e.g. "an old photo.", "a polaroid.", "a single photograph.").
- If detecting multiple photos in one, try disabling overlap removal. You'll get the full set of detections, and then you can remove any duplicates manually.
The application automatically downloads required models:
- GPU support
- Additional post-processing (crop out borders, auto-color-balance, etc.)
- Add file tracking to GUI
- Make the Docker image smaller
- Klimentiy Bulygin: NerdScan
- IDEA Research: Grounding DINO
- Subhadip Maji: Deep-OAD, with ONNX weights from Chuckame