Flexible OCR Processing Modes #417

icereed · 2025-05-21T03:46:14Z

icereed
May 21, 2025
Maintainer

Release Highlights – v0.19.0

New: Flexible OCR Processing Modes

paperless-gpt now offers three distinct OCR processing modes, allowing you to optimize document processing based on your OCR provider and performance needs:

Image Mode (default)
Converts each PDF page into an image before OCR.
Best for: Maximum compatibility with all OCR providers.
Configure via: OCR_PROCESS_MODE: "image"
PDF Mode
Processes individual PDF pages directly, without image conversion.
Best for: Preserving PDF structure and improving speed or accuracy with native PDF-compatible providers.
Configure via: OCR_PROCESS_MODE: "pdf"
Whole PDF Mode
Sends the entire PDF as a single document for OCR.
Best for: Providers optimized for multi-page processing and reduced API calls.
Configure via: OCR_PROCESS_MODE: "whole_pdf"
Note: Large PDFs may exceed your provider's API limits—switch to pdf mode if issues occur.

Enhancements & Fixes

New Feature: Native PDF processing support for OCR #406
Fix: Azure OpenAI LLM integration #398
Dependency Updates:
- Core libraries: React, TypeScript, Node.js types, ESLint
- Backend: Gin, GORM, Google API modules, ocrchestra digest updates
- Build & test tools: Go 1.24.3, Docker images, Testcontainers

Full Changelog: v0.18.0...v0.19.0

This discussion was created from the release Flexible OCR Processing Modes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Flexible OCR Processing Modes #417

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Flexible OCR Processing Modes #417

Uh oh!

icereed May 21, 2025 Maintainer

Release Highlights – v0.19.0

New: Flexible OCR Processing Modes

Enhancements & Fixes

Replies: 0 comments

icereed
May 21, 2025
Maintainer