Skip to content

Workflow Guide post correction

Konstantin Baierer edited this page Sep 30, 2020 · 6 revisions

In this processing step, the recognized text is corrected by statistical error modelling, language modelling, and word modelling (dictionaries, morphology and orthography).

Note: Most tools benefit strongly from input which includes alternative OCR hypotheses. Currently, models for ocrd-cor-asv-ann-process are optimised for input from single OCR engines, whereas ocrd-cis-postcorrect expects input from multi-OCR alignment.

Available processors

Processor Parameter Remarks Call
ocrd-cor-asv-ann-process {"textequiv_level":"word","model_file":"/path/to/model/model.h5"} Models can be found here; you need to **pass your local path to the model on your hard drive** as parameter value for this processor to work! ocrd-cor-asv-ann-process -I OCR-D-OCR -O OCR-D-PROCESS -P textequiv_level word -P model_file /path/to/model/model.h5
ocrd-cis-postcorrect {"profilerPath": "/path/to/profiler.bash","profilerConfig": str,"nOCR": int,"model": "/path/to/model/model.zip"} The various parameters should be specified in a JSON file. If you don't want to use a profiler, you can set the value for "profilerConfig" to "ignored". In this case, your profiler.bash should look like this: `#!/bin/bash cat > /dev/null echo '{}'` you need to **pass your local path to the model on your hard drive** as parameter value for this processor to work! ocrd-cis-postcorrect -I OCR-D-ALIGN -O OCR-D-CORRECT -p postcorrect.json

Notes on parameter usage

E.g.

  • which parameters do you use with what values?
  • which parameters are insufficiently documented?
  • which aspects of a processor should be parameterizable but are not?

Notes on document-specific usage

E.g. which processors worked best with what material? -- feel free to post sample images here, too.

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials
Discussions
Expert section on OCR-D- workflows
Particular workflow steps
Recommended workflows
Workflow Guide
Videos
Section on Ground Truth
Clone this wiki locally