Skip to content

Improve ABBYY error message reporting #1493

@andrewjbtw

Description

@andrewjbtw

When ocr-create hits an error, we get a generic message reported back to the workflow.

For example, there's an item in stage with this error:

ocr-create : Processing station Default Processing Station. Processing was canceled sg359db9318_0365.tif

This message must be coming from ABBYY but it is fairly cryptic.

It turns out that ABBYY actually output a much more helpful message in the result xml located in the EXCEPTIONS folder at:
/abbyy/EXCEPTIONS/jj017ry4593.xml.result.xml (you can find this message if you search for a word like "corrupted")

The following error ocurred when loading the file 'sg359db9318_0365.tif': Unable to open image file. Page 1 is corrupted. Total number of pages in the image file: 1. (The size of image S:\RS14WF\Images\SDR-OCR-STAGE\79445\j2603026_sg359db9318_0365.tif exceeds the maximum allowed size (32512 x 32512).)

I'm not sure if it's possible to programmatically identify and pull out detailed error messages from the result XML, but if we can, we should display that message instead of the more cryptic ABBYY error message.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions