Skip to content

Error splitting ALTO XML file during ocrWF #1547

@estanfor

Description

@estanfor

Honeybadger alert: https://app.honeybadger.io/projects/52894/faults/123219313

XML file looks OK at first glance. The only weird thing I see is that in the ABBYY output folder, there are two XML files, one with the druid name and one with the image filename: S:\AbbyyShare\sdr-ocr-prod\OUTPUT\sh099tn7500

The SDR item sh099tn7500 is currently stuck in ocrWF in the split-ocr step. It looks like it might have started to split out the individual pages of OCR XML from the combined XML and ran into an error while it was doing that?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

In Progress (Not Ordered)

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions