-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
The alto page
transform does not set /PcGts/Page/@imageFilename
if the input had no /alto/description/sourceImageInformation/@fileName
. It is impossible to fix that with OCR-D means (even ocrd workspace
).
It would be very helpful if this processor had some fix-up capability for this important case (and probably others).
My suggestion would be to try to find the "correct" image file by looking up the physical pageId for the ALTO file and then among the image-only fileGrps taking the first (or the largest, or a parameter-configured) entry for that page.
Metadata
Metadata
Assignees
Labels
No labels