Put Together In The Same Factory
This release fixes a major regression in 0.0.7 which broke consuming documents via streams. It also adds new features:
- Document Layout Analysis: Adds the
Docstrum
(Doc Spectrum) algorithm for page segmentation. - Document segmentation approaches (
Docstrum
andRecursiveXYCut
) implement theIPageSegmenter
interface which now returns a list ofTextBlock
s.XYLeaf
andXYNode
are now internal. TextEdgesExtractor
is a new class which can be used to detect shared alignment in sections of text.- Letters now have a
Color
property. This is one of the types implementingIColor
. These areGrayColor
,RGBColor
andCMYKColor
, other color spaces are not currently supported and default toGrayColor.Black
. PdfDocument
now has aTryGetXmpMetadata(out XmpMetadata metadata)
method which will retrieve the XML XMP Metadata object from the document if one is present.