Skip to content

Put Together In The Same Factory

Compare
Choose a tag to compare
@EliotJones EliotJones released this 14 Aug 08:25
· 1235 commits to master since this release

This release fixes a major regression in 0.0.7 which broke consuming documents via streams. It also adds new features:

  • Document Layout Analysis: Adds the Docstrum (Doc Spectrum) algorithm for page segmentation.
  • Document segmentation approaches (Docstrum and RecursiveXYCut) implement the IPageSegmenter interface which now returns a list of TextBlocks. XYLeaf and XYNode are now internal.
  • TextEdgesExtractor is a new class which can be used to detect shared alignment in sections of text.
  • Letters now have a Color property. This is one of the types implementing IColor. These are GrayColor, RGBColor and CMYKColor, other color spaces are not currently supported and default to GrayColor.Black.
  • PdfDocument now has a TryGetXmpMetadata(out XmpMetadata metadata) method which will retrieve the XML XMP Metadata object from the document if one is present.