PLAYA-PDF v0.7.0: Vastly improved structure and marked content
PLAYA 0.7.0: 2025-08-04
- Remove long-deprecated functions
- Add and document
finalize
method on ContentObjects - Make
PageList
work more or less like aSequence
- Support iteration over
playa.structure.ContentItem
- Greatly increase test coverage
- Greatly optimize marked content section access
- Add
find
andfind_all
methods topage.structure
- Extract CMYK images (except JPEG/JPEG2000) as TIFF
What's Changed
- fix: correct and test rotation behaviour which was very broken by @dhdaines in #167
- fix: use decode_text almost everywhere for utf-16 by @dhdaines in #168
- Improve coverage and fix bugs by @dhdaines in #161
- refactor: image extraction part 2 by @lambdalemon in #163
- feat: extract cmyk images as tiff by @lambdalemon in #170
- Greatly accelerate and improve logical structure tasks by @dhdaines in #165
Full Changelog: v0.6.6...v0.7.0