Skip to content

Releases: Unstructured-IO/unstructured-api

0.0.56

03 Nov 21:24
5e04b1c
Compare
Choose a tag to compare

0.0.56

  • Add max_characters param for chunking This param gives users additional control to "chunk" elements into larger or smaller CompositeElements
  • Bump unstructured to 0.10.28
  • Make sure chipperv2 is called whien hi_res_model_name==chipper

0.0.55

26 Oct 20:32
c91d1b9
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.26
  • Bring parent_id metadata field back after fixing a backwards compatibility bug
  • Restrict Chipper usage to one at a time. The model is very resource intense, and this will prevent issues while we improve it.

0.0.54

23 Oct 16:58
13c1760
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.25
  • Use a generator when splitting pdfs in parallel mode
  • Add a default memory minimum for 503 check
  • Fix an UnboundLocalError when an invalid docx file is caught

0.0.53

16 Oct 18:31
c0b945e
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.23
  • Simplify the error message for BadZipFile errors

0.0.52

12 Oct 21:14
1121f12
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.21
  • Fix an unhandled error when a non pdf file is sent with content-type pdf
  • Fix an unhandled error when a non docx file is sent with content-type docx
  • Fix an unhandled error when a non-Unstructured json schema is sent

0.0.51

05 Oct 18:14
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.19

0.0.50

03 Oct 21:35
db264d8
Compare
Choose a tag to compare
  • Bump unstructured to 0.10.18

0.0.49

29 Sep 15:13
2e655c6
Compare
Choose a tag to compare
  • Remove spurious whitespace in app-start.sh. This fixes deployments in some envs such as Google Cloud Run.

0.0.48

26 Sep 21:42
29be0e8
Compare
Choose a tag to compare
  • Adds languages kwarg ocr_languages will eventually be deprecated and replaced by lanugages to specify what languages to use for OCR
  • Adds a startup log and other minor cleanups

0.0.47

26 Sep 00:21
a20e01c
Compare
Choose a tag to compare
  • Adds chunking_strategy kwarg and associated params These params allow users to "chunk" elements into larger or smaller CompositeElements
  • Remove parent_id from the element metadata. New metadata fields are causing errors with existing installs. We'll readd this once a fix is widely available.
  • Fix some pdfs incorrectly returning a file is encrypted error. The pypdf.is_encrypted check caused us to return this error even if the file is readable.