Why Docling is SO slow when converting PDF with images #1651
Unanswered
MingLin-home
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We benchmarked the conversion speed of Docling with PDF files containing images. The benchmarking code is provided below:
All unnecessary features, such as OCR, were disabled during the tests.
We compared the processing times (in seconds) of Docling and PyMuPDF using a set of PDF files. The results are shown in the following table:
In the above test, Docling is 30~80 times slower than PyMuPDF.
Any reason for the slowness? Any plan to improve the speed?
Benchmark environment:
MacBook Pro M2 Pro, Memory 32 GB. Docling 2.34.0, Python 3.11.
Beta Was this translation helpful? Give feedback.
All reactions