Skip to content

GPU-Accelerated Batching for pages of a PDF during Inference #1669

Open
@parin1995

Description

@parin1995

Question

Is there a way we can improve inference latency of Docling on a GPU by creating a batch of page images as an input to the different models - EasyOCR, Layout Detection and TableFormer?

I am using a single A10 GPU for inference, and it is significantly underutilized (~15%). It would be ideal if we can batch

Looking into the Docling documentation, I have tried increasing num_threads, but that seems to only work for CPU and not GPUs.
When I did a little digging into the code I saw that docling iterates over the pages in a page_batch only passes a single page as an input to these models like so:

def __call__(
        self, conv_res: ConversionResult, page_batch: Iterable[Page]
    ) -> Iterable[Page]:
        for page in page_batch:
            assert page._backend is not None
            if not page._backend.is_valid():
                yield page
            else:
                with TimeRecorder(conv_res, "layout"):
                    assert page.size is not None
                    page_image = page.get_image(scale=1.0)
                    assert page_image is not None

                    clusters = []
                    for ix, pred_item in enumerate(
                        self.layout_predictor.predict(page_image)
                    ):
                        label = DocItemLabel(
                            pred_item["label"]
                            .lower()
                            .replace(" ", "_")
                            .replace("-", "_")
                        )
                      ........
                      ........

It would be great if we can do batching of the page images and maximize the GPU capabilities.
Looking forward to hearing back!
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions