pydantic validation error of JobResult calling LlamaParse.parser.parse() #688

brianb08 · 2025-04-18T21:48:38Z

Describe the bug
Today I started getting pydantic validation errors for some calls to LlamaParse.parser.parse()

...
pydantic_core._pydantic_core.ValidationError: 2 validation errors for JobResult
pages.1.images.0.original_width
  Field required [type=missing, input_value={'name': 'img_p1_1.png', ...9, 'y': 684.69894064375}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/missing
pages.1.images.0.original_height
  Field required [type=missing, input_value={'name': 'img_p1_1.png', ...9, 'y': 684.69894064375}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/missing

Files
Some example PDFs can be provided if necessary.

Job ID
Two example Job IDs showing this error:
ddab5846-6a36-48d3-916b-bacd47b96e42
8e1bb36d-385d-4023-904f-f3572b424e03

Client:
Using Python llama-parse 0.6.12

Additional context
Excerpts of client code:

...
        self.parser = LlamaParse(
            result_type="markdown",
            verbose=True,
            api_key=settings.LLAMA_CLOUD_API_KEY,
            show_progress=False,
            parsing_instruction=parsing_instruction,
            language="en",
            take_screenshot=True,
            auto_mode=True,
        )
...
        job_result: JobResult = self.parser.parse(
            pdf_file,
            { "file_name": db_document.name })
...

The text was updated successfully, but these errors were encountered:

ngallo1 · 2025-04-23T16:50:18Z

I am getting the same validation errors, also on validating images and properties original_width and original_height. Have you tried setting target_pages variable in your parse options to skip the problematic pages (page 1 in your case) to see how the rest of the document parses? When I did this, it failed to even parse the rest of the document--the resulting markdown text just said NO CONTENT HERE for most of the pages. This seems to suggest there is something about the format of the specific pdf that the parser doesn't like (beyond the pydantic validation errors). Not sure if there is a good work around for this?

brianb08 added the bug label Apr 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pydantic validation error of JobResult calling LlamaParse.parser.parse() #688

pydantic validation error of JobResult calling LlamaParse.parser.parse() #688

brianb08 commented Apr 18, 2025

ngallo1 commented Apr 23, 2025

pydantic validation error of JobResult calling LlamaParse.parser.parse() #688

pydantic validation error of JobResult calling LlamaParse.parser.parse() #688

Comments

brianb08 commented Apr 18, 2025

ngallo1 commented Apr 23, 2025