Question about the Concatenate completion strategy for multi-page document #349

creukersVF · 2025-07-18T08:09:23Z

creukersVF
Jul 18, 2025

Hello, I have a question about the usage of the concatenate completion strategy. I have a multi-page document (7 pages) loaded in with DocumentLoaderPyPdf() with vision enabled, resulting in a list of the pages with image data of each page.
Now with so much image data I hit the token limits of my LLM (Claude 3 haiku through Bedrock). For the classification I can get around this by providing a single page at a time, but with the extraction I want to provide the entire document to return a single extraction result based on my provided Contract. Here is where, from what I understood, the Concatenate completion strategy would come in. However, when I try to extract the content like this;

result = extractor_reading.extract(
    content,
    classified_contract,
    completion_strategy=CompletionStrategy.CONCATENATE,
    vision=True,
)

It throws in the following error;
ValueError: Maximum retries reached: Maximum retries reached with invalid JSON continuations

What am I doing wrong?

enoch3712 · 2025-07-18T10:07:30Z

enoch3712
Jul 18, 2025
Maintainer

Hello!

Thank you for your message!

So, CompletionStrategy.CONCATENATE is used when the content is too big. What he does is:

Do the request normally.
If the LLM response returns end=false, he concats the JSON and asks again.
At the end gets the TOTAL JSON of this requests and returns the result.

Now, you are using Claude haiku, is a weak model, he is going to mess up the concat parsing.

I would strongly advice you to use a stronger model or increase the output token count, as such:

Take a look at this similar example im just doing:

        # Use heavy model for advanced analysis
        llm = LLM(FLAGSHIP_LLM_HEAVY)
        llm.set_thinking(True)
        llm.thinking_budget = 100000
        llm.token_limit = 100000
        extractor.load_llm(llm)

You see? Maybe you can change the token_limit in this case.

https://enoch3712.github.io/ExtractThinker/core-concepts/completion-strategies/

Take a look at the completion strategies, i think this one will not work most of the times, because of the weaker model.

Try with a bigger model, if doesnt work, we can get together and see if i need to fix a bug.

Hope is helpful, im here now, i will answer fast.

5 replies

creukersVF Jul 18, 2025
Author

Thanks for your quick response!
I could use another LLM but it would all go through Bedrock, where I'd hit token limits quickly regardless. I've tried increasing the token limit before with the parameters you mentioned to no avail, I think it just has a hard limit of 4096 all across Bedrock. And as I mentioned, I need a single extraction result based on my Contract, as all pages in the document are related to each other (and not separate documents), so the PAGINATE strategy would not fit my use case, and the FORBIDDEN strategy would just hit the token limits all the same (I've experimented with both). Is there another option besides changing LLM? For example, I've tried reducing the size of the image data by loading it in with PIL and reducing the dimensions and file format, which didn't work either, still hitting the token limit.

raphschlatt Jul 18, 2025

Had a similar problem to yours (connected content over multiple pages). For me the suggestion above was the only thing that did the trick. I use e.g.:

llm = LLM("vertex_ai/gemini-2.5-flash")
llm.set_thinking(True)
llm.thinking_budget = 8000
llm.token_limit = 65000
extractor.load_llm(llm)

enoch3712 Jul 18, 2025
Maintainer

creukersVF

Ugh, yah that's a small context length. The thing, i can try to solve the issue by tomorrow, with the following:

At the end of the result check if is valid, if not ask the LLM to try to fix it.

But the thing is, this will make the CONCAT flow better, but is a deprecated use case, because the model you are using is just Deprecated, i have no idea why you are using it.

Take care,

creukersVF Jul 18, 2025
Author

I'll try experimenting with some more LLM's in Bedrock, but I think I'll hit this limit. This model also doesn't support the 'thinking' parameters, so I'll try changing it regardless. To your knowledge, is there some other LLM within Bedrock that you'd suggest? And once again, thanks for your quick responses.

creukersVF Jul 23, 2025
Author

Hi, thanks for your answers the other day. I've tried experimenting with several other LLM's with higher token limits such as Sonnet 3.5 or Haiku 3.5, I still get the error. While I can get it to work with a much larger model (e.g. Sonnet 3.7, 64000 token limit) using the FORBIDDEN strategy, the CONCATENATE strategy still does not work, even with such a large LLM. Do you have any other suggestions regarding this matter? For example what you mentioned about changing the code;

At the end of the result check if is valid, if not ask the LLM to try to fix it.

Could you implement this? Or some other change that might better facilitate smaller LLM's.

Question about the Concatenate completion strategy for multi-page document #349

Uh oh!

creukersVF Jul 18, 2025

Replies: 1 comment · 5 replies

Uh oh!

enoch3712 Jul 18, 2025 Maintainer

Uh oh!

Uh oh!

creukersVF Jul 18, 2025 Author

Uh oh!

Uh oh!

raphschlatt Jul 18, 2025

Uh oh!

enoch3712 Jul 18, 2025 Maintainer

Uh oh!

creukersVF Jul 18, 2025 Author

Uh oh!

Uh oh!

creukersVF Jul 23, 2025 Author

creukersVF
Jul 18, 2025

Replies: 1 comment 5 replies

enoch3712
Jul 18, 2025
Maintainer

creukersVF Jul 18, 2025
Author

enoch3712 Jul 18, 2025
Maintainer

creukersVF Jul 18, 2025
Author

creukersVF Jul 23, 2025
Author