Question about the Concatenate completion strategy for multi-page document #349
Replies: 1 comment 5 replies
-
Hello! Thank you for your message! So, CompletionStrategy.CONCATENATE is used when the content is too big. What he does is:
Now, you are using Claude haiku, is a weak model, he is going to mess up the concat parsing. I would strongly advice you to use a stronger model or increase the output token count, as such: Take a look at this similar example im just doing:
You see? Maybe you can change the token_limit in this case. https://enoch3712.github.io/ExtractThinker/core-concepts/completion-strategies/ Take a look at the completion strategies, i think this one will not work most of the times, because of the weaker model. Try with a bigger model, if doesnt work, we can get together and see if i need to fix a bug. Hope is helpful, im here now, i will answer fast. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I have a question about the usage of the concatenate completion strategy. I have a multi-page document (7 pages) loaded in with
DocumentLoaderPyPdf()
with vision enabled, resulting in a list of the pages with image data of each page.Now with so much image data I hit the token limits of my LLM (Claude 3 haiku through Bedrock). For the classification I can get around this by providing a single page at a time, but with the extraction I want to provide the entire document to return a single extraction result based on my provided Contract. Here is where, from what I understood, the Concatenate completion strategy would come in. However, when I try to extract the content like this;
It throws in the following error;
ValueError: Maximum retries reached: Maximum retries reached with invalid JSON continuations
What am I doing wrong?
Beta Was this translation helpful? Give feedback.
All reactions