Replies: 1 comment 1 reply
-
Hello @mophilly! Always comes first classification, split then extraction. The flow is always this one.
Inside of the split, the classification is done and then aggregates the pages and does the extraction.
I think you already know everything, But everything is done inside of a Process. Take a look at test_process Tomorrow i should publish with Issue #237 and will make everything more clear to use. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In adding splitting to handle large PDF files, I hit upon a classic question: which comes first?
classification ->> split pages ->> extraction ->> validation
or
split page ->> classification ->> extraction ->> validation
I have been crafting test script for each task. Good for learning but I wonder if I am too deep in the weeds. Issue #237 is a great addition. It does make the distinction between classification, splitting and extraction less visible.
Is there a task flow more in line the current project capabilities?
Beta Was this translation helpful? Give feedback.
All reactions