Images within PDFs, Docx, or Powerpoints could be passed through a vision model! #92
Andresshamis
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Some files contain images with relevant information that I and many others also need to extract from within the files, not just plain text and tables. If we are able to detect which images are actually relevant, pass them through a vision LLM with a nice extraction prompt, and place the description of the image in the same order as it was in the file compared with the rest of the data, it would make this the absolute best file parser out there for LLMs!
Beta Was this translation helpful? Give feedback.
All reactions