[draft] new prompt templates - nearly 100% completition: ollama + qwen.2-5-instruct:32b (14b/7b feasible too) #327
Replies: 3 comments 2 replies
-
Hi, Best Regards |
Beta Was this translation helpful? Give feedback.
-
I'm still sticking with it - haven't added that many more documents lately. So I didn't get around implementing any changes yet ... I have setup an automation for adding the ocr tags for new documents however ... that`s also how I got around re-doing everything again and again for testing purposes. Feel free to try and report back! I guess we might need better some code changes for better results. Yesterday I`ve been looking into letta ... not sure if that's overkill but I'll see. |
Beta Was this translation helpful? Give feedback.
-
Short update since I'm currently developing an LMM based Search Engine that makes use of these models too (https://github.com/thiscantbeserious/llm-search-agent)
Albeit it does seem to have a much simpler language level, this should be fine for document titles and such. Will keep you updated with my observations since I'm planning to implement a testing pipeline to benchmark the accuracy and reliability there against my prompts. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
From my testings I figured that
deepseek-r1
wasn't able to properly follow the instructions especially in regards to parsing the date. That's why I tried a few different models and landed atqwen2.5:32b-instruct
(7b and 14b work too, just slightly more errors).This produces the most reliable results for me - including documents that weren't properly OCR`ed given my instructions.
I ran multiple models against the full 178 documents multiple times and different document-types including hand-writing.
Key take-away is that we might need a few improvements on the code-layer to improve other model results. For example the first thing I'd add is a simple date parser that extracts the last date that matches YYYY-MM-DD to solve the issue with slight off responses with additional noise. However models like deepseek-r1 would highly improve if we are able to use multiple prompts to fine-tune results, so that`s another possible improvement I do see on the code layer. And most importantly we should allow different models for each field - say you know that an instruct model is far better at following instructions like date parsing, but you want a more refined model like deepseek-r1 to work on the title.
That's the reasoning behind my choice and some outlook of possible PR`s / Improvements I might see ...
I noticed that the most important change was moving the Content at the top of the template for the instructions to not be overwritten under any circumstance.
paperless-gpt
TOKEN_LIMIT
set at:3000
Ollama default Context Length via
OLLAMA_CONTEXT_LENGTH
:8096
-4048
should be fine too ...Language of Documents (and System):
German
Documents tested:
179 so far and counting
Success rate:
Nearly 100% - Quality I would rate at 70-80% of the results being ok, even on really sub-par OCRs ...
I created and refined the prompts with the models itself to make sure its in a format its working best against - last but not least however I manually fine-tuned them and tested them again until I was somewhat satisfied ...
You can find the most recent version of the templates in my Git-Repo:
https://github.com/thiscantbeserious/paperless-gpt-prompts
Beta Was this translation helpful? Give feedback.
All reactions