Replies: 1 comment
-
|
That would be really cool! I actually do have a private eval set that I use, though I'm reluctant to release it publicly as it's got a ton of personal info that I don't have time to redact. Please let me know if you see any qualitative improvements. I should probably move to recommend Qwen3VL-4B as the base. For anyone who's curious, I've actually been investigating why Qwen2.5VL performs so poorly on LMStudio and Ollama. The base model is actually very powerful and accurate, but hallucinates a ton when run with LMStudio/Ollama. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is there any "evals" set of inputs and expected outputs that can be used to evaluate how good or bad a given model is?
I'm running local models and I'm using qwen3-vl-8b which was released recently. I'd love to see how this compares with the recommended qwen 2.5 -- or gemini.
If there was an eval suite available, this could be run whenever a new vision local model is released to determine whether to recommend it or not.
Beta Was this translation helpful? Give feedback.
All reactions