Input truncation is automatic on ollama you need to add a single flag to fix it #1403
Replies: 3 comments
-
Just a quick update. On a hunch I tried the "openai compatible" provider, but it's the same story...
|
Beta Was this translation helpful? Give feedback.
-
Can you please provide me a specific model so I can download it and try to repro it? Also can you provide your model config settings screenshot and a screenshot of the chat session where it fails? EDIT: Sorry I had a swear word in there I did not mean to include. I should read what I type!. Sorry @devlux76 |
Beta Was this translation helpful? Give feedback.
-
I know what you mean @devlux76 and I think it would help a lot with the ollama integration so you don't have to manually create a new model every time (i.e. remove step 3 from https://docs.roocode.com/providers/ollama#setting-up-ollama). Let me convert this to a feature request and we can keep discussing. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Which version of the app are you using?
latest head
Which API Provider are you using?
Ollama
Which Model are you using?
All of them
What happened?
Thanks for an excellent product. I love that it integrates with so many providers, but the ollama integration is bugged due to an incredibly short sighted design decision of the ollama team that I don't think you're aware of.
Ollama by default truncates input to 2048 tokens this is really quite tiny.
In order to override this all you need to do is to set the limit in the message options to something reasonable. There is no reason not to use the model's context length - new for this.
This can of course cause resource issues, so best practice is to measure what you current have, plus the size of the expected generation and then cap it at the model's max context size. This way you don't waste a bunch of resources when you only need a much smaller context at the moment.
Source: https://github.com/ollama/ollama/blob/main/docs/faq.md
Steps to reproduce
Relevant API REQUEST output
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions