Vertex AI Context Caching for Gemini 2.5 Pro #2581
Replies: 11 comments 8 replies
-
This would be immensely useful |
Beta Was this translation helpful? Give feedback.
-
Please! |
Beta Was this translation helpful? Give feedback.
-
Great idea |
Beta Was this translation helpful? Give feedback.
-
Yes, that would be very helpful! Without the caching, the costs are just ridiculous |
Beta Was this translation helpful? Give feedback.
-
This must be the top priority task for Roo developers as all good models are pretty costy these days. |
Beta Was this translation helpful? Give feedback.
-
Gemini 2.5 Pro is great, but very expensive when you are a heavy user. Caching is very much needed. |
Beta Was this translation helpful? Give feedback.
-
Yes this is a super high priority as it directly impacts cost in a major way - and will dramatically increase my usage of Roo if supported They also just announced updates to the cache to lower the minimum to 4k tokens making the savings even larger than they were. I know that you have to self-manage the cache with Google and that's complicated and probably why it's not yet implemented, but if Roo nailed this it would cook. |
Beta Was this translation helpful? Give feedback.
-
https://openrouter.ai/docs/features/prompt-caching#google-gemini |
Beta Was this translation helpful? Give feedback.
-
This was a featured announced in this week's update, but only for 2.5 preview not for 2.5 exp. |
Beta Was this translation helpful? Give feedback.
-
Hi! I am currently partitcipating in Google Summer of Code at Deepmind and I will be exactly focusing on working on this feature in upcoming weeks. Happy to help :) |
Beta Was this translation helpful? Give feedback.
-
This has been implemented. If you would like any adjustments to the current implementation please submit a detailed feature proposal at https://github.com/RooCodeInc/Roo-Code/issues |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The Vertex AI API seems to now support context caching for Gemini 2.5 Pro: https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview
Given the cost of high token count chats this seems pretty important. Would love to see it implemented in Roo-Code. Thanks to everyone who has contributed to this repo, it really is fantastic :)
Beta Was this translation helpful? Give feedback.
All reactions