feat: add prompt caching support for LiteLLM (#5791) #6074
+201
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related GitHub Issue
Closes: #5791
Roo Code Task Context (Optional)
No Roo Code task context for this PR
Description
This PR implements prompt caching support for LiteLLM, allowing users to benefit from reduced costs and improved response times when using models that support prompt caching (like Claude 3.7).
Key implementation details:
litellmUsePromptCache
boolean option to provider settings schemaenablePromptCaching
andenablePromptCachingTitle
) to maintain consistency across all supported languagesDesign choices:
Test Procedure
Automated Testing:
src/api/providers/__tests__/lite-llm.spec.ts
that verifies:litellmUsePromptCache
is enabledManual Testing Steps:
Test Command:
Pre-Submission Checklist
Screenshots / Videos
Before: The LiteLLM settings page shows only Base URL, API Key, and Model selection.
After: When a model that supports prompt caching is selected, an additional "Enable prompt caching" checkbox appears with a description.
Note: The checkbox only appears for models that have
supportsPromptCache: true
in their model info.Documentation Updates
The feature is self-explanatory through the UI, using existing translation keys that are already documented.
Additional Notes
This implementation follows the same approach as the referenced Cline commit but adapts it to RooCode's architecture. The main difference is that we reuse existing translation keys instead of creating new ones, which ensures all languages are supported without additional translation work.
Get in Touch
@MuriloFP
Important
Adds prompt caching support for LiteLLM, including schema updates, handler modifications, UI changes, and tests.
litellmUsePromptCache
boolean to provider settings schema inprovider-settings.ts
.LiteLLMHandler
inlite-llm.ts
to add cache control headers to system and last two user messages if caching is enabled.LiteLLMHandler
.LiteLLM.tsx
, visible only for models supporting caching.lite-llm.spec.ts
to verify cache control headers and token tracking when caching is enabled.This description was created by
for d460f43. You can customize this summary. It will automatically update as commits are pushed.