feat: add prompt caching support for LiteLLM (#5791) #6074

MuriloFP · 2025-07-22T18:51:34Z

Related GitHub Issue

Closes: #5791

Roo Code Task Context (Optional)

No Roo Code task context for this PR

Description

This PR implements prompt caching support for LiteLLM, allowing users to benefit from reduced costs and improved response times when using models that support prompt caching (like Claude 3.7).

Key implementation details:

Added litellmUsePromptCache boolean option to provider settings schema
Modified the LiteLLM handler to apply cache control headers to system messages and the last two user messages when the feature is enabled
Enhanced usage tracking to properly capture cache read/write tokens from LiteLLM responses, including alternative field names
Added UI checkbox that only appears when the selected model supports prompt caching
Reused existing translation keys (enablePromptCaching and enablePromptCachingTitle) to maintain consistency across all supported languages

Design choices:

Followed the same caching pattern as the Anthropic provider (caching system prompt + last 2 user messages)
Made the feature opt-in via a checkbox to give users control over when caching is used
Kept the test focused on the critical functionality to follow project patterns

Test Procedure

Automated Testing:

Added unit test in src/api/providers/__tests__/lite-llm.spec.ts that verifies:
- Cache control headers are properly added when litellmUsePromptCache is enabled
- Cache tokens are correctly tracked in usage data
- The feature respects the model's prompt caching support capability

Manual Testing Steps:

Configure LiteLLM as your provider with a model that supports prompt caching (e.g., Claude 3.7)
Navigate to Settings > Providers > LiteLLM
Verify the "Enable prompt caching" checkbox appears
Enable the checkbox and save settings
Start a conversation and monitor the LiteLLM logs/dashboard
Verify that cache hits/misses are being recorded
Check that the usage tracking in Roo Code shows cache read/write tokens

Test Command:

cd src && npx vitest run api/providers/__tests__/lite-llm.spec.ts

Pre-Submission Checklist

Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
Scope: My changes are focused on the linked issue (one major feature/fix per PR).
Self-Review: I have performed a thorough self-review of my code.
Testing: New and/or updated tests have been added to cover my changes (if applicable).
Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Before: The LiteLLM settings page shows only Base URL, API Key, and Model selection.

After: When a model that supports prompt caching is selected, an additional "Enable prompt caching" checkbox appears with a description.

Note: The checkbox only appears for models that have supportsPromptCache: true in their model info.

Documentation Updates

No documentation updates are required.

The feature is self-explanatory through the UI, using existing translation keys that are already documented.

Additional Notes

This implementation follows the same approach as the referenced Cline commit but adapts it to RooCode's architecture. The main difference is that we reuse existing translation keys instead of creating new ones, which ensures all languages are supported without additional translation work.

Get in Touch

@MuriloFP

Important

Adds prompt caching support for LiteLLM, including schema updates, handler modifications, UI changes, and tests.

Behavior:
- Adds litellmUsePromptCache boolean to provider settings schema in provider-settings.ts.
- Modifies LiteLLMHandler in lite-llm.ts to add cache control headers to system and last two user messages if caching is enabled.
- Tracks cache read/write tokens in LiteLLMHandler.
UI:
- Adds a checkbox for enabling prompt caching in LiteLLM.tsx, visible only for models supporting caching.
Testing:
- Adds unit test in lite-llm.spec.ts to verify cache control headers and token tracking when caching is enabled.

^{This description was created by}^{for d460f43. You can customize this summary. It will automatically update as commits are pushed.}

- Add litellmUsePromptCache configuration option to provider settings - Implement cache control headers in LiteLLM handler when enabled - Add UI checkbox for enabling prompt caching (only shown for supported models) - Track cache read/write tokens in usage data - Add comprehensive test for prompt caching functionality - Reuse existing translation keys for consistency across languages This allows LiteLLM users to benefit from prompt caching with supported models like Claude 3.7, reducing costs and improving response times.

ellipsis-dev · 2025-07-22T18:53:38Z

src/api/providers/__tests__/lite-llm.spec.ts

+
+			expect(createCall.messages[lastUserIdx]).toMatchObject({
+				cache_control: { type: "ephemeral" },
+			})


Consider adding an assertion for the second last user message as well, to fully verify that cache control headers are applied to both the last two user messages.

MuriloFP requested review from mrubens, cte and jr as code owners July 22, 2025 18:51

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jul 22, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jul 22, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jul 22, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jul 22, 2025

ellipsis-dev bot reviewed Jul 22, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add prompt caching support for LiteLLM (#5791) #6074

feat: add prompt caching support for LiteLLM (#5791) #6074

MuriloFP commented Jul 22, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot Jul 22, 2025

Uh oh!

Uh oh!

feat: add prompt caching support for LiteLLM (#5791) #6074

Are you sure you want to change the base?

feat: add prompt caching support for LiteLLM (#5791) #6074

Conversation

MuriloFP commented Jul 22, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related GitHub Issue

Roo Code Task Context (Optional)

Description

Test Procedure

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

ellipsis-dev bot Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MuriloFP commented Jul 22, 2025 •

edited by ellipsis-dev bot

Loading