Add support for prompt caching through litellm #2429

mboret · 2025-04-09T08:20:11Z

mboret
Apr 9, 2025

Hi,

I'm using Roo Code with litellm as an OpenAI-compatible endpoint. Prompt caching is enabled.

If I use an OpenAI model like o3-mini (Azure), prompt caching works, but it does not when I switch to Claude 3.7 sonnet (AWS bedrock).
When I inspect the request sent to litellm by roo code, no caching parameter is defined (which makes sense as OpenAI doesn't need extra parameters to do prompt caching), but Claude 3.7 sonnet needs one.

Example of request to litellm with prompt caching parameter:

response = client.chat.completions.create(
    model="claude-3.7-sonnet",
    messages=[
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "Here is the full text of a complex legal agreement" *400,
                    "cache_control": "default"
                }
            ],
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key terms and conditions in this agreement?",
                }
            ],
        }
    ]
)

litellm doc

It would be nice to be able to add this extra parameter to use prompt caching

mrubens · 2025-04-09T11:34:30Z

mrubens
Apr 9, 2025
Maintainer

Hi, did you try checking the prompt caching checkbox? We added that recently.

5 replies

mrubens Apr 9, 2025
Maintainer

Oh sorry, I just read your message more closely. Let me take a look.

mrubens Apr 9, 2025
Maintainer

@mboret I just tried with this config and the prompt caching seemed to work for me. Can you confirm that you have similar settings?

mboret Apr 9, 2025
Author

@mrubens Well, now it works. Yesterday, no (confirmed by a teammate), but today, yes. The only change was a vscode restart. Maybe we were not using the latest version.

Anyway, sorry for the noise and thanks for all your work!

mrubens Apr 9, 2025
Maintainer

No problem! We did fix a bug with it in the latest version, so you’re not imagining things. Just wanted to make sure the fix worked!

mrubens Apr 9, 2025
Maintainer

Separately, I’ve been thinking about improvements we can make to LiteLLM support. Would you be willing to chat about that? You can email me at matt at roocode dot com. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for prompt caching through litellm #2429

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Add support for prompt caching through litellm #2429

Uh oh!

mboret Apr 9, 2025

Replies: 1 comment · 5 replies

Uh oh!

mrubens Apr 9, 2025 Maintainer

Uh oh!

mrubens Apr 9, 2025 Maintainer

Uh oh!

Uh oh!

mrubens Apr 9, 2025 Maintainer

Uh oh!

mboret Apr 9, 2025 Author

Uh oh!

mrubens Apr 9, 2025 Maintainer

Uh oh!

mrubens Apr 9, 2025 Maintainer

mboret
Apr 9, 2025

Replies: 1 comment 5 replies

mrubens
Apr 9, 2025
Maintainer

mrubens Apr 9, 2025
Maintainer

mrubens Apr 9, 2025
Maintainer

mboret Apr 9, 2025
Author

mrubens Apr 9, 2025
Maintainer

mrubens Apr 9, 2025
Maintainer