Skip to content

fix: Anthropic prompt caching on GCP Vertex AI #9605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 30, 2025

Conversation

sammcj
Copy link
Contributor

@sammcj sammcj commented Mar 28, 2025

Title

Fix (hopefully) for prompt caching not working with Anthropic models on GCP VertexAI

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally - the same tests fail on my branch as fail on main
  • [-] My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code] - the same tests fail on my branch as fail on main
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

New test:

image

Other tests are no more broken than they are on main at present:

image image

And the same linting errors as on main (not on the changes included here):

image image

Type

🐛 Bug Fix

Changes

Add missing header for Anthropic models on GCP Vertex which allows prompt caching to work.

Copy link

vercel bot commented Mar 28, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 28, 2025 9:06am

@krrishdholakia krrishdholakia merged commit a867324 into BerriAI:main Mar 30, 2025
3 checks passed
@sammcj sammcj deleted the gcp_prompt_caching branch March 31, 2025 11:29
krrishdholakia added a commit that referenced this pull request Apr 1, 2025
krrishdholakia added a commit that referenced this pull request Apr 1, 2025
krrishdholakia added a commit that referenced this pull request Apr 1, 2025
krrishdholakia added a commit that referenced this pull request Apr 1, 2025
…x_parallel_requests = 0 (#9671)

* fix(proxy_server.py): remove non-functional parent backoff/retry on /chat/completion

Causes circular reference error

* fix(http_parsing_utils.py): safely return parsed body - don't allow mutation of cached request body by client functions

Root cause fix for circular reference error

* Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)

This reverts commit a867324.

* add type hints for AnthropicMessagesResponse

* define types for response form AnthropicMessagesResponse

* fix response typing

* allow using litellm.messages.acreate and litellm.messages.create

* fix anthropic_messages implementation

* add clear type hints to litellm.messages.create functions

* fix anthropic_messages

* working anthropic API tests

* fixes - anthropic messages interface

* use new anthropic interface

* fix code quality check

* docs anthropic messages endpoint

* add namespace_packages = True to mypy

* fix mypy lint errors

* docs anthropic messages interface

* test: fix unit test

* test(test_http_parsing_utils.py): update tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants