Skip to content

Return a copy from strict key removal to not break cache keys #9693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adrianlyjak
Copy link
Contributor

@adrianlyjak adrianlyjak commented Apr 2, 2025

Caching stability improvements

A) Return a copy from strict key removal to not break cache keys. This change seems nice and simple. All of the callers I saw using it were updating their reference, e.g. thing = _remove_strict_from_schema(thing), so this seems like it should work just fine. After digging deeper, this part may be unnecessary?

B I also saw there's a property that can be set to re-use the same cache key later, in case mutation occurs, but I guess that only happens if litellm_params object has been set previously. Looks like there's a good opportunity to initialize it in wrapper_async, such that it can be used to cache the key between request and response if it was not otherwise set.

Origination of this is perhaps a bit of an edge case, but I think these improvements will help.

  1. My request didn't have litellm_params, although this is probably normal
  2. I had erroneously set strict: true on my json schema

Relevant issues

fix #9692

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code]
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

☝️ this last one is debatable. I could remove part A) to make this simpler

Tests passing screenshots
image
image

image

Type

🐛 Bug Fix
✅ Test

Changes

Return a copy from strict key removal to not break cache keys

Copy link

vercel bot commented Apr 2, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 13, 2025 7:42pm

@adrianlyjak adrianlyjak marked this pull request as ready for review April 2, 2025 02:13
@adrianlyjak
Copy link
Contributor Author

oh, I'm not really sure this bug is going to be common. I was sending a custom json schema, (rather than a pydantic model). This was to decorate the json object with propertyOrdering, because vertex (at least gemini) annoyingly returns fields in alphabetical ordering, rather than order within the json.

Maybe my fault here for including strict in the first place. The bug didn't occur when I tried passing in the pydantic model directly

@adrianlyjak
Copy link
Contributor Author

Another idea would be to attach some sort of persistent attribute to a request to re-use the same cache key later, in case mutation occurs, but I didn't spend the time looking into how difficult that would be.

Maybe this would be a better idea, (at least if not too complex), because it seems like it could solve the whole class of caching problems where a request changes between request and response caching time. I don't think I'd ever see any value in that happening?

@adrianlyjak adrianlyjak marked this pull request as draft April 2, 2025 02:28
@adrianlyjak
Copy link
Contributor Author

Just rubber ducking here now:

What's odd is that this does seem like it should already be happening, with get_cache_key, it stores the value in the lite llm params

@adrianlyjak
Copy link
Contributor Author

ok, fixed

@adrianlyjak
Copy link
Contributor Author

welp, something's screwed up with the cost tracking tests now

@adrianlyjak
Copy link
Contributor Author

should be fixed

# mutate kwargs
kwargs["temperature"] = 0.8
cache_key_2 = litellm.cache.get_cache_key(**kwargs)
assert cache_key == cache_key_2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @adrianlyjak this is not desired behaviour - if the user changes an optional param, we do not want to return a cached response

Copy link
Contributor Author

@adrianlyjak adrianlyjak Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krrishdholakia I didn't change anything to make this particular test pass, this is actually the current functionality. This appears to be the existing intended behavior of the code, to memoize the cache key within a single request.

This test scenario is perhaps a little unrealistic, since the temperature itself can't get changed, as the kwargs are spread, copying the dict, however if a nested parameter such as the response schema is mutated between the start of the request and the response, then the same cache key is used. The related fix I implemented was to just ensure the litellm_params were initialized, (so the cache key is actually memoized)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the user changes an optional param

To be clear, as I understand, this would only be happening internally within integrations. I don't know of a way for the user to be modifying the request parameters after calling the completion (or other function)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted the test for clarity, as modifying temperature isn't really an expected use case, instead I normalized the system -> developer role on a message

@adrianlyjak
Copy link
Contributor Author

Accidentally closed

A) Return a copy from strict key removal to not break cache keys
B) Fix issue in existing cache key stabilizer that was not storing a stable
   key across request/response if no litellm_params existed in the request
@CLAassistant
Copy link

CLAassistant commented Apr 22, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Vertex AI Gemini Structured JSON caching not working
3 participants