-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Return a copy from strict key removal to not break cache keys #9693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
oh, I'm not really sure this bug is going to be common. I was sending a custom json schema, (rather than a pydantic model). This was to decorate the json object with Maybe my fault here for including |
Maybe this would be a better idea, (at least if not too complex), because it seems like it could solve the whole class of caching problems where a request changes between request and response caching time. I don't think I'd ever see any value in that happening? |
Just rubber ducking here now: What's odd is that this does seem like it should already be happening, with get_cache_key, it stores the value in the lite llm params |
ok, fixed |
welp, something's screwed up with the cost tracking tests now |
837818d
to
5e75601
Compare
should be fixed |
# mutate kwargs | ||
kwargs["temperature"] = 0.8 | ||
cache_key_2 = litellm.cache.get_cache_key(**kwargs) | ||
assert cache_key == cache_key_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @adrianlyjak this is not desired behaviour - if the user changes an optional param, we do not want to return a cached response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krrishdholakia I didn't change anything to make this particular test pass, this is actually the current functionality. This appears to be the existing intended behavior of the code, to memoize the cache key within a single request.
This test scenario is perhaps a little unrealistic, since the temperature itself can't get changed, as the kwargs are spread, copying the dict, however if a nested parameter such as the response schema is mutated between the start of the request and the response, then the same cache key is used. The related fix I implemented was to just ensure the litellm_params were initialized, (so the cache key is actually memoized)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the user changes an optional param
To be clear, as I understand, this would only be happening internally within integrations. I don't know of a way for the user to be modifying the request parameters after calling the completion (or other function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted the test for clarity, as modifying temperature isn't really an expected use case, instead I normalized the system -> developer role on a message
Accidentally closed |
A) Return a copy from strict key removal to not break cache keys B) Fix issue in existing cache key stabilizer that was not storing a stable key across request/response if no litellm_params existed in the request
Caching stability improvements
A) Return a copy from strict key removal to not break cache keys. This change seems nice and simple. All of the callers I saw using it were updating their reference, e.g.
thing = _remove_strict_from_schema(thing)
, so this seems like it should work just fine. After digging deeper, this part may be unnecessary?B I also saw there's a property that can be set to re-use the same cache key later, in case mutation occurs, but I guess that only happens if
litellm_params
object has been set previously. Looks like there's a good opportunity to initialize it inwrapper_async
, such that it can be used to cache the key between request and response if it was not otherwise set.Origination of this is perhaps a bit of an edge case, but I think these improvements will help.
litellm_params
, although this is probably normalRelevant issues
fix #9692
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
)[https://docs.litellm.ai/docs/extras/contributing_code]☝️ this last one is debatable. I could remove part A) to make this simpler
Tests passing screenshots


Type
🐛 Bug Fix
✅ Test
Changes
Return a copy from strict key removal to not break cache keys