Releases: BerriAI/litellm
v1.67.3.dev1
What's Changed
- [Feat] Add gpt-image-1 cost tracking by @ishaan-jaff in #10241
- [Bug Fix] Add Cost Tracking for gpt-image-1 when quality is unspecified by @ishaan-jaff in #10247
- [Feat] Add support for GET Responses Endpoint - OpenAI, Azure OpenAI by @ishaan-jaff in #10235
- fix(user_dashboard.tsx): add token expiry logic to user dashboard by @krrishdholakia in #10250
- [Helm] fix for serviceAccountName on migration job by @ishaan-jaff in #10258
- Fix typos by @DimitriPapadopoulos in #10232
- Reset key alias value when resetting filters by @crisshaker in #10099
- Support all compatible bedrock params when model="arn:.." by @krrishdholakia in #10256
- UI - fix edit azure public model name + support changing model names post create by @krrishdholakia in #10249
- Litellm fix UI login by @krrishdholakia in #10260
- Multi-admin + Users page fixes: show all models, show user personal models, allow editing user role, available models by @krrishdholakia in #10259
Full Changelog: v1.67.2-nightly...v1.67.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 235.18092614107948 | 6.181088327781123 | 0.0 | 1850 | 0 | 192.45027600004505 | 4892.269687999942 |
Aggregated | Passed ✅ | 210.0 | 235.18092614107948 | 6.181088327781123 | 0.0 | 1850 | 0 | 192.45027600004505 | 4892.269687999942 |
v1.67.2-nightly
What's Changed
- Add AgentOps Integration to LiteLLM by @Dwij1704 in #9685
- Add global filtering to Users tab by @krrishdholakia in #10195
- [Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI by @ishaan-jaff in #10205
- Bug Fix - Address deprecation of open_text by @ishaan-jaff in #10208
- UI - Users page - Enable global sorting (allows finding users with highest spend) by @krrishdholakia in #10211
- feat: Added Missing Attributes For Arize & Phoenix Integration (#10043) by @ishaan-jaff in #10215
- Users page - new user info pane by @krrishdholakia in #10213
- Fix datadog llm observability logging + (Responses API) Ensures handling for undocumented event types by @krrishdholakia in #10206
- Discard duplicate sentence by @DimitriPapadopoulos in #10231
- Require auth for all dashboard pages by @crisshaker in #10229
New Contributors
Full Changelog: v1.67.1-nightly...v1.67.2-nightly
v1.67.1-nightly
What's Changed
- [UI] Bug Fix, team model selector by @ishaan-jaff in #10171
- [Bug Fix] Auth Check, Fix typing to ensure case where model is None is handled by @ishaan-jaff in #10170
- [Docs] Responses API by @ishaan-jaff in #10172
- Litellm release notes 04 19 2025 by @krrishdholakia in #10169
- fix(transformation.py): pass back in gemini thinking content to api by @krrishdholakia in #10173
- Litellm docs SCIM by @ishaan-jaff in #10174
- fix(common_daily_activity.py): support empty entity id field by @krrishdholakia in #10175
- fix(proxy_server.py): pass llm router to get complete model list by @krrishdholakia in #10176
- Model pricing updates for Azure & VertexAI by @marty-sullivan in #10178
- fix(bedrock): wrong system prompt transformation by @hewliyang in #10120
- Fix: Potential SQLi in spend_management_endpoints.py by @n1lanjan in #9878
- Handle edge case where user sets model_group inside model_info + Return hashed_token in
token
field on/key/generate
by @krrishdholakia in #10191 - Remove user_id from url by @krrishdholakia in #10192
- [Feat] Pass through endpoints - ensure
PassthroughStandardLoggingPayload
is logged and contains method, url, request/response body by @ishaan-jaff in #10194 - [Feat] Add Responses API - Routing Affinity logic for sessions by @ishaan-jaff in #10193
- [Feat] Add infinity embedding support (contributor pr) by @ishaan-jaff in #10196
- [Bug Fix] caching does not account for thinking or reasoning_effort config by @ishaan-jaff in #10140
- Gemini-2.5-flash improvements by @krrishdholakia in #10198
Full Changelog: v1.67.0-nightly...v1.67.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 263.64700999835935 | 6.1132795166960605 | 0.0 | 1829 | 0 | 199.11094299999377 | 4358.182531000011 |
Aggregated | Passed ✅ | 220.0 | 263.64700999835935 | 6.1132795166960605 | 0.0 | 1829 | 0 | 199.11094299999377 | 4358.182531000011 |
v1.67.0-stable
What's Changed
- build(model_prices_and_context_window.json): add gpt-4.1 pricing by @krrishdholakia in #9990
- [Fixes/QA] For gpt-4.1 costs by @ishaan-jaff in #9991
- Fix cost for Phi-4-multimodal output tokens by @emerzon in #9880
- chore(docs): update ordering of logging & observability docs by @marcklingen in #9994
- Updated cohere v2 passthrough by @krrishdholakia in #9997
- [Feat] Add support for
cache_control_injection_points
for Anthropic API, Bedrock API by @ishaan-jaff in #9996 - [UI] Allow setting prompt
cache_control_injection_points
by @ishaan-jaff in #10000 - Fix azure tenant id check from env var + response_format check on api_version 2025+ by @krrishdholakia in #9993
- Add
/vllm
and/mistral
passthrough endpoints by @krrishdholakia in #10002 - CI/CD fix mock tests by @ishaan-jaff in #10003
- Setting
litellm.modify_params
via environment variables by @Eoous in #9964 - Support checking provider
/models
endpoints on proxy/v1/models
endpoint by @krrishdholakia in #9958 - Update AWS bedrock regions by @Schnitzel in #9430
- Fix case where only system messages are passed to Gemini by @NolanTrem in #9992
- Revert "Fix case where only system messages are passed to Gemini" by @krrishdholakia in #10027
- chore(docs): Update logging.md by @mrlorentx in #10006
- build(deps): bump @babel/runtime from 7.23.9 to 7.27.0 in /ui/litellm-dashboard by @dependabot in #10001
- Fix typo: Entrata -> Entra in code by @msabramo in #9922
- Retain schema field ordering for google gemini and vertex by @adrianlyjak in #9828
- Revert "Retain schema field ordering for google gemini and vertex" by @krrishdholakia in #10038
- Add aggregate team based usage logging by @krrishdholakia in #10039
- [UI Polish] UI fixes for cache control injection settings by @ishaan-jaff in #10031
- [UI] Bug Fix - Show created_at and updated_at for Users Page by @ishaan-jaff in #10033
- [Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions by @ishaan-jaff in #10029
- Fix gcs pub sub logging with env var GCS_PROJECT_ID by @krrishdholakia in #10042
- Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls by @krrishdholakia in #10040
- [Docs] Auto prompt caching by @ishaan-jaff in #10044
- Add litellm call id passing to Aim guardrails on pre and post-hooks calls by @hxmichael in #10021
- /utils/token_counter: get model_info from deployment directly by @chaofuyang in #10047
- [Bug Fix] Azure Blob Storage fixes by @ishaan-jaff in #10059
- build(deps): bump http-proxy-middleware from 2.0.7 to 2.0.9 in /docs/my-website by @dependabot in #10064
- fix(stream_chunk_builder_utils.py): don't set index on modelresponse by @krrishdholakia in #10063
- fix(llm_http_handler.py): fix fake streaming by @krrishdholakia in #10061
- Add aggregate spend by tag by @krrishdholakia in #10071
- Add OpenAI o3 & o4-mini by @PeterDaveHello in #10065
- Add new
/tag/daily/activity
endpoint + Add tag dashboard to UI by @krrishdholakia in #10073 - Add team based usage dashboard at 1m+ spend logs (+ new
/team/daily/activity
API) by @krrishdholakia in #10081 - [Feat SSO] Add LiteLLM SCIM Integration for Team and User management by @ishaan-jaff in #10072
- Virtual Keys: Filter by key alias (#10035) by @ishaan-jaff in #10085
- Add new
/vertex_ai/discovery
route - enables calling AgentBuilder API routes by @krrishdholakia in #10084 - fix(o_series_transformation.py): correctly map o4 to openai o_series … by @krrishdholakia in #10079
- [Feat] Unified Responses API - Add Azure Responses API support by @ishaan-jaff in #10116
- UI: Make columns resizable/hideable in Models table by @msabramo in #10119
- Remove unnecessary
package*.json
files by @msabramo in #10075 - Add Gemini Flash 2.5 Preview Model Price and Context Window by @drmingler in #10125
- test: update tests to new deployment model by @krrishdholakia in #10142
- [Feat] Support for all litellm providers on Responses API (works with Codex) - Anthropic, Bedrock API, VertexAI, Ollama by @ishaan-jaff in #10132
- fix(litellm-proxy-extras/utils.py): prisma migrate improvements: hand… by @krrishdholakia in #10138
- Litellm dev 04 18 2025 p2 by @krrishdholakia in #10157
- Gemini-2.5-flash - support reasoning cost calc + return reasoning content by @krrishdholakia in #10141
- Handle fireworks ai tool calling response by @krrishdholakia in #10130
- Support 'file' message type for VLLM video url's + Anthropic redacted message thinking support by @krrishdholakia in #10129
- fix(triton/completion/transformation.py): remove bad_words / stop wor… by @krrishdholakia in #10163
- Update model_prices_and_context_window_backup.json by @Classic298 in #10122
- to get API key from environment viarble of WATSONX_APIKEY by @ongkhaiwei in #10131
- test(utils.py): handle scenario where text tokens + reasoning tokens … by @krrishdholakia in #10165
New Contributors
- @Eoous made their first contribution in #9964
- @mrlorentx made their first contribution in #10006
- @hxmichael made their first contribution in #10021
- @chaofuyang made their first contribution in #10047
- @drmingler made their first contribution in #10125
- @Classic298 made their first contribution in #10122
- @ongkhaiwei made their first contribution in #10131
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.0-stable
Full Changelog: v1.66.0-stable...v1.67.0-stable
v1.67.0-nightly
What's Changed
- [Feat] Expose Responses API on LiteLLM UI Test Key Page by @ishaan-jaff in #10166
- [Bug Fix] Spend Tracking Bug Fix, don't modify in memory default litellm params by @ishaan-jaff in #10167
- Bug Fix - Responses API, Loosen restrictions on allowed environments for computer use tool by @ishaan-jaff in #10168
Full Changelog: v1.67.0-stable...v1.67.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.67.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 262.85419851041036 | 6.266552109647687 | 0.0 | 1873 | 0 | 202.24337799993464 | 5393.98836700002 |
Aggregated | Passed ✅ | 230.0 | 262.85419851041036 | 6.266552109647687 | 0.0 | 1873 | 0 | 202.24337799993464 | 5393.98836700002 |
v1.66.3.dev5
What's Changed
- [Feat] Unified Responses API - Add Azure Responses API support by @ishaan-jaff in #10116
- UI: Make columns resizable/hideable in Models table by @msabramo in #10119
- Remove unnecessary
package*.json
files by @msabramo in #10075 - Add Gemini Flash 2.5 Preview Model Price and Context Window by @drmingler in #10125
- test: update tests to new deployment model by @krrishdholakia in #10142
- [Feat] Support for all litellm providers on Responses API (works with Codex) - Anthropic, Bedrock API, VertexAI, Ollama by @ishaan-jaff in #10132
New Contributors
- @drmingler made their first contribution in #10125
Full Changelog: v1.66.2.dev1...v1.66.3.dev5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.3.dev5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 241.46378394371686 | 6.1149592690003 | 0.0 | 1830 | 0 | 197.6759699999775 | 1416.5823339999974 |
Aggregated | Passed ✅ | 230.0 | 241.46378394371686 | 6.1149592690003 | 0.0 | 1830 | 0 | 197.6759699999775 | 1416.5823339999974 |
v1.66.3.dev1
What's Changed
- [Feat] Unified Responses API - Add Azure Responses API support by @ishaan-jaff in #10116
- UI: Make columns resizable/hideable in Models table by @msabramo in #10119
Full Changelog: v1.66.2.dev1...v1.66.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 180.0 | 210.74489628810068 | 6.401988824678471 | 0.003341330284278951 | 1916 | 1 | 38.52582800004711 | 5506.760536000002 |
Aggregated | Passed ✅ | 180.0 | 210.74489628810068 | 6.401988824678471 | 0.003341330284278951 | 1916 | 1 | 38.52582800004711 | 5506.760536000002 |
v1.66.3-nightly
What's Changed
- Add aggregate spend by tag by @krrishdholakia in #10071
- Add OpenAI o3 & o4-mini by @PeterDaveHello in #10065
- Add new
/tag/daily/activity
endpoint + Add tag dashboard to UI by @krrishdholakia in #10073 - Add team based usage dashboard at 1m+ spend logs (+ new
/team/daily/activity
API) by @krrishdholakia in #10081 - [Feat SSO] Add LiteLLM SCIM Integration for Team and User management by @ishaan-jaff in #10072
- Virtual Keys: Filter by key alias (#10035) by @ishaan-jaff in #10085
- Add new
/vertex_ai/discovery
route - enables calling AgentBuilder API routes by @krrishdholakia in #10084 - fix(o_series_transformation.py): correctly map o4 to openai o_series … by @krrishdholakia in #10079
Full Changelog: v1.66.2-nightly...v1.66.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 250.0 | 302.3290337319068 | 6.097097387542003 | 0.04679789661490572 | 1824 | 14 | 218.4401190000358 | 5459.562037000012 |
Aggregated | Failed ❌ | 250.0 | 302.3290337319068 | 6.097097387542003 | 0.04679789661490572 | 1824 | 14 | 218.4401190000358 | 5459.562037000012 |
v1.66.2.dev1
What's Changed
- Add aggregate spend by tag by @krrishdholakia in #10071
- Add OpenAI o3 & o4-mini by @PeterDaveHello in #10065
- Add new
/tag/daily/activity
endpoint + Add tag dashboard to UI by @krrishdholakia in #10073 - Add team based usage dashboard at 1m+ spend logs (+ new
/team/daily/activity
API) by @krrishdholakia in #10081 - [Feat SSO] Add LiteLLM SCIM Integration for Team and User management by @ishaan-jaff in #10072
- Virtual Keys: Filter by key alias (#10035) by @ishaan-jaff in #10085
- Add new
/vertex_ai/discovery
route - enables calling AgentBuilder API routes by @krrishdholakia in #10084 - fix(o_series_transformation.py): correctly map o4 to openai o_series … by @krrishdholakia in #10079
Full Changelog: v1.66.2-nightly...v1.66.2.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.2.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 242.7078390639904 | 6.1689738182726535 | 0.0 | 1844 | 0 | 181.44264199997906 | 6553.659710999966 |
Aggregated | Passed ✅ | 200.0 | 242.7078390639904 | 6.1689738182726535 | 0.0 | 1844 | 0 | 181.44264199997906 | 6553.659710999966 |
v1.66.2-nightly
What's Changed
- Fix azure tenant id check from env var + response_format check on api_version 2025+ by @krrishdholakia in #9993
- Add
/vllm
and/mistral
passthrough endpoints by @krrishdholakia in #10002 - CI/CD fix mock tests by @ishaan-jaff in #10003
- Setting
litellm.modify_params
via environment variables by @Eoous in #9964 - Support checking provider
/models
endpoints on proxy/v1/models
endpoint by @krrishdholakia in #9958 - Update AWS bedrock regions by @Schnitzel in #9430
- Fix case where only system messages are passed to Gemini by @NolanTrem in #9992
- Revert "Fix case where only system messages are passed to Gemini" by @krrishdholakia in #10027
- chore(docs): Update logging.md by @mrlorentx in #10006
- build(deps): bump @babel/runtime from 7.23.9 to 7.27.0 in /ui/litellm-dashboard by @dependabot in #10001
- Fix typo: Entrata -> Entra in code by @msabramo in #9922
- Retain schema field ordering for google gemini and vertex by @adrianlyjak in #9828
- Revert "Retain schema field ordering for google gemini and vertex" by @krrishdholakia in #10038
- Add aggregate team based usage logging by @krrishdholakia in #10039
- [UI Polish] UI fixes for cache control injection settings by @ishaan-jaff in #10031
- [UI] Bug Fix - Show created_at and updated_at for Users Page by @ishaan-jaff in #10033
- [Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions by @ishaan-jaff in #10029
- Fix gcs pub sub logging with env var GCS_PROJECT_ID by @krrishdholakia in #10042
- Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls by @krrishdholakia in #10040
- [Docs] Auto prompt caching by @ishaan-jaff in #10044
- Add litellm call id passing to Aim guardrails on pre and post-hooks calls by @hxmichael in #10021
- /utils/token_counter: get model_info from deployment directly by @chaofuyang in #10047
- [Bug Fix] Azure Blob Storage fixes by @ishaan-jaff in #10059
- build(deps): bump http-proxy-middleware from 2.0.7 to 2.0.9 in /docs/my-website by @dependabot in #10064
- fix(stream_chunk_builder_utils.py): don't set index on modelresponse by @krrishdholakia in #10063
- fix(llm_http_handler.py): fix fake streaming by @krrishdholakia in #10061
New Contributors
- @Eoous made their first contribution in #9964
- @mrlorentx made their first contribution in #10006
- @hxmichael made their first contribution in #10021
- @chaofuyang made their first contribution in #10047
Full Changelog: v1.66.1-nightly...v1.66.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |
Aggregated | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |