Releases: BerriAI/litellm
v1.69.2-nightly
What's Changed
- Fixed Ollama Structured Response not working #10616 by @imdigitalashish in #10617
- fix(factory.py): Add reasoning content handling for missing assistant… by @LouisShark in #10688
- [Feat] Add tools support for Nvidia NIM by @ishaan-jaff in #10763
- [Fix]: /messages - allow using dynamic AWS params by @ishaan-jaff in #10769
- fix: pass application/json for GenericAPILogger by @ishaan-jaff in #10772
- [Docs] Using litellm with Google ADK by @ishaan-jaff in #10777
- Update Nscale model providers to point to website by @OscarSavNS in #10764
- [Fix] Allow using dynamic aws_region with /messages on Bedrock by @ishaan-jaff in #10779
- [Feat] Option to force/always use the litellm proxy (#10559) (#10633) by @ishaan-jaff in #10773
- feat: Addded EU Anthropic Inference profile for Claude 3.7 by @wagnerjt in #10767
- Add new model provider Novita AI (#7582) by @krrishdholakia in #9527
- Support Anthropic web search tool + Add more google finish reason mapping by @krrishdholakia in #10785
- Fix azure dall e 3 call with custom model name + Handle
Bearer $LITELLM_API_KEY
inx-litellm-api-key
custom header by @krrishdholakia in #10776 - [Refactor] Move LLM Guard, Secret Detection to Enterprise Pip packagea by @ishaan-jaff in #10782
New Contributors
- @imdigitalashish made their first contribution in #10617
- @LouisShark made their first contribution in #10688
- @OscarSavNS made their first contribution in #10764
Full Changelog: v1.69.1-nightly...v1.69.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 261.87647271929023 | 6.201675294060423 | 0.0 | 1856 | 0 | 196.7134870000109 | 3269.9465260000125 |
Aggregated | Passed ✅ | 230.0 | 261.87647271929023 | 6.201675294060423 | 0.0 | 1856 | 0 | 196.7134870000109 | 3269.9465260000125 |
v1.69.0.dev1
What's Changed
- Handle gemini audio input by @krrishdholakia in #10739
- Fixed Ollama Structured Response not working #10616 by @imdigitalashish in #10617
- fix(factory.py): Add reasoning content handling for missing assistant… by @LouisShark in #10688
- [Feat] Add tools support for Nvidia NIM by @ishaan-jaff in #10763
- [Fix]: /messages - allow using dynamic AWS params by @ishaan-jaff in #10769
New Contributors
- @imdigitalashish made their first contribution in #10617
- @LouisShark made their first contribution in #10688
Full Changelog: v1.69.0-stable...v1.69.0.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.0.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 225.99198177358326 | 6.202078923901813 | 0.0 | 1855 | 0 | 185.55775300001187 | 1628.1567700000323 |
Aggregated | Passed ✅ | 210.0 | 225.99198177358326 | 6.202078923901813 | 0.0 | 1855 | 0 | 185.55775300001187 | 1628.1567700000323 |
v1.69.1-nightly
What's Changed
- Handle gemini audio input by @krrishdholakia in #10739
Full Changelog: v1.69.0-stable...v1.69.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 266.2933357954668 | 6.193235506165493 | 0.0 | 1853 | 0 | 214.53170099999852 | 1247.624820999988 |
Aggregated | Passed ✅ | 250.0 | 266.2933357954668 | 6.193235506165493 | 0.0 | 1853 | 0 | 214.53170099999852 | 1247.624820999988 |
v1.69.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.69.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 264.33108534405653 | 6.12787888551344 | 0.0 | 1834 | 0 | 216.09041499999648 | 1326.1799069999824 |
Aggregated | Passed ✅ | 240.0 | 264.33108534405653 | 6.12787888551344 | 0.0 | 1834 | 0 | 216.09041499999648 | 1326.1799069999824 |
What's Changed
- [Docs] Using LiteLLM with vector stores / knowledge bases by @ishaan-jaff in #10534
- [Docs] Document StandardLoggingVectorStoreRequest by @ishaan-jaff in #10535
- Litellm stable release notes 05 03 2025 by @krrishdholakia in #10536
- [New model pricing] Add add perplexity/sonar-deep-research by @ishaan-jaff in #10537
- [Contributor PR] Support Llama-api as an LLM provider (#10451) by @ishaan-jaff in #10538
- UI - fix(model_management_endpoints.py): allow team admin to update model info + fix request logs - handle expanding other rows when existing row selected + fix(organization_endpoints.py): enable proxy admin with 'all-proxy-model' access to create new org with specific models by @krrishdholakia in #10539
- [Bug Fix] UnicodeDecodeError: 'charmap' on Windows during litellm import by @ishaan-jaff in #10542
- fix(converse_transformation.py): handle meta llama tool call response by @krrishdholakia in #10541
- Github: Increase timeout of litellm tests by @zoltan-ongithub in #10568
- [Docs] Change llama-api link for litellm by @seyeong-han in #10556
- [Feat] v2 Custom Logger API Endpoints by @ishaan-jaff in #10575
- [Bug fix] JSON logs - Ensure only 1 log is emitted (previously duplicate json logs were getting emitted) by @ishaan-jaff in #10580
- Update gemini-2.5-pro-exp-03-25 max_tokens to 65,535 by @mkavinkumar1 in #10548
- Update instructor.md by @thomelane in #10549
- fix issue when databrick use external model, the delta could be empty… by @frankzye in #10540
- Add
litellm-proxy
CLI (#10478) by @ishaan-jaff in #10578 - Add bedrock llama4 pricing + handle llama4 templating on bedrock invoke route by @krrishdholakia in #10582
- Contributor PR - Return 404 when
delete_verification_tokens
(POST /key/delete
) fai… by @ishaan-jaff in #10605 - Fix otel - follow genai semantic conventions + support 'instructions' param for tts by @krrishdholakia in #10608
- make openai model O series conditional accept provider/model by @aholmberg in #10591
- add gemini-2.5-pro-preview-05-06 model prices and context window by @marty-sullivan in #10597
- Fix: Ollama integration KeyError when using JSON response format by @aravindkarnam in #10611
- [Feat] V2 Emails - Fixes for sending emails when creating keys + Resend API support by @ishaan-jaff in #10602
- [Feat] Add User invitation emails when inviting users to litellm by @ishaan-jaff in #10615
- [Fix] SCIM - Creating SCIM tokens on Admin UI by @ishaan-jaff in #10628
- Filter on logs table by @NANDINI-star in #10644
- [Feat] Bedrock Guardrails - Add support for PII Masking with bedrock guardrails by @ishaan-jaff in #10642
- [Feat] Add endpoints to manage email settings by @ishaan-jaff in #10646
- Contributor PR - MCP Server DB Schema (#10634) by @ishaan-jaff in #10641
- Ollama - fix custom price cost tracking + add 'max_completion_token' support by @krrishdholakia in #10636
- fix cerebras llama-3.1-70b model_prices_and_context_window, not llama3.1-70b by @xsg22 in #10648
- Fix cache miss for gemini models with response_format by @casparhsws in #10635
- Add user management functionality to Python client library & CLI by @msabramo in #10627
- [BETA] Support unified file id (managed files) for batches by @krrishdholakia in #10650
- Fix Slack alerting not working if using a DB by @hypermoose in #10370
- Add support for Nscale (EU-Sovereign) Provider by @tomukmatthews in #10638
- Add New Perplexity Models by @keyute in #10652
- [Refactor - Filtering Spend Logs] Add
status
to root of SpendLogs table by @ishaan-jaff in #10661 - Filter logs on status and model by @NANDINI-star in #10670
- [Refactor] Anthropic /v1/messages endpoint - Refactor to use base llm http handler and transformations by @ishaan-jaff in #10677
- [Feat] Add support for using Bedrock Invoke models in /v1/messages format by @ishaan-jaff in #10681
- fix(factory.py): Handle system only message to anthropic by @krrishdholakia in #10678
- Realtime API - Set 'headers' in scope for websocket auth requests + reliability fix infinite loop when model_name not found for realtime models by @krrishdholakia in #10679
- Extract 'thinking' from nova response + Add 'drop_params' support for gpt-image-1 by @krrishdholakia in #10680
- New azure models by @emerzon in #9956
- Add GPTLocalhost to "docs/my-website/docs/projects" by @GPTLocalhost in #10687
- Add nscale support for streaming by @tomukmatthews in #10698
- build: update model in test by @krrishdholakia in #10706
- fix: support for python 3.11- (re datetime UTC) (#10471) by @ishaan-jaff in #10701
- [FIX] Update token fields in schema.prisma to use BigInt for improved… by @husnain7766 in #10697
- [Refactor] Use pip package for enterprise/ folder by @ishaan-jaff in #10709
- [Feat] Add streaming support for using bedrock invoke models with /v1/messages by @ishaan-jaff in #10710
- Add
--version
flag tolitellm-proxy
CLI by @msabramo in #10704 - Add management client docs by @msabramo in #10703
- fix(caching_handler.py): fix embedding str caching result by @krrishdholakia in #10700
- Azure LLM: fix passing through of azure_ad_token_provider parameter by @claralp in #10694
- set correct context window length for all gemini 2.5 variants by @mollux in #10690
- Fix log table bugs (after filtering logic was added) by @NANDINI-star in #10712
- fix(router.py): write file to all deployments by @krrishdholakia in #10708
- Litellm Unified File ID output file id support by @krrishdholakia in #10713
- complete unified batch id support - replace model in jsonl to be deployment model name by @krrishdholakia in #10719
- [UI] Bug Fix - Allow Copying Request / Response on Logs Page by @ishaan-jaff in #10720
- [UI] QA Logs page - Fix bug where log did not remain in focus + text overflow on error logs by @ishaan-jaff in #10725
- Add target model name validation by @krrishdholakia in #10722
- [Bug fix] - allow using credentials for /moderations by @ishaan-jaff in #10723
- [DB] Add index for session_id on LiteLLM_SpendLogs by @ishaan-jaff in #10727
- [QA Bug fix] fix: ensure model info does not get overwritten when editing a model on UI by @ishaan-jaff in #10726
- Mutable default arguments on embeddings/completion headers parameters breaks watsonx by @terylt in #10728
- [Docs] v1.69.0-stable by @ishaan-jaff in #10731
- Litellm emails smtp fixes by @ishaan-jaff in https://github.c...
v1.69.0-nightly
What's Changed
- build: update model in test by @krrishdholakia in #10706
- fix: support for python 3.11- (re datetime UTC) (#10471) by @ishaan-jaff in #10701
- [FIX] Update token fields in schema.prisma to use BigInt for improved… by @husnain7766 in #10697
- [Refactor] Use pip package for enterprise/ folder by @ishaan-jaff in #10709
- [Feat] Add streaming support for using bedrock invoke models with /v1/messages by @ishaan-jaff in #10710
- Add
--version
flag tolitellm-proxy
CLI by @msabramo in #10704 - Add management client docs by @msabramo in #10703
- fix(caching_handler.py): fix embedding str caching result by @krrishdholakia in #10700
- Azure LLM: fix passing through of azure_ad_token_provider parameter by @claralp in #10694
- set correct context window length for all gemini 2.5 variants by @mollux in #10690
- Fix log table bugs (after filtering logic was added) by @NANDINI-star in #10712
- fix(router.py): write file to all deployments by @krrishdholakia in #10708
- Litellm Unified File ID output file id support by @krrishdholakia in #10713
- complete unified batch id support - replace model in jsonl to be deployment model name by @krrishdholakia in #10719
- [UI] Bug Fix - Allow Copying Request / Response on Logs Page by @ishaan-jaff in #10720
- [UI] QA Logs page - Fix bug where log did not remain in focus + text overflow on error logs by @ishaan-jaff in #10725
- Add target model name validation by @krrishdholakia in #10722
- [Bug fix] - allow using credentials for /moderations by @ishaan-jaff in #10723
- [DB] Add index for session_id on LiteLLM_SpendLogs by @ishaan-jaff in #10727
- [QA Bug fix] fix: ensure model info does not get overwritten when editing a model on UI by @ishaan-jaff in #10726
- Mutable default arguments on embeddings/completion headers parameters breaks watsonx by @terylt in #10728
New Contributors
- @husnain7766 made their first contribution in #10697
- @claralp made their first contribution in #10694
- @mollux made their first contribution in #10690
- @terylt made their first contribution in #10728
Full Changelog: v1.68.2-nightly...v1.69.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.69.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 292.69430995024163 | 6.184694862389694 | 0.0 | 1849 | 0 | 216.9113210000262 | 60025.948276999996 |
Aggregated | Passed ✅ | 250.0 | 292.69430995024163 | 6.184694862389694 | 0.0 | 1849 | 0 | 216.9113210000262 | 60025.948276999996 |
v1.68.2.dev6
What's Changed
- build: update model in test by @krrishdholakia in #10706
- fix: support for python 3.11- (re datetime UTC) (#10471) by @ishaan-jaff in #10701
- [FIX] Update token fields in schema.prisma to use BigInt for improved… by @husnain7766 in #10697
New Contributors
- @husnain7766 made their first contribution in #10697
Full Changelog: v1.68.2-nightly...v1.68.2.dev6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.2.dev6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 210.63173736431506 | 6.257034907859717 | 0.0 | 1872 | 0 | 166.34112399992773 | 1685.74146200001 |
Aggregated | Passed ✅ | 190.0 | 210.63173736431506 | 6.257034907859717 | 0.0 | 1872 | 0 | 166.34112399992773 | 1685.74146200001 |
v1.68.2-nightly
What's Changed
- [Refactor - Filtering Spend Logs] Add
status
to root of SpendLogs table by @ishaan-jaff in #10661 - Filter logs on status and model by @NANDINI-star in #10670
- [Refactor] Anthropic /v1/messages endpoint - Refactor to use base llm http handler and transformations by @ishaan-jaff in #10677
- [Feat] Add support for using Bedrock Invoke models in /v1/messages format by @ishaan-jaff in #10681
- fix(factory.py): Handle system only message to anthropic by @krrishdholakia in #10678
- Realtime API - Set 'headers' in scope for websocket auth requests + reliability fix infinite loop when model_name not found for realtime models by @krrishdholakia in #10679
- Extract 'thinking' from nova response + Add 'drop_params' support for gpt-image-1 by @krrishdholakia in #10680
- New azure models by @emerzon in #9956
- Add GPTLocalhost to "docs/my-website/docs/projects" by @GPTLocalhost in #10687
- Add nscale support for streaming by @tomukmatthews in #10698
New Contributors
- @GPTLocalhost made their first contribution in #10687
Full Changelog: v1.68.1.dev4...v1.68.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 223.07673508503882 | 6.209370359620187 | 0.0033419646714855688 | 1858 | 1 | 75.31227999999146 | 4978.849046000022 |
Aggregated | Passed ✅ | 190.0 | 223.07673508503882 | 6.209370359620187 | 0.0033419646714855688 | 1858 | 1 | 75.31227999999146 | 4978.849046000022 |
v1.68.1.dev4
What's Changed
- Contributor PR - Return 404 when
delete_verification_tokens
(POST /key/delete
) fai… by @ishaan-jaff in #10605 - Fix otel - follow genai semantic conventions + support 'instructions' param for tts by @krrishdholakia in #10608
- make openai model O series conditional accept provider/model by @aholmberg in #10591
- add gemini-2.5-pro-preview-05-06 model prices and context window by @marty-sullivan in #10597
- Fix: Ollama integration KeyError when using JSON response format by @aravindkarnam in #10611
- [Feat] V2 Emails - Fixes for sending emails when creating keys + Resend API support by @ishaan-jaff in #10602
- [Feat] Add User invitation emails when inviting users to litellm by @ishaan-jaff in #10615
- [Fix] SCIM - Creating SCIM tokens on Admin UI by @ishaan-jaff in #10628
- Filter on logs table by @NANDINI-star in #10644
- [Feat] Bedrock Guardrails - Add support for PII Masking with bedrock guardrails by @ishaan-jaff in #10642
- [Feat] Add endpoints to manage email settings by @ishaan-jaff in #10646
- Contributor PR - MCP Server DB Schema (#10634) by @ishaan-jaff in #10641
- Ollama - fix custom price cost tracking + add 'max_completion_token' support by @krrishdholakia in #10636
- fix cerebras llama-3.1-70b model_prices_and_context_window, not llama3.1-70b by @xsg22 in #10648
- Fix cache miss for gemini models with response_format by @casparhsws in #10635
- Add user management functionality to Python client library & CLI by @msabramo in #10627
- [BETA] Support unified file id (managed files) for batches by @krrishdholakia in #10650
- Fix Slack alerting not working if using a DB by @hypermoose in #10370
- Add support for Nscale (EU-Sovereign) Provider by @tomukmatthews in #10638
- Add New Perplexity Models by @keyute in #10652
New Contributors
- @aholmberg made their first contribution in #10591
- @aravindkarnam made their first contribution in #10611
- @xsg22 made their first contribution in #10648
- @casparhsws made their first contribution in #10635
- @hypermoose made their first contribution in #10370
- @tomukmatthews made their first contribution in #10638
- @keyute made their first contribution in #10652
Full Changelog: v1.68.1-nightly...v1.68.1.dev4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 233.10816080888745 | 6.241336822394705 | 0.0 | 1868 | 0 | 166.93079599997418 | 5406.457653000075 |
Aggregated | Passed ✅ | 190.0 | 233.10816080888745 | 6.241336822394705 | 0.0 | 1868 | 0 | 166.93079599997418 | 5406.457653000075 |
v1.68.1.dev2
Full Changelog: v1.68.1.dev1...v1.68.1.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 271.34034604220733 | 6.1752223755996924 | 0.0 | 1848 | 0 | 206.34432800000013 | 5012.736279000023 |
Aggregated | Passed ✅ | 240.0 | 271.34034604220733 | 6.1752223755996924 | 0.0 | 1848 | 0 | 206.34432800000013 | 5012.736279000023 |
v1.68.1.dev1
What's Changed
- Github: Increase timeout of litellm tests by @zoltan-ongithub in #10568
- [Docs] Change llama-api link for litellm by @seyeong-han in #10556
- [Feat] v2 Custom Logger API Endpoints by @ishaan-jaff in #10575
- [Bug fix] JSON logs - Ensure only 1 log is emitted (previously duplicate json logs were getting emitted) by @ishaan-jaff in #10580
- Update gemini-2.5-pro-exp-03-25 max_tokens to 65,535 by @mkavinkumar1 in #10548
- Update instructor.md by @thomelane in #10549
- fix issue when databrick use external model, the delta could be empty… by @frankzye in #10540
- Add
litellm-proxy
CLI (#10478) by @ishaan-jaff in #10578
New Contributors
- @zoltan-ongithub made their first contribution in #10568
- @mkavinkumar1 made their first contribution in #10548
- @thomelane made their first contribution in #10549
- @frankzye made their first contribution in #10540
Full Changelog: v1.68.0-nightly...v1.68.1.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.1.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 244.34719839029643 | 6.203411663807808 | 0.0 | 1855 | 0 | 183.31073700005618 | 5362.244745999988 |
Aggregated | Passed ✅ | 210.0 | 244.34719839029643 | 6.203411663807808 | 0.0 | 1855 | 0 | 183.31073700005618 | 5362.244745999988 |