Add gemini audio input support + handle special tokens in sagemaker response #9640

krrishdholakia · 2025-03-29T23:55:25Z

Title

Add gemini audio input support
handle special tokens in sagemaker response

Relevant issues

Feature - enables calling gemini with audio input (b64 + file)
Fixes #9574 (comment)

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code]
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…point no team/org split on daily user endpoint

…s audio input

enables passing google cloud bucket urls

vercel · 2025-03-29T23:55:30Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Mar 30, 2025 0:57am

tests/llm_translation/base_llm_unit_tests.py

+                ]
+            }
+        )
+        print("raw_request: ", raw_request)


To fix the problem, we need to ensure that sensitive information is not logged. This can be achieved by scrubbing the raw_request object of any sensitive data before logging it. We can use a utility function to remove or mask sensitive information from the raw_request object before printing it.

Add a utility function to scrub sensitive information from the raw_request object.

Use this utility function to clean the raw_request object before logging it.

Ensure that the changes are made in the tests/llm_translation/base_llm_unit_tests.py file.

… from url

…when counting sagemaker tokens

jbellis · 2025-03-31T17:32:17Z

I'm getting litellm.exceptions.UnsupportedParamsError: litellm.UnsupportedParamsError: gemini does not support parameters: {'modalities': ['text', 'audio']}, for model=gemini-2.0-flash, is this just not in the docker images yet?

krrishdholakia added 5 commits March 29, 2025 15:49

fix(internal_user_endpoints.py): cleanup unused variables on beta end…

87565f4

…point no team/org split on daily user endpoint

build(model_prices_and_context_window.json): gemini-2.0-flash support…

4f94987

…s audio input

feat(gemini/transformation.py): support passing audio input to gemini

4d40eba

test: fix test

2d4709d

fix(gemini/transformation.py): support audio input as a url

ea56aa8

enables passing google cloud bucket urls

github-advanced-security bot found potential problems Mar 29, 2025

View reviewed changes

fix(gemini/transformation.py): support explicitly passing format of file

e89ef64

vercel bot deployed to Preview March 30, 2025 00:22 View deployment

fix(gemini/transformation.py): expand support for inferred file types…

52e45d2

… from url

vercel bot deployed to Preview March 30, 2025 00:33 View deployment

fix(sagemaker/completion/transformation.py): fix special token error …

cca8973

…when counting sagemaker tokens

vercel bot deployed to Preview March 30, 2025 00:38 View deployment

krrishdholakia mentioned this pull request Mar 30, 2025

[Bug]: Sagemaker / Huggingface - LiteLLM doesn't handle model output which it classifies as special tokens #9574

Closed

test: fix import

942c9b0

krrishdholakia changed the title ~~Add gemini audio input support~~ Add gemini audio input support + handle special tokens in sagemaker response Mar 30, 2025

vercel bot deployed to Preview March 30, 2025 00:57 View deployment

krrishdholakia merged commit 5c107c6 into main Mar 30, 2025
36 of 41 checks passed

krrishdholakia mentioned this pull request Mar 30, 2025

[Feature]: Support Using non Image/PDF files with Gemini models #9416

Open

krrishdholakia deleted the litellm_dev_03_29_2025_p1 branch March 30, 2025 07:10

jbellis mentioned this pull request Apr 16, 2025

[Bug]: gemini audio regression #10070

Closed

@@ -11,2 +11,9 @@
+            def scrub_sensitive_data(data):
+                if isinstance(data, dict):
+                    for key in ["client_secret", "api_key", "azure_ad_token", "azure_username", "azure_password"]:
+                        if key in data:
+                            data[key] = "REDACTED"
+                return data
             sys.path.insert(
@@ -954,2 +961,3 @@
                     )
+                    raw_request = scrub_sensitive_data(raw_request)
                     print("raw_request: ", raw_request)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add gemini audio input support + handle special tokens in sagemaker response #9640

Add gemini audio input support + handle special tokens in sagemaker response #9640

Uh oh!

krrishdholakia commented Mar 29, 2025 •

edited

Loading

Uh oh!

vercel bot commented Mar 29, 2025 •

edited

Loading

Uh oh!

Check failure

Copilot Autofix

Uh oh!

jbellis commented Mar 31, 2025

Uh oh!

Uh oh!

Uh oh!

Add gemini audio input support + handle special tokens in sagemaker response #9640

Add gemini audio input support + handle special tokens in sagemaker response #9640

Uh oh!

Conversation

krrishdholakia commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Uh oh!

jbellis commented Mar 31, 2025

Uh oh!

Uh oh!

krrishdholakia commented Mar 29, 2025 •

edited

Loading

vercel bot commented Mar 29, 2025 •

edited

Loading