Skip to content

Add gemini audio input support + handle special tokens in sagemaker response #9640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 30, 2025

Conversation

krrishdholakia
Copy link
Contributor

@krrishdholakia krrishdholakia commented Mar 29, 2025

Title

  • Add gemini audio input support
  • handle special tokens in sagemaker response

Relevant issues

Feature - enables calling gemini with audio input (b64 + file)
Fixes #9574 (comment)

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code]
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

Copy link

vercel bot commented Mar 29, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 30, 2025 0:57am

]
}
)
print("raw_request: ", raw_request)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High test

This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 3 months ago

To fix the problem, we need to ensure that sensitive information is not logged. This can be achieved by scrubbing the raw_request object of any sensitive data before logging it. We can use a utility function to remove or mask sensitive information from the raw_request object before printing it.

  • Add a utility function to scrub sensitive information from the raw_request object.
  • Use this utility function to clean the raw_request object before logging it.
  • Ensure that the changes are made in the tests/llm_translation/base_llm_unit_tests.py file.
Suggested changeset 1
tests/llm_translation/base_llm_unit_tests.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tests/llm_translation/base_llm_unit_tests.py b/tests/llm_translation/base_llm_unit_tests.py
--- a/tests/llm_translation/base_llm_unit_tests.py
+++ b/tests/llm_translation/base_llm_unit_tests.py
@@ -11,2 +11,9 @@
 
+def scrub_sensitive_data(data):
+    if isinstance(data, dict):
+        for key in ["client_secret", "api_key", "azure_ad_token", "azure_username", "azure_password"]:
+            if key in data:
+                data[key] = "REDACTED"
+    return data
+
 sys.path.insert(
@@ -954,2 +961,3 @@
         )
+        raw_request = scrub_sensitive_data(raw_request)
         print("raw_request: ", raw_request)
EOF
@@ -11,2 +11,9 @@

def scrub_sensitive_data(data):
if isinstance(data, dict):
for key in ["client_secret", "api_key", "azure_ad_token", "azure_username", "azure_password"]:
if key in data:
data[key] = "REDACTED"
return data

sys.path.insert(
@@ -954,2 +961,3 @@
)
raw_request = scrub_sensitive_data(raw_request)
print("raw_request: ", raw_request)
Copilot is powered by AI and may make mistakes. Always verify output.
@krrishdholakia krrishdholakia changed the title Add gemini audio input support Add gemini audio input support + handle special tokens in sagemaker response Mar 30, 2025
@krrishdholakia krrishdholakia merged commit 5c107c6 into main Mar 30, 2025
36 of 41 checks passed
@krrishdholakia krrishdholakia deleted the litellm_dev_03_29_2025_p1 branch March 30, 2025 07:10
@jbellis
Copy link

jbellis commented Mar 31, 2025

I'm getting litellm.exceptions.UnsupportedParamsError: litellm.UnsupportedParamsError: gemini does not support parameters: {'modalities': ['text', 'audio']}, for model=gemini-2.0-flash, is this just not in the docker images yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Sagemaker / Huggingface - LiteLLM doesn't handle model output which it classifies as special tokens
2 participants