Batches V2 Implementation / Managed Files #9632
Replies: 11 comments 3 replies
-
Beta Was this translation helpful? Give feedback.
-
Gemini has a nice implementation at the SDK level of something like this. Noting down for reference from google import genai
client = genai.Client()
myfile = client.files.upload(file='media/sample.mp3')
response = client.models.generate_content(
model='gemini-2.0-flash',
contents=['Describe this audio clip', myfile]
)
print(response.text) |
Beta Was this translation helpful? Give feedback.
-
Thanks for brainstorming on this @krrishdholakia. I have 2 questions:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for clarifying Krrish!
Follow up question on the first part. Does that mean you would enforce a requirement that users should provide a S3/ GCS/ Azure Blob location in order to use Batch processing with LiteLLM?
This would be very helpful! |
Beta Was this translation helpful? Give feedback.
-
Updating based on feedback from another user:
Updated file creation flow: file = client.files.create(
file=wav_data,
purpose="user_data",
extra_body={"target_model_names": ["gemini-2.0-flash", "gpt-3.5-turbo"]} # copy files in this moment, store 'file_id' reference in db
)
I think this will remove the need for a s3/gcs bucket for litellm @jaivep2323 |
Beta Was this translation helpful? Give feedback.
-
Noticed:
|
Beta Was this translation helpful? Give feedback.
-
Initial PR here |
Beta Was this translation helpful? Give feedback.
-
DAY 2 Work: Files API |
Beta Was this translation helpful? Give feedback.
-
More improvements:
|
Beta Was this translation helpful? Give feedback.
-
v1.69.0 will be live tomorrow. This release brings LiteLLM Managed File support to Batches, making it easier to call + control access to Azure batch endpoints. LiteLLM Managed Files let's you:
Here's a guide on how to use LiteLLM with LiteLLM Managed Files - https://docs.litellm.ai/docs/proxy/managed_batches. |
Beta Was this translation helpful? Give feedback.
-
Closing as this is now live. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Writing down an idea for an improvement on the current batches implementation. Key challanges:
tldr; if we store the
.jsonl
in the Proxy DB, we could easily go across providers by doing the create/batch operation when required on that deployment1. Setup config.yaml
2. User sends file to proxy
3. User can create batch request
Beta Was this translation helpful? Give feedback.
All reactions