-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
I'm opening this issue to request a couple of features that would improve integration with VertexAI and help reduce GCP usage costs.
- Support for Vertex AI Service Accounts
It appears the library currently requires API KEY for authentication. It would be very helpful if it could also support authentication via Vertex AI service accounts json file.
- Support for Batch Inference in VertexAI
When processing a large number of documents, making a separate API call for each one becomes expensive. A more cost-effective approach is to use batching.
It would be great if the library could be updated to internally collect multiple inputs and send them to the Gemini API as a single, batched request. This would significantly cut down on the total number of API calls, leading to substantial cost savings for anyone doing large-scale processing.