-
Notifications
You must be signed in to change notification settings - Fork 8
🎅 I WISH genai gateway HAD... #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
framework to compare outputs/latency from different LLMs in an "experiment" format (feedback from trusek@) |
Since the solution is CDK-based, it'd be great to package and publish the construct(s) in a construct library (on PyPI, NPM, etc) so we can just use it as a component within our own deployments! |
Hi Now a question/request.
My specific use-case needs support for endpoints:
Unfortunately, simply replacing it with converse won't do the trick, so maybe with your current knowledge you can add support for them or indicate what should be included if I wanted to implement them on my side? |
Right now, we only support converse api. I'm wondering: why do you need the older invoke APIs? Would like to better understand your needs there. I'm open to adding it in the future if it would bring some value. I would have to add logic to translate from the invoke api format to the OpenAI format before passing the request to litellm, just as I have done for the converse APIs If you're looking into doing this yourself in your own project (or opening a pr in this one), look at the middleware/app.py file: https://github.com/aws-samples/genai-gateway/blob/main/middleware/app.py That's where I'm currently doing the translation logic for the converse endpoints |
Here the answer is quite trivial, I'm trying to create a gateway for an application whose code I can't change, and it uses these old endpoints. The only thing I can influence is the URL. At this point I can add that I tried to simply change the path in the gateway and the conversion from Bedrock -> OpenAI looked good at first glance. Things started to break down when returning to Bedrock format (especially for stream). |
Okay, I will look into how difficult it is to support invoke model. My main concern is that, unlike converse which has a consistent format across all models, I think invoke model differs a lot per model. So it may not be simple to implement |
If you tell me what model you're using, perhaps I can just initially support that one to get you unblocked The way I see it, is I will basically need to support all bedrock invoke model formats, detect which format is being used, and then do the conversion based on that? Wondering if you have any thoughts/suggestions here |
@mirodrr2 thanks for the instant response! What I care about most is integration with LiteLLM, I don't have (yet) too many requirements regarding models - mainly gpt-4o and claude (3.5, 3.7). |
Okay, as a 1.0 version for support for this feature, I can focus on supporting invoke model in Claude format non-streaming conversion to litellm (which would allow you to call gpt-4o and any other model you want as well) Can't give you a timeline, but would that unblock you? Or do you also need streaming |
The mentioned app hits both endpoints so it's a partial success, but I appreciate any help. |
Also, are you not able to change the code at all? Because you will still need to make some adjustments to your client instantiation code to inject the api key as detailed in the readme There might be some way to disable litellm auth though. If you're in an isolated network, that could be a possible solution |
Yes, I am aware of that, but it is a separate issue that I will have to somehow get around. For now, for initial testing purposes, I simply hardcoded it on the gateway side, but you are absolutely right. Isolating the network seems like a sensible approach in this case. |
@lsawaniewski , can you send me a code sample of the exact format you're using to call invoke_model? Would help me with testing |
@mirodrr2 I'll check what I can do, but I have limited access to it myself and I only rely on logs/requests on the gateway side. 🙄 |
Please create an issue if you can. Good idea |
Made an issue: |
This is a ticket to track a wishlist of items you wish genai gateway had.
COMMENT BELOW 👇
With your request 🔥 - if we have any questions, we'll follow up in comments / via DMs
Respond with ❤️ to any request you would also like to see
The text was updated successfully, but these errors were encountered: