OpenAI realtime support on LiteLLM ? #7423

ishaan-jaff · 2024-12-26T05:37:34Z

ishaan-jaff
Dec 26, 2024
Maintainer

Starting a discussion for LiteLLM's implementation for OpenAI realtime endpoint support

If you have a request or feedback please leave a comment

Open Questions:

Should we support using web sockets or webRTC
Do you want all realtime events logged on Langfuse, s3, GCS etc https://docs.litellm.ai/docs/proxy/logging

marty-sullivan · 2025-04-10T05:57:22Z

marty-sullivan
Apr 10, 2025

Now that you've implemented cost tracking on /v1/realtime, we would use WebRTC for the realtime models through proxy if it was available. Implementing speech-to-speech is doable with websockets, but a lot more challenging.

I also don't think your current implementation works with browser WebSocket at all, since you can't pass headers. Consider the example provided by OpenAI where subprotocols are used instead of headers to supply auth info:

/*
Note that in client-side environments like web browsers, we recommend
using WebRTC instead. It is possible, however, to use the standard 
WebSocket interface in browser-like environments like Deno and 
Cloudflare Workers.
*/

const ws = new WebSocket(
  "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview-2024-12-17",
  [
    "realtime",
    // Auth
    "openai-insecure-api-key." + OPENAI_API_KEY, 
    // Optional
    "openai-organization." + OPENAI_ORG_ID,
    "openai-project." + OPENAI_PROJECT_ID,
    // Beta protocol, required
    "openai-beta.realtime-v1"
  ]
);

ws.on("open", function open() {
  console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
  console.log(message.data);
});

5 replies

marty-sullivan Apr 10, 2025

on second thought though, we might want to recommend that our users implement WebRTC server or their own websocket server when using the proxy. I'll think about this some more.

krrishdholakia Apr 10, 2025
Maintainer

Hey @marty-sullivan

If you do WebRTC don't you lose the ability to log the request / response?

As the browser establishes a direct connection to the backend openai server vs. the proxy

Asking to understand how we can support WebRTC

marty-sullivan Apr 11, 2025

@krrishdholakia I'm just sort of thinking out loud on this. My thought process is around how we would prevent abuse of client-side credentials to our LiteLLM proxy if any of our proxy users were to implement WebRTC in a client-side app (if you were to add support for it on the /v1/realtime endpoint).

What I meant with my previous comment was, we might want to suggest to our internal users that they should only implement server-to-server websocket to the LiteLLM proxy and that they would need to implement WebRTC between their clients and their server, in order to protect their proxy API key.

Then again, we could also suggest that any web applications that would use client-side WebRTC should generate temporary keys to LiteLLM proxy for authenticated web-based clients when using client-side access to our gateway.

The more I think about it though, I do think if LiteLLM supported WebRTC with cost tracking for the /v1/realtime endpoint, there would be use for it in clients where credential protection is not as imperative too. For example, we do have one professor who is doing human-robot-interaction projects. I would imagine he would want to implement WebRTC for his robots for both simplicity and better connection reliability.

krrishdholakia Apr 11, 2025
Maintainer

Would the connection be to the proxy or the LLM API?

marty-sullivan Apr 11, 2025

@krrishdholakia the connection would be to the proxy.

I would expect that the LiteLLM proxy would support WebRTC for /v1/realtime (in the same way that OpenAI / Azure OpenAI does) and would track the cost in the same way you have recently implemented cost-tracking for server-side WebSockets.

In our environment, we are only offering items to our users through the proxy that can have cost tracked. So something like a passthrough to the provider API would not be desirable, unless you could track the costs for the passthrough.

marty-sullivan · 2025-04-12T20:51:16Z

marty-sullivan
Apr 12, 2025

Would also be interesting to see AWS' implementation of realtime supported as well. Certainly not a priority for us, but I could see competitive realtime models being added to Bedrock in the future.

https://aws.amazon.com/blogs/aws/introducing-amazon-nova-sonic-human-like-voice-conversations-for-generative-ai-applications/

3 replies

krrishdholakia Apr 12, 2025
Maintainer

I have a PR for this - #9864

But it's early

rachitchauhan43 Jun 4, 2025

@krrishdholakia : Why not offer HTTP/2 as an interfacing instead of Websockets since HTTP/2 also offers bi-di streaming? I understand that their SDK today doesn't let you override base url and headers (like OpenAI sdk lets you) but is there any reason why HTTP/2 endpoint would not be exposed ?

krrishdholakia Jun 5, 2025
Maintainer

do they support http/2?

or would we be maintaining a websocket backend with an http/2 client-side request?

marty-sullivan · 2025-05-23T14:19:25Z

marty-sullivan
May 23, 2025

For others looking for an easy frontend implementation path, the following Chainlit example works really well with LiteLLM proxy. Basically just need to change the way it's setting the URL and API key and it works as is.

https://github.com/Chainlit/cookbook/tree/main/realtime-assistant

~~It doesn't currently seem to implement the user speech detection event to interrupt the AI's speech but that could probably be done in some way.~~

Nevermind, it does implement the above fine.

1 reply

marty-sullivan Jun 2, 2025

I've discovered that the audio capture feature in Chainlit doesn't work with Firefox, but does work with all other browsers

Uh oh!

OpenAI realtime support on LiteLLM ? #7423

Uh oh!

ishaan-jaff Dec 26, 2024 Maintainer

Open Questions:

Replies: 3 comments · 9 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

krrishdholakia Apr 10, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

krrishdholakia Apr 11, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

krrishdholakia Apr 12, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

krrishdholakia Jun 5, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

ishaan-jaff
Dec 26, 2024
Maintainer

Replies: 3 comments 9 replies

krrishdholakia Apr 10, 2025
Maintainer

krrishdholakia Apr 11, 2025
Maintainer

krrishdholakia Apr 12, 2025
Maintainer

krrishdholakia Jun 5, 2025
Maintainer