Open
Description
Scope check
- This is core LLM communication (not application logic)
- This benefits most users (not just my use case)
- This can't be solved in application code with current RubyLLM
- I read the Contributing Guide
Due diligence
- I searched existing issues
- I checked the documentation
What problem does this solve?
Both Bedrock (Nova Sonic) and OpenAI (Realtime conversations) offer APIs for their speech-to-speech models that are pretty impressive and useful. I would like to be able to interact with them in this library.
Proposed solution
Implement something like RubyLLM.chat.with_speech
.
Why this belongs in RubyLLM
These are LLMs that we would like to use in our Ruby applications but are a bit difficult to deal with. Having a beautiful supported wrapper would be very nice.