Description
Expected Behavior
It should be possible to send an audio sample with speech (in supported audio formats) to the OpenAI transcription endpoint and retrieve the extracted speech as text.
Current Behavior
Speech to text is currently not supported by the OpenAI integration and users must fallback to custom implementations.
See: https://github.com/thomasdarimont/quadropole-welcome-hero/blob/main/src/main/java/com/welcomehero/app/openai/OpenAiFacade.java#L34
Context
The motivating use case for this was to build a AI augmented information kiosk for hospitals to enable visitors, patients and non-native staff to ask questions with natural speech about the hospital environment, based on a controlled knowledge-base.
See: https://github.com/thomasdarimont/quadropole-welcome-hero
Greetings to the team :)