Adding support for transcriptions ( and OpenAI Whisper model) #300

michaellavelle · 2024-02-06T12:37:49Z

No description provided.

michaellavelle · 2024-02-06T12:39:30Z

.../spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiTranscriptionClient.java

+
+			Transcript transcript = new Transcript(transcription.text());
+
+			RateLimit rateLimits = OpenAiResponseHeaderExtractor.extractAiResponseHeaders(transcriptionEntity);


I'm re-using these RateLimit and Usage classes from the chat packages here, but given that these are now used outside of chat, should they be moved into a more generic package ?

I am not yet sure, we need to review this in the 0.9.0 timeline. I suspect there are cross cutting concerns. Now we are wrapping up the other direction, input options, to support a tiered structure of portable and model specific options. The same analysis and likely same approach needs to apply for the metadata info coming out of the response.

michaellavelle · 2024-02-06T15:23:31Z

Please see #149

markpollack · 2024-02-06T17:13:38Z

Thanks for the contribution, this looks like a very strong PR on first glance!

hemeda3 · 2024-02-12T08:34:51Z

Thank you.
Your PR inspired me to do the Text To Audio (Speech API).

Added support for OpenAI Text to Audio (Speech API )
#317

tzolov · 2024-03-05T12:49:43Z

Thank you @michaellavelle
I've replaced the lower-level client and removed the code from the spring-ai-core. We will need at least few audio and transcription client implementations before we can generalise.
Reworked, rebased, squashed and merged at 7d04167

michaellavelle commented Feb 6, 2024

View reviewed changes

Adding support for transcriptions

8f6f9b0

michaellavelle force-pushed the transcriptions branch from bf4a5c1 to 8f6f9b0 Compare February 6, 2024 13:46

michaellavelle mentioned this pull request Feb 6, 2024

Support for Speech-to-Text via OpenAI audio transcriptions #149

Closed

markpollack added this to the 0.9.0 milestone Feb 12, 2024

markpollack modified the milestones: 0.9.0, 0.8.1 Feb 29, 2024

markpollack requested a review from tzolov February 29, 2024 15:54

tzolov added enhancement New feature or request model client labels Mar 1, 2024

tzolov self-assigned this Mar 1, 2024

tzolov closed this Mar 5, 2024

tzolov mentioned this pull request Mar 6, 2024

Added support for OpenAI Text to Audio (Speech API ) #317

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding support for transcriptions ( and OpenAI Whisper model) #300

Adding support for transcriptions ( and OpenAI Whisper model) #300

Uh oh!

michaellavelle commented Feb 6, 2024

Uh oh!

michaellavelle Feb 6, 2024

Uh oh!

markpollack Feb 6, 2024

Uh oh!

michaellavelle commented Feb 6, 2024

Uh oh!

markpollack commented Feb 6, 2024

Uh oh!

hemeda3 commented Feb 12, 2024

Uh oh!

tzolov commented Mar 5, 2024

Uh oh!

Uh oh!


		Transcript transcript = new Transcript(transcription.text());

		RateLimit rateLimits = OpenAiResponseHeaderExtractor.extractAiResponseHeaders(transcriptionEntity);

Adding support for transcriptions ( and OpenAI Whisper model) #300

Adding support for transcriptions ( and OpenAI Whisper model) #300

Uh oh!

Conversation

michaellavelle commented Feb 6, 2024

Uh oh!

michaellavelle Feb 6, 2024

Choose a reason for hiding this comment

Uh oh!

markpollack Feb 6, 2024

Choose a reason for hiding this comment

Uh oh!

michaellavelle commented Feb 6, 2024

Uh oh!

markpollack commented Feb 6, 2024

Uh oh!

hemeda3 commented Feb 12, 2024

Uh oh!

tzolov commented Mar 5, 2024

Uh oh!

Uh oh!