Skip to content

Audio transcript in Gemini Live API not really working #951

@tjasmin111

Description

@tjasmin111

Description of the bug:

I'm trying to add transcription to the Gemini Live demo code here. According to Google's official capability: https://ai.google.dev/gemini-api/docs/live-guide#audio-transcription

But the transcription is a mess, like below. Am I missing anything? Any extra flags to set?

[Model Transcript]:  Ca
[Model Transcript]: n I
[Model Transcript]:  pl
[Model Transcript]: eas
[Model Transcript]: e h
[Model Transcript]: ave
[Model Transcript]:  yo
[Model Transcript]: ur
[Model Transcript]: acc
[Model Transcript]: oun
[Model Transcript]: t n
[Model Transcript]: umb
[Model Transcript]: er

Actual vs expected behavior:

I'm expecting it clearly writes: Can I please have your account number. This is how Amazon's Nova Sonic works.

Any other information you'd like to share?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions