Skip to content

Add ElevenLabs speech synthesis to TTS engine #1995

@martinbedouret

Description

@martinbedouret

Implement the actual text-to-speech synthesis using ElevenLabs API with audio streaming and playback, following the existing Azure pattern in tts.js.

Acceptance Criteria

  • Extend speak() method in tts.js to handle ElevenLabs voices
  • Implement ElevenLabs TTS API integration with voice settings
  • Add audio blob handling and playbook similar to Azure implementation
  • Implement proper error handling for API limits and failures
  • Add fallback to local voices when ElevenLabs fails
  • Implement request queuing for multiple speech requests
  • Add timeout handling and retry logic

Technical Implementation Notes

// Extend tts.js speak method
async speak(text, { voiceURI, pitch, rate, volume, onend }, setCloudSpeakAlertTimeout) {
  const voice = this.getVoiceByVoiceURI(voiceURI);

  if (voice && voice.voiceSource === 'elevenlabs') {
    const speakAlertTimeoutId = setCloudSpeakAlertTimeout();

    try {
      const audioBlob = await API.synthesizeSpeechElevenLabs(
        text,
        voice.voice_id,
        voice.settings
      );

      clearTimeout(speakAlertTimeoutId);
      this.playAudioBlob(audioBlob, onend);
    } catch (err) {
      console.error('ElevenLabs synthesis error:', err);
      onend({ error: true });
    }
  } else {
    // Existing logic for local/Azure voices
  }
}

Files to Modify

  • src/providers/SpeechProvider/tts.js
  • src/providers/SpeechProvider/SpeechProvider.actions.js

Metadata

Metadata

Assignees

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions