Skip to content

feat: Add support for word-level progress tracking in TextToSpeech #131

@Dhruv-1105

Description

@Dhruv-1105

Is your feature request related to a problem? Please describe:
In many applications that use text-to-speech (TTS), it is essential to track the progress of spoken words to provide features such as synchronized text highlighting. Currently, the @capacitor-community/text-to-speech package does not offer a way to get real-time updates on the specific words being spoken, which limits its utility in such scenarios.

Describe the solution you'd like:
I propose adding support for an onRangeStart event that emits the start and end indices of the currently spoken word, along with the spoken word itself. This feature would allow developers to track which word is being spoken in real-time and implement functionalities such as synchronized text highlighting.
The implementation involves the following changes:

  • TextToSpeech.java:
    Added an UtteranceProgressListener that listens for onRangeStart events and emits the start and end indices of the spoken word.
    @Override
    public void onRangeStart(String utteranceId, int start, int end, int frame) {
        String spokenWord = text.substring(start, end);
        Log.d("TTS", "Spoken word: " + spokenWord);
        resultCallback.onRangeStart(start, end);
    }
  • TextToSpeechPlugin.java:
    Added a method to handle the onRangeStart callback and emit the event.
    @PluginMethod
    public void speak(PluginCall call) {
        // existing code...
        SpeakResultCallback resultCallback = new SpeakResultCallback() {
            @Override
            public void onRangeStart(int start, int end) {
                JSObject ret = new JSObject();
                ret.put("start", start);
                ret.put("end", end);
                call.resolve(ret);
            }
        };
        // existing code...
    }
  • definitions.ts:
    Added an addListener method to listen for onRangeStart events.
    addListener(eventName: 'onRangeStart', listenerFunc: (info: { start: number; end: number; spokenWord: string }) => void): Promise<PluginListenerHandle>;

Describe alternatives you've considered:
An alternative approach could be to periodically poll the TTS engine for its current progress, but this would be less efficient and more complex to implement. Integrating directly with the UtteranceProgressListener provides a more reliable and accurate solution.

Additional context:
This feature is critical for applications that need to provide synchronized text highlighting, karaoke-style text displays, or any other feature that requires real-time tracking of spoken words. Adding this capability to the @capacitor-community/text-to-speech package will significantly enhance its usability for a broader range of applications.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions