Setting the synthesizer sentence segmenter for a given language

Hi! Great project.

Currently the pySBD segmenter is [hardcoded to english](https://github.com/idiap/coqui-ai-TTS/blob/4c593c620854d9cd2e177382abf48082f7c9f2ae/TTS/utils/synthesizer.py#L92): 

```python
class Synthesizer(nn.Module):
    def __init__(
        self,
        *,
        tts_checkpoint: str | os.PathLike[Any] | None = None,
        tts_config_path: str | os.PathLike[Any] | None = None,
        tts_speakers_file: str | os.PathLike[Any] | None = None,
        tts_languages_file: str | os.PathLike[Any] | None = None,
        vocoder_checkpoint: str | os.PathLike[Any] | None = None,
        vocoder_config: str | os.PathLike[Any] | None = None,
        encoder_checkpoint: str | os.PathLike[Any] | None = None,
        encoder_config: str | os.PathLike[Any] | None = None,
        vc_checkpoint: str | os.PathLike[Any] | None = None,
        vc_config: str | os.PathLike[Any] | None = None,
        model_dir: str | os.PathLike[Any] | None = None,
        voice_dir: str | os.PathLike[Any] | None = None,
        use_cuda: bool = False,
    ) -> None:
        # etc.
        self.seg = self._get_segmenter("en")
```

It would be great if that parameter could be set somehow. I don't fully understand the api, so I can't confidently suggest an appropriate approach to achieve this, but maybe following the `language` arg in the `tts_to_file` call?

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Setting the synthesizer sentence segmenter for a given language #324

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Setting the synthesizer sentence segmenter for a given language #324

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions