Replies: 1 comment
-
It would be better to segment your input text based on punctuation and keep each segment around 15 seconds. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am facing an out of memory issue when running the tts model on long sentences. I worked around it by splitting the text and processing it in chunks and then combining produced audio files. This, however, produces small glitches in the final audio, in places where the audio chunks were merged. I wonder if there is any other way to convert long text to audio in one go, without running out of memory?
Models I am using:
spectrogram_model="tts_en_fastpitch"
vocoder_model="tts_hifigan"
Beta Was this translation helpful? Give feedback.
All reactions