-
Hi, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
You would leave them as is most of the time. Acoustically, these are common speech patterns and the model should learn that the text should be represented as such. Of course, pronunciation of speech depends on the dataset, optimizing for just American English and expecting it to work with say British/Scottish English is unrealistic. The model will only be robust to accents it was trained on. We train on some 12000 hours of speech, and a lot of it is not formal speech, we do normalization for numerics and remove punctuation and capitalization, but keep the rest and it seems to work quite well. |
Beta Was this translation helpful? Give feedback.
You would leave them as is most of the time. Acoustically, these are common speech patterns and the model should learn that the text should be represented as such. Of course, pronunciation of speech depends on the dataset, optimizing for just American English and expecting it to work with say British/Scottish English is unrealistic. The model will only be robust to accents it was trained on.
We train on some 12000 hours of speech, and a lot of it is not formal speech, we do normalization for numerics and remove punctuation and capitalization, but keep the rest and it seems to work quite w…