Context-Sensitive Pronunciation Missing

I’ve noticed that the training data for the phonemizer seems to consist entirely of isolated words. This may cause the model to consistently produce canonical (dictionary-style) pronunciations, and fail to handle context-sensitive phonetic variations — for example, always predicting “the” as /ðə/, regardless of the following word (/ðiː/ of the following start with u e o a i). It would be great to consider incorporating sentence-level data to account for connected speech phenomena like weak forms, assimilation, and linking.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Context-Sensitive Pronunciation Missing #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Context-Sensitive Pronunciation Missing #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions