-
Notifications
You must be signed in to change notification settings - Fork 295
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Let’s face it. KenLM has served us well…
…but it has its limitations. It didn’t aged well as a language model architecture.
First order of business is to compute a bi directional vector representation of words to go from an audio representation to a character representation.
For example word2vec allows you to take any word and get its relative vector towards all others.
Nowadays we can use a small transformer to achieve this.
Let’s train a transformer on the raw output of our acoustic models and teach them to produce an accurate character representation of our spoken words.
This is much smarter than using KenLM and doesn’t need to be more computationally expensive if we scale our transformer accordingly.
RobinE89 and carlfm01
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request