Feature request: Replace Scorer.KenLM with Scorer.Transform

Let’s face it. KenLM has served us well…
…but it has its limitations. It didn’t aged well as a language model architecture.

First order of business is to compute a bi directional vector representation of words to go from an audio representation to a character representation.

For example word2vec allows you to take any word and get its relative vector towards all others.

Nowadays we can use a small transformer to achieve this.

Let’s train a transformer on the raw output of our acoustic models and teach them to produce an accurate character representation of our spoken words.


This is much smarter than using KenLM and doesn’t need to be more computationally expensive if we scale our transformer accordingly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: Replace Scorer.KenLM with Scorer.Transform #2348

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: Replace Scorer.KenLM with Scorer.Transform #2348

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions