Skip to content

Word Level or Char Level language model? #5

@MagedSaeed

Description

@MagedSaeed

Thanks @patrickvonplaten for this repo, it really helped a lot!

Just a question here, what is the best language model for CTC decoding? is it a character-level or word-level language model? I am assuming a character level should be the choice as wav2vec decodes characters. However, it seems that the practice is to use a word-level one. I notice that in many repos and posts. Please correct me if I am wrong. Also, if so, can you please elaborate on why word-level language models are preferred over char-level ones?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions