Skip to content

amoghatr/quartznet-pytorch

 
 

Repository files navigation

quartznet-pytorch

Automatic Speech Recognition (ASR) on pytorch. Re-implementation on pytorch of Nvidia's Quartznet.

Features:

  • Youtokentome tokenization with BPE dropout
  • Augmentations: custom and audiomentations
  • 3 datasets support: CommonVoice, Librispeech and LJSpeech
  • Weights & Biases logging
  • CTC beam search interation
  • GPU-based MelSpectrogram

Trained models:

dataset wer using dummy decoder wer with ctc beam search wer finetuned dummy decoder wer finetuned ctc beam search
LJspeech 36.66 34.45 28.41 27.19

W&B Logs:

About

Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 62.8%
  • Python 36.7%
  • Other 0.5%