Skip to content

itwastony/ai-talent-hub-itmo-speech-course

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech Recognition and Generation course for ITMO AI Talent Hub

  • Course materials are available in English in this repo
  • All the lectures are given in Russian

Course structure

All courseworks deadlines are provided within their corresponding descriptions

Syllabus

This is a new course that is given for the first time, so the syllabus is subject to slight modifications during the course.

  1. Basics of Digital Signal Processing
  2. Classic ASR and metrics
  3. End-to-End ASR with CTC and audio augmentations
  4. Encoder-Decoder End-to-End ASR and decoding with LM
  5. Self-supervised speech representations
  6. SSL-finetuned ASR and Whisper
  7. Text-to-Speech systems
  8. Neural vocoders
  9. Modern TTS with normalizing flows and diffusion
  10. Neural Codec Language Models: VALL-E
  11. Extra: Speaker recognition and speech inpainting

Resources

About

ITMO AI Talent Hub Speech Recognition and Generation course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%