Skip to content

A curated roadmap based on my 5 years of experience form zero to become a skilled AI Speech Engineer. This roadmap covers everything from fundamentals to cutting-edge research trends in the speech domain.

Notifications You must be signed in to change notification settings

leminhnguyen/ai-speech-engineer-roadmap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

🥑 ROADMAP: AI Speech Engineer

GitHub stars GitHub forks GitHub last commit

A curated roadmap based on my 5 years of experience form zero to become a skilled AI Speech Engineer. 🚀👨‍💻
This roadmap covers everything from fundamentals to cutting-edge research trends in the speech domain.


📅 Overview Timeline

Phase Duration Focus Areas
🧠 Foundations 3 months Math, Python, Machine Learning, Deep Learning, Signal Processing
💼 Tools & Frameworks 3 months Libraries, Audio Tools, Hugging Face
🌱 Core Technologies 12 months ASR, TTS, Speaker Verification & Diarization
🔬 Research Trends Continuous Audio-Language Models

🧠 #1 Foundations (3 months)

🔹Python Basic

🔹Machine Learning Basic

🔹Deeplearning Basic

🔹Audio Signal Processing for ML


💼 #2 Tools & Frameworks (3 months)

🧰 Frameworks & Libraries

  • PyTorch - Training models framework
  • librosa - Audio preprocessing (STFT, MFCCs, etc.)
  • torchaudio- Audio loading, transforms, and model wrappers
  • ffmpeg, sox, pydub - Audio conversion, slicing, format handling
  • noisereduce – Simple noise reduction from raw audio

🖥️ Tools

🤗 Hugging Face Course

  • Hugging Face Audio - Learn to tackle a range of audio-related tasks and gain experiments with speech datasets.

🌱 #3 Dive Into Speech Core Technologies (12 months)

🤖 Transformers (Attention is all you need)

🎙️ Automatic Speech Recognition (ASR)

🗣️ Text-to-Speech (TTS)

🇻🇳 Vietnamese Resources

🔐 Speaker Verification (SV)

👥 Speaker Diarization (SD)


🔬 #4 Research Trends

🤯 Audio Language Models

About

A curated roadmap based on my 5 years of experience form zero to become a skilled AI Speech Engineer. This roadmap covers everything from fundamentals to cutting-edge research trends in the speech domain.

Topics

Resources

Stars

Watchers

Forks