🎙️ Spoken Digit Recognition

A simple yet effective bidirectional LSTM model trained to recognize spoken digits using spectrograms. This project serves as an end-to-end learning exercise in audio preprocessing, feature extraction, and sequence modeling.

📌 Project Highlights

🔊 Audio Classification: Predicts digits (0–9) from short spoken audio clips.
📈 High Accuracy: Achieves strong validation performance with minimal preprocessing.
🧠 Deep Learning: Utilizes a Bidirectional LSTM model trained on spectrogram features.
🔁 Augmentation: Applies time-stretching and noise injection to improve generalization.
📊 Visualization: Includes confusion matrix, spectrogram plot, and training curves.
🎓 Educational Purpose: Built as a foundational step into speech and audio modeling.

📂 Dataset

This project uses the Free Spoken Digit Dataset (FSDD), which contains:

Recordings of digits (0–9)
Multiple speakers
Clean and well-labeled audio, ideal for quick experimentation

📈 Evaluation & Results

Observations:

The model performs well on validation data with minimal overfitting.
Confusion matrix shows strong classification accuracy, especially for clearly spoken digits.
Achieved 96% F1 score

Conclusion:

This project shows that even simple models can be powerful when combined with clean datasets and good preprocessing. Bidirectional LSTMs capture temporal features well, and augmentation helps further boost performance. The approach provides a solid foundation for more complex speech-based applications.

🧪 Virtual Environment

Key Packages:
- Python 3.10
- tensorflow==2.19.0
- tf-keras==2.19.0
- keras==3.9.2
- pandas==1.4.2
- numpy==1.26.4
- matplotlib==3.10.0
- seaborn==0.13.2
- librosa==0.11.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Digit recognition.ipynb		Digit recognition.ipynb
LICENSE		LICENSE
README.md		README.md
augmented_log.csv		augmented_log.csv
training_log.csv		training_log.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Spoken Digit Recognition

📌 Project Highlights

📂 Dataset

📈 Evaluation & Results

Observations:

Conclusion:

🧪 Virtual Environment

About

Uh oh!

Releases

Packages

Languages

License

SherifGamal9441/Spoken-Digit-Recognition

Folders and files

Latest commit

History

Repository files navigation

🎙️ Spoken Digit Recognition

📌 Project Highlights

📂 Dataset

📈 Evaluation & Results

Observations:

Conclusion:

🧪 Virtual Environment

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages