This project focuses on voice signal processing and analysis, with the primary emphasis on MFCC (Mel-Frequency Cepstral Coefficients) for voice recognition.
MFCC features, derived from the Mel Spectrogram, are widely used to represent the spectral characteristics of audio signals in speech and audio analysis tasks.
Our work involves:
- Recording and analyzing voice signals
- Signal pre-processing (denoising, filtering, reconstruction)
- Feature extraction using MFCC
- Building a reference voice database for recognition & comparison
- Implementing a GUI (Graphical User Interface) for dynamic analysis
✨ Detailed methodology, analysis, and results are included in the CEP_REPORT.pdf file.
- ✅ MFCC-based feature extraction from voice signals
- ✅ Voice recognition via Euclidean Distance
- ✅ Reference voice database support
- ✅ GUI for adjusting test signals dynamically
- ✅ Automatic folder generation during runtime
-
Main Model
- Open and run
Obj1.m
— this is the main file for deep analysis and recognition.
- Open and run
-
Prepare Reference Database
- Record 15 audio files in
.wav
format:- Sampling Rate: 44.1 kHz
- Encoding: 16-bit PCM
- Name files as:
Example:
your_name_<no>.wav
haider_1.wav
,haider_2.wav
- Place them in the
sound_recordings/
folder.
🔧 If using more than 15 files, update parameters in the code (
fileCount
, etc.). - Record 15 audio files in
-
Run the GUI
- Launch the graphical interface by running:
VoiceApp.m
- Launch the graphical interface by running: