🎙️ Emotion Detection from Raw Audio Waveform

Detect human emotions from raw .wav files using deep learning and pretrained speech models.
Built using Hugging Face Transformers, PyTorch, and Torchaudio.

🚀 Overview

This project classifies human emotions like Happy, Sad, Angry, and Neutral directly from raw audio (.wav) files. It utilizes a Hugging Face pretrained model fine-tuned on the RAVDESS emotional speech dataset. The entire process—from preprocessing to training and inference—can be run using simple Python scripts.

🎯 Recreate the Model (Step-by-Step)

Step 1: Preprocess the Dataset
Step 2: Train the Classifier
Step 3: Run Inference

🧪 Sample Output

📊 Classification Report
🔍 Prediction Output
📉 Confusion Matrix
📈 Wave-Output
🔍 Model Details
🌟 Features
📦 Requirements
💡 Future Ideas
👨‍💻 Author

🧠 How the Model Works

Raw Audio Input
You input a mono .wav file recorded at any sample rate.
Preprocessing
The audio is normalized and resampled to 16kHz using torchaudio.
Feature Extraction
The pretrained HuBERT model (from Hugging Face) extracts deep audio embeddings.
Classifier Head
A dense neural network is trained on top of these embeddings using labeled emotion data (RAVDESS).
Prediction
The model outputs the most probable emotion class for the given voice input.

📁 Project Structure

Emotion_detection_by_wave_formate/
├── audio_waveform_viewer.py     # Displays waveform of audio files
├── preprocess_ravdess.py        # Prepares RAVDESS dataset
├── train_emotion_model.py       # Trains the classifier on HuBERT embeddings
├── predict_emotion.py           # Predicts emotion for new audio input
├── model/
│   └── final_emotion_model/     # Trained model is saved here
├── dataset/                     # Raw and processed audio files
├── sample_output/               # Store screenshots and predictions
├── requirements.txt             # All required dependencies
└── README.md                    # You are here!

🛠️ Installation

1. Clone the Repository

git clone https://github.com/dhanesh-j/Emotion_detection_by_wave_formate.git
cd Emotion_detection_by_wave_formate

2. Install Dependencies

pip install -r requirements.txt

🎯 Recreate the Model (Step-by-Step)

🔹 Step 1: Preprocess the Dataset

Ensure RAVDESS dataset is available under dataset/RAVDESS/

python preprocess_ravdess.py --input_dir dataset/RAVDESS/ --output_dir dataset/processed/ --sample_rate 16000

🔹 Step 2: Train the Classifier

python train_emotion_model.py   --data_dir dataset/processed/   --pretrained_model superb/hubert-large-superb-er   --output_dir model/final_emotion_model/

🔹 Step 3: Run Inference

python predict_emotion.py --model_dir model/final_emotion_model/ --input_audio sample_output/sample.wav

🔹 For The Live Emotion Prediction Run This File

record_and_predict.py

🧪 Sample Output

Here’s a visual summary of the model's performance across different emotions:

1. Classification Report

The graph shows high precision and recall for emotions like *Sad* and *Neutral*, with consistent accuracy across all categories.

2.Prediction Output

The above image represents the Emotion Detected after analysing the audio in realtime (live recorded audio).

3.Confusion Matrix

The above image represents the confusion matrix of the model

4. Wave-Output

The above image represents the Realtime Wave-Form of Recorded audio, measured by Time / Seconds.

🔍 Model Details

Backbone: superb/hubert-large-superb-er
Dataset: RAVDESS
Input: Raw waveform
Output: Emotion class (happy, sad, angry, neutral)

🌟 Features

✅ No speech-to-text required
✅ Lightweight training using pretrained embeddings
✅ Easily extendable with other datasets
✅ Compatible with Gradio or Streamlit for UI

📦 Requirements

Python 3.8+
torch
torchaudio
transformers
librosa
matplotlib

Install them using:

pip install -r requirements.txt

💡 Future Ideas

🎤 Real-time microphone support
🌐 Multilingual emotion detection
📊 Interactive dashboard with waveform + prediction
🧪 Confusion matrix and training metrics visualization

👨‍💻 Author

Dhanesh J
Third-year mini project Computer Science student passionate about AI, voice recognition, and applied ML.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Emotion Detection from Raw Audio Waveform

🚀 Overview

📚 Table of Contents

🎯 Recreate the Model (Step-by-Step)

🧪 Sample Output

🧠 How the Model Works

📁 Project Structure

🛠️ Installation

1. Clone the Repository

2. Install Dependencies

🎯 Recreate the Model (Step-by-Step)

🔹 Step 1: Preprocess the Dataset

🔹 Step 2: Train the Classifier

🔹 Step 3: Run Inference

🔹 For The Live Emotion Prediction Run This File

🧪 Sample Output

1. Classification Report

2.Prediction Output

3.Confusion Matrix

4. Wave-Output

🔍 Model Details

🌟 Features

📦 Requirements

💡 Future Ideas

👨‍💻 Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Record & predict		Record & predict
dataset		dataset
final_emotion_model		final_emotion_model
sample_output		sample_output
wav2vec2-emotion		wav2vec2-emotion
README.md		README.md
audio_waveform_viewer.py		audio_waveform_viewer.py
predict_emotion.py		predict_emotion.py
preprocess_ravdess.py		preprocess_ravdess.py
ravdess_data.csv		ravdess_data.csv
requirements.txt		requirements.txt
train_emotion_model.py		train_emotion_model.py

dhaneshjayachandhiran/Emotion_detection_by_wave_formate

Folders and files

Latest commit

History

Repository files navigation

🎙️ Emotion Detection from Raw Audio Waveform

🚀 Overview

📚 Table of Contents

🎯 Recreate the Model (Step-by-Step)

🧪 Sample Output

🧠 How the Model Works

📁 Project Structure

🛠️ Installation

1. Clone the Repository

2. Install Dependencies

🎯 Recreate the Model (Step-by-Step)

🔹 Step 1: Preprocess the Dataset

🔹 Step 2: Train the Classifier

🔹 Step 3: Run Inference

🔹 For The Live Emotion Prediction Run This File

🧪 Sample Output

1. Classification Report

2.Prediction Output

3.Confusion Matrix

4. Wave-Output

🔍 Model Details

🌟 Features

📦 Requirements

💡 Future Ideas

👨‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages