This repository contains a deep learning model built with PyTorch for facial expression recognition. The model classifies images of human faces into one of seven emotion categories: Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral.
- Project Overview
- Features
- Dataset
- Installation
- Usage
- Training
- Evaluation
- Prediction
- Model Architecture
- Results
- License
Facial expression recognition is a fundamental task in human-computer interaction and affective computing. This project implements a Convolutional Neural Network (CNN) trained on a labeled dataset of facial images to classify emotions accurately.
- Custom CNN architecture optimized for grayscale 48x48 pixel images.
- Data augmentation during training for improved generalization.
- Training and validation pipelines with accuracy and loss tracking.
- Model checkpointing for best validation accuracy.
- Inference script to predict emotion from an image URL.
- Modular and configurable codebase.
The dataset consists of labeled facial images categorized into seven emotions:
Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral.
The dataset should be organized with accompanying CSV files specifying image paths and labels. Images are expected to be grayscale and resized to 48x48 pixels during training and inference.
-
Clone this repository:
git clone https://github.com/yourusername/face-expression-recognition.git cd face-expression-recognition
-
Create and activate a Python virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # Linux/macOS venv\Scripts\activate # Windows
-
Install dependencies:
pip install -r requirements.txt
Run the training script:
python train.py
The script will load the dataset, train the CNN model, and save the best performing model to the checkpoints/
directory.
Use the prediction script to classify an emotion from an image URL:
python predict.py --url "https://example.com/image.jpg"
The script outputs the predicted emotion label.
- The model trains using CrossEntropyLoss and Adam optimizer.
- Training and validation data loaders use data augmentation and normalization.
- Training progress is displayed with loss and accuracy metrics per epoch.
- The best model based on validation accuracy is saved automatically.
Validation accuracy is printed after each epoch. Typical performance achieves approximately 50% accuracy on validation data, which can be improved with further tuning and data.
The CNN model consists of:
- 3 convolutional layers with batch normalization and max pooling.
- Dropout layer to reduce overfitting.
- Fully connected layers mapping to 7 output emotion classes.
Input images are grayscale of size 48x48 pixels.
Example metrics after training:
- Train Loss: 1.38
- Train Accuracy: 46.6%
- Validation Accuracy: 51.3%
This project is licensed under the MIT License - see the LICENSE file for details.