A comprehensive machine learning project that implements multiple deep learning approaches for estimating human age from facial images. This project explores and compares different CNN architectures and techniques for age prediction.
- Overview
- Dataset
- Approaches Implemented
- Requirements
- Installation & Setup
- Usage
- Project Structure
- Results
- Technologies Used
- Contributing
This project focuses on the challenging computer vision task of age estimation from facial images. Age estimation has various applications including demographic analysis, security systems, and personalized content delivery. The project implements and compares multiple deep learning approaches to achieve accurate age prediction.
Key Objectives:
- Develop robust age estimation models using different CNN architectures
- Compare performance between custom CNN and transfer learning approaches
- Explore ensemble methods for improved accuracy
- Provide comprehensive analysis of model performance
The project uses the Age Prediction Dataset from Kaggle by mariafrenti.
Dataset Details:
- Size: ~273,640 facial images
- Age Range: 20-50 years
- Format: Color images (RGB)
- Organization: Images organized by age groups
- Total Size: ~2.03 GB
The dataset contains diverse facial images across different age groups, providing a robust foundation for training age estimation models.
- Architecture: Custom Convolutional Neural Network
- Input Size: 128x128x3
- Key Features:
- Multiple Conv2D layers with batch normalization
- Dropout layers for regularization
- Data augmentation for improved generalization
- Adam optimizer with learning rate of 0.001
- Layers: 5 convolutional layers with increasing filter complexity
- Output: Softmax classification for 51 age classes
- Architecture: Pre-trained ResNet50 + Custom Dense layers
- Approach: Transfer learning with fine-tuning
- Key Features:
- Leverages ImageNet pre-trained weights
- Custom regression head for age prediction
- Continuous age prediction (regression)
- Advanced loss functions and metrics
- Architecture: Ensemble and hybrid methods
- Key Features:
- Combines multiple model architectures
- Advanced preprocessing techniques
- Multiple optimization strategies
- Comprehensive model evaluation
- Comprehensive Analysis: Complete pipeline from data loading to evaluation
- Data Preprocessing: Advanced image preprocessing and augmentation
- Model Comparison: Side-by-side comparison of different approaches
numpy >= 1.21.0
pandas >= 1.3.0
tensorflow >= 2.8.0
keras >= 2.8.0
opencv-python >= 4.5.0
matplotlib >= 3.5.0
seaborn >= 0.11.0
scikit-learn >= 1.0.0
kaggle >= 1.5.0
- RAM: Minimum 8GB (16GB recommended)
- Storage: 5GB free space for dataset and models
- GPU: CUDA-compatible GPU recommended for faster training
- Python: 3.7 or higher
-
Clone the Repository
git clone https://github.com/yourusername/Human-Age-Estimation.git cd Human-Age-Estimation
-
Install Dependencies
pip install -r requirements.txt
-
Kaggle API Setup
- Create a Kaggle account and generate API token
- Download
kaggle.json
from Kaggle Account settings - Place the file in your Kaggle directory or upload when prompted
-
Dataset Download The notebooks automatically download the dataset using Kaggle API:
import kagglehub mariafrenti_age_prediction_path = kagglehub.dataset_download('mariafrenti/age-prediction')
-
CNN from Scratch
jupyter notebook agepredictionwithkeras.ipynb
-
ResNet50 Transfer Learning
jupyter notebook resnet50.ipynb
-
Combination Approach
jupyter notebook Combinaison.ipynb
-
Complete Analysis
jupyter notebook Human_Age_Estimation_from_Face_Images.ipynb
CNN Model:
- Epochs: 15 (configurable)
- Batch Size: 250
- Optimizer: Adam (lr=0.001)
- Loss: Categorical Crossentropy
- Data Augmentation: Rotation, zoom, width/height shifts
ResNet50 Model:
- Epochs: 100 (with early stopping)
- Optimization: Advanced learning rate scheduling
- Loss: Mean Squared Error (regression)
- Metrics: RMSE, RΒ² Score
Human-Age-Estimation/
βββ agepredictionwithkeras.ipynb # CNN from scratch implementation
βββ resnet50.ipynb # ResNet50 transfer learning
βββ Combinaison.ipynb # Ensemble/combination methods
βββ Human_Age_Estimation_from_Face_Images.ipynb # Main analysis
βββ CNN_from_scratch.pdf # CNN approach documentation
βββ Resnet.pdf # ResNet approach documentation
βββ Combinaison.pdf # Combination approach documentation
βββ README.md # Project documentation
βββ requirements.txt # Python dependencies
Approach | Architecture | Training Time | Accuracy/RMSE | Key Advantages |
---|---|---|---|---|
CNN from Scratch | Custom CNN | ~2-3 hours | Classification Acc. | Full control, interpretable |
ResNet50 | Transfer Learning | ~4-6 hours | Regression RMSE | Pre-trained features, robust |
Combination | Ensemble | Variable | Combined metrics | Best of both approaches |
- Transfer Learning generally provides better initial performance due to pre-trained ImageNet features
- Custom CNN offers more flexibility and interpretability
- Data Augmentation significantly improves model generalization
- Age range 20-50 provides sufficient diversity for robust model training
- Deep Learning: TensorFlow 2.x, Keras
- Computer Vision: OpenCV, PIL
- Data Science: NumPy, Pandas, Scikit-learn
- Visualization: Matplotlib, Seaborn
- Development: Jupyter Notebooks, Python 3.x
- Dataset: Kaggle API
Conv2D(8, 5x5) β Conv2D(140, 3x3) β Conv2D(130, 3x3) β
BatchNorm β MaxPool2D β Conv2D(120, 3x3) β BatchNorm β
MaxPool2D β Conv2D(120, 3x3) β BatchNorm β MaxPool2D β
Flatten β Dropout(0.25) β Dense(51) β Softmax
ResNet50(ImageNet weights) β GlobalAveragePooling2D β
Dense(512, ReLU) β Dropout(0.5) β Dense(256, ReLU) β
Dropout(0.3) β Dense(1, Linear) # Regression output
- Image Resizing: 128x128 or 224x224 pixels
- Normalization: Pixel values scaled to [0,1]
- Augmentation: Rotation, zoom, shifts, brightness adjustment
- Train/Validation Split: 90%/10%
- Early Stopping: Prevent overfitting
- Learning Rate Scheduling: Adaptive learning rate reduction
- Cross-Validation: K-fold validation for robust evaluation
- Regularization: Dropout, batch normalization, L2 regularization
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Dataset: Thanks to mariafrenti for providing the age prediction dataset
- Kaggle: For hosting the dataset and providing the platform
- TensorFlow/Keras Team: For the excellent deep learning framework
- Research Community: For the foundational work in age estimation and computer vision
Note: This project is for educational and research purposes. The models should be further validated before any production use.