Skip to content

Nouran252/Obesity_Prediction

 
 

Repository files navigation

Obesity Prediction using Machine Learning

This project predicts the obesity level of individuals based on their physical and lifestyle attributes using multiple machine learning models. It involves end-to-end data processing, exploratory data analysis (EDA), feature engineering, model evaluation, and hyperparameter tuning.

Open in Google Colab

📁 Dataset

  • Source: The dataset is split into:
    • train_dataset.csv
    • test_dataset.csv
  • Target Column: NObeyesdad — Multi-class label representing obesity levels.

🚀 Features Used

  • Age
  • Height
  • Weight
  • BMI (calculated)
  • FCVC: Frequency of consuming vegetables
  • NCP: Number of main meals
  • CH2O: Daily water consumption
  • FAF: Physical activity frequency
  • TUE: Time using technology devices
  • Gender, Family history, Food habits, Transportation methods, etc.

🧹 Data Preprocessing

  • Calculated BMI from weight and height
  • Handled missing values using mean/mode
  • Removed duplicates
  • One-Hot and Label Encoding for categorical data
  • Outlier handling using IQR method
  • Feature scaling using StandardScaler

📊 Exploratory Data Analysis (EDA)

  • Boxplots, Distribution plots, Countplots
  • Correlation heatmap
  • Feature importance visualization using Random Forest

🧠 Models Implemented

  • Logistic Regression
  • Decision Tree
  • Random Forest
  • K-Nearest Neighbors (KNN)
  • Neural Network (MLPClassifier)
  • Support Vector Machine (SVM)
  • Gradient Boosting Classifier
  • Stacking Classifier (Ensemble)

🛠️ Model Evaluation Metrics

  • Accuracy
  • Precision (weighted)
  • Recall (weighted)
  • F1-Score (weighted)
  • Confusion Matrix
  • Classification Report

🔍 Hyperparameter Tuning

  • GridSearchCV for:
    • Logistic Regression (C, penalty, solver)
    • KNN (n_neighbors, p, metric, algorithm)

🏆 Best Model (Stacking Ensemble)

Combines:

  • Logistic Regression
  • Tuned KNN
  • Random Forest
  • Decision Tree
  • Neural Network
  • SVM
  • Gradient Boosting

Meta-model: Logistic Regression

💾 Model Saving

  • StandardScaler saved as scaler.pkl
  • Final SVM model saved as svm_model.pkl

📉 Visualizations

  • Feature distributions
  • Boxplots: Features vs Obesity Level
  • Countplot of Obesity Classes
  • Feature importance charts
  • Correlation heatmap

📂 Project Structure

📦 Obesity-Prediction-ML ├── data/ │ ├── train_dataset.csv │ └── test_dataset.csv ├── models/ │ ├── svm_model.pkl │ └── scaler.pkl ├── obesity_prediction.ipynb / .py └── README.md

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%