Skip to content

fergarcat/multiclass_prediction_obesity_risk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧬 Keep In Shape: Multiclass Prediction of Obesity Risk 🩺

ObesityRiskML Logo

Salud IA Predictiva ClasificaciΓ³n Multiclase

Behind every piece of data, a type of obesity. Behind the model, a real solution.

Welcome to Keep In Shape, an application that uses multiclass ML techniques to predict people's obesity risk based on lifestyle and eating habits data. This tool is designed to help healthcare professionals and users interested in monitoring and managing obesity-related risks.

DeepWiki

Ask DeepWiki

πŸ“š table of contents

  1. 🧠 Intelligence for Health
  2. πŸ” Project Background
  3. πŸ’» Key Features
  4. βš™οΈ Tech Stack
  5. πŸ“ Project Structure
  6. πŸš€ Installation & Usage
  7. πŸ“Š About our model
  8. πŸ‘₯ Development Team
  9. 🀝 Contributing

🧠 Intelligence for Health

Check out the demo of our app


πŸ” Project Background

Details about:

  • Exploratory Data Analysis
  • Algorithm experimentation XGBoost
  • Model evaluation metrics
  • Development of the web application
  • Please visit our technical report for a detailed overview of the project.

πŸ’» Key Features

  • Interactive dashboard
  • User scenario simulator
  • Personalized health recommendations
  • Persistent prediction history (supabase)
  • Intuitive user interface
  • Dockerized deployment

βš™οΈ Tech Stack

Python Dash Pandas NumPy scikit-learn XGBoost SQLAlchemy Matplotlib Seaborn Docker


Project structure

multiclass_prediction_obesity_risk/
β”œβ”€β”€ client/                      # User interface
β”‚   └── media/                   # Static assets (images, etc.)
β”œβ”€β”€ data/                        # Project data
β”‚   β”œβ”€β”€ raw/                     # Original dataset
β”‚   └── processed/               # Processed dataset
β”œβ”€β”€ docs/                        # Project documentation
β”œβ”€β”€ server/                      # Backend and model logic
β”‚   β”œβ”€β”€ model/                   # Trained models and utilities
β”‚   └── utils/                   # Helper functions
β”œβ”€β”€ tests/                       # Automated tests
β”œβ”€β”€ .gitignore                   # Files ignored by Git
β”œβ”€β”€ pyproject.toml               # Project configuration
β”œβ”€β”€ requirements.txt             # Project dependencies
β”œβ”€β”€ README.md                    # Main documentation
└── Dockerfile                   # Containerization setup


πŸš€ Installation & Usage

1️⃣ Clone the Repository

git clone https://github.com/fergarcat/multiclass_prediction_obesity_risk.git
cd multiclass_prediction_obesity_risk

2️⃣ Create and Activate a Virtual Environment

python3 -m venv .venv

macOS/Linux:

source .venv/bin/activate

Windows:

.venv\Scripts\activate

3️⃣ Install Dependencies

pip install -r requirements.txt

πŸ’‘ TIP:
Use pip list to see all installed dependencies.

4️⃣ Set Up Environment Variables

Duplicate the env_example file, rename it to .env

5️⃣ Run the Dashboard

python run_client.py

6️⃣ πŸš€ Deploy with Docker

Run the following command to build and start the containers:

docker-compose up --build

7️⃣ Run test

python -m unittest discover tests

πŸ“Š About our model

We compared several models to determine which one performed best. You can see the results in this notebook.

Model Accuracy Precision Recall F1-Score Train Time (s) Overfitting
XGBoost 0.9073 0.9070 0.9073 0.9071 2.4346 0.0769
CatBoost 0.9061 0.9056 0.9061 0.9058 35.4323 0.0483
LightGBM 0.9051 0.9048 0.9051 0.9049 4.7564 0.0711
Random Forest 0.9037 0.9030 0.9037 0.9031 3.8831 0.0962
SVM (RBF) 0.8832 0.8822 0.8832 0.8826 5.1034 0.0137
SVM (Linear) 0.8697 0.8684 0.8697 0.8688 5.1625 -0.0029
Logistic Regression 0.8639 0.8624 0.8639 0.8629 0.7557 -0.0013
Decision Tree 0.8408 0.8404 0.8408 0.8404 0.1659 0.1591
KNN 0.7936 0.7906 0.7936 0.7908 0.2451 0.0574
Naive Bayes 0.7755 0.7710 0.7755 0.7691 0.0260 -0.0024

We identified these two models as optimal.

Model Accuracy Precision Recall F1-Score Train Time (s) Overfitting
XGBoost (Adjusted) 0.9053 0.9048 0.9053 0.9050 5.4682 0.0097
Logistic Regression (Adjusted) 0.8639 0.8639 0.8639 0.8633 0.7708 0.0002

Finally, we retrained the fastest model to reduce overfitting.

Model Accuracy Precision Recall F1-Score Train Time (s) Overfitting
XGBoost + Optuna 0.9118 0.9117 0.9118 0.9116 4.1328 0.0185

πŸ‘₯ Development Team

Name GitHub
Fernando GarcΓ­a CatalΓ‘n fergarcat
Anca Bacria a-bac-0
Omar Lengua Omarlsant
Abigail Masapanta abbyenredes

🀝 Contributing

Contributions are welcome! To contribute:

  1. Fork this repository.

  2. Create a new branch:

    git checkout -b feature/new-feature
  3. Make your changes and commit them:

    git commit -m "Add new feature"
  4. Submit a pull request πŸš€


πŸš€ Thank You for Using Keep In Shape!

If you have any questions, feel free to open an issue in the repository or contact us.

About

Machine learning multi label classification

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •