Welcome to the PPG Blood Glucose Diabetes Classification project! ๐
This repository provides a pipeline for estimating blood glucose levels using Photoplethysmography (PPG) signals.
By combining advanced signal processing, machine learning, and visualizations, we enable non-invasive diabetes screening for modern preventive healthcare. ๐ฉบ๐ก
This project uses PPGโan optical technique for capturing blood volume changesโto predict blood glucose levels non-invasively.
The modular framework integrates signal engineering and machine learning, making it suitable for experimentation and real-world use.
- ๐ Data Processing: Preprocess raw PPG segments and extract physiological features
- ๐ค Machine Learning Models: Random Forest, Gradient Boosting, SVM, LightGBM, Logistic Regression, and Ensemble Methods, Stacking and Voting Classifiers
- ๐ Visualizations: ROC curves, confusion matrices, and feature importance plots
- ๐งช Evaluation: Subject-wise StratifiedGroupKFold cross-validation to prevent data leakage and ensure real-world applicability
PPG_Blood_Glucose_JB_Implementation/
โโโ datasets/
โ โโโ ppg_bagging_tree_features.csv
โ โโโ ppg_specific_features.csv
โ โโโ processed_metadata.csv
โ โโโ PPG-BP.xlsx
โ โโโ 0_subject/
โโโ models/
โ โโโ random_forest.pkl
โ โโโ svm.pkl
โ โโโ ...
โโโ outputs/
โ โโโ confusion_matrix_randomforest.png
โ โโโ roc_randomforest.png
โ โโโ ...
โโโ src/
โ โโโ excel_handling.py
โ โโโ data_preprocessing.py
โ โโโ train_models.py
โ โโโ evaluate.py
โโโ requirements.txt
โโโ README.md
Letโs get you up and running in no time!
- Python 3.8+ ๐
- pip (Python package manager)
- Git ๐ฆ
git clone https://github.com/Spidy104/PPG_DIABETES_CLASSIFICATION
cd PPG_DIABETES_CLASSIFICATION
python -m venv venv
- On macOS/Linux:
source venv/bin/activate
- On Windows:
venv\Scripts\activate
pip install -r requirements.txt
Ensure the datasets/
folder contains:
ppg_bagging_tree_features.csv
โ Contains extracted features from PPG signals using bagging tree methods for model training.ppg_specific_features.csv
โ Includes domain-specific physiological features derived from PPG signals.processed_metadata.csv
โ Metadata for each sample, such as subject IDs, timestamps, and labels (e.g., glucose levels).PPG-BP.xlsx
โ Raw and reference data, including PPG signals and corresponding blood pressure/glucose measurements.0_subject/
(Raw PPG signals by subject) โ Directory with raw PPG signal files, organized per subject for preprocessing.
python src/excel_handling.py
python src/data_preprocessing.py
python src/feature_extraction.py
Run the following command to train all machine learning models:
python src/train_models.py
This will generate model files in the models/
directory:
models/
โโโ random_forest.pkl
โโโ svm.pkl
โโโ gradient_boosting.pkl
โโโ lightgbm.pkl
โโโ logistic_regression.pkl
โโโ stacking_classifier.pkl
โโโ voting_classifier.pkl
Evaluate the trained models and generate visualizations:
python src/second_model.ipynb
Results and plots will be saved in the outputs/
directory:
outputs/
โโโ gradient_boosting.jpg
โโโ LightGBM.jpg
โโโ Stacking_Classifier.jpg
โโโ Voting_Classifier.jpg
โโโ Model_performance.jpg
โโโ ROC_curves.jpg
Hereโs a sneak peek at the insights you'll get:
Below are the classification reports for each model (Model 2):
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.830 | 0.990 | 0.900 | 181 |
1 | 0.000 | 0.000 | 0.000 | 38 |
Accuracy | 0.820 | 219 | ||
Macro Avg | 0.410 | 0.500 | 0.450 | 219 |
Weighted Avg | 0.680 | 0.820 | 0.750 | 219 |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.830 | 1.000 | 0.910 | 181 |
1 | 0.000 | 0.000 | 0.000 | 38 |
Accuracy | 0.830 | 219 | ||
Macro Avg | 0.410 | 0.500 | 0.450 | 219 |
Weighted Avg | 0.680 | 0.830 | 0.750 | 219 |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.830 | 0.980 | 0.900 | 181 |
1 | 0.330 | 0.050 | 0.090 | 38 |
Accuracy | 0.820 | 219 | ||
Macro Avg | 0.580 | 0.520 | 0.490 | 219 |
Weighted Avg | 0.740 | 0.820 | 0.760 | 219 |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.830 | 1.000 | 0.910 | 181 |
1 | 0.000 | 0.000 | 0.000 | 38 |
Accuracy | 0.830 | 219 | ||
Macro Avg | 0.410 | 0.500 | 0.450 | 219 |
Weighted Avg | 0.680 | 0.830 | 0.750 | 219 |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.840 | 0.970 | 0.900 | 181 |
1 | 0.440 | 0.110 | 0.170 | 38 |
Accuracy | 0.820 | 219 | ||
Macro Avg | 0.640 | 0.540 | 0.540 | 219 |
Weighted Avg | 0.770 | 0.820 | 0.770 | 219 |
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.830 | 0.990 | 0.900 | 181 |
1 | 0.000 | 0.000 | 0.000 | 38 |
Accuracy | 0.820 | 219 | ||
Macro Avg | 0.410 | 0.500 | 0.450 | 219 |
Weighted Avg | 0.680 | 0.820 | 0.750 | 219 |