Pime Diabetes Prediction

Overview

This project implements a complete machine learning pipeline to predict diabetes risk using the Pima Indians dataset. It covers data preprocessing, model building, validation, optimization, and deployment via a FastAPI-based API.

Content:

Scripts
Notebooks
Deployment

Online Project
For more details and updates about this project, visit this website.

Scripts

data_preprocessing.py

Loads and transforms raw data (replacing zero values with NaN for key features).
Generates 8 preprocessing combinations (applying/removing outliers, balancing, scaling).
Saves processed CSV files with descriptive names.

pipeline.py

Configures preprocessing using simple and iterative imputations via a ColumnTransformer.
Integrates a pre-trained classifier into a unified pipeline.
Trains the model using processed data and saves the pipeline.

Notebooks

01_data_preprocessing.ipynb: Data import, cleaning, exploration, and normalization.
02_data_model_validation.ipynb: Splitting data, model evaluation with metrics, and comparison.
03_model_optimization.ipynb: Hyperparameter tuning and robust model validation.

Diagrams illustrate the ML pipeline architecture and best model performance.

Deployment

API Overview:
A FastAPI-based service for diabetes prediction.
Key Endpoints:
- GET /: Health check.
- POST /predict: Returns a binary prediction.
- POST /predict_proba: Returns class probability scores.
Deployment Methods:
Includes instructions for local Docker deployment and scalable deployment on AWS Elastic Beanstalk.

Contributing

We welcome contributions! If you have ideas, improvements, or fixes, please feel free to fork the repository and submit pull requests. Your involvement helps make this project even better.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
docker		docker
models		models
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pime Diabetes Prediction

Overview

Scripts

Notebooks

Deployment

Contributing

About

Uh oh!

Releases

Packages

Languages

License

haroldeustaquio/Pima-Diabetes-Prediction

Folders and files

Latest commit

History

Repository files navigation

Pime Diabetes Prediction

Overview

Scripts

Notebooks

Deployment

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages