Skip to content

awmirma/DS4031_project

Repository files navigation

Diabetes Prediction Project

Welcome to the Diabetes Prediction Project, a machine learning and deep learning-based system designed to predict the likelihood of diabetes occurrence. This repository contains the code, data, and documentation for the project, conducted as part of the IUST University (4031 semester) coursework.


📜 Project Description

The goal of this project is to leverage machine learning and deep learning models to predict diabetes outcomes using real-world clinical data. Various models are trained, evaluated, and optimized to identify the most accurate and efficient predictor.

Key Steps in the Project:

  • Data cleaning and preprocessing
  • Training multiple machine learning models
  • Hyperparameter tuning for performance optimization
  • Comparison of classical models with deep learning approaches
  • Ensemble techniques for improved accuracy
  • Visualizations and detailed analysis of model performance

🚀 Features

  • Data Preprocessing: Cleaned, scaled, and encoded data for efficient training.
  • Model Comparison: Logistic Regression, Random Forest, SVM, KNN, and Neural Networks.
  • Optimization: Hyperparameter tuning using GridSearchCV.
  • Ensemble Techniques: Gradient Boosting, AdaBoost, and Random Forest Ensembles.
  • Evaluation Metrics: Accuracy, AUC-ROC, Precision, Recall, F1-score, Confusion Matrix.
  • Visualizations: ROC curves, heatmaps, and decision boundaries.

📊 Results

The Gradient Boosting model achieved the best performance:

  • Accuracy: 84%
  • AUC-ROC: 89%

Deep Learning models also performed well:

  • Accuracy: 82%
  • AUC-ROC: 87%

📦 Installation

pip install -r requirements.txt
python main.py

🖥️ Usage

  • Use the notebooks for detailed exploratory analysis and model training.
  • The main.py script provides an end-to-end pipeline for training and evaluation.
  • Access visual outputs in the outputs directory.

🛠️ Tools & Technologies

  • Programming Language: Python
  • Libraries: Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, TensorFlow/Keras
  • Development Environment: Jupyter Notebooks, VS Code

📅 Academic Context

This project was developed as part of the Machine Learning Course at IUST University (4031 semester). The project aimed to provide practical exposure to predictive modeling, machine learning algorithms, and optimization techniques.


🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to fork this repository and submit pull requests.


📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published