A healthcare-focused machine learning project to predict the likelihood of diseases such as diabetes or heart disease based on medical records. It includes thorough data exploration, multiple model training approaches, and evaluation using F1 scores and AUC-ROC to ensure diagnostic relevance and performance.
📄 Includes a complete Report.pdf and explanatory ProjectVideo.mp4 for academic or professional reference.
📦 Repository: https://github.com/ahsankhizar5/disease-diagnosis-prediction.git
- 📊 Exploratory Data Analysis (EDA) to understand feature-disease relationships
- 🔍 Feature selection and preprocessing (scaling, encoding)
- 🤖 Models: Gradient Boosting, SVM, and Neural Networks
- 📈 Performance evaluation using F1 Score and AUC-ROC
- 🩺 Clinical insights for early detection and prevention
- 📄 Report and Presentation Video included
git clone https://github.com/ahsankhizar5/disease-diagnosis-prediction.git
cd disease-diagnosis-prediction
pip install pandas numpy matplotlib seaborn scikit-learn xgboost keras
✅ Python 3.7+ recommended. Ensure you have TensorFlow installed for Keras support.
Launch Code.ipynb
in Jupyter Notebook or compatible IDE and execute the pipeline for training and evaluation.
- Python
- Pandas, NumPy – Data manipulation
- Seaborn, Matplotlib – Visualization
- scikit-learn, XGBoost – ML models and evaluation
- Keras – Neural Network implementation
├── Code.ipynb
├── Report.pdf
└── ProjectVideo.mp4
-
Fork the repo
-
Create a branch
git checkout -b feature/your-feature
-
Commit your changes
git add . git commit -m "Add your feature"
-
Push and submit a PR
git push origin feature/your-feature
MIT License — free to use, modify, and distribute.
If this project helped you, inspired you, or saved you time — consider giving it a ⭐ on GitHub!
🧠 "Predicting disease early is not just prevention — it’s empowerment."