Course: IS252.O22.HTCL - Data Mining
Lecturer: PhD. Cao Thi Nhan
Instructor: MSc. Vu Minh Sang
Semester: 2, 2023-2024
No. | Student ID | Full Name |
---|---|---|
1 | 21520596 | Tran Thi Kim Anh |
2 | 21521049 | Ho Quang Lam |
3 | 21521586 | Le Thi Le Truc (Leader) |
4 | 21521882 | Le Minh Chanh |
Project Title: DIABETES PREDICTION
Dataset: Diabetes Prediction | Kaggle
The project involves preprocessing an existing dataset and building predictive models for diabetes using various algorithms: K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree, Random Forest, and XGBoost. Once the models are trained, they will be saved as .pkl files. Following this, a web application will be developed to input patient data and use the pre-trained models to predict the likelihood of diabetes.
- Model Building: Python with Scikit-learn for machine learning algorithms (K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Tree, Random Forest, XGBoost).
- Web Application: Python using Streamlit for creating interactive web interfaces to predict diabetes based on input data.