This project focuses on using machine learning techniques to analyze customer financial details and predict whether a loan application is likely to be approved or rejected. A link for the Loan Approval Prediction Dataset from Kaggle.
Build classification models to predict loan approval outcomes using supervised machine learning. The project includes data preprocessing, model training, evaluation, and model selection.
- Logistic Regression
- Decision Tree Classifier
- Hyperparameter tuning using
GridSearchCV
- Cross-validation
- Evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC AUC
- Data visualization (EDA & Feature Importance)
The dataset includes details such as:
- Applicant income
- Loan amount
- Loan term
- CIBIL score
- Asset values
- Education level
- Employment status
- Target variable:
loan_status
(Approved / Rejected)
-
Preprocessing
- Handling missing values
- Encoding categorical variables
- Feature scaling
-
Modeling
- Trained Logistic Regression and Decision Tree models
- Performed cross-validation and hyperparameter tuning
-
Evaluation
- Compared models using key metrics and visualizations
- Analyzed feature importance
-
Conclusion
- Decision Tree selected as the final model based on performance
- Loan status distribution
- Boxplot: Annual income vs Loan status
- Correlation heatmap
- Feature importance (Decision Tree)
See the full report in the loan_approval_prediction.ipynb
notebook or exported PDF.
Supervised_ML/
-
Dataset
- loan_approval_dataset.csv
-
Loan_approval_prediction
- loan_approval_prediction.ipynb
-
Report.pdf
-
README.md