loan-approval-predictor

Predicting loan approvals with probabilistic modeling to support financial risk mitigation and smarter lending decisions

Project Duration: Mar 1, 2025 - Apr 1, 2025

🎯 Problem Statement

Financial institutions need accurate tools to determine the creditworthiness of applicants. The goal of this project was to build a binary classification model to predict whether a loan applicant would be approved or not. This challenge was hosted on Kaggle as part of Playground Series - Season 4, Episode 10. Participants were required to submit predicted probabilities for loan approvals, and the evaluation metric used was Area Under the ROC Curve (AUC-ROC). Higher AUC scores indicated better model performance.

🧩 Top Approach

You can explore the complete methodology in my notebook:
🔗 PS4E10 - Solution 2

Key steps followed:

📊 Data Augmentation & Preprocessing:
- Merged the official competition training data with the original open-source credit risk dataset to increase training volume.
- Performed proper alignment and null handling to ensure feature consistency.
🧠 Model Stacking with OOFs & Ensembling:
- Leveraged pre-computed out-of-fold (OOF) predictions and test set predictions from diverse base models including lgbm, catboost, tabnet, and others.
- Combined OOFs for robust out-of-sample validation.
🧮 Hill Climbing Ensembling:
- Applied a hill climbing algorithm to greedily find the optimal blend of base models by maximizing the AUC on validation data.
- Iteratively selected models that improved validation AUC until no further gain was observed.
🧪 Final Prediction & Submission:
- Used the optimal ensemble weights derived via hill climbing to combine model predictions on the test set.
- Generated submission probabilities accordingly.

📈 Results

✅ Public Leaderboard Scores:
- Achieved scores of 0.97324, 0.97347, 0.97369, and 0.97285.
🏁 Private Leaderboard Scores:
- Final scores: 0.96855, 0.96886 (best), 0.96877, and 0.96712 for subsequent submissions.
🥇 Rank Achieved:
- Ranked 218 / 4080 participants and 3858 teams, as a solo participant.

🧰 References

📂 Kaggle Competition: Loan Approval Prediction
📁 Dataset: Competition Data
📁 Source Dataset: Open Credit Risk Dataset

🛠️ Tech Stack

Language: Python 🐍
Libraries:
- pandas, numpy for data handling
- matplotlib, seaborn for basic EDA
- lightgbm, catboost, tabnet for modeling
- sklearn for evaluation and metrics
Tools:
- Jupyter Notebook 📓 for modeling
- Kaggle Kernels / Colab for experimentation

📌 This project demonstrates the power of ensembling and model optimization in achieving high-performance predictive modeling under AUC evaluation criteria.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
ps4e10-lap-solution-1.ipynb		ps4e10-lap-solution-1.ipynb
ps4e10-lap-solution-2.ipynb		ps4e10-lap-solution-2.ipynb
ps4e10-lap-solution-3.ipynb		ps4e10-lap-solution-3.ipynb
ps4e10-lap-solution-4.ipynb		ps4e10-lap-solution-4.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

loan-approval-predictor

Project Duration: Mar 1, 2025 - Apr 1, 2025

🎯 Problem Statement

🧩 Top Approach

📈 Results

🧰 References

🛠️ Tech Stack

About

Uh oh!

Languages

License

krishnaura45/loan-approval-predictor

Folders and files

Latest commit

History

Repository files navigation

loan-approval-predictor

Project Duration: Mar 1, 2025 - Apr 1, 2025

🎯 Problem Statement

🧩 Top Approach

📈 Results

🧰 References

🛠️ Tech Stack

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages