Loan Default Risk Analysis Project

🔍 Data-Driven Risk Profiling | 💡 Predictive Modeling | 📈 Visual Storytelling

🚀 Overview

This project explores the key factors contributing to loan defaults by applying the full data analysis pipeline — from data cleaning and EDA to feature engineering and machine learning modeling. The objective is to provide actionable insights for financial institutions to identify high-risk borrowers and optimize lending strategies.

🎯 Objectives

Clean and preprocess loan application data
Identify patterns and relationships using grouped visualizations
Apply feature engineering techniques for model readiness
Train and evaluate Logistic Regression and Random Forest models
Predict loan status (Default / Non-Default) accurately

📌 Project Highlights

✅ Cleaned missing values, handled outliers, and transformed skewed distributions
📊 Created grouped bar charts for Age, Income, Loan Amount, etc., vs Loan Status
🧹 Applied binning, label encoding, and log transformation techniques
🧠 Built two predictive models:

Logistic Regression
Random Forest Classifier
🧪 Evaluated model performance using accuracy, confusion matrix, and precision-recall
🧾 Delivered insights in a clear, visual, and business-focused manner

📊 Key Insights

Young and low-income applicants show higher default risk
Loan purpose like education and medical had elevated default rates
Home ownership status significantly impacted risk levels
Higher loan amounts → Higher chances of default
Random Forest outperformed Logistic Regression in predicting default cases

🛠️ Tools & Technologies

Category	Tools / Libraries
🐍 Programming	Python
📊 Data Analysis	Pandas, NumPy
📈 Visualization	Matplotlib, Seaborn
🤖 Machine Learning	Scikit-learn (LogisticRegression, RandomForest)
🧪 Environment	Jupyter Notebook

📁 Deliverables

Cleaned and preprocessed dataset
Grouped bar chart visualizations
Trained ML models (Logistic & Random Forest)
Insightful visual storytelling
Jupyter Notebook with complete analysis pipeline

🙋‍♂️ Author

I’m Syed Hur Abbas Naqvi, a Certified Data Analyst skilled in Python, SQL, Microsoft Power BI, Excel, and Machine Learning.
I specialize in turning raw data into business intelligence that drives growth — from data cleaning & EDA to visualization & strategic insights.

🌐 Portfolio: https://hurabbas05.github.io/
🔗 LinkedIn: https://www.linkedin.com/in/hurabbas05/
📧 Email: syedhur572@gmail.com
📞 Phone: +923036098700

🌟 Star This Repo

If you found this project helpful, feel free to ⭐ star this repository to support and bookmark it!

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Cleaned_LoanDataset.xls		Cleaned_LoanDataset.xls
Loan Default Risk Analysis Project final Presentation.pdf		Loan Default Risk Analysis Project final Presentation.pdf
LoanDataset - LoansDatasest.csv		LoanDataset - LoansDatasest.csv
README.md		README.md
loan_default_analysis.ipynb		loan_default_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Loan Default Risk Analysis Project

🚀 Overview

🎯 Objectives

📌 Project Highlights

📊 Key Insights

🛠️ Tools & Technologies

📁 Deliverables

🙋‍♂️ Author

🌟 Star This Repo

About

Uh oh!

Releases

Packages

Languages

hurabbas05/Loan-Default-Risk-Analysis-Project

Folders and files

Latest commit

History

Repository files navigation

Loan Default Risk Analysis Project

🚀 Overview

🎯 Objectives

📌 Project Highlights

📊 Key Insights

🛠️ Tools & Technologies

📁 Deliverables

🙋‍♂️ Author

🌟 Star This Repo

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages