This repository contains a full end-to-end data science project analyzing customer churn for a telecommunications company. Using real-world data, we built a machine learning model that predicts which customers are likely to cancel their service — a key insight that can help businesses reduce churn and protect recurring revenue.
Churn — when a customer leaves a service — is one of the most important metrics for subscription-based businesses. This project uses a dataset from a Telco provider to model churn risk based on customer demographics, billing data, and service usage.
- Understand which features are most predictive of churn
- Build and evaluate classification models
- Generate actionable business insights for retention strategies
File | Description |
---|---|
data_wrangling.ipynb |
Data cleaning and early exploration |
eda.ipynb |
Exploratory Data Analysis, feature distributions, statistical testing |
Pre-processing Work and Model.ipynb |
Data prep, feature selection, model building & evaluation |
Cleansed_Telco_Customer_Churn.csv |
Final modeling dataset (7% rows removed for quality) |
README.md |
You're reading it :) |
- Data Cleaning: Removed noisy or inconsistent records
- Statistical Testing: Used t-tests to select significant features (p < 0.05)
- Feature Engineering: Dropped multicollinear features using VIF
- Modeling: Compared Logistic Regression, Random Forest, and SVC
- Evaluation: ROC AUC, precision/recall, confusion matrix (percent-based)
- Top Predictors: Tenure, MonthlyCharges, Contract type, and Security services
- Best Model: Random Forest with ROC AUC of 0.83
- Business Insight: Customers with high bills and short tenure are most likely to churn
- Python 3.9+
- pandas, numpy, seaborn, matplotlib
- scikit-learn
- Jupyter Notebook
- Deploy model as an API or live dashboard
- Collect time-series service usage data
- Apply retention strategies to high-risk flagged customers
Made with ❤️ by Arnav Nambiar
Feel free to connect or reach out if you're interested in collaborating or discussing the project.