This project focuses on predicting customer churn for a telecom company using the Telco Customer Churn Dataset. By leveraging machine learning models, it identifies key factors contributing to churn and provides actionable insights to help businesses improve customer retention.
The project includes:
- End-to-end workflows for data preprocessing, model training, and optimization.
- An interactive Streamlit app with features like single prediction, batch processing, and dashboard visualizations.
- Overview
- Live Demo
- Objectives
- Project Workflow
- Key Insights and Results
- How to Run Locally
- Repository Structure
- Future Work
- License
Explore the deployed app here: Customer Churn Prediction App
This interactive app provides:
- Single Customer Prediction: Predict churn for a single customer based on key features.
- Batch Prediction: Upload a CSV file containing multiple customer records to generate predictions for the entire batch.
- Dashboards: Visualize key insights, such as churn rates, feature correlations, and demographic trends.
- Understand the factors influencing customer churn using data analysis and visualization.
- Develop a machine learning model to predict churn with high accuracy and interpretability.
- Provide business insights through interactive dashboards and actionable KPIs.
- Handled missing values and inconsistent data (e.g.,
TotalCharges
column). - Encoded categorical features using
LabelEncoder
. - Scaled numerical features like
tenure
,MonthlyCharges
, andTotalCharges
usingStandardScaler
.
- Trained multiple models (Logistic Regression, Random Forest) and optimized them using
GridSearchCV
. - Selected the best-performing model based on metrics like accuracy, F1-score, and AUC-ROC.
- Tuned decision thresholds for improved business interpretability using a custom optimal threshold.
- Built an interactive Streamlit app with the following features:
- Single Customer Prediction: Allows users to predict churn for a single customer.
- Batch Prediction: Enables predictions for multiple customers through CSV upload.
- Dashboard: Visualizes key insights like churn rates, feature correlations, and demographic trends.
- Total Customers: 7043
- Churned Customers: 2648
- Churn Rate: 37.60%
The churn rate is approximately 37.60%, with a significant proportion of customers predicted not to churn.
Month-to-month contracts have the highest churn rate, indicating potential issues with short-term customer retention.
Female customers have a slightly higher churn rate compared to male customers.
Customers with longer tenures are less likely to churn, emphasizing the importance of customer retention strategies.
Churned customers tend to have higher monthly charges, suggesting pricing strategy adjustments for high-value customers.
Key insights:
- Tenure has a negative correlation with churn.
- MonthlyCharges and TotalCharges show a moderate positive correlation with churn.
- Python 3.8 or higher
- Install dependencies listed in
requirements.txt
-
Clone the repository:
git clone https://github.com/DataStatsMohith/customer-churn-prediction.git
-
Navigate to the directory:
cd customer-churn-prediction
-
Install dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app/app.py
customer-churn-prediction/
β
βββ app/ # Streamlit app
β βββ app.py
β
βββ assets/ # Visualizations and animations
β βββ *.png # Images for dashboards
β βββ animations/ # JSON animations for Streamlit
β
βββ data/ # Dataset and preprocessed files
β βββ WA_Fn-UseC_-Telco-Customer-Churn.csv
β βββ feature_columns.pkl
β βββ final_churn_model.pkl
β βββ label_encoders.pkl
β βββ optimal_threshold.pkl
β βββ scaler.pkl
β
βββ notebooks/ # Jupyter notebook(s)
β βββ FinalCustomer_Churn_Prediction.ipynb
β
βββ README.md # Project documentation
βββ requirements.txt # Python dependencies
βββ LICENSE # Project license (optional)
- Model Improvements: Experiment with deep learning models for better performance.
- Additional Features: Incorporate customer satisfaction scores and support interactions.
- Deployment: Extend the app deployment to cloud platforms like AWS or Heroku.
This project is licensed under the MIT License.