Customer churn prediction is to measure why customers are leaving a business. I am using here Telco Customer Data from Kaggle. Will build a deep learning model to predict the churn and use precision, recall, f1-score to measure performance of the model.
- Data Collection : Here I had used the Telco Customer Data from Kaggle.
- Preprocess the data
- Convert the nominal and ordinal categorical values into numbers.
- Created dummy variables for categorical features having more than two categories.
- Train Test Split. Further preprocessing is done after the split to avoid data leakage.
- Feature scaling using min max scaling is done seperately for train and test data
- Oversampling using SMOTE to deal with class imbalance, as Churn=0 class is much more than Churn=1
- Create a Dense Artificial Neural Network with Dropout for regularization.
- Do prediction for the test data and plot the output. Check the performance metrics.
For the customer churn prediction, we need to reduce the False Negative (ie. prediction will not churn by mistake). Therefore, need to improve recall. Here we got a recall of 74% for our model. You can see the whole pipeline here