Skip to content

Predicts customer churn for a telecom operator using personal data, service usage, and billing information. Helps proactively retain at-risk customers by identifying those likely to leave, enabling targeted offers like promos and special plans. Focuses on high recall to minimize missed churners, supporting effective retention strategies.

Notifications You must be signed in to change notification settings

gorop51-2/Telecom-Customer-Churn-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“± Telecom Customer Churn Prediction

πŸ“Œ Project Overview

Telecom operator "TeleDom" aims to reduce customer churn by offering personalized promocodes and special conditions to clients planning to leave. This project develops a machine learning model to predict the probability of contract termination.

🎯 Objective

Create a model that accurately predicts customer churn with:

  • ROC-AUC β‰₯ 0.85

  • High recall (prioritizing identifying potential churners over false positives)

πŸ“Š Dataset Description

Four CSV files containing customer information as of February 1, 2020:

  1. contract_new.csv - contract details:

Β  - customerID, BeginDate, EndDate (target feature)

Β  - Billing and payment information (Type, PaperlessBilling, PaymentMethod)

Β  - Charges (MonthlyCharges, TotalCharges)

  1. personal_new.csv - customer demographics:

Β  - Dependents, SeniorCitizen, Partner

  1. internet_new.csv - internet services:

Β  - InternetService (DSL/Fiber optic)

Β  - Additional services (OnlineSecurity, OnlineBackup, etc.)

  1. phone_new.csv - phone services:

Β  - MultipleLines

πŸ” Methodology

Data Preparation

  • Merged datasets using customerID

  • Converted EndDate to binary target (1 = churned, 0 = active)

  • Created new feature Months (contract duration)

  • Handled missing values and transformed categorical features

  • Removed highly correlated TotalCharges feature (MSE = 103.45 with MonthlyCharges * Months)

Modeling Approach

Tested multiple models with cross-validation and focal loss to address class imbalance (only ~26% churn rate):

  • DecisionTreeClassifier (ROC-AUC: 0.808)

  • RandomForestClassifier (ROC-AUC: 0.834)

  • LGBMClassifier with focal loss (ROC-AUC: 0.899)

  • LogisticRegression (ROC-AUC: 0.757)

  • Neural Network (ROC-AUC: 0.871)

πŸ“ˆ Results

Best Model: LGBMClassifier with matched hyper-parameters.

Test Performance:

  • ROC-AUC: 0.913

  • Recall: 90% (correctly identifies 90% of customers planning to leave)

The model successfully identifies high-risk customers, enabling targeted retention strategies while minimizing customer loss.

About

Predicts customer churn for a telecom operator using personal data, service usage, and billing information. Helps proactively retain at-risk customers by identifying those likely to leave, enabling targeted offers like promos and special plans. Focuses on high recall to minimize missed churners, supporting effective retention strategies.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published