Skip to content

A machine learning project to predict customer churn, built with Python and scikit-learn, designed for telecom companies aiming to reduce customer churn.

License

Notifications You must be signed in to change notification settings

D0nG4667/telco_customer_churn_prediction

Repository files navigation

Telco_customer_churn_prediction

A machine learning project to predict customer churn, built with Python and scikit-learn, designed for telecom companies aiming to reduce customer churn and as a result improve customer retention.

Overview

This repository contains data analysis, insights, and machine learning modelling for customer churn prediction.

Key Objectives

The primary objective of this project is to develop a classification model for churn analysis to aid in customer retention efforts. Churn analysis focuses on predictingwhether customers are likely to leave or continue their relationship with the company. By identifying customers at risk of churning, the company can take proactivemeasures to retain them, thus increasing revenue and profit margins

Framework

The CRoss Industry Standard Process for Data Mining (CRISP-DM).

Features

  • Jupyter Notebook containing data analysis, visualizations, and interpretation.
  • Detailed documentation outlining methodology, data sources, and analysis results.
  • Interactive visualizations in Power BI showcasing funding trends and key insights.

Notebook preview on anaconda cloud

PowerBI Dashboard

Dashboard

Technologies Used

  • Anaconda
  • PowerBI
  • Python
  • Pandas
  • NumPy
  • Plotly
  • Jupyter Notebooks
  • Git
  • Scipy
  • Sklearn
  • Xgboost
  • Catboost
  • Lightgbm
  • Imblearn
  • Pyodbc
  • Re
  • Typing

Installation

Quick install

 pip install -r requirements.txt

Recommended install

conda env create -f churn_environment.yml

Sample Code- used to generate the performance metric of a list of models

def info(models: Union[ValuesView[Pipeline], List[Pipeline]], metric: Callable[..., float], **kwargs) -> List[Dict[str, Any]]:
    """
    Generates a list of dictionaries, each containing a model's name and a specified performance metric.

    Parameters:
    - models (List[Pipeline]): A list of model pipeline instances.
    - metric (Callable[..., float]): A function used to evaluate the model's performance. Expected to accept
      parameters like `y_true`, `y_pred`, and `average`, and return a float.
    - **kwargs: Additional keyword arguments to be passed to the metric function or any other function calls inside `info`. Can pass

    Returns:
    - List[Dict[str, Any]]: A list of dictionaries with model names and their evaluated metrics.
    """
    def get_metric(model, kwargs):
         
        # Add default kwargs for callable metric to kwargs. Consider is they are present in kwargs
        if 'X_train' and 'y_train_encoded' in kwargs:
            model.fit(kwargs[X_train], kwargs[y_train_encoded])
        else:
            # Fit final pipeline to training data            
            model.fit(X_train, y_train_encoded)
        
        if 'y_eval_encoded' in kwargs:
            kwargs['y_true'] = kwargs['y_eval_encoded']
        else:
            kwargs['y_true'] = y_eval_encoded
            
        if 'X_eval' in kwargs:
            kwargs['y_pred'] = model.predict(kwargs[X_eval])
        else:
            kwargs['y_pred'] = model.predict(X_eval)   
        
        # Sanitize the metric arguments, use only valid metric parameters
        kwargs = {k: value for k, value in kwargs.items() if k in inspect.signature(metric).parameters.keys()}
        
        return metric(**kwargs)    
    
    info_metric = [
        {
            'model_name': model['classifier'].__class__.__name__,
            f'Metric ({metric.__name__}_{kwargs['average'] if 'average' in kwargs else ''})': get_metric(model, kwargs),
        } for model in models
    ]

    return info_metric

Contributions

How to Contribute

  1. Fork the repository and clone it to your local machine.
  2. Explore the Jupyter Notebooks and documentation.
  3. Implement enhancements, fix bugs, or propose new features.
  4. Submit a pull request with your changes, ensuring clear descriptions and documentation.
  5. Participate in discussions, provide feedback, and collaborate with the community.

Feedback and Support

Feedback, suggestions, and contributions are welcome! Feel free to open an issue for bug reports, feature requests, or general inquiries. For additional support or questions, you can connect with me on LinkedIn.

Link to article on Medium: Telco Customer Churn Prediction: Unveiling Insights with Data Analysis and Machine Learning

Author

Gabriel Okundaye.

License

MIT

About

A machine learning project to predict customer churn, built with Python and scikit-learn, designed for telecom companies aiming to reduce customer churn.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published