Skip to content

HR analytics capstone using Python to forecast employee churn and inform retention strategy at Salifort Motors.

License

Notifications You must be signed in to change notification settings

johbry17/Salifort-Employee-Churn-ml

Repository files navigation

Classified: At Risk — Salifort Motors Employee Attrition Prediction

Google Advanced Data Analytics Capstone

Can machine learning help predict and reduce employee turnover? Real-world HR modeling techniques uncover attrition patterns at a fictional automaker.

🔗 Live Report

Table of Contents

Project Overview

Salifort Motors is a fictional car company facing high employee turnover. This project models and explains the drivers of attrition using a structured ML workflow:

  • Data wrangling & preprocessing
  • Exploratory data analysis (EDA)
  • Predictive modeling with multiple classifiers
  • Model evaluation and interpretability (SHAP, feature importance)
  • Final recommendations for HR strategy and retention

Hosted online as an interactive web report aimed at both technical and general audiences.

Features

  • 📊 Interactive visual EDA (Seaborn, Matplotlib)
  • 🤖 Four predictive models: Logistic Regression, Decision Tree, Random Forest, XGBoost
  • 🔍 Model evaluation: confusion matrices, recall scores, misclassification analysis
  • 🧠 SHAP and feature importances for explainability
  • 💬 Executive summary with actionable business takeaways

Tools & Technologies

  • Language: Python
  • Libraries: pandas, seaborn, matplotlib, scikit-learn, xgboost, statsmodels, shap
  • Environment: Jupyter Notebook
  • Deployment: GitHub Pages (HTML report)

Usage

All analysis can be found online at project site.

  1. Clone the repository.
  2. Install required dependencies from requirements.txt.
  3. Open the notebooks in the notebooks/ directory to explore the analysis:
    • eda.ipynb for exploratory data analysis
    • models.ipynb for model development and evaluation
    • executive_summary.ipynb for a project overview and key findings

Gallery

EDA Insights:

Satisfaction Level vs Average Monthly Hours Plot

Promotion

Model Results:

Confusion Matrix Results

XGBoost SHAP Summary

Misclassified Predicted Probability

Decision Tree

Feature Importances

Certificate

Final capstone project for Google Advanced Data Analytics Professional Certificate:

Google Data Analytics Certificate

References

License

MIT License © 2025 Bryan Johns. See LICENSE for details.

Acknowledgements

Author

Bryan Johns, June 2025
bryan.johns.official@gmail.com | LinkedIn | GitHub | Portfolio

About

HR analytics capstone using Python to forecast employee churn and inform retention strategy at Salifort Motors.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published