This repository presents a comparative evaluation of statistical and machine learning models for estimating Expected Goals (xG) in football, using a cleaned and enriched open-source dataset.
📘 Notebook: notebooks/xg-model-selection.ipynb
📊 Dataset: Kaggle - Shots Dataset for Football
📄 Published Article: Zenodo DOI
- Logistic Regression
- Linear Discriminant Analysis (LDA)
- Bagging Classifier
- Random Forest (with GridSearchCV tuning)
- Feedforward Neural Network (Keras + TensorFlow)
- SHAP Explainability for neural models
- Confusion Matrix (threshold = 0.3)
- ROC Curve & AUC
- Calibration Curve
- Precision, Recall, F1-score
- Brier Score
To reproduce the environment:
pip install -r requirements.txt
This project is distributed under the MIT License.