Predicting Shapley values in routing problems with machine learning

This repository represents the implementation part of my master thesis "Predicting Shapley values in routing problems with machine learning".

Abstract of the master thesis

In routing problems, companies and logistics service providers are not only interested in identifying the optimal route with the lowest cost, but also in a fair cost allocation among the served customers. A highly regarded method for distributing costs is the Shapley value with its unique fairness properties. As the highly intensive computation of the exact Shapley values limits the method’s applicability to small instances in routing problems, this study presents a novel technique for approximating Shapley values using machine learning. The approach is based on the generation of problem-specific features to capture the underlying structure of the setting. Therefore, a comprehensive numerical study using synthetic instances for the traveling salesman problem and the capacitated vehicle routing problem is conducted, including a detailed description of the data generation process as well as the implementation of the machine learning algorithms. A diverse set of models is considered for the experiments in order to determine the most suitable ones for each context. The analyses show excellent results by outperforming straightforward proxies and state-of-the-art approximation methods from the literature for both problem settings, leading to accurate Shapley value predictions for all customers within seconds. Additionally, the paper provides economic findings on the primary cost factors based on a feature analysis, as well as a study on the computational efficiency of the applied models. Importantly, the generalizability of the methodology to further operations research contexts is evaluated by applying the technique to a variant of the bin packing problem (BPP). The promising results for the BPP show that the approach can be effectively applied to other contexts.

Keywords: Shapley value · Cost allocation · Traveling salesman problem · Capacitated vehicle routing problem · Machine learning

Instructions on how to run the program

In the main.ipynb script, you can define one of the routing problems "TSP" or "CVRP" as the optimization_problem variable. After choosing a routing problem, run the script. It will execute the following notebooks and scripts in this order:

b1_feature_selection.ipynb, a1_linear.ipynb, a2_tree.ipynb (only for TSP), a3_ensembles.ipynb, a4_SVM.ipynb, a5_NN.ipynb, a6_all_models.ipynb, 03_tuning_results/a_tuning_summary.py, and 03_tuning_results/a_tuning_time.ipynb (only for TSP).

This procedure will automatically train, tune and evaluate all applied machine learning models on the dataset of the defined routing problem. Furthermore, this generates all relevant results and output files. Alternatively, you can run each model individually for both routing problems in the respective notebook of the model.

In the notebooks a1_linear.ipynb, a2_tree.ipynb, a3_ensembles.ipynb, a4_SVM.ipynb, and a5_NN.ipynb all applied machine learning models are implemented and tuned. The best parameter configurations are stored as a PDF file in the "03_tuning_results" folder.

The notebook a6_all_models.ipynb trains the best configuration of each model on the training set and evaluates its performance on the test set. The resulting test scores are stored as an Excel file in the "04_test_results" folder.

All the applied user-defined functions of the notebooks are stored in the ML_functions.py script.

⚠️ Note: Due to storage limitations, the datasets are not included in the repository.
However, the code to generate them can be found in the "01_data_generation" folder.

Applied machine learning algorithms for the TSP

Algorithm	Implemented in
K-Nearest Neighbor (KNN)	`a1_linear.ipynb`
Linear Regression	`a1_linear.ipynb`
Ridge Regression	`a1_linear.ipynb`
Polynomial Regression	`a1_linear.ipynb`
Decision Tree	`a2_tree.ipynb`
Random Forest	`a3_ensembles.ipynb`
GradientBoostingRegressionTrees (GBRT)	`a3_ensembles.ipynb`
XGBoost	`a3_ensembles.ipynb`
Linear Support Vector Machine (Linear SVM)	`a4_SVM.ipynb`
Kernel Machine	`a4_SVM.ipynb`
Multilayer Perceptron Neural Network(NN/MLP)	`a5_NN.ipynb`

Applied machine learning algorithms for the CVRP

Algorithm	Implemented in
K-Nearest Neighbor (KNN)	`a1_linear.ipynb`
Polynomial Regression	`a1_linear.ipynb`
Random Forest	`a3_ensembles.ipynb`
XGBoost	`a3_ensembles.ipynb`
Kernel Machine	`a4_SVM.ipynb`
Multilayer Perceptron Neural Network(NN/MLP)	`a5_NN.ipynb`

Applied benchmarks

SHapley APproximation based on a fixed Order (SHAPO)
Depot distance

Author

Dennis Götz

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.conda/.condatmp		.conda/.condatmp
01_data_generation		01_data_generation
02_best_features		02_best_features
03_tuning_results		03_tuning_results
04_test_results		04_test_results
05_benchmarks		05_benchmarks
06_graphics		06_graphics
07_bin_packing_problem		07_bin_packing_problem
08_archiv		08_archiv
__pycache__		__pycache__
ML_functions.py		ML_functions.py
README.md		README.md
a1_linear.ipynb		a1_linear.ipynb
a2_tree.ipynb		a2_tree.ipynb
a3_ensembles.ipynb		a3_ensembles.ipynb
a4_SVM.ipynb		a4_SVM.ipynb
a5_NN.ipynb		a5_NN.ipynb
a6_all_models.ipynb		a6_all_models.ipynb
b1_feature_selection.ipynb		b1_feature_selection.ipynb
b2_feature_importance.ipynb		b2_feature_importance.ipynb
main.ipynb		main.ipynb
settings.json		settings.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predicting Shapley values in routing problems with machine learning

Abstract of the master thesis

Instructions on how to run the program

Applied machine learning algorithms for the TSP

Applied machine learning algorithms for the CVRP

Applied benchmarks

Author

About

Uh oh!

Releases

Packages

Languages

dennismgoetz/MA_predicting_Shapley_values

Folders and files

Latest commit

History

Repository files navigation

Predicting Shapley values in routing problems with machine learning

Abstract of the master thesis

Instructions on how to run the program

Applied machine learning algorithms for the TSP

Applied machine learning algorithms for the CVRP

Applied benchmarks

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages