You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -623,6 +623,7 @@ We also build on top of many great packages. Please check them out!
623
623
# Papers that use or compare EBMs
624
624
625
625
-[Challenging the Performance-Interpretability Trade-off: An Evaluation of Interpretable Machine Learning Models](https://arxiv.org/pdf/2409.14429)
626
+
-[GAMFORMER: In-context Learning for Generalized Additive Models](https://arxiv.org/pdf/2410.04560v1)
626
627
-[Data Science with LLMs and Interpretable Models](https://arxiv.org/pdf/2402.14474v1.pdf)
627
628
-[DimVis: Interpreting Visual Clusters in Dimensionality Reduction With Explainable Boosting Machine](https://arxiv.org/pdf/2402.06885.pdf)
628
629
-[Distill knowledge of additive tree models into generalized linear models](https://detralytics.com/wp-content/uploads/2023/10/Detra-Note_Additive-tree-ensembles.pdf)
@@ -688,6 +689,7 @@ We also build on top of many great packages. Please check them out!
688
689
-[Explainable Boosting Machines for Slope Failure Spatial Predictive Modeling](https://www.mdpi.com/2072-4292/13/24/4991/htm)
689
690
-[Micromodels for Efficient, Explainable, and Reusable Systems: A Case Study on Mental Health](https://arxiv.org/pdf/2109.13770.pdf)
690
691
-[Identifying main and interaction effects of risk factors to predict intensive care admission in patients hospitalized with COVID-19](https://www.medrxiv.org/content/10.1101/2020.06.30.20143651v1.full.pdf)
692
+
-[Leveraging interpretable machine learning in intensive care](https://link.springer.com/article/10.1007/s10479-024-06226-8#Tab10)
691
693
-[Development of prediction models for one-year brain tumour survival using machine learning: a comparison of accuracy and interpretability](https://www.pure.ed.ac.uk/ws/portalfiles/portal/343114800/1_s2.0_S0169260723001487_main.pdf)
692
694
-[Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes](https://arxiv.org/pdf/2207.05322.pdf)
693
695
-[Calibrate: Interactive Analysis of Probabilistic Model Output](https://arxiv.org/pdf/2207.13770.pdf)
guidance: This is an important hyperparameter to tune. The optimal smoothing_rounds value will vary depending on the dataset's characteristics. Adjust based on the prevalence of smooth feature response curves.
guidance: This is an important hyperparameter to tune. The conventional wisdom is that a lower learning rate is generally better, but we have found the relationship to be more complex. In general, regression seems to prefer a higher learning rate, binary classification seems to prefer a lower learning rate, and multiclass is in-between.
guidance: For max_interaction_bins, more is not necessarily better, unlike with max_bins. A good value on many datasets seems to be 32, but it's worth trying higher and lower values.
guidance: interaction_smoothing_rounds appears to have only a minor impact on model accuracy. 0 is often the best choice. 0 is often the most accurate choice, but the interaction shape plots will be smoother and easier to interpret with more interaction_smoothing_rounds.
guidance: A smaller learning_rate promotes finer model adjustments during fitting, but may require more iterations. Generally, we believe a smaller learning_rate should improve the model, but sometimes hyperparameter tuning seems to be needed to select the best value.
83
-
84
84
## max_leaves
85
-
default: 3
85
+
default: 2
86
86
87
87
hyperparameters: [2, 3, 4]
88
88
89
-
guidance: Generally, the default setting is effective, but it's worth checking if changing to either 2 or 4 can offer better accuracy on your specific data. The max_leaves parameter only applies to main effects.
89
+
guidance: Generally, the default setting is effective, but it's worth checking if changing to either 3 or 4 can offer better accuracy on your specific data. The max_leaves parameter only applies to main effects.
90
90
91
91
## min_samples_leaf
92
-
default: 2
92
+
default: 4
93
93
94
-
hyperparameters: [2, 3, 4]
94
+
hyperparameters: [2, 3, 4, 5, 6]
95
95
96
96
guidance: The default value usually works well, however experimenting with slightly higher values could potentially enhance generalization on certain datasets.
guidance: The default min_hessian is a solid starting point.
104
104
@@ -112,7 +112,7 @@ hyperparameters: [1000000000]
112
112
guidance: The max_rounds parameter serves as a limit to prevent excessive training on datasets where improvements taper off. Set this parameter sufficiently high to avoid premature early stopping. Consider increasing it if small yet consistent gains are observed in longer trainings.
113
113
114
114
## early_stopping_rounds
115
-
default: 50
115
+
default: 100
116
116
117
117
guidance: We typically do not advise changing early_stopping_rounds. The default is appropriate for most cases, adequately capturing the optimal model without incurring unnecessary computational costs.
0 commit comments