Skip to content

Commit b75486e

Browse files
authored
Merge pull request #42 from mayer79/release_candidate
Release candidate
2 parents 81814e6 + 499dd8e commit b75486e

File tree

9 files changed

+62
-26
lines changed

9 files changed

+62
-26
lines changed

CRAN-SUBMISSION

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Version: 0.3.0
2+
Date: 2022-09-29 15:39:13 UTC
3+
SHA: 0652b701dc44c6446c9ed4c92e4da3d76089f7b6

DESCRIPTION

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
Package: kernelshap
22
Title: Kernel SHAP
3-
Version: 0.2.0.900
3+
Version: 0.3.0
44
Authors@R: c(
55
person("Michael", "Mayer", , "mayermichael79@gmail.com", role = c("aut", "cre")),
66
person("David", "Watson", , "david.s.watson11@gmail.com", role = "ctb")
77
)
88
Description: Multidimensional refinement of the Kernel SHAP algorithm
99
described in Ian Covert and Su-In Lee (2021)
10-
<http://proceedings.mlr.press/v130/covert21a>. Depending on the
11-
number of features, Kernel SHAP values can be calculated exactly, by
12-
sampling, or by a combination of the two. As soon as sampling is
13-
involved, the algorithm iterates until convergence, and standard
14-
errors are provided. The package allows to work with any model that
10+
<http://proceedings.mlr.press/v130/covert21a>. The package allows to
11+
calculate Kernel SHAP values in an exact way, by iterative sampling
12+
(as in the reference above), or by a hybrid of the two. As soon as
13+
sampling is involved, the algorithm iterates until convergence, and
14+
standard errors are provided. The package works with any model that
1515
provides numeric predictions of dimension one or higher. Examples
16-
include linear regression, logistic regression (logit or probability
17-
scale), other generalized linear models, generalized additive models,
18-
and neural networks. The package plays well together with
19-
meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
16+
include linear regression, logistic regression (on logit or
17+
probability scale), other generalized linear models, generalized
18+
additive models, and neural networks. The package plays well together
19+
with meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
2020
Visualizations can be done using the R package 'shapviz'.
2121
License: GPL (>= 2)
2222
Depends:

NEWS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# kernelshap 0.2.0.900 DEVEL
1+
# kernelshap 0.3.0
22

33
## Major improvements
44

R/kernelshap.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
#' The function allows to calculate Kernel SHAP values in an exact way, by iterative sampling
66
#' as in CL21, or by a hybrid of these two options. As soon as sampling is involved,
77
#' the algorithm iterates until convergence, and standard errors are provided.
8-
#' The default behaviour depends on the number of features p:
8+
#' The default behaviour depends on the number of features p, see also Details below:
99
#' \itemize{
1010
#' \item 2 <= p <= 8: Exact Kernel SHAP values are returned (for the given background data).
1111
#' \item p > 8: Hybrid (partly exact) iterative version of Kernel SHAP
@@ -191,7 +191,7 @@ kernelshap.default <- function(object, X, bg_X, pred_fun = stats::predict, bg_w
191191
all(nms %in% colnames(bg_X)),
192192
is.function(pred_fun),
193193
exact %in% c(TRUE, FALSE),
194-
p == 1L || hybrid_degree %in% 0:(p / 2),
194+
p == 1L || exact || hybrid_degree %in% 0:(p / 2),
195195
paired_sampling %in% c(TRUE, FALSE),
196196
"m must be even" = trunc(m / 2) == m / 2
197197
)

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
SHAP values (Lundberg and Lee, 2017) decompose model predictions into additive contributions of the features in a fair way. A model agnostic approach is called Kernel SHAP, introduced in Lundberg and Lee (2017), and investigated in detail in Covert and Lee (2021).
66

7-
The "kernelshap" package implements a multidimensional refinement of the Kernel SHAP Algorithm described in Covert and Lee (2021). The package allows to calculate Kernel SHAP values in an exact way, by iterative sampling (as in Covert and Lee, 2021), or a hybrid of the two. As soon as sampling is involved, the algorithm iterates until convergence, and standard errors are provided.
7+
The "kernelshap" package implements a multidimensional refinement of the Kernel SHAP Algorithm described in Covert and Lee (2021). The package allows to calculate Kernel SHAP values in an exact way, by iterative sampling (as in Covert and Lee, 2021), or by a hybrid of the two. As soon as sampling is involved, the algorithm iterates until convergence, and standard errors are provided.
88

99
The default behaviour depends on the number of features $p$:
1010

@@ -283,7 +283,7 @@ fit <- gam(Sepal.Length ~ s(Sepal.Width) + Species, data = iris)
283283

284284
system.time(
285285
s <- kernelshap(
286-
fit,
286+
fit,
287287
iris[c(2, 5)],
288288
bg_X = iris,
289289
parallel = TRUE,
@@ -300,7 +300,7 @@ SHAP values of first 2 observations:
300300

301301
## Exact/sampling/hybrid
302302

303-
In above examples, since $p$ was small, exact Kernel SHAP values were calculated. Here, we want to show how to use the different strategies (exact, hybrid, and pure sampling) in a situation with ten features.
303+
In above examples, since $p$ was small, exact Kernel SHAP values were calculated. Here, we want to show how to use the different strategies (exact, hybrid, and pure sampling) in a situation with ten features, see `?kernelshap` for details about those strategies.
304304

305305
With ten features, a degree 2 hybrid is being used by default:
306306

@@ -343,7 +343,7 @@ s$S[1:5]
343343

344344
The results are identical. While more on-off vectors $z$ were required (1022), only a single call to `predict()` was necessary.
345345

346-
Pure sampling can be enforced by setting the hybrid degree to 0:
346+
Pure sampling (not recommended!) can be enforced by setting the hybrid degree to 0:
347347

348348
```r
349349
s <- kernelshap(fit, X[1L, ], bg_X = X, hybrid_degree = 0)

compare_with_python.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,8 @@ fit <- lm(
6565
X_small <- diamonds[seq(1, nrow(diamonds), 53), setdiff(names(diamonds), "price")]
6666

6767
# Exact KernelSHAP on X_small, using X_small as background data
68-
# (71/59 seconds for exact, 27/17 for hybrid deg 2, 17/9 for hybrid deg 1,
69-
# 26/15 for pure sampling; second number with 2 parallel sessions on Windows)
68+
# (58/67(?) seconds for exact, 25/18 for hybrid deg 2, 16/9 for hybrid deg 1,
69+
# 26/17 for pure sampling; second number with 2 parallel sessions on Windows)
7070
system.time(
7171
ks <- kernelshap(fit, X_small, bg_X = bg_X)
7272
)

cran-comments.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
Hello CRAN
2+
3+
This is an update with
4+
5+
- a much better way to calculate *exact* KernelSHAP values,
6+
- and a very fast and accurate hybrid between exact and sampling.
7+
8+
Furthermore, some defaults have been improved. As the package is maturing, the next
9+
update will hopefully be version 1.0.0.
10+
11+
## Checks
12+
13+
### `check(manual = TRUE, cran = TRUE)`
14+
15+
0 errors ✔ | 0 warnings ✔ | 0 notes ✔
16+
17+
### `check_win_devel()`
18+
19+
* checking for detritus in the temp directory ... NOTE
20+
Found the following files/directories:
21+
'lastMiKTeXException'
22+
23+
0 errors ✔ | 0 warnings ✔ | 1 note ✖
24+
25+
### `check_rhub()`
26+
27+
- Ubuntu Linux 20.04.1 LTS, R-release, GCC: Okay
28+
- Platform: Fedora Linux, R-devel, clang, gfortran: Note
29+
30+
* checking HTML version of manual ... NOTE
31+
Skipping checking HTML validation: no command 'tidy' found
32+

man/kernelshap.Rd

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packaging.R

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,15 @@ library(usethis)
1515
use_description(
1616
fields = list(
1717
Title = "Kernel SHAP",
18-
Version = "0.2.0.900",
18+
Version = "0.3.0",
1919
Description = "Multidimensional refinement of the Kernel SHAP algorithm described in
2020
Ian Covert and Su-In Lee (2021) <http://proceedings.mlr.press/v130/covert21a>.
21-
Depending on the number of features, Kernel SHAP values can be calculated exactly,
22-
by sampling, or by a combination of the two. As soon as sampling is involved,
23-
the algorithm iterates until convergence, and standard errors are provided.
24-
The package allows to work with any model that provides numeric predictions of dimension one or higher.
25-
Examples include linear regression, logistic regression (logit or probability scale),
21+
The package allows to calculate Kernel SHAP values in an exact way, by iterative
22+
sampling (as in the reference above), or by a hybrid of the two.
23+
As soon as sampling is involved, the algorithm iterates until convergence,
24+
and standard errors are provided.
25+
The package works with any model that provides numeric predictions of dimension one or higher.
26+
Examples include linear regression, logistic regression (on logit or probability scale),
2627
other generalized linear models, generalized additive models, and neural networks.
2728
The package plays well together with meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
2829
Visualizations can be done using the R package 'shapviz'.",

0 commit comments

Comments
 (0)