Skip to content

Commit e4870fb

Browse files
authored
Merge pull request #21 from mayer79/cran_v2
prepare CRAN version 0.2.0
2 parents 89701cc + fe37624 commit e4870fb

File tree

11 files changed

+187
-144
lines changed

11 files changed

+187
-144
lines changed

.Rbuildignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@
88
^\.Rproj\.user$
99
^compare_with_python.R$
1010
^Z_exact.R$
11+
^CRAN-SUBMISSION$

CRAN-SUBMISSION

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Version: 0.2.0
2+
Date: 2022-09-05 12:31:15 UTC
3+
SHA: e28ed70f22cbeb7f234bc554c8eda7153d538e41

DESCRIPTION

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Package: kernelshap
22
Title: Kernel SHAP
3-
Version: 0.1.900
3+
Version: 0.2.0
44
Authors@R: c(
55
person("Michael", "Mayer", , "mayermichael79@gmail.com", role = c("aut", "cre")),
66
person("David", "Watson", , "david.s.watson11@gmail.com", role = "ctb")
@@ -13,9 +13,9 @@ Description: Multidimensional version of the iterative Kernel SHAP
1313
provides numeric predictions of dimension one or higher. Examples
1414
include linear regression, logistic regression (logit or probability
1515
scale), other generalized linear models, generalized additive models,
16-
and neural networks. The package plays well together with
17-
meta-learning packages like 'caret' or 'mlr3'. Visualizations can be
18-
done using the R package 'shapviz'.
16+
and neural networks. The package plays well together with
17+
meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
18+
Visualizations can be done using the R package 'shapviz'.
1919
License: GPL (>= 2)
2020
Depends:
2121
R (>= 3.2.0)

NEWS.md

Lines changed: 10 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,27 @@
1-
# kernelshap 0.1.900 DEVEL
1+
# kernelshap 0.2.0
22

33
## Breaking change
44

5-
The interface of `kernelshap()` has been revised. Instead of specifying a prediction function, it suffices now to pass the fitted model object. The default `pred_fun` is now `stats::predict`, which works in most cases. Some other cases are catched via model class ("ranger" and mlr3 "Learner"). The `pred_fun` can be overwritten by a function of the form `function(object, X, ...)`.
5+
The interface of `kernelshap()` has been revised. Instead of specifying a prediction function, it suffices now to pass the fitted model object. The default `pred_fun` is now `stats::predict`, which works in most cases. Some other cases are catched via model class ("ranger" and mlr3 "Learner"). The `pred_fun` can be overwritten by a function of the form `function(object, X, ...)`. Additional arguments to the prediction function are passed via `...` of `kernelshap()`.
66

7-
Example: Logistic regression with predictions on logit scale
7+
Some examples:
88

9-
```
10-
kernelshap(fit, X, bg_X)
11-
```
12-
13-
Example: Logistic regression with predictions on probability scale
14-
15-
```
16-
kernelshap(fit, X, bg_X, type = "response")
17-
```
18-
19-
Example: Log-linear regression to be evaluated on original scale.
20-
Here, the default predict function needs to be overwritten:
21-
22-
```
23-
kernelshap(fit, X, bg_X, pred_fun = function(m, X) exp(predict(m, X)))
24-
```
9+
- Logistic regression (logit scale): `kernelshap(fit, X, bg_X)`
10+
- Logistic regression (probabilities): `kernelshap(fit, X, bg_X, type = "response")`
11+
- Linear regression with logarithmic response, but evaluated on original scale: Here, the default predict function needs to be overwritten: `kernelshap(fit, X, bg_X, pred_fun = function(m, X) exp(predict(m, X)))`
2512

2613
## Major improvements
2714

2815
- `kernelshap()` has received a more intuitive interface, see breaking change above.
2916
- The package now supports multidimensional predictions. Hurray!
30-
- Thanks to David Watson, parallel computing is now supported. The user needs to set up the parallel backend before calling `kernelshap()`, i.e., using the "doFuture" package, and then set `parallel = TRUE`. Especially on Windows, sometimes not all global variables or packages are loaded in the parallel instances. These can be specified by `parallel_args`, a list of arguments passed to `foreach()`.
17+
- Thanks to David Watson, parallel computing is now supported. The user needs to set up the parallel backend before calling `kernelshap()`, e.g., using the "doFuture" package, and then set `parallel = TRUE`. Especially on Windows, sometimes not all global variables or packages are loaded in the parallel instances. These can be specified by `parallel_args`, a list of arguments passed to `foreach()`.
3118
- Even without parallel computing, `kernelshap()` has become much faster.
3219
- For $2 \le p \le 5$ features, the algorithm now returns exact Kernel SHAP values with respect to the given background data. (For $p = 1$, exact *Shapley values* are returned.)
33-
- Besides `matrix`, `data.frame`s, and `tibble`s, the package now also accepts `data.table`s (if the prediction function can deal with them).
20+
- Direct handling of "tidymodels" models.
3421

3522
## User visible changes
3623

24+
- Besides `matrix`, `data.frame`s, and `tibble`s, the package now also accepts `data.table`s (if the prediction function can deal with them).
3725
- `kernelshap()` is less picky regarding the output structure of `pred_fun()`.
3826
- `kernelshap()` is less picky about the column structure of the background data `bg_X`. It should simply contain the columns of `X` (but can have more or in different order). The old behaviour was to launch an error if `colnames(X) != colnames(bg_X)`.
3927
- The default `m = "auto"` has been changed from `trunc(20 * sqrt(p))` to `max(trunc(20 * sqrt(p)), 5 * p`. This will have an effect for cases where the number of features $p > 16$. The change will imply more robust results for large p.
@@ -46,7 +34,7 @@ kernelshap(fit, X, bg_X, pred_fun = function(m, X) exp(predict(m, X)))
4634

4735
## New contributor
4836

49-
- David Watson is now contributor of the package.
37+
- David Watson
5038

5139
# kernelshap 0.1.0
5240

R/kernelshap.R

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,10 @@
5252
#' on the same scale.
5353
#' @param max_iter If the stopping criterion (see \code{tol}) is not reached after
5454
#' \code{max_iter} iterations, the algorithm stops.
55-
#' @param parallel If \code{TRUE}, use parallel \code{foreach} to loop over rows
56-
#' to be explained. Must register backend beforehand, e.g. via \code{doFuture},
55+
#' @param parallel If \code{TRUE}, use parallel \code{foreach::foreach()} to loop over rows
56+
#' to be explained. Must register backend beforehand, e.g. via "doFuture" package,
5757
#' see Readme for an example. Parallelization automatically disables the progress bar.
58-
#' @param parallel_args A named list of arguments passed to \code{foreach()}, see
58+
#' @param parallel_args A named list of arguments passed to \code{foreach::foreach()}, see
5959
#' \code{?foreach::foreach}. Ideally, this is \code{NULL} (default). Only relevant
6060
#' if \code{parallel = TRUE}. Example on Windows: if \code{object} is a generalized
6161
#' additive model fitted with package "mgcv", then one might need to set
@@ -81,7 +81,7 @@
8181
#' @examples
8282
#' # Linear regression
8383
#' fit <- stats::lm(Sepal.Length ~ ., data = iris)
84-
#' s <- kernelshap(fit, iris[1:2, -1], bg_X = iris[, -1])
84+
#' s <- kernelshap(fit, iris[1:2, -1], bg_X = iris)
8585
#' s
8686
#'
8787
#' # Multivariate model
@@ -106,11 +106,11 @@
106106
#' )
107107
#'
108108
#' # On scale of linear predictor
109-
#' s <- kernelshap(fit, iris[1:2], bg_X = iris[1:2])
109+
#' s <- kernelshap(fit, iris[1:2], bg_X = iris)
110110
#' s
111111
#'
112112
#' # On scale of response (probability)
113-
#' s <- kernelshap(fit, iris[1:2], bg_X = iris[1:2], type = "response")
113+
#' s <- kernelshap(fit, iris[1:2], bg_X = iris, type = "response")
114114
#' s
115115
#'
116116
kernelshap <- function(object, ...){

R/utils.R

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,12 @@ reorganize_list <- function(alist, nms) {
157157

158158
# Checks and reshapes predictions to (n x K) matrix
159159
check_pred <- function(x, n) {
160+
if (!is.vector(x) && !is.matrix(x) && !is.data.frame(x)) {
161+
stop("Predictions must be a vector, matrix, or data.frame")
162+
}
163+
if (is.data.frame(x)) {
164+
x <- as.matrix(x)
165+
}
160166
if (!is.numeric(x)) {
161167
stop("Predictions must be numeric")
162168
}
@@ -166,7 +172,7 @@ check_pred <- function(x, n) {
166172
if (length(x) == n) {
167173
return(matrix(x, nrow = n))
168174
}
169-
stop("Predictions must be a length n vector or a matrix with n rows.")
175+
stop("Predictions must be a length n vector or a matrix/data.frame with n rows.")
170176
}
171177

172178
# Informative warning if background data is small or large

0 commit comments

Comments
 (0)