Skip to content

Conversation

@mayer79
Copy link
Collaborator

@mayer79 mayer79 commented Jul 19, 2025

With input from Mario Wuethrich and Ian Covert and his repo, we have fixed a bug in how kernelshap() calculates Kernel weights.

  • The differences caused by this are typically very small, see the first example below.
  • Models with interactions of order up to two have been unaffected.
  • The PR comes with unit tests that compares a model with high order interactions against Python's "shap".
  • Exact Kernel SHAP now provides identical results to exact permutation SHAP.
  • Sampling versions of Kernel SHAP converge to exact versions when setting a small tolerance.

Ping @dswatson : Note that the calculation of the exact A matrix in Covert-Lee is actually correct. We were mislead by wrong weights back then.

Ping @wueth

The resulting SHAP values were only slightly off

First example in the readme (random forest):

# Before the fix

#       log_carat     clarity       color        cut
# [1,]  1.1911791  0.0900462 -0.13531648 0.001845958
# [2,] -0.4927482 -0.1168517  0.09815062 0.028255442

# After the fix

(ks <- kernelshap(fit, X, bg_X = bg_X))

      log_carat     clarity       color         cut
[1,]  1.1913247  0.09005467 -0.13430720 0.000682593
[2,] -0.4931989 -0.11724773  0.09868921 0.028563613

# Now equal to permutation SHAP
(ps <- permshap(fit, X, bg_X = bg_X))

      log_carat     clarity       color         cut
[1,]  1.1913247  0.09005467 -0.13430720 0.000682593
[2,] -0.4931989 -0.11724773  0.09868921 0.028563613

Comparison with Python's "shap"

Now part of the unit tests.

Python

import numpy as np
import shap  # 0.47.2

X = np.array(
    [
        np.arange(1, 101) / 100,
        np.log(np.arange(1, 101)),
        np.sqrt(np.arange(1, 101)),
        np.sin(np.arange(1, 101)),
        (np.arange(1, 101) / 100) ** 2,
        np.cos(np.arange(1, 101)),
    ]
).T


def predict(X):
    return X[:, 0] * X[:, 1] * X[:, 2] * X[:, 3] + X[:, 4] + X[:, 5]


ks = shap.explainers.Kernel(predict, X, nsamples=10000)
es = shap.explainers.Exact(predict, X)

print("Exact Kernel SHAP:\n", ks(X[0:2]).values)
print("Exact (Permutation) SHAP:\n", es(X[0:2]).values)

# Exact Kernel SHAP:
#  [[-1.19621609 -1.24184808 -0.9567848   3.87942037 -0.33825     0.54562519]
#  [-1.64922699 -1.20770105 -1.18388581  4.54321217 -0.33795    -0.41082395]]
# Exact (Permutation) SHAP:
#  [[-1.19621609 -1.24184808 -0.9567848   3.87942037 -0.33825     0.54562519]
#  [-1.64922699 -1.20770105 -1.18388581  4.54321217 -0.33795    -0.41082395]]

R

n <- 100

X <- data.frame(
  x1 = seq(1:n) / 100,
  x2 = log(1:n),
  x3 = sqrt(1:n),
  x4 = sin(1:n),
  x5 = (seq(1:n) / 100)^2,
  x6 = cos(1:n)
)

pf <- function(model, newdata) {
  x <- newdata
  x[, 1] * x[, 2] * x[, 3] * x[, 4] + x[, 5] + x[, 6]
}
(ks <- kernelshap(pf, head(X, 2), bg_X = X, pred_fun = pf))
#             x1        x2         x3       x4       x5         x6
# [1,] -1.196216 -1.241848 -0.9567848 3.879420 -0.33825  0.5456252
# [2,] -1.649227 -1.207701 -1.1838858 4.543212 -0.33795 -0.4108240

(ps <- permshap(pf, head(X, 2), bg_X = X, pred_fun = pf))
#             x1        x2         x3       x4       x5         x6
# [1,] -1.196216 -1.241848 -0.9567848 3.879420 -0.33825  0.5456252
# [2,] -1.649227 -1.207701 -1.1838858 4.543212 -0.33795 -0.4108240

@mayer79 mayer79 requested review from Copilot and pbiecek July 19, 2025 19:41

This comment was marked as outdated.

@codecov-commenter
Copy link

codecov-commenter commented Jul 19, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 95.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.67%. Comparing base (abed201) to head (e37913a).

Files with missing lines Patch % Lines
R/utils_kernelshap.R 95.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #168      +/-   ##
==========================================
- Coverage   96.83%   96.67%   -0.16%     
==========================================
  Files           7        7              
  Lines         663      662       -1     
==========================================
- Hits          642      640       -2     
- Misses         21       22       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mayer79 mayer79 self-assigned this Jul 19, 2025
@mayer79 mayer79 added the bug Something isn't working label Jul 19, 2025
@mayer79 mayer79 changed the title FIX-major-bug-in-weighting-logic FIX-bug-in-weighting-logic Jul 20, 2025
Copy link
Member

@pbiecek pbiecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you for this fix

after this cross implementation verification, I think that the package is ready for version 1.0

@mayer79
Copy link
Collaborator Author

mayer79 commented Jul 21, 2025

@pbiecek : Thanks a lot! Good idea to increment the version to stable 1.0.0

@mayer79 mayer79 merged commit b9ff089 into main Jul 21, 2025
7 checks passed
@mayer79 mayer79 deleted the fix-weighting-logic branch July 21, 2025 08:05
@dswatson
Copy link

Cool! Thanks for the heads up, this looks great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants