Model interpretability and understanding for PyTorch
-
Updated
Sep 11, 2025 - Python
Model interpretability and understanding for PyTorch
Shapley Interactions and Shapley Values for Machine Learning
Zennit is a high-level framework in Python using PyTorch for explaining/exploring neural networks using attribution methods like LRP.
Collection of NLP model explanations and accompanying analysis tools
An Open-Source Library for the interpretability of time series classifiers
Explainable AI in Julia.
Counterfactual SHAP: a framework for counterfactual feature importance
Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑
Materials for the Lab "Explaining Neural Language Models from Internal Representations to Model Predictions" at AILC LCL 2023 🔍
The official repo for the EACL 2023 paper "Quantifying Context Mixing in Transformers"
Code and data for the ACL 2023 NLReasoning Workshop paper "Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods" (Feldhus et al., 2023)
⛈️ Code for the paper "End-to-End Prediction of Lightning Events from Geostationary Satellite Images"
Efficient and accurate explanation estimation with distribution compression (ICLR 2025 Spotlight)
Implementation of the Integrated Directional Gradients method for Deep Neural Network model explanations.
Reproducible code for our paper "Explainable Learning with Gaussian Processes"
Robustness of Global Feature Effect Explanations (ECML PKDD 2024)
Codes for the paper On marginal feature attributions of tree-based models
Feature Attribution methods for neurons and Evolution experiments
NO2 Prediction: Performance and Robustness Comparison between Random Forest and Graph Neural Network
Add a description, image, and links to the feature-attribution topic page so that developers can more easily learn about it.
To associate your repository with the feature-attribution topic, visit your repo's landing page and select "manage topics."