This repository contains the code and experiments accompanying the paper "Back to the Baseline: Examining Baseline Effects on Explainability Metrics." The paper investigates the impact of baseline choices on fidelity metrics used to evaluate attribution methods in Explainable Artificial Intelligence (XAI).
- Baseline Sensitivity: The Deletion metric is highly sensitive to the choice of baseline, making it unreliable. The Insertion metric appears stable but primarily operates in a high OOD regime.
- Trade-offs in Baselines: Current baselines either remove information or produce OOD images, but not both.
- Model-Dependent Baseline: We propose a solution using feature visualization to create a model-dependent baseline that removes information without producing OOD images.
Our experiments revealed that the Deletion metric's reliability is heavily
influenced by the chosen baseline, as illustrated in the figure. This sensitivity
results in inconsistent rankings of attribution methods across different baselines, undermining the metric's dependability.
Our research indicates a significant trade-off between information removal and the production
of out-of-distribution (OOD) images when using different baselines, as shown in the
figure. The removal score measures how effectively a baseline removes information, while
the OOD score assesses how much the baseline alters the image distribution. The scatter plot
on the left of Figure 2 highlights this trade-off, where baselines that achieve high removal
scores also tend to have higher OOD scores. The grid of images on the right shows examples of
how different baselines affect an input image, further illustrating the trade-off between effective
information removal and maintaining an in-distribution image. Our proposed model-dependent
baseline, denoted by a star, strikes a better balance by removing information effectively
without significantly pushing the image out of the distribution.
To address the limitations of current baselines, we propose a model-dependent baseline generated using feature visualization techniques. This baseline aims to remove information without producing overly OOD images. The figure showcases examples of these model-dependent baselines. Each image represents a baseline generated by minimizing the activation of specific features in a deep neural network. The resulting images vary significantly depending on the model and the features being minimized, demonstrating the adaptability of this approach. This method offers a more reliable and trustworthy baseline for evaluating attribution methods.
For any questions or issues, please open an issue on GitHub or contact the authors via email.