Skip to content

riverback/Awesome-Concept-Bottleneck-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Awesome Concept Bottleneck Models

Work in progress: we have compiled and summarized relevant papers in this field by year and are continuing to improve the categorization and organization of the collection to help newcomers quickly understand the area. Feel free to suggest improvements or add new papers via a pull request.

Introduction

The Concept Bottleneck Model (CBM) is an emerging self-explainable architecture that first maps inputs to a set of human-interpretable concepts before making predictions using an interpretable classifier, typically a single-layer linear model. Beyond inherent interpretability, CBMs provide an intervention interface through the concept bottleneck layer, allowing users to directly modify concept activations to refine model predictions, and this serves as the most significant difference between CBMs and other explainable models, such as the CapsulesNet and ProtoPNet.

image_from_IntCEMs

(images from IntCEMs, highlighting the interpretability and intervention ability of CBM architectures)

Papers Sorted by Research Focus

Architecture Improvements

Improving Concept Representations

The original Concept Bottleneck Model maps each concept to a single (probabilistic) value to construct the concept bottleneck layer, followed by a linear layer that predicts image-level class labels based on these concept values. However, the semantics of individual concepts, the relationships and hierarchies among different concepts, and the dependencies between concepts and class labels are inherently complex. Therefore, to address the need for richer, more expressive concept representations and to model the intricate concept–concept and concept–class relationships, many studies have proposed improvements to the representation methods used in the concept bottleneck layer.

Method Publication Concept Representation Highlight Code/Project
Concept Embedding Models (CEMs) NeurIPS 2022 high-dimensional embeddings representing each concept as a supervised high-dimensional embeddings to preserve high performance and interpretability under incomplete concept annotations Code
Probabilistic Concept Bottleneck Models (PCBMs) ICML 2023 probabilistic embeddings leveraging probabilistic concept embeddings to model uncertainty in concept predictions and provide more reliable explanations with uncertainty Code
Energy-based Concept Bottleneck Models (ECBMs) ICLR 2024 high-dimensional embeddings + energy networks using a set of networks to define the joint energy of the (input, concept, class) triplet, therefore providing a unified way for prediction, concept intervention, and probabilistic explanation via minimizing energy. Code
Logic-enhanced CBMs ICML W 2024 augmented with propositional logic rules using the propositional logic derived from the concepts to model the relationships between concepts -
EQ-CBM ACCV 2024 quantized probabilistic embeddings enhances CBMs through probabilistic concept encoding using energy-based models with quantized concept activation vectors to capture uncertainties -

Improving Intervention Ability / Interactivity

Method Publication Highlight Code/Project

Finding Concepts (concept discovery, language-guided CBMs, etc.)

Method Publication Concept Source Code/Project
LF-CBMs ICLR 2023 LLM Code
Post-hoc CBMs ICLR 2023 LLM / TCAV Code
LaBo CVPR 2023 LLM Code
BotCL CVPR 2023 Concept Prototypes (images + heatmap) Code
LM4CV ICCV 2023 LLM Code
CDMs ICCV 2023 Worshop LLM + VLMs Code
Res-CBM CVPR 2024 LLM + Visual genome [Code](https://github.com/HelloSCM/ Res-CBM)
DN-CBMs ECCV 2024 Sparse Auto Encoder + Words Code
CF-CBMs NeurIPS 2024 LLM + VLMs Code
VLG-CBM NeurIPS 2024 LLM + Object Detectors Code
BC-LLM NeurIPS 2024 Workshop LLM + Bayesian search framework Code
CCBM Arxiv 2024 Heatmaps -
CCPM IEEE TMM LLM, learnable -
XBMs AAAI 2025 MLLM (LLaVA) Code
V2C-CBM AAAI 2025 VLM (CLIP) + Common words Code
UBMs TMLR 2025 Concept discovery (image patch) Code

CBMs for Non-visual Data

Text

Table

Scientific Data

CBM Applications

Datasets

Concept Annotated Datasets

Name Task N. of concepts N. of classes
CUB birds classification 312 200
AwA2 animals classification 85 50
CelebA identities classification 6 1,000
OAI x-ray grading 10 4
WBCAtt white blood cells classification 31 5
Fitzpatrick 17k (subset) skin diseases classification 48 2
Diverse Dermatology Images (DDI) skin diseases classification 48 2
Skincon (Fitz sub + DDI annotated) skin diseases classification 48 2
DermaCon-IN skin diseases classification 47 8
Substitutions on CUB (SUB) synthetic bird classification 312 200

Papers Sorted by Publication Year

2025

Publication Paper Title Code/Project
AAAI Explanation Bottleneck Models Code
AAAI V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer Code
ACL Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models -
ACM CHI W Supporting Data-Frame Dynamics in AI-assisted Decision Making -
ACM MM BNI Learning New Concepts, Remembering the Old: A Novel Continual Learning for Multimodal Concept Bottleneck Models Code
CVPR Interpretable Generative Models through Post-hoc Concept Bottlenecks Code
CVPR Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability Code
CVPR Language Guided Concept Bottleneck Models for Interpretable Continual Learning Code
CVPR Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models -
CVPR W PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition -
ECML-PKDD Stable Vision Concept Transformers for Medical Diagnosis -
EICS 2025 CBM-RAG: Demonstrating Enhanced Interpretability in Radiology Report Generation with Multi-Agent RAG and Concept Bottleneck Models -
ICCV Intervening in Black Box: Concept Bottleneck Model for Enhancing Human Neural Network Mutual Understanding Code
ICCV Semi-supervised Concept Bottleneck Models Code
ICCV SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions Code
ICLR Counterfactual Concept Bottleneck Models Code
ICLR Concept Bottleneck Large Language Models Code
ICLR CONDA: Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts Code
ICLR Concept Bottleneck Language Models For Protein Design -
ICLR W Causally Reliable Concept Bottleneck Models Code
ICLR W Adaptive Test-Time Intervention for Concept Bottleneck Models Code
ICML Editable Concept Bottleneck Models -
ICML DCBM: Data-Efficient Visual Concept Bottleneck Models Code
ICML Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization Code
ICML Concept-Based Unsupervised Domain Adaptation Code
ICML Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts Code
ICML W Interpretable Reward Modeling with Active Concept Bottlenecks Code
ICML W Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings -
IEEE TMI Concept-Based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis Code
IEEE TMM Leveraging Concise Concepts with Probabilistic Modeling for Interpretable Visual Recognition -
IEEE CCSSTA Concept Learning for Cooperative Multi-Agent Reinforcement Learning -
IJCAI MVP-CBM:Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification Code
Information Processing & Management Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition Code
MICCAI Learning Concept-Driven Logical Rules for Interpretable and Generalizable Medical Image Classification Code
MICCAI Training-free Test-time Improvement for Explainable Medical Image Classification Code
Nature Communications A concept-based interpretable model for the diagnosis of choroid neoplasias using multimodal data Code
TMLR Selective Concept Bottleneck Models Without Predefined Concepts Code
xAI V-CEM: Bridging Performance and Intervenability in Concept-based Models Code
Arxiv ConceptCLIP: Towards Trustworthy Medical AI Via Concept-Enhanced Contrastive Langauge-Image Pre-training Code
Arxiv Object Centric Concept Bottlenecks -
Arxiv Towards Reasonable Concept Bottleneck Models -
Arxiv Zero-shot Concept Bottleneck Models Code
Arxiv CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification Code
Arxiv Towards Achieving Concept Completeness for Textual Concept Bottleneck Models -
Arxiv Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts -
Arxiv If Concept Bottlenecks are the Question, are Foundation Models the Answer? Code
Arxiv DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models -
Arxiv CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models -
Arxiv FHSTP@ EXIST 2025 Benchmark: Sexism Detection with Transparent Speech Concept Bottleneck Models -
Arxiv A Concept-based approach to Voice Disorder Detection -
Arxiv Transferring Expert Cognitive Models to Social Robots via Agentic Concept Bottleneck Models -
Arxiv Graph Concept Bottleneck Models -
Arxiv Locality-aware Concept Bottleneck Model -

2024

Publication Paper Title Code/Project
AAAI On the Concept Trustworthiness in Concept Bottleneck Models Code
AAAI Sparsity-guided holistic explanation for llms with interpretable inference-time intervention Code
ACCV EQ-CBM: A Probabilistic Concept Bottleneck with Energy-based Models and Quantized Vectors -
CVPR Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion -
CVPR LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models Code
CVPR Incremental Residual Concept Bottleneck Models Code
ECCV Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery Code
ECCV Explain Via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts -
ICLR Concept Bottleneck Generative Models Code
ICLR Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations Code
ICLR Faithful Vision-Language Interpretation Via Concept Bottleneck Models Code
ICLR Concept Bottleneck Generative Models
ICML Post-hoc Part-prototype Networks -
ICML W XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization -
ICML W Enhancing concept-based learning with logic -
IEEE TPAMI The Decoupling Concept Bottleneck Model Code
JBHI Guest Editorial: Trustworthy Machine Learning for Health Informatics -
MedIA Interpretable and Intervenable Ultrasonography-Based Machine Learning Models for Pediatric Appendicitis Code
MICCAI Concept-Attention Whitening for Interpretable Skin Lesion Diagnosis Code
MICCAI Aligning human knowledge with visual concepts towards explainable medical image classification Code
MICCAI Evidential concept embedding models: Towards reliable concept explanations for skin disease diagnosis Code
MICCAI Learning a Clinically-Relevant Concept Bottleneck for Lesion Detection in Breast Ultrasound Code
MICCAI Mask-Free Neuron Concept Annotation for Interpreting Neural Networks in Medical Domain Code
MICCAI AdaCBM: an Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis Code
MICCAI Integrating Clinical Knowledge into Concept Bottleneck Models Code
MLHC Improving ARDS Diagnosis Through Context-Aware Concept Bottleneck Models Code
NeurIPS Stochastic Concept Bottleneck Models Code
NeurIPS Coarse-to-Fine Concept Bottleneck Models Code
NeurIPS VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance Code
NeurIPS A Theoretical Design of Concept Sets: Improving the Predictability of Concept Bottleneck Models -
NeurIPS Towards Multi-dimensional Explanation Alignment for Medical Classification -
NeurIPS A Concept-Based Explainability Framework for Large Multimodal Models Code
NeurIPS Classifier Clustering and Feature Alignment for Federated Learning under Distributed Concept Drift Code
NeurIPS ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Code
NeurIPS Do LLMs Dream of Elephants (when Told Not To)? Latent Concept Association and Associative Memory in Transformers -
NeurIPS FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making Code
NeurIPS Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature Enhancement Code
NeurIPS From Causal to Concept-Based Representation Learning -
NeurIPS Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents Code
NeurIPS Interpretable Concept-Based Memory Reasoning Code
NeurIPS Interpreting CLIP with Sparse Linear Concept Embeddings (Splice) Code
NeurIPS Learning Discrete Concepts in Latent Hierarchical Models -
NeurIPS LG-CAV: Train Any Concept Activation Vector with Language Guidance -
NeurIPS Neural Concept Binder Code
NeurIPS No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Code
NeurIPS PaCE: Parsimonious Concept Engineering for Large Language Models Code
NeurIPS Relational Concept Bottleneck Models Code
NeurIPS Uncovering Safety Risks of Large Language Models Through Concept Activation Vector Code
NeurIPS Towards Multi-dimensional Explanation Alignment for Medical Classification -
NeurIPS Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable? Code
NeurIPS W Bayesian concept bottleneck models with llm priors Code
PAKDD Interpreting Pretrained Language Models Via Concept Bottlenecks Code
Sci. Rep Pseudo-class Part Prototype Networks for Interpretable Breast Cancer Classification Code
TMLR Reproducibility Study of "LICO: Explainable Models with Language-Image Consistency" Code
TMLR [Re].on the Reproducibility of Post-Hoc Concept Bottleneck Models Code
TMLR CLIP-QDA: an Explainable Concept Bottleneck Model
Arxiv Explainable and interpretable multimodal large language models: A comprehensive survey -
Arxiv Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks -
Arxiv Concept Complement Bottleneck Model for Interpretable Medical Image Diagnosis -
Arxiv Improving Concept Alignment in Vision-Language Concept Bottleneck Models Code
Arxiv CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models Code
Arxiv Tree-Based Leakage Inspection and Control in Concept Bottleneck Models Code

2023

Publication Paper Title Code/Project
AAAI Interactive Concept Bottleneck Models -
CVPR Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification Code
CVPR Learning bottleneck concepts in image classification Code
CVPR Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision -
EMNLP STAIR: Learning Sparse Text and Image Representation in Grounded Tokens -
EMNLP Cross-Modal Conceptualization in Bottleneck Models Code
ICCV Learning Concise and Descriptive Attributes for Visual Recognition Code
ICLR Label-free Concept Bottleneck Models Code
ICLR Post-hoc Concept Bottleneck Models Code
ICML A Closer Look at the Intervention Procedure of Concept Bottleneck Models Code
ICML Probabilistic Concept Bottleneck Models Code
ICML W A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis -
MICCAI Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification -
NeurIPS Do Concept Bottleneck Models Respect Localities Code
NeurIPS Learning to Receive Help: Intervention-Aware Concept Embedding Models Code
NMI From attribution maps to human-understandable explanations through Concept Relevance Propagation Code
Arxiv Robust and interpretable medical image classifiers via concept bottleneck models -

2022

Publication Paper Title Code/Project
ICCV Explaining in Style: Training a GAN to Explain a Classifier in StyleSpace -
ICLR CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks Code
IEEE Access Concept Bottleneck Model With Additional Unsupervised Concepts -
NeurIPS Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off Code
NeurIPS Addressing Leakage in Concept Bottleneck Models Code

2021

Publication Paper Title Code/Project
ICLR W Do Concept Bottleneck Models Learn as Intended? -
ICML Meaningfully Debugging Model Mistakes Using Conceptual Counterfactual Explanations Code
NMI A case-based interpretable deep learning model for classification of mass lesions in digital mammography Code

2020

Publication Paper Title Code/Project
ICML Concept bottleneck models Code
NMI Concept whitening for interpretable image recognition Code

Acknowledgement

This project was originally inspired by https://github.com/kkzhang95/Awesome_Concept_Bottleneck_Models. We thank the authors for their contributions. Our main motivation is to provide an additional architecture organized by research focus, supplement it with more recent papers, and sort them by conference name for easier navigation.

About

A list of papers about concept bottleneck models (CBMs)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published