Awesome Concept Bottleneck Models

Work in progress: we have compiled and summarized relevant papers in this field by year and are continuing to improve the categorization and organization of the collection to help newcomers quickly understand the area. Feel free to suggest improvements or add new papers via a pull request.

Introduction

The Concept Bottleneck Model (CBM) is an emerging self-explainable architecture that first maps inputs to a set of human-interpretable concepts before making predictions using an interpretable classifier, typically a single-layer linear model. Beyond inherent interpretability, CBMs provide an intervention interface through the concept bottleneck layer, allowing users to directly modify concept activations to refine model predictions, and this serves as the most significant difference between CBMs and other explainable models, such as the CapsulesNet and ProtoPNet.

(images from IntCEMs, highlighting the interpretability and intervention ability of CBM architectures)

Papers Sorted by Research Focus

Architecture Improvements

Improving Concept Representations

The original Concept Bottleneck Model maps each concept to a single (probabilistic) value to construct the concept bottleneck layer, followed by a linear layer that predicts image-level class labels based on these concept values. However, the semantics of individual concepts, the relationships and hierarchies among different concepts, and the dependencies between concepts and class labels are inherently complex. Therefore, to address the need for richer, more expressive concept representations and to model the intricate concept–concept and concept–class relationships, many studies have proposed improvements to the representation methods used in the concept bottleneck layer.

Method	Publication	Concept Representation	Highlight	Code/Project
Concept Embedding Models (CEMs)	NeurIPS 2022	high-dimensional embeddings	representing each concept as a supervised high-dimensional embeddings to preserve high performance and interpretability under incomplete concept annotations	Code
Probabilistic Concept Bottleneck Models (PCBMs)	ICML 2023	probabilistic embeddings	leveraging probabilistic concept embeddings to model uncertainty in concept predictions and provide more reliable explanations with uncertainty	Code
Energy-based Concept Bottleneck Models (ECBMs)	ICLR 2024	high-dimensional embeddings + energy networks	using a set of networks to define the joint energy of the (input, concept, class) triplet, therefore providing a unified way for prediction, concept intervention, and probabilistic explanation via minimizing energy.	Code
Logic-enhanced CBMs	ICML W 2024	augmented with propositional logic rules	using the propositional logic derived from the concepts to model the relationships between concepts	-
EQ-CBM	ACCV 2024	quantized probabilistic embeddings	enhances CBMs through probabilistic concept encoding using energy-based models with quantized concept activation vectors to capture uncertainties	-

Improving Intervention Ability / Interactivity

Method	Publication	Highlight	Code/Project

Finding Concepts (concept discovery, language-guided CBMs, etc.)

Method	Publication	Concept Source	Code/Project
LF-CBMs	ICLR 2023	LLM	Code
Post-hoc CBMs	ICLR 2023	LLM / TCAV	Code
LaBo	CVPR 2023	LLM	Code
BotCL	CVPR 2023	Concept Prototypes (images + heatmap)	Code
LM4CV	ICCV 2023	LLM	Code
CDMs	ICCV 2023 Worshop	LLM + VLMs	Code
Res-CBM	CVPR 2024	LLM + Visual genome	[Code](https://github.com/HelloSCM/ Res-CBM)
DN-CBMs	ECCV 2024	Sparse Auto Encoder + Words	Code
CF-CBMs	NeurIPS 2024	LLM + VLMs	Code
VLG-CBM	NeurIPS 2024	LLM + Object Detectors	Code
BC-LLM	NeurIPS 2024 Workshop	LLM + Bayesian search framework	Code
CCBM	Arxiv 2024	Heatmaps	-
CCPM	IEEE TMM	LLM, learnable	-
XBMs	AAAI 2025	MLLM (LLaVA)	Code
V2C-CBM	AAAI 2025	VLM (CLIP) + Common words	Code
UBMs	TMLR 2025	Concept discovery (image patch)	Code

CBMs for Non-visual Data

Text

Table

Scientific Data

CBM Applications

Datasets

Concept Annotated Datasets

Name	Task	N. of concepts	N. of classes
CUB	birds classification	312	200
AwA2	animals classification	85	50
CelebA	identities classification	6	1,000
OAI	x-ray grading	10	4
WBCAtt	white blood cells classification	31	5
Fitzpatrick 17k (subset)	skin diseases classification	48	2
Diverse Dermatology Images (DDI)	skin diseases classification	48	2
Skincon (Fitz sub + DDI annotated)	skin diseases classification	48	2
DermaCon-IN	skin diseases classification	47	8
Substitutions on CUB (SUB)	synthetic bird classification	312	200

Papers Sorted by Publication Year

2025

Publication	Paper Title	Code/Project
AAAI	Explanation Bottleneck Models	Code
AAAI	V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer	Code
ACL	Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models	-
ACM CHI W	Supporting Data-Frame Dynamics in AI-assisted Decision Making	-
ACM MM BNI	Learning New Concepts, Remembering the Old: A Novel Continual Learning for Multimodal Concept Bottleneck Models	Code
CVPR	Interpretable Generative Models through Post-hoc Concept Bottlenecks	Code
CVPR	Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability	Code
CVPR	Language Guided Concept Bottleneck Models for Interpretable Continual Learning	Code
CVPR	Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models	-
CVPR W	PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition	-
ECML-PKDD	Stable Vision Concept Transformers for Medical Diagnosis	-
EICS 2025	CBM-RAG: Demonstrating Enhanced Interpretability in Radiology Report Generation with Multi-Agent RAG and Concept Bottleneck Models	-
ICCV	Intervening in Black Box: Concept Bottleneck Model for Enhancing Human Neural Network Mutual Understanding	Code
ICCV	Semi-supervised Concept Bottleneck Models	Code
ICCV	SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions	Code
ICLR	Counterfactual Concept Bottleneck Models	Code
ICLR	Concept Bottleneck Large Language Models	Code
ICLR	CONDA: Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts	Code
ICLR	Concept Bottleneck Language Models For Protein Design	-
ICLR W	Causally Reliable Concept Bottleneck Models	Code
ICLR W	Adaptive Test-Time Intervention for Concept Bottleneck Models	Code
ICML	Editable Concept Bottleneck Models	-
ICML	DCBM: Data-Efficient Visual Concept Bottleneck Models	Code
ICML	Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization	Code
ICML	Concept-Based Unsupervised Domain Adaptation	Code
ICML	Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts	Code
ICML W	Interpretable Reward Modeling with Active Concept Bottlenecks	Code
ICML W	Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings	-
IEEE TMI	Concept-Based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis	Code
IEEE TMM	Leveraging Concise Concepts with Probabilistic Modeling for Interpretable Visual Recognition	-
IEEE CCSSTA	Concept Learning for Cooperative Multi-Agent Reinforcement Learning	-
IJCAI	MVP-CBM:Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification	Code
Information Processing & Management	Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition	Code
MICCAI	Learning Concept-Driven Logical Rules for Interpretable and Generalizable Medical Image Classification	Code
MICCAI	Training-free Test-time Improvement for Explainable Medical Image Classification	Code
Nature Communications	A concept-based interpretable model for the diagnosis of choroid neoplasias using multimodal data	Code
TMLR	Selective Concept Bottleneck Models Without Predefined Concepts	Code
xAI	V-CEM: Bridging Performance and Intervenability in Concept-based Models	Code
Arxiv	ConceptCLIP: Towards Trustworthy Medical AI Via Concept-Enhanced Contrastive Langauge-Image Pre-training	Code
Arxiv	Object Centric Concept Bottlenecks	-
Arxiv	Towards Reasonable Concept Bottleneck Models	-
Arxiv	Zero-shot Concept Bottleneck Models	Code
Arxiv	CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification	Code
Arxiv	Towards Achieving Concept Completeness for Textual Concept Bottleneck Models	-
Arxiv	Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts	-
Arxiv	If Concept Bottlenecks are the Question, are Foundation Models the Answer?	Code
Arxiv	DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models	-
Arxiv	CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models	-
Arxiv	FHSTP@ EXIST 2025 Benchmark: Sexism Detection with Transparent Speech Concept Bottleneck Models	-
Arxiv	A Concept-based approach to Voice Disorder Detection	-
Arxiv	Transferring Expert Cognitive Models to Social Robots via Agentic Concept Bottleneck Models	-
Arxiv	Graph Concept Bottleneck Models	-
Arxiv	Locality-aware Concept Bottleneck Model	-

2024

Publication	Paper Title	Code/Project
AAAI	On the Concept Trustworthiness in Concept Bottleneck Models	Code
AAAI	Sparsity-guided holistic explanation for llms with interpretable inference-time intervention	Code
ACCV	EQ-CBM: A Probabilistic Concept Bottleneck with Energy-based Models and Quantized Vectors	-
CVPR	Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion	-
CVPR	LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models	Code
CVPR	Incremental Residual Concept Bottleneck Models	Code
ECCV	Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery	Code
ECCV	Explain Via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts	-
ICLR	Concept Bottleneck Generative Models	Code
ICLR	Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations	Code
ICLR	Faithful Vision-Language Interpretation Via Concept Bottleneck Models	Code
ICLR	Concept Bottleneck Generative Models
ICML	Post-hoc Part-prototype Networks	-
ICML W	XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization	-
ICML W	Enhancing concept-based learning with logic	-
IEEE TPAMI	The Decoupling Concept Bottleneck Model	Code
JBHI	Guest Editorial: Trustworthy Machine Learning for Health Informatics	-
MedIA	Interpretable and Intervenable Ultrasonography-Based Machine Learning Models for Pediatric Appendicitis	Code
MICCAI	Concept-Attention Whitening for Interpretable Skin Lesion Diagnosis	Code
MICCAI	Aligning human knowledge with visual concepts towards explainable medical image classification	Code
MICCAI	Evidential concept embedding models: Towards reliable concept explanations for skin disease diagnosis	Code
MICCAI	Learning a Clinically-Relevant Concept Bottleneck for Lesion Detection in Breast Ultrasound	Code
MICCAI	Mask-Free Neuron Concept Annotation for Interpreting Neural Networks in Medical Domain	Code
MICCAI	AdaCBM: an Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis	Code
MICCAI	Integrating Clinical Knowledge into Concept Bottleneck Models	Code
MLHC	Improving ARDS Diagnosis Through Context-Aware Concept Bottleneck Models	Code
NeurIPS	Stochastic Concept Bottleneck Models	Code
NeurIPS	Coarse-to-Fine Concept Bottleneck Models	Code
NeurIPS	VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance	Code
NeurIPS	A Theoretical Design of Concept Sets: Improving the Predictability of Concept Bottleneck Models	-
NeurIPS	Towards Multi-dimensional Explanation Alignment for Medical Classification	-
NeurIPS	A Concept-Based Explainability Framework for Large Multimodal Models	Code
NeurIPS	Classifier Clustering and Feature Alignment for Federated Learning under Distributed Concept Drift	Code
NeurIPS	ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty	Code
NeurIPS	Do LLMs Dream of Elephants (when Told Not To)? Latent Concept Association and Associative Memory in Transformers	-
NeurIPS	FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making	Code
NeurIPS	Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature Enhancement	Code
NeurIPS	From Causal to Concept-Based Representation Learning	-
NeurIPS	Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents	Code
NeurIPS	Interpretable Concept-Based Memory Reasoning	Code
NeurIPS	Interpreting CLIP with Sparse Linear Concept Embeddings (Splice)	Code
NeurIPS	Learning Discrete Concepts in Latent Hierarchical Models	-
NeurIPS	LG-CAV: Train Any Concept Activation Vector with Language Guidance	-
NeurIPS	Neural Concept Binder	Code
NeurIPS	No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance	Code
NeurIPS	PaCE: Parsimonious Concept Engineering for Large Language Models	Code
NeurIPS	Relational Concept Bottleneck Models	Code
NeurIPS	Uncovering Safety Risks of Large Language Models Through Concept Activation Vector	Code
NeurIPS	Towards Multi-dimensional Explanation Alignment for Medical Classification	-
NeurIPS	Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable?	Code
NeurIPS W	Bayesian concept bottleneck models with llm priors	Code
PAKDD	Interpreting Pretrained Language Models Via Concept Bottlenecks	Code
Sci. Rep	Pseudo-class Part Prototype Networks for Interpretable Breast Cancer Classification	Code
TMLR	Reproducibility Study of "LICO: Explainable Models with Language-Image Consistency"	Code
TMLR	[Re].on the Reproducibility of Post-Hoc Concept Bottleneck Models	Code
TMLR	CLIP-QDA: an Explainable Concept Bottleneck Model
Arxiv	Explainable and interpretable multimodal large language models: A comprehensive survey	-
Arxiv	Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks	-
Arxiv	Concept Complement Bottleneck Model for Interpretable Medical Image Diagnosis	-
Arxiv	Improving Concept Alignment in Vision-Language Concept Bottleneck Models	Code
Arxiv	CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models	Code
Arxiv	Tree-Based Leakage Inspection and Control in Concept Bottleneck Models	Code

2023

Publication	Paper Title	Code/Project
AAAI	Interactive Concept Bottleneck Models	-
CVPR	Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification	Code
CVPR	Learning bottleneck concepts in image classification	Code
CVPR	Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision	-
EMNLP	STAIR: Learning Sparse Text and Image Representation in Grounded Tokens	-
EMNLP	Cross-Modal Conceptualization in Bottleneck Models	Code
ICCV	Learning Concise and Descriptive Attributes for Visual Recognition	Code
ICLR	Label-free Concept Bottleneck Models	Code
ICLR	Post-hoc Concept Bottleneck Models	Code
ICML	A Closer Look at the Intervention Procedure of Concept Bottleneck Models	Code
ICML	Probabilistic Concept Bottleneck Models	Code
ICML W	A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis	-
MICCAI	Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification	-
NeurIPS	Do Concept Bottleneck Models Respect Localities	Code
NeurIPS	Learning to Receive Help: Intervention-Aware Concept Embedding Models	Code
NMI	From attribution maps to human-understandable explanations through Concept Relevance Propagation	Code
Arxiv	Robust and interpretable medical image classifiers via concept bottleneck models	-

2022

Publication	Paper Title	Code/Project
ICCV	Explaining in Style: Training a GAN to Explain a Classifier in StyleSpace	-
ICLR	CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks	Code
IEEE Access	Concept Bottleneck Model With Additional Unsupervised Concepts	-
NeurIPS	Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off	Code
NeurIPS	Addressing Leakage in Concept Bottleneck Models	Code

2021

Publication	Paper Title	Code/Project
ICLR W	Do Concept Bottleneck Models Learn as Intended?	-
ICML	Meaningfully Debugging Model Mistakes Using Conceptual Counterfactual Explanations	Code
NMI	A case-based interpretable deep learning model for classification of mass lesions in digital mammography	Code

2020

Publication	Paper Title	Code/Project
ICML	Concept bottleneck models	Code
NMI	Concept whitening for interpretable image recognition	Code

Acknowledgement

This project was originally inspired by https://github.com/kkzhang95/Awesome_Concept_Bottleneck_Models. We thank the authors for their contributions. Our main motivation is to provide an additional architecture organized by research focus, supplement it with more recent papers, and sort them by conference name for easier navigation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome Concept Bottleneck Models

Introduction

Papers Sorted by Research Focus

Architecture Improvements

Finding Concepts (concept discovery, language-guided CBMs, etc.)

CBMs for Non-visual Data

CBM Applications

Datasets

Papers Sorted by Publication Year

2025

2024

2023

2022

2021

2020

About

Uh oh!

Releases

Packages

riverback/Awesome-Concept-Bottleneck-Models

Folders and files

Latest commit

History

Repository files navigation

Awesome Concept Bottleneck Models

Introduction

Papers Sorted by Research Focus

Architecture Improvements

Finding Concepts (concept discovery, language-guided CBMs, etc.)

CBMs for Non-visual Data

CBM Applications

Datasets

Papers Sorted by Publication Year

2025

2024

2023

2022

2021

2020

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages