GitHub - cleverhans-lab/confidential-guardian: We show that a model owner can artificially introduce uncertainty into their model and provide a corresponding detection mechanism.

📄Paper • 📊 Slides (coming soon) • 🖼️Poster (coming soon) • 🎬 Video (coming soon)

🧠 Abstract

Cautious predictions—where a machine learning model abstains when uncertain—are crucial for limiting harmful errors in safety-critical applications. In this work, we identify a novel threat: a dishonest institution can exploit these mechanisms to discriminate or unjustly deny services under the guise of uncertainty. We demonstrate the practicality of this threat by introducing an uncertainty-inducing attack called Mirage, which deliberately reduces confidence in targeted input regions, thereby covertly disadvantaging specific individuals. At the same time, Mirage maintains high predictive performance across all data points. To counter this threat, we propose Confidential Guardian, a framework that analyzes calibration metrics on a reference dataset to detect artificially suppressed confidence. Additionally, it employs zero-knowledge proofs of verified inference to ensure that reported confidence scores genuinely originate from the deployed model. This prevents the provider from fabricating arbitrary model confidence values while protecting the model’s proprietary details. Our results confirm that Confidential Guardian effectively prevents the misuse of cautious predictions, providing verifiable assurances that abstention reflects genuine model uncertainty rather than malicious intent.

⚙️ Installation with `uv`

We are using uv as our package manager (and we think you should, too)! It is a fast Python dependency management tool and drop-in replacement for pip.

Step 1: Install `uv` (if not already installed)

pip install uv

Step 2: Install dependencies

uv pip install -e .

Step 3: Activate environment

source .venv/bin/activate

Step 4: Launch jupyter

jupyter notebook

🗂️ Codebase overview

mirage.py: Contains code for the Mirage attack discussed in the paper.
conf_guard.py: Contains code for computing calibration metrics and reliability diagrams.
gaussian_experiments.ipynb: Notebook for the synthethic Gaussian experiments.
image_experiments.ipynb: Notebook for the image experiments on CIFAR-100 and UTKFace.
tabular_experiments.ipynb: Notebook for the tabular experiments on Adult and Credit.
regression_experiments.ipynb: Notebook for the regression experiments.
zkp: Code for running the zero-knowlegde proofs. See README.md in subfolder for details.

🎓 BibTeX citation

@inproceedings{rabanser2025confidential,
  title = {Confidential Guardian: Cryptographically Prohibiting the Abuse of Model Abstention},
  author = {Stephan Rabanser and Ali Shahin Shamsabadi and Olive Franzese and Xiao Wang and Adrian Weller and Nicolas Papernot},
  year = {2025},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
zkp		zkp
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
data.py		data.py
gaussian_experiments.ipynb		gaussian_experiments.ipynb
image_experiments.ipynb		image_experiments.ipynb
main.py		main.py
mirage.py		mirage.py
pyproject.toml		pyproject.toml
regression_experiments.ipynb		regression_experiments.ipynb
tabular_experiments.ipynb		tabular_experiments.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Abstract

⚙️ Installation with `uv`

Step 1: Install `uv` (if not already installed)

Step 2: Install dependencies

Step 3: Activate environment

Step 4: Launch jupyter

🗂️ Codebase overview

🎓 BibTeX citation

About

Uh oh!

Releases

Packages

Languages

cleverhans-lab/confidential-guardian

Folders and files

Latest commit

History

Repository files navigation

🧠 Abstract

⚙️ Installation with uv

Step 1: Install uv (if not already installed)

Step 2: Install dependencies

Step 3: Activate environment

Step 4: Launch jupyter

🗂️ Codebase overview

🎓 BibTeX citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

⚙️ Installation with `uv`

Step 1: Install `uv` (if not already installed)

Packages