MAPIE and Conformal Predictions with a LLM

Overview

This notebook demonstrates how to use MAPIE for conformal predictions with a Large Language Model (LLM). The goal is to evaluate prediction sets for a multiple-choice question-answering task using conformal prediction techniques. This notebook is based on the work presented in Benchmarking LLMs via Uncertainty Quantification. Parts of the code come from this Github repo.

Key Components

Dataset: The CosmosQA dataset, a benchmark for commonsense reasoning.
LLM: The notebook utilizes the Mistral-7B-Instruct-v0.3 model for predictions.
MAPIE for Conformal Prediction: The SplitConformalClassifier from MAPIE is used to generate prediction sets with a given confidence level.

Steps Covered

Setup & Installation
- Clone the repository and install required dependencies.
- Authenticate with Hugging Face Hub to access the LLM.
Data Preprocessing
- Load and transform CosmosQA data into a format suitable for the model.
Model Loading
- Load the Mistral-7B model and its tokenizer.
- Define an LLMClassifier wrapper to make predictions in a structured format.
Conformal Prediction with MAPIE
- Use SplitConformalClassifier to conformalize the model on a subset of the data.
- Generate prediction sets with a 95% confidence level.
Evaluation & Visualization
- Compute accuracy scores and coverage metrics.
- Visualize the size distribution of prediction sets.
- Plot accuracy per prediction set size.

Results

The LLM achieves an accuracy of approximately 86% on the test set.
Prediction sets provide calibrated uncertainty estimates, enhancing reliability in decision-making.
The more uncertain the model is (i.e., the larger the prediction sets), the lower the accuracy.

Conclusion

This notebook illustrates how conformal prediction techniques can be applied to LLMs for more trustworthy AI systems. The approach can be extended to other question-answering datasets and models to assess confidence in model predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
utils		utils
.gitignore		.gitignore
MAPIE_for_cosmosqa.ipynb		MAPIE_for_cosmosqa.ipynb
MAPIE_for_cosmosqa_correction.ipynb		MAPIE_for_cosmosqa_correction.ipynb
README.md		README.md
cosmosqa_10k.json		cosmosqa_10k.json
make_exercice.sh		make_exercice.sh
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MAPIE and Conformal Predictions with a LLM

Overview

Key Components

Steps Covered

Results

Conclusion

About

Uh oh!

Releases

Packages

Languages

gmartinonQM/mapie_llm

Folders and files

Latest commit

History

Repository files navigation

MAPIE and Conformal Predictions with a LLM

Overview

Key Components

Steps Covered

Results

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages