[ICLR 2025] Tailoring Mixup to Data for Calibration

Authors: Quentin Bouniot, Pavlo Mozharovskyi, Florence d'Alché-Buc

This is the official code for Similarity Kernel Mixup on classification tasks (see classification folder), and regression tasks (see regression folder). Each folder contains a separated README for information about setting up and running experiments. Code to reproduce experiments on toy dataset is included in the toy_datasets folder.

Abstract

Among all data augmentation techniques proposed so far, linear interpolation of training samples, also called Mixup, has found to be effective for a large panel of applications. Along with improved predictive performance, Mixup is also a good technique for improving calibration. However, mixing data carelessly can lead to manifold mismatch, i.e., synthetic data lying outside original class manifolds, which can deteriorate calibration. In this work, we show that the likelihood of assigning a wrong label with mixup increases with the distance between data to mix. To this end, we propose to dynamically change the underlying distributions of interpolation coefficients depending on the similarity between samples to mix, and define a flexible framework to do so without losing in diversity. We provide extensive experiments for classification and regression tasks, showing that our proposed method improves predictive performance and calibration of models, while being much more efficient.

Taking into account similarity in Mixup

Introducing similarity into the interpolation is more efficient and provides more diversity than explicitly selecting the points to mix.

Similarity Kernel

Batch-normalized and centered Gaussian kernel

Amplitude $\tau_{max}$ governs the strength of the interpolation
Standard deviation $\tau_{std}$ governs the extent of mixing
Stronger interpolation between similar points and reduce interpolation otherwise

Avoiding Manifold Mismatch

Reference

If you find our work useful, please star this repo and cite:

@inproceedings{bouniot2025tailoring,
  title={Tailoring Mixup to Data for Calibration},
  author={Bouniot, Quentin and Mozharovskyi, Pavlo and d'Alch{\'e}-Buc, Florence},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
classification		classification
images		images
regression		regression
toy_datasets		toy_datasets
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[ICLR 2025] Tailoring Mixup to Data for Calibration

Abstract

Taking into account similarity in Mixup

Similarity Kernel

Avoiding Manifold Mismatch

Reference

About

Uh oh!

Uh oh!

Languages

qbouniot/sim_kernel_mixup

Folders and files

Latest commit

History

Repository files navigation

[ICLR 2025] Tailoring Mixup to Data for Calibration

Abstract

Taking into account similarity in Mixup

Similarity Kernel

Avoiding Manifold Mismatch

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages