Skip to content

[CVPR 2025 Highlight] Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

Notifications You must be signed in to change notification settings

Yikai-Wang/asuka-misato

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Towards Enhanced Image Inpainting:
Mitigating Unwanted Object Insertion and Preserving Color Consistency

Yikai Wang*, Chenjie Cao*, Junqiu Yu*, Ke Fan, Xiangyang Xue, Yanwei Fu†.
Fudan University
CVPR 2025 (Highlight)

arXiv page


Overview

This repo contains the proposed ASUKA algorithm and the evaluation dataset MISATO in our paper "Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency".

ASUKA solves two issues existed in current diffusion and rectified flow inpainting models: Unwanted object insertion, where randomly elements that are not aligned with the unmasked region are generated; Color-inconsistency, the color shift of the generated masked region, causing smear-like traces. ASUKA proposes a post-training procedure for these models, significantly mitigates object hallucination and improves color consistency of inpainted results.

While unwanted object insertion is a specific problem in general image inpainting, color inconsistency affects all text-to-image editing models. Our proposed decoder can consistently improve performance by addressing this issue.

We released ASUKA for FLUX.1-Fill-dev, denoted as ASUKA(FLUX.1-Fill). We also release the MISATO dataset at resolutions 512 and 1024. We are actively working to improve both our model and evaluation dataset. If you encounter failure cases with ASUKA (FLUX.1-Fill) or have challenging examples in image inpainting, we would love to hear from you. Please email them to yi-kai.wang@outlook.com. We truly appreciate your contributions!

Disclaimer

This repo, the ASUKA algorithm, and the MISATO dataset is intended to use for reserach purpose only, and we respect all the license of used models and codes. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. The developers do not assume any responsibility for potential misuse by users.

The authors do not own the image copyrights. Please follow the original dataset's license. We appreciate the contributions of Matterport3D, FlickrLandscape, MegaDepth, and COCO 2014.

To use Matterport3D, you must indicate that you agree to the terms of use by signing the Terms of Use agreement form, using your institutional email addresses, and sending it to: matterport3d@googlegroups.com.

Updates

  • Fixed a bug in the interpolation step that was causing batch inference to fail in our decoder in some cases.
  • Released the training mask used in ASUKA.
  • Released the ASUKA (FLUX.1-Fill) model and inference code.
  • Released the MISATO@1K dataset.
  • Released the MISATO@512 dataset.

ASUKA model

We've released a version of the ASUKA model compatible with the FLUX.1-Fill-dev inpainting model. Please refer to the sub-folder for more information.

Training mask

We've released the training mask at Huggingface.

MISATO dataset

teaser

To validate the inpainting performance across different domains and mask styles, we construct a evaluation dataset, dubbed as MISATO, from Matterport3D, FlickrLandscape, MegaDepth, and COCO 2014 for indoor, outdoor landscape, building, and background inpainting.

Download

The MISATO Dataset is available at Huggingface.

Structure

After unzipping the file, you will find two folders: one for 512 resolution and one for 1024. Each folder has the following structure:

|-image
  |- 0000.png
  ...
  |- 1999.png
|-mask
  |- 0000.png
  ...
  |- 1999.png

The numbers 0000-0499 represent outdoor landscapes, 0500-0999 represent indoor scenes, 1000-1499 represent buildings, and 1500-1999 represent backgrounds. The MISATO@1K version includes only 1500 image-mask pairs, as the COCO dataset lacks enough high-resolution images.

License

Code related to the ASUKA algorithm is under Apache 2.0 license.

BibTeX

If you find our repo helpful, please consider cite our paper :)

@inproceedings{wang2025towards,
  title={Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency.},
  author={Wang, Yikai and Cao, Chenjie and Yu, Junqiu and Fan, Ke and Xue, Xiangyang and Fu, Yanwei},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  year={2025}
}

About

[CVPR 2025 Highlight] Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published