Skip to content

WinKawaks/CaricatureBooth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CaricatureBooth: Data-Free Interactive
Caricature Generation in a Photo Booth

Paper

Paper link

Abstract

For caricatures, ID preservation and artistic distortion are two key points. While for an interactive caricature booth, conditional generation with creativity is necessary. However, none of the existing ID datasets (face/caricature) fulfil all the requirements.

  • For modern caricature datasets (WebCaricature), there are source IDs, caricatures and landmarks. However, these datasets are highly dependent on manual labelling and cannot be trained at large scale.
  • For free online caricature data, we can build a new dataset of source ID and caricature pairs. However, landmark detection on caricature data is very difficult. We can not use the data for conditional generation.
  • For real face datasets (FFHQ), we can predict the landmark easily. However, real face data lacks exaggerated artistic distortion.

In this work, we design an effective data synthesis method to overcome the above problem.

overview

Citation

@inproceedings{qu2025caricaturebooth,
  title={CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth},
  author={Qu, Zhiyu and Miao, Yunqi and Zhang, Zhensong and Song, Jifei and Deng, Jiankang and Song, Yi-Zhe},
  booktitle={CVPR},
  year={2025}
}

Code

This work is based on InstantID. Please follow the dependencies of InstantID and put the "antelopev2" folder under the InstantID-train/models/.

Preprocess the data:

Download the FFHQ dataset and put the 1024*1024 images under the Data/source_images/. Run the following code for full data preparation. Note that the demo TPS code generates only one deformed face-landmark pair for each ID, and more deformed data will improve the generation results and robustness.

# Generate the deformed face images and landmarks.
$python Data_preparation/tps_deform.py

# Prepare face embeddings.
$python InstantID-train/prepare_emb.py

# Prepare text embeddings for saving GPU memory.
$python Data_preparation/create_text_embs.py

# Generate the json file and build the dataset.
$python Data_preparation/write_json.py

Run InstantID-train/download.sh to download the model weight.

Training:

Note that our data synthesis strategy and the inference design could be combined with any existing ID preservation generative models. Here we take InstantID as an example.

$sh InstantID-train/train_instantId_sdxl.sh

About

[CVPR 2025] CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published