Diffusion Curriculum (DisCL): Synthetic-to-Real Data Curriculum via Image-Guided Diffusion [ICCV 2025]
Our approach is composed of two phases: (Phase 1) Interpolated Synthetic Generation and (Phase 2) Training with CL. In
Phase 1, we use a model pretrained on the original data to identify the ''hard'' samples, and generate data with a full
spectrum from synthetic to real images with various image guidance
A demo script is prepared for synthetic data generation and filtering.
Synthetic images generated based on ImageNet-LT dataset with DisCL:
Synthetic images generated based on iWildCam dataset with DisCL:
Leveraging the full spectrum of synthetic data generated by DisCL, the model performs better on more challenging samples. Below are improved samples from the iWildCam test set compared to the SOTA method (FLYP).
Top-1 Accuracy on ImageNet-LT:
| Method | Many | Medium | Few | Overall |
|---|---|---|---|---|
| CE | 57.70 | 26.60 | 4.40 | 35.80 |
| CE + CUDA | 57.49 | 28.16 | 6.58 | 36.30 |
| CE + DisCL | 56.78 | 30.73 | 23.64 | 39.82 |
| BS | 51.14 | 37.02 | 19.29 | 39.80 |
| BS + CUDA | 51.16 | 37.35 | 19.28 | 40.03 |
| BS + DisCL | 52.68 | 37.68 | 21.36 | 41.33 |
F1 score on iWildCam (OOD: Out-of-domain, ID: In-domain):
| Method | OOD F1 Score | ID F1 Score |
|---|---|---|
| FLYP | 35.5 (1.1) | 52.2 (0.6) |
| FLYP + ALIA | 36.9 (0.3) | 52.6 (0.4) |
| FLYP + DisCL | 38.2 (0.5) | 54.3 (1.4) |
conda create -n DisCL python=3.10
conda activate DisCL
pip3 install open_clip_torch
pip3 install wilds
pip3 install -r requirements.txtWe use two public datasets for training : ImageNet-LT and iWildCam.
- ImageNet-LT is a long-tailed subset of ImageNet data. Long-tailed meta information could be download from google drive.
- iWildCam is a image classification dataset captured by wildlife camera trap. It is release by WILDS and can be downloaded with its offical package.
- Prepare for a data csv file including hard samples
- Template of the csv file is shown in file sample.csv
- Use this csv to generate synthetic data with guidance scales & random seeds
python3 data_generation/iWildCam/gene_img.py --part=1 --total_parts=1 --data_csv="${PATH_TO_CSV}" --output_path="${OUTPUT_FOLDER}"
- Compute CLIPScore for filtering out poor-quality images.
python3 data_generation/iWildCam/comp_clip_scores.py --syn_path="${OUTPUT_FOLDER}" --real_path="${PATH_TO_WILDS}"
- Results 1 (clip_score.pkl): including the image-image similarity score and image-text similarity score
- Results 2 (filtered_results.pkl): including only the filtered image-image similarity score and image-text similarity score
- Prepare for a data csv file including hard samples
- Template of the csv file is shown in file sample.csv
- Use this csv to generate diversified text prompt for hard classes
python3 data_generation/ImageNet_LT/get_text_prompt.py --data_csv="${PATH_TO_CSV}" --prompt_json="${PATH_TO_PROMPT}"
- Use this csv to generate synthetic data with guidance scales & random seeds
python3 data_generation/ImageNet_LT/gene_img.py --part=1 --total_parts=1 --data_csv="${PATH_TO_CSV}" --output_path="${OUTPUT_FOLDER}" --prompt_json="${PATH_TO_PROMPT}"
- Compute CLIPScore for filtering out poor-quality images. This script will produce a clip_score.pkl including the
image-image similarity score and image-text similarity score
python3 data_generation/ImageNet_LT/comp_clip_scores.py --syn_path="${OUTPUT_FOLDER}" --real_path="${PATH_TO_INLT}"
- Results 1 (clip_score.pkl): including the image-image similarity score and image-text similarity score
- Results 2 (filtered_results.pkl): including only the filtered image-image similarity score and image-text similarity score
- Run training scripts run_training.sh
cd curriculum_training/ImageNet bash myshells/run_training.sh
- Run training scripts run_training.sh
cd curriculum_training/iWildCam bash myshells/run_training.sh
Our code is heavily based on FLYP, LDMLR, and Open CLIP. We greatly thank the authors for open-sourcing their code!
Please consider citing our paper if you think our codes, data, or models are useful. Thank you!
@inproceedings{liang-bhardwaj-zhou-2024-discl,
title = "Diffusion Curriculum: Synthetic-to-Real Data Curriculum via Image-Guided Diffusion",
author = "Liang, Yijun and Bhardwaj, Shweta and Zhou, Tianyi",
booktitle = "International Conference on Computer Vision (ICCV)",
year = "2025",
}







