This repository includes official implementation of "SyncSDE: A Probabilistic Framework for Diffusion Synchronization" (CVPR 2025).
This repository is tested with Python 3.9, CUDA 11.8.
pip install -r requirements.txt
pip install -e .
For 3d Mesh Texturing, install pytorch3d
conda install -c conda-forge cupy
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu118_pyt270/download.html
If error occurs on numpy version, please re-install numpy via
conda install -c conda-forge numpy
and upgrade to version 1.22.4.
python src/mask_based_T2I.py --prompt_path "data/prompt_mask_T2I.txt" \
--mask_path "data/mask_sample.png" \
--results_folder "./output/mask_T2I" \
--random_seed 0 --use_float_16 --inv_lambda 5.0
First invert the real image using DDIM inversion.
python src/inversion.py --input_image "data/image.png" \
--results_folder "output/real_image_editing" \
--use_float_16
Then edit the real image.
python src/real_image_editing.py --inversion "./output/real_image_editing/inversion/image.pt" \
--prompt "./output/real_image_editing/prompt/image.txt" \
--results_folder "./output/real_image_editing" --task_name "cat2dog" \
--random_seed 0 --use_float_16 --inv_lambda 5.0
python src/wide_image.py --prompt_path "./data/prompt_wide_image.txt" \
--results_folder "output/wide_image" \
--n_patches 13 --init_xt_from_zt \
--random_seed 0 --use_float_16 --inv_lambda 5.0
[NOTE] You may upgrade transformer
version to use fp16
variant of the pretrained DeepFloyd-IF model.
To use the pretrained Deepfloyd-IF model, please log in to hugging face by following the instructions.
Then, run the script:
python src/ambiguous_image.py --prompt_path "data/prompt_ambiguous.txt" \
--transform "rotate_cw" \
--results_folder "./output/ambiguous_image" \
--random_seed 0 --use_float_16 --inv_lambda 5.0
You may use 5 types of transforms: 'rotate_cw'
, 'rotate_ccw'
, 'rotate_180'
, 'skew'
, and 'flip'
.
python src/mesh_texturing.py --prompt_path "./data/prompt_mesh_texturing.txt" \
--mesh_path "./data/clutch_bag.obj" \
--results_folder "./output/mesh_texturing" \
--random_seed 0 --inv_lambda 5.0
This repository is constructed based on Conditional Score Guidance and DeepFloyd-IF. The source image for text-driven real image editing is brought from the LAION-5B dataset.