Skip to content

The official implementation code for (ACMMM 2025) AnchorSync: Global Consistency Optimization for Long Video Editing

License

Notifications You must be signed in to change notification settings

VISION-SJTU/AnchorSync

Repository files navigation

AnchorSync: Global Consistency Optimization for Long Video Editing

Introduction

Official implementation of AnchorSync(ACMMM 2025). This repo includes the reproduced version of AnchorSync: Global Consistency Optimization for Long Video Editing.

Get Started

Suppose the AnchorSync codebase path is ${AnchorSync_HOME}. Then, follow the subsequent procedures.

Step 1: Prepare Environment

cd ${AnchorSync_HOME}
conda create -n anchorsync python=3.10
conda activate anchorsync
python3 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
python3 -m pip install -r requirements.txt --no-deps
python3 -m pip install xformers==0.0.25 --index-url https://download.pytorch.org/whl/cu118

Step 2: Prepare Pandas Dataset (For Training)

Download videos from Pandas-70m Dataset in ${AnchorSync_HOME}/data.

the ${AnchorSync_HOME}/data/Panda-70M folder should be organized as follows:

└── data
    └── Panda-70M
        ├── train
        ├── test
        ├── video_files.json

video_files.json record the video storage path information, which you can generate using the provided script:

python get_pandas.py

Step 3: Prepare Pretrained Models

Download stable-diffusion-v1-5, Canny controlnet for sd 1.5 and stable-video-diffusion-img2vid-xt. Change corresponding checkpoint path.

Or you can download them automatically at runtime (default).

Train

First, train joint diffusion for first step:

bash train_models/train_scripts/train_joint_frame_lora.sh

Second, train multimodal controlnet for SVD:

bash train_models/train_scripts/train_controlnet_canny+flow.sh

Usage

If you do not train, you can download joint frame lora in {joint_lora_path}, download multimodal controlnet in {multimodal_controlnet_path}.

For example, the ${AnchorSync_HOME}/data/Panda-70M folder should be organized as follows:

└── output_dir
    ├── joint_frame_lora
    ├── multimodal-controlnet

Put your videos in data/, named "{case_name}.mp4". You can use it like below:

  1. DDIM Inversion of first process (jointly edit anchor frames)
python run_models/run_inference_joint_frame_video_fusion_guidance_inversion.py --case_name "mountain-new" --invert_prompt "Vast Mountain Landscape under Clear Blue Sky" --joint_lora_dir "output_dir/joint_frame_lora"
  1. Forward editing of first process (jointly edit anchor frames)
python run_models/run_inference_joint_frame_video_fusion_guidance_forward.py --case_name "mountain-new" --invert_prompt "Vast Mountain Landscape under Clear Blue Sky" --prompt "Chinese Ink Wash Painting of Mountain Landscape under Clear Sky" --joint_lora_dir "output_dir/joint_frame_lora"
  1. Second process (Multimodal-Guided Interpolation)
python run_models/run_inference_trans_controlnet_canny_flow_video_fusion_guidance_pnp.py --case_name "mountain-new" --prompt "Chinese Ink Wash Painting of Mountain Landscape under Clear Sky" --multimodal_controlnet_path "output_dir/multimodal-controlnet"

We recommend you try editing longer videos like:

python run_models/run_inference_joint_frame_video_fusion_guidance_inversion.py --case_name "forest-2" --invert_prompt "A forest path in morning sunlight with green trees and long shadows" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_joint_frame_video_fusion_guidance_forward.py --case_name "forest-2" --invert_prompt "A forest path in morning sunlight with green trees and long shadows" --prompt "A forest path covered in snow during a winter sunrise" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_trans_controlnet_canny_flow_video_fusion_guidance_pnp.py --case_name "forest-2" --prompt "A forest path covered in snow during a winter sunrise" --multimodal_controlnet_path "output_dir/multimodal-controlnet"

Acknowledgement

This repository refers to multiple great open-sourced code bases. Thanks for their great contribution to the community.

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

About

The official implementation code for (ACMMM 2025) AnchorSync: Global Consistency Optimization for Long Video Editing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published