Skip to content

The official implementation code for (ACMMM 2025) AnchorSync: Global Consistency Optimization for Long Video Editing

Notifications You must be signed in to change notification settings

xiaobeichi/AnchorSync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnchorSync: Global Consistency Optimization for Long Video Editing

Official implementation of AnchorSync: We introduce AnchorSync, a novel diffusion-based framework for long video editing that explicitly tackles long-term consistency and short-term continuity in a unified architecture. Our approach decouples the editing process into two stages: (1) anchor frame editing, where a sparse set of representative frames are jointly edited through a progressive denoising process. To ensure global coherence, we inject a trainable Bidirectional Attention into a diffusion model to capture pairwise structural dependencies between distant frames, and perform Plug-and-Play (PnP) inversion and injection for controllable editing; and (2) intermediate frame interpolation, where we leverage a video diffusion model equipped with a newly trained multimodal ControlNet to guide generation using both optical flow and edge maps, enabling temporally smooth and structure-aware transitions between anchor frames.

Installation

conda create -n anchorsync python=3.10
conda activate anchorsync
python3 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
python3 -m pip install -r requirements.txt --no-deps
python3 -m pip install xformers==0.0.25 --index-url https://download.pytorch.org/whl/cu118

Train

Download more than 10000 videos from Pandas-70m Dataset in your path and change the param video_folder.

Download stable-diffusion-v1-5, Canny controlnet for sd 1.5 and stable-video-diffusion-img2vid-xt. Change corresponding checkpoint path.

First, train joint diffusion for first step:

bash train_models/train_scripts/train_joint_frame_lora.sh

Second, train multimodal controlnet for SVD:

bash train_models/train_scripts/train_controlnet_canny+flow.sh

Usage

If you do not train, you can download ckpt from link:

download joint frame lora in {joint_lora_path}, download multimodal controlnet in {multimodal_controlnet_path}.

Put your videos in data/, named "{case_name}.mp4"

python run_models/run_inference_joint_frame_video_fusion_guidance_inversion.py --case_name "mountain-new" --invert_prompt "Vast Mountain Landscape under Clear Blue Sky" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_joint_frame_video_fusion_guidance_forward.py --case_name "mountain-new" --invert_prompt "Vast Mountain Landscape under Clear Blue Sky" --prompt "Chinese Ink Wash Painting of Mountain Landscape under Clear Sky" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_trans_controlnet_canny_flow_video_fusion_guidance_pnp.py --case_name "mountain-new" --prompt "Chinese Ink Wash Painting of Mountain Landscape under Clear Sky" --multimodal_controlnet_path "output_dir/multimodal-controlnet"

python run_models/run_inference_joint_frame_video_fusion_guidance_inversion.py --case_name "forest-2" --invert_prompt "A forest path in morning sunlight with green trees and long shadows" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_joint_frame_video_fusion_guidance_forward.py --case_name "forest-2" --invert_prompt "A forest path in morning sunlight with green trees and long shadows" --prompt "A forest path covered in snow during a winter sunrise" --joint_lora_dir "output_dir/joint_frame_lora"

python run_models/run_inference_trans_controlnet_canny_flow_video_fusion_guidance_pnp.py --case_name "forest-2" --prompt "A forest path covered in snow during a winter sunrise" --multimodal_controlnet_path "output_dir/multimodal-controlnet"

Acknowledgement

This codebase is built upon Stable Diffusion, ControlNet and Stable Video Diffusion. We thank all the authors for their great work and repos!

About

The official implementation code for (ACMMM 2025) AnchorSync: Global Consistency Optimization for Long Video Editing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published