Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion

📖TL;DR: Any-to-Bokeh is a novel one-step video bokeh framework that converts arbitrary input videos into temporally coherent, depth-aware bokeh effects.

📢 News

[2025-07-11] 🎉 We have officially released the model weights for public use!
You can now download the pretrained weights via the google drive.

✅ ToDo List for Any-to-Bokeh Release

Release the demo inference files
Release the inference pipeline
Release the model weights
Release the training files

🔧 Installation

conda create -n any2bokeh python=3.10 -y
conda activate any2bokeh
# The default CUDA version is 12.4, please modify it according to your configuration.

# Install pytorch. 
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124

# Clone repo
git clone https://github.com/vivoCameraResearch/any-to-bokeh.git
cd any2bokeh
pip install -r requirements.txt

⏬ Demo Inference

We obtained 8 demos from DAVIS dataset

Download the pre-trained weights in google drive in ./checkpoints folder.
Run the demo script python test/inference_demo.py. The results will be saved in the ./output folder.

🏃 Inference Custom Video

Before bokeh rendering, two data preprocessing steps are required.

Data Preprocessing

1. Get Object Mask.

We recommend using SAM2 to get the mask of the focusing target.

2. Depth Prediction.

First, split the video into frames, place it in a folder, and use the utils/script_mp4.py.

python utils/split_mp4.py input.mp4

Then, install the video depth anything to get depth information for each frame by our script.

python utils/pre_process.py \
    --img_folder path/to/images \
    --mask_folder path/to/masks \  # Path to the mask obtained via sam2
    --disp_dir output/directory \

Case1: Fixed focus plane

The folder aif_folder that stores the video frames, the corresponding folder disp_folder that has been preprocessed, and the value k representing the intensity of bokeh into a CSV file in the following format (like demo.csv):

aif_folder	disp_folder	k
demo_dataset/videos/xxx	demo_dataset/disp/xxx	16

Then, run the script

python test/inference_demo.py --val_csv_path csv_file/demo.csv

Case2: Changed blur strength

First, define the blur strength k for each frame. Specifically, the filename of the depth file for each frame needs to be modified. We provide a simple modification script for this purpose.

Next, the CSV configuration for case1 should be updated to the following template(e.g., change_k_demo.csv):

aif_folder	disp_folder	k
demo_dataset/videos/xxx	demo_dataset/disp_change_k/xxx	change

Then, run the script

python test/inference_demo.py --val_csv_path csv_file/demo_change_k.csv

Case3: Changed focus plane

We use the number identified by _zf_ to represent the disparity value of the focus plane. You can customize this value for each frame to adjust the focus plane. We provide a simple modification script for this purpose.

Next, the CSV configuration is the same as in case1 (e.g., change_f_demo.csv):

aif_folder	disp_folder	k
demo_dataset/videos/xxx	demo_dataset/disp_change_f/xxx	16

Then, run the script

python test/inference_demo.py --val_csv_path csv_file/demo_change_f.csv

📜 Acknowledgement

This codebase builds on SVD_Xtend. Thanks for open-sourcing! Besides, we acknowledge following great open-sourcing projects:

SAM2 (https://github.com/facebookresearch/sam2).
Video-Depth-Anything (https://github.com/DepthAnything/Video-Depth-Anything).

🌏 Citation

@article{yang2025any,
  title={Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion},
  author={Yang, Yang and Zheng, Siming and Chen, Jinwei and Wu, Boxi and He, Xiaofei and Cai, Deng and Li, Bo and Jiang, Peng-Tao},
  journal={arXiv preprint arXiv:2505.21593},
  year={2025}
}

📧 Contact

If you have any questions and improvement suggestions, please email Yang Yang (yangyang98@zju.edu.cn), or open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
csv_file		csv_file
demo_dataset		demo_dataset
models		models
pipelines		pipelines
test		test
utils		utils
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion

📢 News

✅ ToDo List for Any-to-Bokeh Release

🔧 Installation

⏬ Demo Inference

🏃 Inference Custom Video

Data Preprocessing

1. Get Object Mask.

2. Depth Prediction.

Case1: Fixed focus plane

Case2: Changed blur strength

Case3: Changed focus plane

📜 Acknowledgement

🌏 Citation

📧 Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

vivoCameraResearch/any-to-bokeh

Folders and files

Latest commit

History

Repository files navigation

Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion

📢 News

✅ ToDo List for Any-to-Bokeh Release

🔧 Installation

⏬ Demo Inference

🏃 Inference Custom Video

Data Preprocessing

1. Get Object Mask.

2. Depth Prediction.

Case1: Fixed focus plane

Case2: Changed blur strength

Case3: Changed focus plane

📜 Acknowledgement

🌏 Citation

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages