ROSE: Remove Objects with Side Effects in Videos

Chenxuan Miao¹ Yutong Feng² Jianshu Zeng³ Zixiang Gao³ Hantang Liu²
Yunfeng Yan¹ Donglian Qi¹ Xi Chen⁴ Bin Wang² Hengshuang Zhao⁴

¹Zhejiang University ²KunByte AI ³Peking University ⁴The University of Hong Kong

Under Review

⭐ If ROSE is helpful to your projects, please help star this repo. Thanks! 🤗

📖 For more visual results, go checkout our project page

TODO

Release checkpoints.
Release inference code.
Release gradio demo.

Results

Shadow

Masked Input	Output

Reflection

Masked Input	Output

Common

Masked Input	Output

Light Source

Masked Input	Output

Translucent

Masked Input	Output

Mirror

Masked Input	Output

Overview

Dependencies and Installation

Clone Repo

git clone https://github.com/Kunbyte-AI/ROSE.git

Create Conda Environment and Install Dependencies

# create new anaconda env
conda create -n rose python=3.12 -y
conda activate rose

# install python dependencies
pip3 install -r requirements.txt

CUDA = 12.4
PyTorch = 2.6.0
Torchvision = 0.21.0
Other required packages in requirements.txt

Get Started

Prepare pretrained models

We use pretrained Wan2.1-Fun-1.3B-InP as our base model. And during training, we only train the WanTransformer3D part and keep other parts frozen. You can download the weight of Transformer3D of ROSE from this link.

For local inference, the weights directory should be arranged like this:

weights
 ├── transformer
   ├── config.json
   ├── diffusion_pytorch_model.safetensors

Also, it's necessary to prepare the base model in the models directory. You can download the Wan2.1-Fun-1.3B-InP base model from this link.

The models will be arranged like this:

models
 ├── Wan2.1-Fun-1.3B-InP
   ├── google
     ├── umt5-xxl
       ├── spiece.model
       ├── special_tokens_map.json
           ...
   ├── xlm-roberta-large
     ├── sentencepiece.bpe.model
     ├── tokenizer_config.json
         ...
 ├── config.json
 ├── configuration.json
 ├── diffusion_pytorch_model.safetensors
 ├── models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
 ├── models_t5_umt5-xxl-enc-bf16.pth
 ├── Wan2.1_VAE.pth

And for gradio demo, we use the pretrained SAM for generating masks.

For more information about using gradio demo, please check out the README under hugging_face folder.

The complete weight directory structure for gradio demo will be arranged as:

weights
 ├── transformer
   ├── config.json
   ├── diffusion_pytorch_model.safetensors
 ├── cutie-base-mega.pth
 ├── sam_vit_h_4b8939.pth
 ├── download_sam_ckpt.sh

🏂 Quick test

We provide some examples in the data/eval folder. Run the following commands to try it out:

python inference.py

Usage:

python inference.py [options]

Options:
  --validation_videos  Path(s) to input videos 
  --validation_masks   Path(s) to mask videos 
  --validation_prompts Text prompts (default: [""])
  --output_dir         Output directory 
  --video_length       Number of frames per video (It needs to be 16n+1.)
  --sample_size        Frame size: height width (default: 480 720)

💃🏻 Interactive Demo

We also provide an interactive demo for object removal, allowing users to select any object they wish to remove from a video. You can try the demo on Hugging Face or run it locally.

Citation

If you find our repo useful for your research, please consider citing our paper:

@article{miao2025rose,
   title={ROSE: Remove Objects with Side Effects in Videos}, 
   author={Miao, Chenxuan and Feng, Yutong and Zeng, Jianshu and Gao, Zixiang and Liu, Hantang and Yan, Yunfeng and Qi, Donglian and Chen, Xi and Wang, Bin and Zhao, Hengshuang},
   journal={arXiv preprint arXiv:2508.18633},
   year={2025}
}

Contact

If you have any questions, please feel free to reach me out at weiyuchoumou526@gmail.com.

Acknowledgement

This code is based on Wan2.1-Fun-1.3B-Inpaint and some code are brought from ProPainter. Thanks for their awesome works！

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
assets		assets
configs/wan2.1		configs/wan2.1
data/eval		data/eval
hugging_face		hugging_face
models		models
rose		rose
utils		utils
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ROSE: Remove Objects with Side Effects in Videos

TODO

Results

Shadow

Reflection

Common

Light Source

Translucent

Mirror

Overview

Dependencies and Installation

Get Started

Prepare pretrained models

🏂 Quick test

💃🏻 Interactive Demo

Citation

Contact

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

Kunbyte-AI/ROSE

Folders and files

Latest commit

History

Repository files navigation

ROSE: Remove Objects with Side Effects in Videos

TODO

Results

Shadow

Reflection

Common

Light Source

Translucent

Mirror

Overview

Dependencies and Installation

Get Started

Prepare pretrained models

🏂 Quick test

💃🏻 Interactive Demo

Citation

Contact

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages