VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

ICCV 2025 ⭐ highlight ⭐

Runjia Li, Philip Torr, Andrea Vedaldi, Tomas Jakab

University of Oxford

Overview

VMem is a plug-and-play memory mechanism of image-set models for consistent scene generation. Existing methods either rely on inpainting with explicit geometry estimation, which suffers from inaccuracies, or use limited context windows in video-based approaches, leading to poor long-term coherence. To overcome these issues, we introduce Surfel Memory of Views (VMem), which anchors past views to surface elements (surfels) they observed. This enables conditioning novel view generation on the most relevant past views rather than just the most recent ones, enhancing long-term scene consistency while reducing computational cost.

🔧 Installation

conda create -n vmem python=3.10
conda activate vmem
pip install -r requirements.txt

🚀 Usage

You need to properly authenticate with Hugging Face to download our model weights. Once set up, our code will handle it automatically at your first run. You can authenticate by running

# This will prompt you to enter your Hugging Face credentials.
huggingface-cli login

Once authenticated, go to our model card here and enter your information for access.

We provide a demo for you to interact with VMem. Simply run

python app.py

❤️ Acknowledgement

This work is built on top of CUT3R, DUSt3R and Stable Virtual Camera. We thank them for their great works.

📚 Citing

If you find this repository useful, please consider giving a star ⭐ and citation.

@article{li2025vmem,
  title={VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory},
  author={Li, Runjia and Torr, Philip and Vedaldi, Andrea and Jakab, Tomas},
  journal={arXiv preprint arXiv:2506.18903},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
configs/inference		configs/inference
extern/CUT3R		extern/CUT3R
modeling		modeling
test_samples		test_samples
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
navigation.py		navigation.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

Overview

🔧 Installation

🚀 Usage

❤️ Acknowledgement

📚 Citing

About

Uh oh!

Releases

Packages

Languages

License

runjiali-rl/vmem

Folders and files

Latest commit

History

Repository files navigation

VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

Overview

🔧 Installation

🚀 Usage

❤️ Acknowledgement

📚 Citing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages