Skip to content

eightwomen/CameraBench

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📷 CameraBench: Towards Understanding Camera Motions in Any Video

Demo GIF

SfMs and VLMs performance on CameraBench: Generative VLMs (evaluated with VQAScore) trail classical SfM/SLAM in pure geometry, yet they outperform discriminative VLMs that rely on CLIPScore/ITMScore and—even better—capture scene‑aware semantic cues missed by SfM After simple supervised fine‑tuning (SFT) on ≈1,400 extra annotated clips, our 7B Qwen2.5‑VL doubles its AP, outperforming the current best MegaSAM.

📰 News

  • [2025/04/26]🔥 We open‑sourced our fine‑tuned 7B model and the public test set—1 000+ videos with expert labels & captions..
  • LLMs‑eval integration is in progress—stay tuned!
  • 32B & 72B checkpoints are on the way.

🌍 Explore More

🔎 VQA evaluation on VLMs


🤔: Does the camera track the subject from a side view?
🤖: ✅        🙋: ✅

🤔: Does the camera only move down during the video?
🤖: ❌        🙋: ✅

🤔: Does the camera move backward while zooming in?
🤖: ❌        🙋: ✅

🚀 Quick Start

Download test videos

python download_test_videos.py --save_dir ./your_target_folder

Get captions & labels (subset)

python download_test_data.py --save_dir ./your_target_folder

Download finetuned model

# Coming soon

✏️ Citation

If you find this repository useful for your research, please use the following.

@article{lin2025towards,
  title={Towards Understanding Camera Motions in Any Video},
  author={Lin, Zhiqiu and Cen, Siyuan and Jiang, Daniel and Karhade, Jay and Wang, Hewei and Mitra, Chancharik and Ling, Tiffany and Huang, Yuhan and Liu, Sifan and Chen, Mingyu and Zawar, Rushikesh and Bai, Xue and Du, Yilun and Gan, Chuang and Ramanan, Deva},
  journal={arXiv preprint arXiv:2504.15376},
  year={2025},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 93.3%
  • Python 6.7%