We present CineTechBench, a pioneering benchmark founded on precise, manual annotation by seasoned cinematography experts across key cinematography dimensions. Our benchmark covers seven essential aspects—shot scale, shot angle, composition, camera movement, lighting, color, and focal length—and includes over 600 annotated movie images and 120 movie clips with clear cinematographic techniques.
- Video extraction script for movie clips
- Camera trajectory similarity calculation script
- Movie image link organization and documentation
- Video Question-answering evaluation script
- Image Question-answering evaluation script
- Description evaluation script
Due to the copyright, we cannot distributed the movie clips and images directly, here we provide instructions to download and preprocess the data in our benchmark. We upload the all image links in image_annotation
file in our CineTechBench HF Repo.
Create the conda environment:
conda create -n ctbench python=3.11 -y
conda activate ctbench
Install pytorch (e.g, cuda 12.4) and transformers
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
pip install transformers==4.51.3
Install flash-attn
pip install flash-attn
Please prepare another conda environment following the instruction in MonST3R for estimating camera trajectory from input video.
Image Cinematographic Tech Dimension Question Answering
We provide an example to evaluate Gemini-2.5-Pro on image dimensions, e.g, shot angle, lighting, focal length, ..., QA.
python image_qa_gemini_2.5_pro.py --json_path /path/to/your/image_annotation.json --image_path /path/to/your/image_folder
Video Camera Movement Question Answering
We provide an example to evaluate Gemini-2.5-Pro on camera movement QA.
python video_qa_gemini_2.5_pro.py --json_path /path/to/your/video_annotation.json --video_path /path/to/your/video_folder
CineTech Description
We provide code to evaluate MLLMs on description generation on metrics in CAPability and MSCOCO, see the instructions for CAPability and COCO.
Video Camera Movement Generation
Before evaluation, you should first prepare the generated videos and the original film clips. Then use MonST3R to estimate their camera trajectory. The result folder should be arranged like:
- original_clips
- result for movie clip 1
- result for movie clip 2
- wani2v_ct
- result for generated movie clip 1
- result for generated movie clip 2
After preparing the camera trajectory estimation results, please use eval/eval_ct.sh
to summary the results.
We fully respect the copyright of all films and do not use any clips for commercial purposes. Instead of distributing or hosting video content, we only provide links to publicly available, authorized sources (e.g., official studio or distributor channels). All assets are credited to their original rights holders, and our use of these links falls under fair‐use provisions for non‐commercial, academic research.
We would like to thank the contributors to the Wan2.1, FramePack, CamI2V, vLLM, SGLang, LMDeploy, HunyuanVideo, HunyuanVideo-I2V, MovieNet, SkyReels-V2, MonST3R, CAPability for their open research. We also wish to acknowledge IMDb for its comprehensive movie database and the MOVIECLIPS YouTube channel for its vast collection of high-quality clips, which were instrumental to our work.
If you have any question please feel free to mail to wangxr@bupt.edu.cn.
@misc{wang2025cinetechbenchbenchmarkcinematographictechnique,
title={CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation},
author={Xinran Wang and Songyu Xu and Xiangxuan Shan and Yuxuan Zhang and Muxi Diao and Xueyan Duan and Yanhua Huang and Kongming Liang and Zhanyu Ma},
year={2025},
eprint={2505.15145},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2505.15145},
}
By downloading, accessing, or using this dataset, you acknowledge that you have read, understood, and agree to be bound by all the terms and conditions of this agreement.
The Dataset is released under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License.
This means you are free to copy and redistribute the material in any medium or format under the following terms:
-
ATTRIBUTION — You must give appropriate credit by citing our original research paper, provide a link to the license, and indicate if any changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NON-COMMERCIAL — You may not use the Dataset for commercial purposes. This includes any use where the primary purpose is for commercial advantage or monetary compensation. The Dataset is intended for academic and research use only.
-
NO DERIVATIVES — If you remix, transform, or build upon the material, you may not distribute the modified material. You are permitted to share and redistribute the Dataset only in its original, unmodified form.
This Dataset does not host or distribute any copyrighted video or image files. The Dataset consists solely of metadata (such as annotations, descriptions, and question-answer pairs) and publicly available hyperlinks to the original content, which remains on third-party platforms.
We do not claim ownership of any linked media. All rights to the original visual content belong to their respective copyright holders. Users are solely responsible for adhering to the terms of service, copyright policies, and licensing agreements of the source platforms when accessing or using the linked content.
THE DATASET IS PROVIDED "AS IS," WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE DATASET OR THE USE OR OTHER DEALINGS IN THE DATASET.
You are solely responsible for any legal liability arising from your improper use of the Dataset. We reserve the right to terminate your access to the Dataset at any time if you fail to comply with these terms.