Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva
- 2025-11-04: Released scripts for converting Holodeck output into SceneEval-compatible format!
- 2025-10-27: Release v1.1 with a new metric Opening Clearance, support for LayoutVLM and HSM, bug fixes, and more! The environment setup is now simplified and the demo is easier to run! Give it a try!
- 2025-06-27: Codebase release v1.0!
- 2025-06-10: Released SceneEval-500 dataset and v0.9 of the SceneEval codebase!
- Add documentation for the scene state format
- Provide script for downloading and processing Holodeck's assets
- Create guide for extending SceneEval with new methods and metrics
- Replace custom VLM interface with Pydantic AI
First, create and activate the conda environment:
conda env create -f environment.yaml
conda activate scene_evalSceneEval requires an VLM to run certain metrics.
Metrics that DO NOT require a VLM are:
- Collision, Navigability, Out of Bounds, and Opening Clearance
Metrics that REQUIRE a VLM are:
- Object Count, Object Attribute, Object-Object Relationship, Object-Architecture Relationship, Object Support, Object Accessibility
To run the metrics that require a VLM, the default implementation uses OpenAI's GPT-4o, so you will need an OpenAI API key. (The demo below does not require this.)
Create a .env file in the root directory following the template in .env.example, and add your OpenAI API key:
OPENAI_API_KEY=<your_openai_api_key_here>
Download the SceneEval-500 annotations from this repository's Releases page and place the annotations.csv file in the input directory. The structure should look like:
.
└── input
├── annotations.csv
├── empty_scene.json
├── hotel_room_1k.exr
├── human.glb
└── ...
Dataset Composition:
- SceneEval-100: The first 100 entries (IDs 0-99) are manually created
- SceneEval-500: The full dataset includes 400 additional entries generated semi-automatically using a VLM
SceneEval needs 3D assets to recreate scenes from different generation methods. You only need to download the assets for the methods you want to evaluate.
For 3D-FUTURE methods ATISS, DiffuScene, LayoutGPT, InstructScene
- Visit the 3D-FUTURE dataset page
- Follow their download instructions
- Place the downloaded assets in
_data/3D-FUTURE-model/
For Holodeck
Run our automated download script to download and preprocess the Objathor assets they use:
python scripts/prepare_objathor.py
# On Linux, you may see an `directory not empty` error; this is an issue in the original Objathor download script and can be ignored. Simply enter 'y' and press Enter when prompted.For LayoutVLM
- Download their preprocessed assets from their repo.
- Unzip and place the contents in
_data/layoutvlm-objathor/
For HSM
HSM uses assets from the Habitat Synthetic Scenes Dataset (HSSD). Some assets are compressed with KHR_texture_basisu, which is currently not supported by Blender. We provide a script to download and decompress the assets into Blender-compatible GLB files.
Prerequisites:
-
Set up to clone repos from HuggingFace using SSH or HTTPS:
-
Install gltf-transform and ktx command line tools and ensure they are in your PATH:
- gltf-transform via npm:
npm install -g @gltf-transform/cli - ktx from the KhronosGroup/KTX-Software repository
- gltf-transform via npm:
Then run the script:
python scripts/prepare_hsm.py
Your _data/ directory should look like this if you have all assets downloaded
_data
├── 3D-FUTURE-model
│ ├── model_info.json
│ ├── 0a0f0cf2-3a34-4ba2-b24f-34f361c36b3e
│ | ├── raw_model.obj
│ | ├── model.mtl
│ | ├── texture.png
│ | ├── ...
│ ├── ...
├── objathor-assets
│ ├── annotations.json
│ ├── 0a0a8274693445a6b533dce7f97f747c
│ | ├── 0a0a8274693445a6b533dce7f97f747c.glb
│ | ├── ...
├── layoutvlm-objathor
│ ├── 0a3dc72fb1bb41439005cac4a3ebb765
│ | ├── 0a3dc72fb1bb41439005cac4a3ebb765.glb
│ | ├── data.json
│ | ├── ...
│ ├── ...
└── hssd
├── fpmodels.csv
├── glb
│ ├── 0
│ | ├── 0a0b9f607345b6cee099d681f28e19e1f0a215c8.glb
│ | ├── ...
│ ├── ...
└── decomposed
├── 00a2b0f3886ccb5ffddac704f8eeec324a5e14c6
| ├── 00a2b0f3886ccb5ffddac704f8eeec324a5e14c6_part_1.glb
| ├── 00a2b0f3886ccb5ffddac704f8eeec324a5e14c6_part_2.glb
| ├── ...
├── ...
Try SceneEval with our provided example scenes. You do not need an OpenAI API key for this demo.
Follow the instructions in the For LayoutVLM section above to download the LayoutVLM assets.
# Copy provided example scenes to input directory
cp -r input_example/* input/
# Run SceneEval
python main.pyThis will run the evaluation on five example scenes generated by LayoutVLM using the no_llm_plan evaluation plan, which runs the following metrics: Collision, Navigability, Out of Bounds, and Opening Clearance.
Results will be saved to ./output_eval.
SceneEval is built to be extensible! You can easily add new scene generation methods, evaluation metrics, and assets.
Follow this step-by-step guide to see how to add a new method that uses a new 3D asset source.
Found a bug or want to contribute a new method or metric? We'd love your help! Please open an issue or submit a pull request.
If you find SceneEval helpful in your research, please cite our work:
@article{tam2025sceneeval,
title = {{SceneEval}: Evaluating Semantic Coherence in Text-Conditioned {3D} Indoor Scene Synthesis},
author = {Tam, Hou In Ivan and Pun, Hou In Derek and Wang, Austin T. and Chang, Angel X. and Savva, Manolis},
year = {2025},
eprint = {2503.14756},
archivePrefix = {arXiv}
}
Note: When using SceneEval in your work, we encourage you to specify which version of the code and dataset you are using (e.g., commit hash, release tag, or dataset version) to ensure reproducibility and proper attribution.
This work was funded in part by the Sony Research Award Program, a CIFAR AI Chair, a Canada Research Chair, NSERC Discovery Grants, and enabled by support from the Digital Research Alliance of Canada. We thank Nao Yamato, Yotaro Shimose, and other members on the Sony team for their feedback. We also thank Qirui Wu, Xiaohao Sun, and Han-Hung Lee for helpful discussions.