SpatialGen: Layout-guided 3D Indoor Scene Generation

Image-to-Scene Results	Text-to-Scene Results

TL;DR: Given a 3D semantic layout, SpatialGen can generate a 3D indoor scene conditioned on either a reference image (left) or a textual description (right) using a multi-view, multi-modal diffusion model.

✨ News

[Sep, 2025] We release the paper of SpatialGen!
[Aug, 2025] Initial release of SpatialGen-1.0!

📋 Release Plan

Provide inference code of SpatialGen.
Provide training instruction for SpatialGen.
Release SpatialGen dataset.

SpatialGen Models

Model	Download
SpatialGen-1.0	🤗 HuggingFace
FLUX.1-Layout-ControlNet	🤗 HuggingFace
FLUX.1-Wireframe-dev-lora	🤗 HuggingFace

Usage

🔧 Installation

Tested with the following environment:

Python 3.10
PyTorch 2.3.1
CUDA Version 12.1

# clone the repository
git clone https://github.com/manycore-research/SpatialGen.git
cd SpatialGen

python -m venv .venv
source .venv/bin/activate

pip install -r requirements.txt
# Optional: fix the [flux inference bug](https://github.com/vllm-project/vllm/issues/4392)
pip install nvidia-cublas-cu12==12.4.5.8

📊 Dataset

We provide SpatialGen-Testset with 48 rooms, which labeled with 3D layout and 4.8K rendered images (48 x 100 views, including RGB, normal, depth maps and semantic maps) for MVD inference.

Inference

# Single image-to-3D Scene
bash scripts/infer_spatialgen_i2s.sh

# Text-to-image-to-3D Scene
# in captions/spatialgen_testset_captions.jsonl, we provide text prompts of different styles for each room, 
# choose a pair of scene_id and prompt to run the text2scene experiment
bash scripts/infer_spatialgen_t2s.sh

License

SpatialGen-1.0 is derived from Stable-Diffusion-v2.1, which is licensed under the CreativeML Open RAIL++-M License. FLUX.1-Layout-ControlNet and FLUX.1-Wireframe-dev-lora are licensed under the FLUX.1-dev Non-Commercial License.

Acknowledgements

We would like to thank the following projects that made this work possible:

DiffSplat | SD 2.1 | TAESD | FLUX | SpatialLM

Citation

@article{SpatialGen,
  title         = {SpatialGen: Layout-guided 3D Indoor Scene Generation},
  author        = {Fang, Chuan and Li, Heng and Liang, Yixu and Zheng, Jia and Mao, Yongsen and Liu, Yuan and Tang, Rui and Zhou, Zihan and Tan, Ping},
  journal       = {arXiv preprint},
  year          = {2025},
  eprint        = {2509.14981},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
captions		captions
configs		configs
diffusers_spatialgen		diffusers_spatialgen
docker		docker
preprocess		preprocess
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
visualize_layout.py		visualize_layout.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpatialGen: Layout-guided 3D Indoor Scene Generation

✨ News

📋 Release Plan

SpatialGen Models

Usage

🔧 Installation

📊 Dataset

Inference

License

Acknowledgements

Citation

About

Uh oh!

Contributors 3

Uh oh!

Languages

License

manycore-research/SpatialGen

Folders and files

Latest commit

History

Repository files navigation

SpatialGen: Layout-guided 3D Indoor Scene Generation

✨ News

📋 Release Plan

SpatialGen Models

Usage

🔧 Installation

📊 Dataset

Inference

License

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 3

Uh oh!

Languages