Skip to content

RyanHangZhou/BOOTPLACE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BOOTPLACE: Bootstrapped Object Placement
with Detection Transformers

BOOTPLACE is a paradigm that formulates object placement as a placement-by-detection problem. It begins by identifying suitable regions of interest for object placement. This is achieved by training a specialized detection transformer on object-subtracted backgrounds, enhanced with multi-object supervisions. It then semantically associates each target compositing object with detected regions based on their complementary characteristics. Through a boostrapped training approach applied to randomly object-subtracted images, it enforces meaningful placements through extensive paired data augmentation.

Check out our Project Page for more visual demos!

⏩ Updates

03/20/2025

  • Release training code and pretrained models.

06/24/2025

  • Release inference code and data.

📦 Installation

Prerequisites

  • System: The code is currently tested only on Linux.

  • Hardware: An NVIDIA GPU with at least 16GB of memory is necessary. The code has been verified on NVIDIA A6000 GPUs.

  • Software:

    • Conda is recommended for managing dependencies.
    • Python version 3.6 or higher is required.

    Create a new conda environment named BOOTPLACE and install the dependencies:

    conda env create --file=BOOTPLACE.yml
    

    Download DETR-R50 pretrained models for finetuning here and put it in the directory weights/detr-r50-e632da11.pth.

🤖 Pretrained Models

We provide the following pretrained models:

Model Description #Params Download
BOOTPLACE_Cityscapes Multiple supervision 523M Download

📚 Dataset

We provide a large-scale street-scene vehicle placement dataset Download curated from Cityscapes. The file structures are:

├── train
    ├── backgrounds:
        ├── imgID.png
        ├── ……
    ├── objects:
        ├── imgID:
            ├── object_name_ID.png
            ├── ……
        ├── ……
    ├── location:
        ├── imgID:
            ├── object_name_ID.txt
            ├── ……
        ├── ……
    ├── annotations.json
├── test
    ├── backgrounds:
        ├── imgID.png
        ├── ……
    |── backgrounds_single
        ├── imgID.png
        ├── ……
    ├── objects:
        ├── imgID:
            ├── object_name_ID.png
            ├── ……
        ├── ……
    ├── objects_single:
        ├── imgID:
            ├── object_name_ID.png
            ├── ……
        ├── ……
    ├── location:
        ├── imgID:
            ├── object_name_ID.txt
            ├── ……
        ├── ……
    ├── location_single:
        ├── imgID:
            ├── object_name_ID.txt
            ├── ……
        ├── ……
    ├── annotations.json

Training

To train a model on Cityscapes:

python -m main \
    --epochs 200 \
    --batch_size 2 \
    --save_freq 10 \
    --set_cost_class 1 \
    --ce_loss_coef 1 \
    --num_queries 120 \
    --eos_coef 0.1 \
    --lr 1e-4 \
    --data_path data/Cityscapes \
    --output_dir results/Cityscapes_ckpt \
    --resume weights/detr-r50-e632da11.pth

Inference

python test.py \
    --num_queries 120 \
    --data_path data/Cityscapes \
    --pretrained_model 'results/Cityscapes_ckpt/checkpoint.pth' \
    --im_root 'data/Cityscapes/test' \
    --output_dir 'results/Cityscape_inference'

⚖️ License

This project is licensed under the terms of the MIT license.

📜 Citation

If you find this work helpful, please consider citing our paper:

@inproceedings{zhou2025bootplace,
  title={BOOTPLACE: Bootstrapped Object Placement with Detection Transformers},
  author={Zhou, Hang and Zuo, Xinxin and Ma, Rui and Cheng, Li},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={19294--19303},
  year={2025}
}

About

PyTorch Implementation of "BOOTPLACE: Bootstrapped Object Placement with Detection Transformers", CVPR 2025

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages