USDNet - the Baseline of Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description

This repository contains the official code release for the Articulate3D paper, accepted at ICCV 2025.

📄 Paper: Articulate3D (ICCV 2025)
🏁 Challenge: Track 3 at OpenSUN3D Workshop, ICCV 2025

📦 What's in this repo?

Currently released:

The implementation of USDNet, baseline of the Challenge.

1. Code structure

We adapt the codebase of Mask3D which provides a highly modularized framework for 3D Semantic Instance Segmentation based on the MinkowskiEngine.

├── USDNet
│   ├── main_instance_segmentation_articulation.py <- the main file
│   ├── conf                          <- hydra configuration files
│   ├── datasets
│   │   ├── preprocessing             <- folder with preprocessing scripts
│   │   │   ├── articulate3d_preprocessing_challenge.py   <- file of preprocessing for the challenge
│   │   ├── semseg.py                 <- indoor dataset
│   │   └── utils.py        
│   ├── models                        <- USDNet model based on Mask3D
│   ├── trainer
│   │   ├── __init__.py
│   │   └── trainer.py                <- train loop
│   └── utils
├── data
│   ├── processed                     <- folder for preprocessed datasets
│   └── raw                           <- folder for raw datasets
├── scripts                           <- train scripts
├── docs
├── README.md
└── saved                             <- folder that stores models and logs
└──Dockerfile                         <- Dockerfile for env setup for cuda 12

2. Dependencies 📝

The main dependencies of the project are the following:

python: 3.10.9
cuda: 11.3

You can set up a conda environment following instructions in Mask3D.

3. Data preprocessing 🔨

After installing the dependencies, we preprocess the datasets. First, put the dataset in the dir "./data/raw/articulate3d". Then run the bash file and the preprocessed files will be saved in "./data/processed/". For efficiency, the preprocessing code will downsample the pointcloud of the mesh from Scannet++ with voxel size 0.01 cm. Note that the evaluation in Articulate3D challenge is based on the voxelized point cloud with the ground truth annotations.

Note the splits files is in "./datasets/articulate3d" and should be copied to "./data/raw/articulate3d/".

The structure should look like this:

├── USDNet
│   ├── data
│   │   ├── raw                       <- raw data
│   │   │   ├── articulate3d
│   │   │   │   ├──splits             <- splits of training, validation and test set
│   │   │   │   │   ├──train.txt
│   │   │   │   │   ├──val.txt
│   │   │   │   │   ├──tesst.txt
│   │   │   │   │──scans
│   │   │   │   │   ├──0a5c013435
│   │   │   │   │   │   ├──mesh_aligned_0.05.ply      <- mesh file
│   │   │   │   │   │   ├──0a5c013435_parts.json      <- annotation for movable and interactable part segmentation
│   │   │   │   │   │   ├──0a5c013435_artic.json      <- annotation for articulation parameters of movable part
│   │   │   │   │   ├── ... 
│   │   ├── processed                 <- folder with processed data by preprocessing_articulate3d.sh 
│   │   │   ├──articulate3d_challenge_mov             <- processed data for movable part seg and articulation prediction
│   │   │   │   │──train                              <- dataset with pointcloud, color and normal  + annotation for training set 
│   │   │   │   │──validation                         <- dataset with pointcloud, color and normal  + annotation for validation set 
│   │   │   │   │──test                               <- dataset with pointcloud, color and normal  + annotation for test set 
│   │   │   │   │──train_database.yaml                <- database for train set, used for dataloader to locate file paths
│   │   │   │   │──validation_database.yaml           <- database for validation set
│   │   │   │   │──train_validation_database.yaml     <- database for train+validation set
│   │   │   │   │──test_database.yaml                 <- database for test set
│   │   │   │   │──expand_dict                        <- neighbored point annotation of movable part, for coarse to fine segmentation training
│   │   │   │   │──instance_gt                        <- gt segmentation annotation in .txt

4. Training 🚆

Movable part segmentation and articulation prediction

Step 1

Download the pretrained model of Mask3D.

Step 2

Check the notes and TODOs in the "./scripts/train_mov.sh" to set the correct key and path

Step 3

Start training for movable part segmentation and articulation parameter prediction:

bash ./scripts/train_mov.sh

Interactable part segmentation

Step 1

Get the trained model from "Movable part segmentation and articulation prediction" and use it for training interactable part segmentation to speed up converging.

Step 2

Check the notes and TODOs in the "./scripts/inter_mov.sh" to set the correct key and path

In the simplest case the inference command looks as follows:

Step 3

Start training for interactable part segmentation:

bash ./scripts/train_inter.sh

Trained checkpoints 💾

We provide the trained checkpoints for the 2 tasks here.

5. Inference 📈

Run inference script for evaluation of the trained mode and for the challange submission

Movable part segmentation and articulation prediction

bash ./scripts/infer_mov.sh

Interactable part segmentation

bash ./scripts/infer_inter.sh

6. TODO List

Release Code
Set up Challenge Server
Training Code and instructions
Checkpoints (mov yes, inter no)
Merge data loader with json format to datapreprocessing
provide preprocessed data for user's convenience

BibTeX 🙏

@article{halacheva2024articulate3d,
  title={Holistic Understanding of 3D Scenes as Universal Scene Description},
  author={Anna-Maria Halacheva* and Yang Miao* and Jan-Nico Zaech and Xi Wang and Luc Van Gool and Danda Pani Paudel},
  year={2024},
  journal={arXiv preprint arXiv:2412.01398},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

USDNet - the Baseline of Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description

📦 What's in this repo?

1. Code structure

2. Dependencies 📝

3. Data preprocessing 🔨

4. Training 🚆

Movable part segmentation and articulation prediction

Interactable part segmentation

Trained checkpoints 💾

5. Inference 📈

Movable part segmentation and articulation prediction

Interactable part segmentation

6. TODO List

BibTeX 🙏

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
benchmark		benchmark
conf		conf
datasets		datasets
docs		docs
models		models
scripts		scripts
test		test
third_party/pointnet2		third_party/pointnet2
trainer		trainer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
extract_pip_requirements.py		extract_pip_requirements.py
main_instance_segmentation_articulation.py		main_instance_segmentation_articulation.py
transfer_preds_files.py		transfer_preds_files.py

License

insait-institute/USDNet

Folders and files

Latest commit

History

Repository files navigation

USDNet - the Baseline of Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description

📦 What's in this repo?

1. Code structure

2. Dependencies 📝

3. Data preprocessing 🔨

4. Training 🚆

Movable part segmentation and articulation prediction

Interactable part segmentation

Trained checkpoints 💾

5. Inference 📈

Movable part segmentation and articulation prediction

Interactable part segmentation

6. TODO List

BibTeX 🙏

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages