Hao Li*, Minghan Qin*†, , Zhengyu Zou*, Diqi He, Yongjie Zhang, Dingwen Zhang†, Junwei Han
(* indicates equal contribution, † means Co-corresponding author)
| Webpage | Full Paper | Video |
| Preprocessed Dataset | BaiduWangpan | GoogleDrive |
| Pre-trained Models | BaiduWangpan | GoogleDrive |
| Datasets |
This repository contains the official authors implementation associated with the paper "LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding" (Arxiv 2024), which can be found here. We further provide the preprocessed datasets, as well as pre-trained models.
We recommend python=3.10.0, cuda toolkit=12.6 as the base environment.
# SSH
git clone github.com/lifuguan/langsurf.git
cd PGSR
conda create -n langsurf python=3.10
conda activate langsurf
# compile the 3D-GS library
pip install -e submodules/diff-langsurf-rasterization
pip install -e submodules/simple-knn
pip install -e submodules/segment-anything-langsplat
# install other dependencies
pip install -r requirements.txtIn the experiments section of our paper, we primarily utilized two datasets: the LERF-OVS dataset and the Scannet dataset.
For the LERF-OVS dataset, we expanded upon its existing annotations, which is accessible for download via the following link: GoogleDrive.
For the Scannet dataset, we also provided the corresponding COLMAP data. Full resources can be accessed through this link: GoogleDrive.
data
|---lerf_ovs
| |---label
| | |--- ramen
| | |--- ...
| |---ramen
| | |---images
| | | |--- ...
| | |---sparse
| | | |--- ...
| |---teatime
| | |--- ...
| |---waldo_kitchen
| | |--- ...
|---scannet
| |---scene0085_00
| | |---gt_iou
| | | |--- ...
| | |---gt_ply
| | | |--- ...
| | |---images
| | | |--- ...
| | |---sparse/0
| | | |--- ...
| |---scene0616_00
| | |--- ...Here we use the following scripts to convert the GT labeled point cloud into a class-specified format (for 3d evaluation).
python scripts/scannet_ply_converter.py --input_ply {path to the ply file}
# example
python scripts/scannet_ply_converter.py --input_ply data/scannet/scene0085_00/gt_ply/scene0085_00_vh_clean_2.labels.plyThe bash file contains multiple steps, including image preprocess, feature inference, and Gaussian training.
bash train_scene.sh data/lerf_ovs/waldo_kitchen
bash train_scene.sh data/lerf_ovs/ramen
bash train_scene.sh data/lerf_ovs/teatime
bash train_scene.sh data/scannet/scene0085_00
bash train_scene.sh data/scannet/scene0114_02
bash train_scene.sh data/scannet/scene0616_00
bash train_scene.sh data/scannet/scene0617_00After that, the data structure should be as follows (here we take scene0085_00 in scannet as examples):
data
|---scannet
| |---scene0085_00
| | |---images
| | | |--- ...
| | |---sparse
| | | |--- ...
| | |---hcma_features
| | | |--- ...
| | |---hcma_features_dim3
| | | |--- ...
| | |---output
| | | |---scene0085_00_1
| | | | |---app_model
| | | | |--- ...
| | | | |---point_cloud
| | | | |--- ...
| | | | |---cfg_args
| | | | |---chkpnt40000.pth
| | | |---scene0085_00_2
| | | | |--- ...
| | | |---scene0085_00_3
| | | | |--- ...If you already got trained model (or using our pre-trained model), you can skip the training process and directly render feature by using the following command.
bash render.sh data/lerf_ovs/waldo_kitchen
bash render.sh data/scannet/scene0085_00For LERF-OVS Dataset, use evaluate_lerf_ovs.py to evalute 2D mIoU and 2D localization metrics.
python eval/evaluate_lerf_ovs.py \
--dataset_name waldo_kitchen \
--output_dir eval_result For Scannet Dataset, use evaluate_scannet.py to evalute 2D mIoU and evaluate_scannet_3d.py to produce 3d query pointclouds as well as evalute semantic F1-score.
python eval/evaluate_scannet.py \
--dataset_name scene0085_00 \
--output_dir eval_result
python eval/evaluate_scannet_3d.py \
--dataset_name scene0085_00 \
--output_dir eval_result To render 2D segmentation masks and 2D heatmap, use --generate_mask, as shown in the following script.
python eval/render_full.py --dataset_name teatime --output_dir eval_result --generate_maskTo render 3D heatmap, use --heatmap_3d, as shown in the following script.
python eval/render_full.py --dataset_name teatime --output_dir eval_result --heatmap_3dWe provide 3D instance segmentation code with the following steps:
- Run
evel/render_full.pywith argument--ins_seg, as shown in the following script. It will automatically generate all the object ply models (i.e.teatime_ins_cookie_0_0.ply, ...).
python eval/render_full.py --dataset_name teatime --output_dir eval_result --ins_seg python render.py -m data/lerf_ovs/teatime/output/teatime_1 \
--include_feature --normalized \
--ply_path eval_result/teatime/point_cloud/teatime_ins_cookie_0_0.ply
python render.py -m data/lerf_ovs/teatime/output/teatime_1 \
--include_feature --normalized \
--ply_path eval_result/teatime/point_cloud/teatime_ins_cookie_0_1.ply
python render.py -m data/lerf_ovs/teatime/output/teatime_1 \
--include_feature --normalized \
--ply_path eval_result/teatime/point_cloud/teatime_ins_cookie_0_2.ply
python render.py -m data/lerf_ovs/teatime/output/teatime_1 \
--include_feature --normalized \
--ply_path eval_result/teatime/point_cloud/teatime_ins_cookie_0_all.plyTo object removal, use --remove_object, as shown in the following script.
python eval/render_full.py --dataset_name teatime --output_dir eval_result --remove_object
python render.py -m data/lerf_ovs/teatime/output/teatime_3 \
--include_feature --normalized \
--ply_path 'eval_result/teatime/point_cloud/teatime_remove_food bag_0.ply' python gs_edit.py --dataset_name scene0617_00
python train_finetune.py -m data/scannet/scene0617_00/output/scene0617_00_3
python render.py -m data/scannet/scene0617_00/output/scene0617_00_3 \
--include_feature --normalized \
--ply_path data/scannet/scene0617_00/output/scene0617_00_3/finetune/point_cloud/iteration_43000/finetune.ply
python add.py --dataset_name waldo_kitchen --input_object_ply 'eval_result/teatime/point_cloud/teatime_food bag_0.ply'
python render.py -m data/lerf_ovs/waldo_kitchen/output/waldo_kitchen_3 \
--include_feature --normalized \
--ply_path 'data/lerf_ovs/waldo_kitchen/output/waldo_kitchen_3/point_cloud/iteration_40000/add_teatime_food bag_0.ply'