This repo is the official implementation of extended version of our CVPR25 paper "Implicit Correspondence Learning for Image-to-Point Cloud Registration"
by Xinjun Li, Wenfei Yang, Jiacheng Deng, Zhixin Cheng, Xu Zhou, and Tianzhu Zhang.
The extended version primarily includes the following additions:
- We design a coarse-to-fine strategy to refine the image-to-point cloud correspondence and camera pose, which can improve the performance with smaller computational cost.
- We conduct more experiments to clarify the effectiveness and limitation of the proposed method.
We will soon release a preprint about the extended paper where you can find more details.
Please use the following command for installation.
# It is recommended to create a new environment
conda create -n ICLI2P python==3.8
conda activate ICLI2P
# 2. Install vision3d following https://github.com/qinzheng93/vision3d
Since we made some modifications to the vision3d codebase — for example, the original vision3d does not support the nuScenes dataset — we provide the modified version used in our experiments. The code has been tested on Python 3.8, PyTorch 1.13.1, Ubuntu 22.04, GCC 11.3 and CUDA 11.7, but it should work with other configurations.
We provide pre-trained weights from BaiduYun(extraction code: 54s4). Please download the latest weights and place them into the appropriate directory:
kitti/ (or nuscenes/)
└── stage_1/ (or stage_2/)
└── workspace/
└── vision3d-output/
└── stage_1/ (or stage_2/)
└── checkpoints/
Make sure to choose the correct dataset (kitti or nuscenes) and stage (stage_1 or stage_2) accordingly.
You can download both the prepared KITTI and nuScenes datasets from the link provided by CorrI2P.
Our training process consists of two stages. In stage 1, we train only the GPDM (Geometric Prior-guided Overlapping Region Detection Module) using a classification loss and a frustum-pose loss for 20 epochs. In stage 2, we train the entire network for another 20 epochs, while keeping the parameters of the GPDM frozen.
The code is in kitti/stage_1
. Use the following command for training.
CUDA_VISIBLE_DEVICES=0 python trainval.py
The code is in kitti/stage_2
.
Save the checkpoint from Stage 1 to:
kitti/stage_2/workspace/vision3d-output/stage_2/checkpoints/checkpoint.pth
Note: Make sure to save the checkpoint with the name epoch-xx.pth
instead of checkpoint.pth
, so that the training in Stage 2 can properly resume from the beginning.
Use the following command for training.
CUDA_VISIBLE_DEVICES=0 python trainval.py --resume
The code is in nuscenes/stage_1
. Use the following command for training.
CUDA_VISIBLE_DEVICES=0 python trainval.py
The code is in nuscenes/stage_2
.
Save the checkpoint from Stage 1 to:
nuscenes/stage_2/workspace/vision3d-output/stage_2/checkpoints/checkpoint.pth
Note: Make sure to save the checkpoint with the name epoch-xx.pth
instead of checkpoint.pth
, so that the training in Stage 2 can properly resume from the beginning.
Use the following command for training.
CUDA_VISIBLE_DEVICES=0 python trainval.py --resume
To evaluate the results of stage 1, you can run the following command:
bash eval.sh
To evaluate the results of stage 2, you can run the following command:
bash eval.sh
Our code is based on 2D3D-MATR, vision3d and CorrI2P. We thank the authors for their excellent work!
If you find our work useful, please cite:
@inproceedings{li2025implicit,
title={Implicit Correspondence Learning for Image-to-Point Cloud Registration},
author={Li, Xinjun and Yang, Wenfei and Deng, Jiacheng and Cheng, Zhixin and Zhou, Xu and Zhang, Tianzhu},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={16922--16931},
year={2025}
}