Generic Objects as Pose Probes for Few-shot View Synthesis (accepted by IEEE TCSVT 2025)
🚀 Official implementation of PoseProbe using PyTorch.
✨ Full Code is coming soon! If you find this repository useful to your research or work, it is greatly appreciated to star this repository and cite our paper.
Radiance fields, including NeRFs and 3D Gaussians, demonstrate great potential in high-fidelity rendering and scene reconstruction, while they require a substantial number of posed images as input. COLMAP is frequently employed for preprocessing to estimate poses. However, COLMAP necessitates a large number of feature matches to operate effectively, and struggles with scenes characterized by sparse features, large baselines, or few-view images. We aim to tackle few-view NeRF reconstruction using only 3 to 6 unposed scene images, freeing from COLMAP initializations. Inspired by the idea of calibration boards in traditional pose calibration, we propose a novel approach of utilizing everyday objects, commonly found in both images and real life, as pose probes. By initializing the probe object as a cube shape, we apply a dual-branch volume rendering optimization (object NeRF and scene NeRF) to constrain the pose optimization and jointly refine the geometry. PnP matching is used to initialize poses between images incrementally, where only a few feature matches are enough. PoseProbe achieves state-of-the-art performance in pose estimation and novel view synthesis across multiple datasets in experiments. We demonstrate its effectiveness, particularly in few-view and large-baseline scenes where COLMAP struggles. In ablations, using different objects in a scene yields comparable performance, showing that PoseProbe is robust to the choice of probe objects. Our project page is available at here
We leverage generic objects in few-view input images as pose probes. The pose probe is automatically segmented by SAM with prompts, and initialized by a cubic shape. The method does not introduce extra burden but successfully facilitates pose estimation in feature-sparse scenes.
- Few-view Reconstruction: Works with only 3 to 6 unposed images.
- No COLMAP Dependency: Completely bypasses COLMAP initialization.
- Everyday Objects as Probes: Utilizes common objects for pose estimation.
- Dual-branch Optimization: Jointly optimizes object and scene geometry.
python run.py --config configs/dtu_e2e/scan1.py -p test --render_video
If you find this code useful for your research, please use the following BibTeX entry:
@article{gao2024generic,
title={Generic Objects as Pose Probes for Few-Shot View Synthesis},
author={Gao, Zhirui and Yi, Renjiao and Zhu, Chenyang and Zhuang, Ke and Chen, Wei and Xu, Kai},
journal={arXiv preprint arXiv:2408.16690},
year={2024}
}