Perspective-Invariant 3D Object Detection

Ao Liang Lingdong Kong Dongyue Lu Youquan Liu Jian Fang
Huaici Zhao Wei Tsang Ooi

This work focuses on the practical yet challenging task of 3D object detection from heterogeneous robot platforms: Vehicle, Drone, and Quadruped. To achieve strong generalization ability, we contribute:

The first dataset for multi-platform 3D object detection, comprising more than 51,000+ LiDAR frames with over 250,000+ meticulously annotated 3D bounding boxes.
A cross-platform 3D domain adaptation framework, effectively transferring capabilities from vehicles to other platforms by integrating geometric and feature-level representations.
A comprehensive benchmark study of state-of-the-art 3D object detectors on cross-platform scenarios.

📚 Citation

If you find this work helpful for your research, please kindly consider citing our paper:

@inproceedings{liang2025pi3det,
    title     = {Perspective-Invariant 3D Object Detection},
    author    = {Ao Liang and Lingdong Kong and Dongyue Lu and Youquan Liu and Jian Fang and Huaici Zhao and Wei Tsang Ooi},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    year      = {2025},
}

Updates

[07/2025] - The Pi3DET dataset has been extended to Track 5: Cross-Platform 3D Object Detection of the RoboSense Challenge at IROS 2025. See the track homepage and GitHub repo for more details.
[07/2025] - The project page is online. 🚀
[07/2025] - This work has been accepted to ICCV 2025. See you in Honolulu! 🌸

⚙️ Installation

For details related to installation and environment setups, kindly refer to INSTALL.md.

♨️ Data Preparation

Kindly refer to our HuggingFace Dataset 🤗 page from here for more details.

🚀 Getting Started

To learn more usage of this codebase, kindly refer to GET_STARTED.md.

🐍 Model Zoo

To be updated.

📐 Pi3DET Benchmark

Statistical Analysis

We observe significant cross-platform geometric discrepancies in ego‑motion jitter, point‑cloud elevation distributions, and target pitch‑angle distributions across vehicle, quadruped, and drone platforms, which hinder single‑platform model generalization.

Methodology

Pi3DET‑Net employs a two‑stage adaptation pipeline—Pre‑Adaptation uses random jitter and virtual poses to learn and align global geometric transformations, while Knowledge Adaptation leverages geometry‑aware descriptors and KL‑based probabilistic feature alignment to synchronize feature distributions across platforms.

Pi3DET Dataset

Detailed statistical information

Platform	Condition	Sequence	# of Frames	# of Points (M)	# of Vehicles	# of Pedestrians
Vehicle (8)	Daytime (4)	city_hall	2,982	26.61	19,489	12,199
		penno_big_loop	3,151	33.29	17,240	1,886
		rittenhouse	3,899	49.36	11,056	12,003
		ucity_small_loop	6,746	67.49	34,049	34,346
	Nighttime (4)	city_hall	2,856	26.16	12,655	5,492
		penno_big_loop	3,291	38.04	8,068	106
		rittenhouse	4,135	52.68	11,103	14,315
		ucity_small_loop	5,133	53.32	18,251	8,639
		Summary (Vehicle)	32,193	346.95	131,911	88,986
Drone (7)	Daytime (4)	penno_parking_1	1,125	8.69	6,075	115
		penno_parking_2	1,086	8.55	5,896	340
		penno_plaza	678	5.60	721	65
		penno_trees	1,319	11.58	657	160
	Nighttime (3)	high_beams	674	5.51	578	211
		penno_parking_1	1,030	9.42	524	151
		penno_parking_2	1,140	10.12	83	230
		Summary (Drone)	7,052	59.47	14,534	1,272
Quadruped (10)	Daytime (8)	art_plaza_loop	1,446	14.90	0	3,579
		penno_short_loop	1,176	14.68	3,532	89
		rocky_steps	1,535	14.42	0	5,739
		skatepark_1	661	12.21	0	893
		skatepark_2	921	8.47	0	916
		srt_green_loop	639	9.23	1,349	285
		srt_under_bridge_1	2,033	28.95	0	1,432
		srt_under_bridge_2	1,813	25.85	0	1,463
	Nighttime (2)	penno_plaza_lights	755	11.25	197	52
		penno_short_loop	1,321	16.79	904	103
		Summary (Quadruped)	12,300	156.75	5,982	14,551
All Three Platforms (25)		Summary (All)	51,545	563.17	152,427	104,809

Dataset Examples

📝 TODO List

Initial release. 🚀
Release the dataset for the RoboSense Challenge 2025.
Release the code for the RoboSense Challenge 2025.
Release the whole Pi3DET dataset.
Release the code for the Pi3DET-Net method.

License

This work is under the Apache License Version 2.0, while some specific implementations in this codebase might be with other licenses. Kindly refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Acknowledgements

This work is developed based on the MMDetection3D codebase.

MMDetection3D is an open-source toolbox based on PyTorch, towards the next-generation platform for general 3D perception. It is a part of the OpenMMLab project developed by MMLab.

Part of the benchmarked models are from the OpenPCDet and 3DTrans projects.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Perspective-Invariant 3D Object Detection

📚 Citation

Updates

Outline

⚙️ Installation

♨️ Data Preparation

🚀 Getting Started

🐍 Model Zoo

📐 Pi3DET Benchmark

Statistical Analysis

Methodology

Pi3DET Dataset

Detailed statistical information

Dataset Examples

📝 TODO List

License

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 4

License

worldbench/pi3det

Folders and files

Latest commit

History

Repository files navigation

Perspective-Invariant 3D Object Detection

📚 Citation

Updates

Outline

⚙️ Installation

♨️ Data Preparation

🚀 Getting Started

🐍 Model Zoo

📐 Pi3DET Benchmark

Statistical Analysis

Methodology

Pi3DET Dataset

Detailed statistical information

Dataset Examples

📝 TODO List

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages