English | 简体中文
Ao Liang
Lingdong Kong
Dongyue Lu
Youquan Liu
Jian Fang
Huaici Zhao
Wei Tsang Ooi
![]() |
|---|
This work focuses on the practical yet challenging task of 3D object detection from heterogeneous robot platforms: Vehicle, Drone, and Quadruped. To achieve strong generalization ability, we contribute:
- The first dataset for multi-platform 3D object detection, comprising more than 51,000+ LiDAR frames with over 250,000+ meticulously annotated 3D bounding boxes.
- A cross-platform 3D domain adaptation framework, effectively transferring capabilities from vehicles to other platforms by integrating geometric and feature-level representations.
- A comprehensive benchmark study of state-of-the-art 3D object detectors on cross-platform scenarios.
If you find this work helpful for your research, please kindly consider citing our paper:
@inproceedings{liang2025pi3det,
title = {Perspective-Invariant 3D Object Detection},
author = {Ao Liang and Lingdong Kong and Dongyue Lu and Youquan Liu and Jian Fang and Huaici Zhao and Wei Tsang Ooi},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
year = {2025},
}- [07/2025] - The Pi3DET dataset has been extended to Track 5: Cross-Platform 3D Object Detection of the RoboSense Challenge at IROS 2025. See the track homepage and GitHub repo for more details.
- [07/2025] - The project page is online. 🚀
- [07/2025] - This work has been accepted to ICCV 2025. See you in Honolulu! 🌸
- Installation
- Data Preparation
- Getting Started
- Model Zoo
- Pi3DET Benchmark
- TODO List
- License
- Acknowledgements
For details related to installation and environment setups, kindly refer to INSTALL.md.
Kindly refer to our HuggingFace Dataset 🤗 page from here for more details.
To learn more usage of this codebase, kindly refer to GET_STARTED.md.
To be updated.
![]() |
|---|
We observe significant cross-platform geometric discrepancies in ego‑motion jitter, point‑cloud elevation distributions, and target pitch‑angle distributions across vehicle, quadruped, and drone platforms, which hinder single‑platform model generalization.
![]() |
|---|
Pi3DET‑Net employs a two‑stage adaptation pipeline—Pre‑Adaptation uses random jitter and virtual poses to learn and align global geometric transformations, while Knowledge Adaptation leverages geometry‑aware descriptors and KL‑based probabilistic feature alignment to synchronize feature distributions across platforms.
| Platform | Condition | Sequence | # of Frames | # of Points (M) | # of Vehicles | # of Pedestrians |
|---|---|---|---|---|---|---|
| Vehicle (8) | Daytime (4) | city_hall | 2,982 | 26.61 | 19,489 | 12,199 |
| penno_big_loop | 3,151 | 33.29 | 17,240 | 1,886 | ||
| rittenhouse | 3,899 | 49.36 | 11,056 | 12,003 | ||
| ucity_small_loop | 6,746 | 67.49 | 34,049 | 34,346 | ||
| Nighttime (4) | city_hall | 2,856 | 26.16 | 12,655 | 5,492 | |
| penno_big_loop | 3,291 | 38.04 | 8,068 | 106 | ||
| rittenhouse | 4,135 | 52.68 | 11,103 | 14,315 | ||
| ucity_small_loop | 5,133 | 53.32 | 18,251 | 8,639 | ||
| Summary (Vehicle) | 32,193 | 346.95 | 131,911 | 88,986 | ||
| Drone (7) | Daytime (4) | penno_parking_1 | 1,125 | 8.69 | 6,075 | 115 |
| penno_parking_2 | 1,086 | 8.55 | 5,896 | 340 | ||
| penno_plaza | 678 | 5.60 | 721 | 65 | ||
| penno_trees | 1,319 | 11.58 | 657 | 160 | ||
| Nighttime (3) | high_beams | 674 | 5.51 | 578 | 211 | |
| penno_parking_1 | 1,030 | 9.42 | 524 | 151 | ||
| penno_parking_2 | 1,140 | 10.12 | 83 | 230 | ||
| Summary (Drone) | 7,052 | 59.47 | 14,534 | 1,272 | ||
| Quadruped (10) | Daytime (8) | art_plaza_loop | 1,446 | 14.90 | 0 | 3,579 |
| penno_short_loop | 1,176 | 14.68 | 3,532 | 89 | ||
| rocky_steps | 1,535 | 14.42 | 0 | 5,739 | ||
| skatepark_1 | 661 | 12.21 | 0 | 893 | ||
| skatepark_2 | 921 | 8.47 | 0 | 916 | ||
| srt_green_loop | 639 | 9.23 | 1,349 | 285 | ||
| srt_under_bridge_1 | 2,033 | 28.95 | 0 | 1,432 | ||
| srt_under_bridge_2 | 1,813 | 25.85 | 0 | 1,463 | ||
| Nighttime (2) | penno_plaza_lights | 755 | 11.25 | 197 | 52 | |
| penno_short_loop | 1,321 | 16.79 | 904 | 103 | ||
| Summary (Quadruped) | 12,300 | 156.75 | 5,982 | 14,551 | ||
| All Three Platforms (25) | Summary (All) | 51,545 | 563.17 | 152,427 | 104,809 |
![]() |
|---|
![]() |
|---|
- Initial release. 🚀
- Release the dataset for the RoboSense Challenge 2025.
- Release the code for the RoboSense Challenge 2025.
- Release the whole Pi3DET dataset.
- Release the code for the Pi3DET-Net method.
This work is under the Apache License Version 2.0, while some specific implementations in this codebase might be with other licenses. Kindly refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.
This work is developed based on the MMDetection3D codebase.
MMDetection3D is an open-source toolbox based on PyTorch, towards the next-generation platform for general 3D perception. It is a part of the OpenMMLab project developed by MMLab.
Part of the benchmarked models are from the OpenPCDet and 3DTrans projects.






