Beyond Pillars: Advancing 3D Object Detection with Salient Voxel Enhancement of LiDAR and 4D Radar Fusion
👋 This repository represents the official implementation of the paper titled "Beyond Pillars: Advancing 3D Object Detection with Salient Voxel Enhancement of LiDAR and 4D Radar Fusion".
We present a novel voxel-based framework SVEFusion for LiDAR and 4D radar fusion object detection, utilizing a voxel-level re-weighting mechanism to suppress non-empty background voxels. Experimental results show that SVEFusion outperforms state-of-the-art methods on both the VoD and Astyx HiRes 2019 datasets.
- 2025/4/13: ✨ Code and checkpoint are released! The checkpoint is available via Google Drive.
git clone https://github.com/icdm-adteam/SVEFusion.git
cd SVEFusion
conda create --name svefusion python=3.8
conda activate svefusion
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install spconv-cu118
python setup.py develop
pip install -r requirements.txt
cd mamba
pip install -e .
The preparation for VoD dataset follows L4DR.
Please follow VoD Dataset to download dataset.
The format of how the dataset is provided:
View-of-Delft-Dataset (root)
├── lidar (kitti dataset where velodyne contains the LiDAR point clouds)
...
├── radar (kitti dataset where velodyne contains the 4D radar point clouds)
...
├── radar_3_scans (kitti dataset where velodyne contains the accumulated 4D radar point clouds of 3 scans)
...
├── radar_5_scans (kitti dataset where velodyne contains the accumulated 4D radar point clouds of 5 scans)
...
In order to train the LiDAR and 4D radar fusion model, LiDAR and 4D radar fusion data infos should be generated.
- First, create an additional folder with lidar and 4D radar point clouds in the VoD dataset directory (here we call it rlfusion_5f):
View-of-Delft-Dataset (root)
├── lidar
├── radar
├── radar_3_scans
├── radar_5_scans
├── rlfusion_5f
- Then, refer to the following structure to place the corresponding files in the rlfusion_5f folder:
rlfusion_5f
│── ImageSets
│── training
├── calib (lidar_calib)
├── image_2
├── label_2
├── lidar (lidar velodyne)
├── lidar_calib
├── pose
├── radar (single frame 4D radar velodyne)
├── radar_5f (radar_5_scans velodyne)
├── radar_calib
│── testing
... like training (except label)
Generate the data infos by running the following command:
python -m pcdet.datasets.vod.vod_dataset create_vod_infos tools/cfgs/dataset_configs/Vod_fusion.yaml
Finally, check if your VoD dataset has the following structure:
View-of-Delft-Dataset (root)
├── lidar
├── radar
├── radar_3_scans
├── radar_5_scans
├── rlfusion_5f
│── gt_database
...
│── ImageSets
...
│── training
├── calib (lidar_calib)
├── image_2
├── label_2
├── lidar (lidar velodyne)
├── lidar_calib
├── pose
├── radar (single frame 4D radar velodyne)
├── radar_5f (radar_5_scans velodyne)
├── radar_calib
│── testing
... like training (except label)
│── vod_dbinfos_train.pkl
│── vod_infos_test.pkl
│── vod_infos_train.pkl
│── vod_infos_trainval.pkl
│── vod_infos_val.pkl
Go to the tools folder:
cd tools
- To train with multiple GPUs:
bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
- To train with a single GPU:
python train.py --cfg_file ${CONFIG_FILE}
For example, train SVEFusion with:
CUDA_VISIBLE_DEVICES=0,1 bash scripts/dist_train.sh 2 --cfg_file cfgs/VoD_models/SVEFusion.yaml
- To test a checkpoint:
python test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}
For example, test an SVEFusion checkpoint with:
python test.py --cfg_file cfgs/VoD_models/SVEFusion.yaml --ckpt ../output/VoD_models/SVEFusion/default/ckpt/checkpoint_epoch_80.pth
Thank for the excellent 3D object detection codebases OpenPCDet.
Thank for the excellent spatially sparse convolution library spconv.
Thank for the excellent 4D radar dataset VoD Dataset to download dataset.