🚀 OD³: Optimization-free Dataset Distillation for Object Detection

Salwa K. Al Khatib^1*, Ahmed ElHagry^1*, Shitong Shao^2,1*, Zhiqiang Shen¹

¹MBZUAI ²HKUST (Guangzhou) ^*Equal contributors

🧠 Abstract

Training large neural networks on large-scale datasets requires substantial computational resources, particularly for dense prediction tasks such as object detection. Although dataset distillation (DD) has been proposed to alleviate these demands by synthesizing compact datasets from larger ones, most existing work focuses solely on image classification, leaving the more complex detection setting largely unexplored. In this paper, we introduce OD³, a novel optimization-free data distillation framework specifically designed for object detection. Our approach involves two stages: first, a candidate selection process in which object instances are iteratively placed in synthesized images based on their suitable locations, and second, a candidate screening process using a pre-trained observer model to remove low-confidence objects. We perform our data synthesis framework on MS COCO and PASCAL VOC, two popular detection datasets, with compression ratios ranging from 0.25% to 5%. Compared to the prior solely existing dataset distillation method on detection and conventional core set selection methods, OD³ delivers superior accuracy and establishes new state-of-the-art results, surpassing the prior best method by more than 14% on COCO mAP₅₀ at a compression ratio of 1.0%.

⚙️ Installation

The code has been tested with: Python 3.9, CUDA 11.3, PyTorch 1.12.1

Follow this official guide on how to setup the openmmlab environment.

🎯 Pre-trained Observer

Download checkpoints for FasterRCNN-R101 and RetinaNet-R101

├── ./mmdetection/checkpoints/
    ├── faster_rcnn_r101_fpn_2x_coco_bbox_mAP-0.398_20200504_210455-1d2dac9c.pth
    ├── retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth

🗂️ Dataset

COCO: train2017 | val2017
Make sure to change the data_root argument in the used config file, e.g. mmdetection/configs/dd/data_synthesis/data_synthesis_faster-rcnn_r101_fpn_coco.py, to your downloaded COCO path

🔬 Distillation

To distill the COCO dataset into a condensed version using OD³, run the following script with output_dir (where to save condensed coco), original_dir (the path of the downloaded MS COCO), IPD (images per dataset/compression ratio), and (optionally) model arguments, e.g. coco-1percent/ data/ms-coco/ 1184 retinanet.

sh scripts/data_synthesis.sh {output_dir} {original_dir} {IPD} {model (optional)}

🙏 Acknowledgement

This codebase is built on mmdetection.

📖 Citation

If you find our work useful, please cite it:

@article{alkhatib2024od3,
  title={OD3: Optimization-free Dataset Distillation for Object Detection},
  author={Al Khatib, Salwa K. and ElHagry, Ahmed and Shao, Shitong and Shen, Zhiqiang},
  journal={arXiv preprint arXiv:2506.01942},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
mmdetection		mmdetection
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 OD³: Optimization-free Dataset Distillation for Object Detection

🧠 Abstract

⚙️ Installation

🎯 Pre-trained Observer

🗂️ Dataset

🔬 Distillation

🙏 Acknowledgement

📖 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

VILA-Lab/OD3

Folders and files

Latest commit

History

Repository files navigation

🚀 OD3: Optimization-free Dataset Distillation for Object Detection

🧠 Abstract

⚙️ Installation

🎯 Pre-trained Observer

🗂️ Dataset

🔬 Distillation

🙏 Acknowledgement

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

🚀 OD³: Optimization-free Dataset Distillation for Object Detection

Packages