In this work, we propose the food object detection dataset named the Global Gastronomic Culinary Dataset (GGCD). This is a follow-up to our previous work Central Asian Food Scenes Dataset (CAFSD). The dataset is the extension of our CAFSD dataset with the Nutrition5k dataset[1]. The original Nutrition5k contains images taken from an overhead angle for approximately 3,500 dishes and four different side-angle videos for approximately 1,500 dishes. We extracted different frames from the side-angle videos and combined them with the overhead images. We annotated the Nutrition5k dataset with bounding boxes resulting in 12,839 images across 113 classes.
The original CAFSD dataset contains 21,306 images spanning 239 classes. The final combined GFSD dataset contains 34,145 images across 241 food classes. Some visual examples of a few classes are shown for the annotated Nutrition5k and CAFSD datasets in the figure below.
The images on the left illustrate the samples from the CAFSD dataset and on the right-hand side from the Nutrition5k dataset.
The table below shows the number of images per split and instances across three datasets.
Dataset | Number of Instances | Train | Valid | Test |
---|---|---|---|---|
CAFSD | 69,865 | 17,046 | 2,084 | 2,176 |
Nutrition5k | 23,445 | 10,257 | 1,272 | 1,310 |
GGCD | 102,017 | 27,303 | 3,356 | 3,486 |
The statistics of classes grouped into 18 coarse categories are shown in Figure below.
The project directory contains the following files and directories:
coco_to_yolo.py
: Script to convert annotations from COCO format to YOLO format.map_labels_yolo.py
: Script to map and update the label files for the YOLO dataset, when merging multiple datasets with YOLO label format.split_data.py
: Script to split the dataset into training, validation, and test sets.train_rtdetr.py
: Script used to train the RT-DETR (Real-Time Detection Transformer) model.train_yolo.py
: Script used to train the YOLO (You Only Look Once) model.
All dataset files and different models pre-trained on different datasets are available for download.
- Global Gastronomic Culinary Dataset (GGCD): Download GFSD.zip
- Annotated Nutrition5k Dataset: Download Nutrition5k.zip
- YOLOv8n model trained on Nutrition5k: https://issai.nu.edu.kz/wp-content/themes/issai-new/data/models/Nutrition5k/yolov8n.pt
- YOLOv8s model trained on GFSD: https://issai.nu.edu.kz/wp-content/themes/issai-new/data/models/GFSD/yolov8s.pt
- RT-DETR-x model trained on CAFSD: https://issai.nu.edu.kz/wp-content/themes/issai-new/data/models/CAFSD/rtdetr-x.pt
You can download different versions of the YOLOv8 model (n, s, m, l, x) and RT-DETR-x model for each dataset by modifying this link accordingly: https://issai.nu.edu.kz/wp-content/themes/issai-new/data/models/<DATASET>/<MODEL>
Below are the placeholders for the model names and dataset names:
- Replace
<MODEL>
with:yolov8n.pt
,yolov8s.pt
,yolov8m.pt
,yolov8l.pt
,yolov8x.pt
,rtdetr-x.pt
- Replace
<DATASET>
with:CAFSD
,GFSD
,Nutrition5k
- YOLOv8x model trained on Nutrition5k: https://issai.nu.edu.kz/wp-content/themes/issai-new/data/models/Nutrition5k/yolov8x.pt
Model Size | CAFSD (mAP50) | CAFSD (mAP50-95) | Nutrition5k (mAP50) | Nutrition5k (mAP50-95) | GFSD (mAP50) | GFSD (mAP50-95) |
---|---|---|---|---|---|---|
YOLOv8n | 0.57 | 0.487 | 0.775 | 0.673 | 0.615 | 0.529 |
YOLOv8s | 0.612 | 0.529 | 0.781 | 0.688 | 0.668 | 0.584 |
YOLOv8m | 0.652 | 0.576 | 0.787 | 0.711 | 0.698 | 0.621 |
YOLOv8l | 0.659 | 0.586 | 0.802 | 0.724 | 0.712 | 0.635 |
YOLOv8xl | 0.677 | 0.601 | 0.797 | 0.717 | 0.714 | 0.641 |
RT-DETR-x | 0.685 | 0.613 | 0.768 | 0.688 | 0.711 | 0.637 |
[1] Thames, Q., Karpur, A., Norris, W., Xia, F., Panait, L., Weyand, T., & Sim, J. (2021). Nutrition5k: Towards automatic nutritional understanding of generic food. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8903–8911).