This project is an extension of menu-image-grouping project. The goal is to improve the performance of the classification algorithm by implementing state-of-the-art model: EfficientNet in place of VGG-16. EfficientNet-B6 succeeded in increasing the number of distinguishable classes to 89 menus while maintaining a precision above 90% on the test set.
EfficientNet-B6_89menus
├── README.md
├── food_cleaned.csv
├── new_random_sample_out.csv
├── requirements.txt
├── best_model
│ ├── B6_89classes-54-1.20.h5
│ ├── B6_89classes_acc.png
│ ├── B6_89classes_loss.png
│ └── B6_eval.png
└── dev
├── 00_download_image.ipynb
├── 01_crop-and-clean_image.ipynb
├── 02_create_data_nclasses.ipynb
├── 03_train_model.ipynb
└── 04_evaluation.ipynb
food_cleaned.csv
-> Table containing information of cleaned data including:
aesthetic_score, photo_eid, pic_url, product_id, product_name, res_id, res_name, number_of_object, bbox_ratio, real_x1, real_x2, real_y1, real_y2.new_random_sample_out.csv
-> Table containing random sample for testing (Not used: data cleaning is needed).
B6_89classes-54-1.20.h5
-> EfficientNet-B6 weights for 89 classes.B6_89classes_acc.png
-> Plot of training accuracy and validation accuracy.B6_89classes_loss.png
-> Plot of training loss and validation loss.B6_eval.png
-> Plot of evaluation result on the test set.
00_download_image.ipynb
-> Download images data from Google Cloud Storage.01_crop-and-clean_image.ipynb
-> Select, clean, and crop images according to the bounding box. Createfood_cleaned_cropped.csv
which will be used in02_create_data_nclasses.ipynb
.02_create_data_nclasses.ipynb
-> Distributed images to train/validation/test folders preparing data for model training and evaluation.03_train_model.ipynb
-> Train EfficientNet-B6 model.04_evaluation.ipynb
-> Evaluate result on test set.
The weight of the model: best_model/B6_89classes-54-1.20.h5
. When setting a threshold at 0.3, the model can label 78.60% of the food images while sustaining 97.23% precision on the test set (test_photo_eid_89classes.csv
).
*There is a trade-off between precision and collection when changing the threshold.
['Pizza' 'Salmon Sashimi' 'honey toast' 'กระเพาะปลา' 'กุ้งอบวุ้นเส้น'
'กุ้งเผา' 'กุ้งแช่น้ำปลา' 'ก๋วยจั๊บ' 'ก๋วยจั๊บญวน' 'ก๋วยเตี๋ยวคั่วไก่'
'ก๋วยเตี๋ยวต้มยำ' 'ก๋วยเตี๋ยวเรือ' 'ขนมจีน' 'ขนมจีบ' 'ขนมปัง'
'ขนมปังปิ้ง' 'ขาหมูเยอรมัน' 'ข้าวขาหมู' 'ข้าวคลุกกะปิ' 'ข้าวซอยไก่'
'ข้าวผัด' 'ข้าวผัดกระเทียม' 'ข้าวมันไก่' 'ข้าวหน้าเนื้อ' 'ข้าวหน้าเป็ด'
'ข้าวหมกไก่' 'ข้าวหมูกรอบ' 'ข้าวหมูแดง' 'ข้าวเหนียวมะม่วง' 'คอหมูย่าง'
'ชาบู' 'ตับหวาน' 'ติ่มซำ' 'ต้มยำ' 'ต้มเลือดหมู' 'ต้มแซ่บกระดูกอ่อน'
'ทอดมันกุ้ง' 'ทอดมันปลากราย' 'ทาโกะยากิ' 'น้ำตกหมู' 'น้ำพริกไข่ปู'
'บะหมี่แห้ง' 'ปลากระพงทอดน้ำปลา' 'ปลากระพงนึ่งมะนาว' 'ปลาหมึกผัดไข่เค็ม'
'ปอเปี๊ยะทอด' 'ปูนิ่มทอดกระเทียม' 'ปูผัดผงกะหรี่' 'ปูม้านึ่ง'
'ผักโขมอบชีส' 'ผัดไทกุ้งสด' 'ยำถั่วพลู' 'ยำปลาดุกฟู' 'ยำวุ้นเส้น'
'ยำสาหร่าย' 'ยำหมูยอ' 'ยำแซลมอน' 'ลาบ' 'สปาเก็ตตี้ขี้เมาทะเล'
'สปาเก็ตตี้คาโบนาร่า' 'สลัด' 'สเต็กหมู' 'ส้มตำ' 'หมูกรอบ' 'หมูมะนาว'
'หมูสะเต๊ะ' 'หมูแดดเดียว' 'หอยนางรม' 'หอยแครงลวก' 'ออส่วน' 'ฮะเก๋า'
'เกี๊ยวซ่า' 'เกี๊ยวทอด' 'เนื้อย่าง' 'เป็ดปักกิ่ง' 'เป็ดพะโล้' 'เป็ดย่าง'
'เย็นตาโฟ' 'แกงคั่วหอยขม' 'แกงส้มชะอมกุ้ง' 'แหนมเนือง' 'ใบเหลียงผัดไข่'
'ไก่ทอด' 'ไก่ย่าง' 'ไข่กระทะ' 'ไข่ตุ๋น' 'ไข่เจียว' 'ไส้อั่ว' 'ไอศกรีม']
-
Create a test set which can better represent the real data. It should include all classes in the database, labeling classes other than these 89 menus as 'อื่นๆ'.
-
Clean the training data.
There are some label mistakes in the `food_cleaned.csv` file; hence, cleaning it may increase performance. (Though it might not worth the time spent).
-
Add more images to classes with fewer numbers of images.
The maximum number of images in each class in the training set was set to be 450. Therefore, it would be best to have a balanced training set having 450 images in each class.
-
Increase the number of classes.
The performance did not drop much when scaling from 62 menus to 89 menus; thus, scaling to a larger number of menus seems feasible.
-
Develop a better object detection model.
The current object detection can only detect one object, and sometimes a non-target object is detected instead.
There are some errors in the bounding box coordinates.
- EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
- EfficientNet Explained!
- Neural Network that Changes Everything - Computerphile
- How Blurs & Filters Work - Computerphile
- Understand the architecture of CNN
- A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning
- How to do Transfer learning with Efficientnet
The input_images, food_cleaned.csv, new_random_sample_out.csv, and model (B6_89classes-54-1.20.h5
) files were not provided in this repository as they are company's assets.