This project is part of the Computer Vision (POVa) course at VUT FIT. This project aims to create and train machine-learning models for polyp segmentation in colonoscopy images.
Ľuboš Martinček, Eva Mičánková, Juraj Dedič
Create python environment:
python3 -m venv env22
Activate python environment:
source env22/bin/activate
Install required packages:
pip install -r requirements.txt
Download merged dataset:
- Prepaired merged dataset is available for download here: Merged_newest.zip
- Unzip dataset into
./datasets/merged
folder
Merge using dataset_merge.py
script:
- Create a 'datasets' folder
- Download CVC-ClinicDB, Kvasir-SEG and PolypGen2021_MultiCenterData_v3 datasets into this folder, link: Datasets
- Unzip datasets into this folder
- Run
python ./datasets/dataset_merge.py
Dataset name | Train images | Validation images | Description |
---|---|---|---|
CVC-ClinicDB | - | 612 | - |
Kvasir-SEG | 880 | 120 | - |
PolypGen2021_MultiCenterData_v3 | 8,037 | - | - |
Dataset split | Num of images |
---|---|
train | 8,917 |
train_augmented_small_random | 26,751 |
val_kvasir | 120 |
val_clinic | 612 |
-
Run
python augment.py
with arguments -data path_to_dataset (must containimages
andlabels
directories) -
Run
python augment_small.py
with arguments -data path_to_dataset (must containimages
andlabels
directories) creates smaller and randomized version of the dataset
Augmented train split is available for download here: train_augmented_small_random.zip
Download pretrained SAM checkpoints
Run SAM finetuning:
python3 finetune_sam.py --cfg ./sam/configs/<YAML configuration file>
- Run
python eval.py
with arguments--arch ["Unet", "SAM"]
(default architecture isUnet
)--model path_to_saved_model
--data path_to_evaluation_dataset
(must containimages
andlabels
directories)--batch_size
(default is1
)
Checkpoints are available for download here: Checkpoints
Model | Checkpoint | Train Dataset | Test Dataset | F1 | IoU | Description |
---|---|---|---|---|---|---|
UNet + Resnet34 | UNet+Resnet34+Standard (unet_segmentation_12-19_08-36).pth | Standard | Kvasir-SEG | 0.8564 | 0.7873 | 39 epochs |
UNet + Resnet34 | Standard | CVC-ClinicDB | 0.7856 | 0.7134 | 39 epochs | |
UNet + Resnet34 | UNet+Resnet34+Augmented (unet_segmentation_12-21_11-21) | Augmented | Kvasir-SEG | 0.8466 | 0.7729 | 41 epochs |
UNet + Resnet34 | Augmented | CVC-ClinicDB | 0.8060 | 0.7307 | 41 epochs |
Finetunind decoder only experiments results:
Model | Checkpoint | Train Dataset | Test Dataset | F1 | IoU | Description |
---|---|---|---|---|---|---|
SAM(md) | SAM_f_merged_newest_meta_7_md_e69_iou0.5655.pth | Standard | Kvasir-SEG | 0.7422 | 0.6481 | 69 epochs |
SAM(md) | Standard | CVC-ClinicDB | 0.6869 | 0.6014 | ||
SAM(md) | SAM_f_sam_merged_newest_meta_7_md_e97_iou0.5541.pth | Standard SAM | Kvasir-SEG | 0.7757 | 0.6896 | 97 epochs |
SAM(md) | Standard SAM | CVC-ClinicDB | 0.6661 | 0.5859 | ||
SAM(md) | SAM_f_merged_newest_A_meta_7_md_e11_iou0.5464.pth | Augmented | Kvasir-SEG | 0.7444 | 0.6530 | 11 epochs |
SAM(md) | Augmented | CVC-ClinicDB | 0.6720 | 0.5896 | ||
SAM(md) | SAM_f_sam_merged_newest_A_meta_7_md_e31_iou0.5566.pth | Augmented SAM | Kvasir-SEG | 0.7687 | 0.6851 | 31 epochs |
SAM(md) | Augmented SAM | CVC-ClinicDB | 0.6528 | 0.5766 |
Finetuning encoders only experiments results:
Model | Checkpoint | Train Dataset | Test Dataset | F1 | IoU | Description |
---|---|---|---|---|---|---|
SAM(ie) | SAM_f_merged_newest_meta_7_ie_e34_iou0.7617.pth | Standard | Kvasir-SEG | 0.8787 | 0.8102 | 34 epochs |
SAM(ie) | Standard | CVC-ClinicDB | 0.8386 | 0.7685 | ||
SAM(ie) | SAM_f_sam_merged_newest_meta_7_ie_e29_iou0.7794.pth | Standard SAM | Kvasir-SEG | 0.8807 | 0.8147 | 29 epochs |
SAM(ie) | Standard SAM | CVC-ClinicDB | 0.8590 | 0.7896 | ||
SAM(ie) | SAM_f_merged_newest_A_meta_7_ie_e6_iou0.7474.pth | Augmented | Kvasir-SEG | 0.8597 | 0.7843 | 6 epochs |
SAM(ie) | Augmented | CVC-ClinicDB | 0.8431 | 0.7650 | ||
SAM(ie) | SAM_f_sam_merged_newest_A_meta_7_ie_e18_iou0.7855.pth | Augmented SAM | Kvasir-SEG | 0.8837 | 0.8166 | 18 epochs |
SAM(ie) | Augmented SAM | CVC-ClinicDB | 0.8659 | 0.7975 |
Finetuning encoders and decoder experiments results:
Model | Checkpoint | Train Dataset | Test Dataset | F1 | IoU | Description |
---|---|---|---|---|---|---|
SAM(iemd) | SAM_f_merged_newest_meta_7_iemd_e32_iou0.7528.pt | Standard | Kvasir-SEG | 0.8803 | 0.8172 | 32 epochs |
SAM(iemd) | Standard | CVC-ClinicDB | 0.8206 | 0.7527 | ||
SAM(iemd) | SAM_f_sam_merged_newest_meta_7_iemd_e52_iou0.7848.pth | Standard SAM | Kvasir-SEG | 0.8801 | 0.8178 | 52 epochs |
SAM(iemd) | Standard SAM | CVC-ClinicDB | 0.8541 | 0.7886 | ||
SAM(iemd) | SAM_merged_newest_A_meta_7_iemd_e11_iou0.7632.pth | Augmented | Kvasir-SEG | 0.8846 | 0.8196 | 11 epochs |
SAM(iemd) | Augmented | CVC-ClinicDB | 0.8437 | 0.7736 | ||
SAM(iemd) | SAM_f_sam_merged_newest_A_meta_7_iemd_e34_iou0.7976.pth | Augmented SAM | Kvasir-SEG | 0.8831 | 0.8189 | 34 epochs |
SAM(iemd) | Augmented SAM | CVC-ClinicDB | 0.8714 | 0.8058 |
Note: SAM datasets versions do not contain images without polyps.
To generate outputs run:
python3 ./utils/generate_images.py --sam_model ./models/final/SAM_f_sam_merged_newest_A_meta_7_iemd_e34_iou0.7976.pth --unet_model ./models/final/'UNet+Resnet34+Standard (unet_segmentation_12-19_08-36).pth' --data ./datasets/Merged_newest/val_kvasir --batch_size 8
Computational resources were provided by the e-INFRA CZ project (ID:90254), supported by the Ministry of Education, Youth and Sports of the Czech Republic.