PIBAdb Ultralytics: Polyp Detection with Ultralytics Models

1. Introduction

This repository contains several utilities to train, evaluate, and apply object detection models based on Ultralytics (such as YOLO and RT-DETR) on the PIBAdb Cohort, with the goal of detecting polyps in colonoscopy videos.

To use this repository, you must request access to the PIBAdb Cohort through the Biobank of the Instituto de Investigaciones Sanitarias Galicia Sur.

To get started, make sure you have Python 3.10 or higher installed. Then, install the required dependencies by running:

pip install .

Or, if you want to use the same exact versions as we did:

pip install -r requirements.txt

Following, you can find an example workflow for preparing multiple datasets derived from PIBAdb to evaluate several models from Ultralytics. A base dataset is created using only polyp images and split into training (60%), validation (20%), and test (20%) subsets. Additional versions are generated by adding 2%, 5%, 10%, and 15% of images without polyp to this base dataset.

2. Data Preparation

The data preparation pipeline consists of the following steps:

Image listing: Generates the files images.tsv (with polyp images) and images_not_polyp.tsv (with images without polyp) from the PIBAdb database.

python -m piba_ultralytics.data_preparation.list_images --user piba --password pibapass --database pibaexported --output-dir data

Dataset creation: Generates the train, validation and test subsets to create a dataset in Ultralytics format. Then, several dataset variations are created incorporating different proportions of images without (2%, 5%, 10%, 15%). In addition, dataset versions without classes are also generated to be used in the test evaluations. The <PATH TO PIBA MEDIA DIRECTORY> marker should be replaced with the path to the directory that contains the PIBAdb media files (PNG and MP4 files).

python -m piba_ultralytics.data_preparation.stratify --images-path <PATH TO PIBA MEDIA DIRECTORY>

Test video ID extraction: Lists video IDs associated with test split images for the base and extended datasets. The <PIBA USER> and <PIBA PASSWORD> markers should be replaced with your credentials to access the local PIBAdb database.

python -m piba_ultralytics.data_preparation.list_test_videos --user <PIBA USER> --password <PIBA PASSWORD> --input-dir datasets/piba/labels/test --output data/test-video-ids.tsv
python -m piba_ultralytics.data_preparation.list_test_videos --user <PIBA USER> --password <PIBA PASSWORD> --input-dir datasets/piba-not-polyp-2/labels/test --output data/test-video-ids-not-polyp-2.tsv
python -m piba_ultralytics.data_preparation.list_test_videos --user <PIBA USER> --password <PIBA PASSWORD> --input-dir datasets/piba-not-polyp-5/labels/test --output data/test-video-ids-not-polyp-5.tsv
python -m piba_ultralytics.data_preparation.list_test_videos --user <PIBA USER> --password <PIBA PASSWORD> --input-dir datasets/piba-not-polyp-10/labels/test --output data/test-video-ids-not-polyp-10.tsv
python -m piba_ultralytics.data_preparation.list_test_videos --user <PIBA USER> --password <PIBA PASSWORD> --input-dir datasets/piba-not-polyp-15/labels/test --output data/test-video-ids-not-polyp-15.tsv

Video segment metadata extraction: Generates metadata for the test video segments using the video ID files created in the previous step. This process enables comprehensive evaluation both at frame and video level. The <PIBA USER> and <PIBA PASSWORD> markers should be replaced with your credentials to access the local PIBAdb database.

python -m piba_ultralytics.data_preparation.list_test_video_segments --user <PIBA USER> --password <PIBA PASS> --video-ids-file data/test-video-ids.tsv --output-file data/test-video-segments.tsv
python -m piba_ultralytics.data_preparation.list_test_video_segments --user <PIBA USER> --password <PIBA PASS> --video-ids-file data/test-video-ids-not-polyp-2.tsv --output-file data/test-video-segments-not-polyp-2.tsv
python -m piba_ultralytics.data_preparation.list_test_video_segments --user <PIBA USER> --password <PIBA PASS> --video-ids-file data/test-video-ids-not-polyp-5.tsv --output-file data/test-video-segments-not-polyp-5.tsv
python -m piba_ultralytics.data_preparation.list_test_video_segments --user <PIBA USER> --password <PIBA PASS> --video-ids-file data/test-video-ids-not-polyp-10.tsv --output-file data/test-video-segments-not-polyp-10.tsv
python -m piba_ultralytics.data_preparation.list_test_video_segments --user <PIBA USER> --password <PIBA PASS> --video-ids-file data/test-video-ids-not-polyp-15.tsv --output-file data/test-video-segments-not-polyp-15.tsv

3. Model Evaluation

The model evaluation process includes several stages:

Training on the base dataset: Multiple models (YOLOv9, YOLOv10, YOLOv11, YOLOv12, and RT-DETR) are trained on the base dataset containing only polyp images.

python -m piba_ultralytics.evaluation.train_models_on_datasets \
  --models-names yolo9t yolo9e yolo10n yolo10x yolo11n yolo11x yolo12n yolo12x rt-detr-l rt-detr-x \
  --models YOLO YOLO YOLO YOLO YOLO YOLO YOLO YOLO RTDETR RTDETR \
  --models-weights yolov9t.pt yolov9e.pt yolov10n.pt yolov10x.pt yolo11n.pt yolo11x.pt yolo12n.pt yolo12x.pt rtdetr-l.pt rtdetr-x.pt \
  --dataset-yaml-paths datasets/piba.yaml \
  --output-path results

Testing the best models: The top-performing models are tested on the validation and test subsets of the base dataset.

python -m piba_ultralytics.evaluation.test_models_on_datasets \
  --models-names yolo9t yolo9e yolo10n yolo10x yolo11n yolo11x yolo12n yolo12x rt-detr-l rt-detr-x \
  --models YOLO YOLO YOLO YOLO YOLO YOLO YOLO YOLO RTDETR RTDETR \
  --models-weights results/piba/train/yolo9t/weights/best.pt \
    results/piba/train/yolo9e/weights/best.pt \
    results/piba/train/yolo10n/weights/best.pt \
    results/piba/train/yolo10x/weights/best.pt \
    results/piba/train/yolo11n/weights/best.pt \
    results/piba/train/yolo11x/weights/best.pt \
    results/piba/train/yolo12n/weights/best.pt \
    results/piba/train/yolo12x/weights/best.pt \
    results/piba/train/rt-detr-l/weights/best.pt \
    results/piba/train/rt-detr-x/weights/best.pt \
  --dataset-yaml-paths datasets/piba.yaml \
  --output-path results \
  --val-summary-path results/bests-val.tsv \
  --test-summary-path results/bests-test.tsv

Evaluation on datasets with images without polyp: The best model (RT-DETR-X) is retrained on datasets that include 2%, 5%, 10%, and 15% of images without polyp.

python -m piba_ultralytics.evaluation.train_models_on_datasets \
  --models-names rt-detr-x \
  --models RTDETR \
  --models-weights rtdetr-x.pt \
  --dataset-yaml-paths datasets/piba-not-polyp-2.yaml datasets/piba-not-polyp-5.yaml datasets/piba-not-polyp-10.yaml datasets/piba-not-polyp-15.yaml \
  --output-path results/rt-detr-x/train

Not polyp testing: The models trained on the extended datasets are tested on their corresponding validation and test subsets to compare performance.

python -m piba_ultralytics.evaluation.test_models_on_datasets \
  --models-names rt-detr-x-not-polyp-0 rt-detr-x-not-polyp-2 rt-detr-x-not-polyp-5 rt-detr-x-not-polyp-10 rt-detr-x-not-polyp-15 \
  --models RTDETR RTDETR RTDETR RTDETR RTDETR \
  --models-weights results/piba/train/rt-detr-x/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-2/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-5/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-10/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-15/weights/best.pt \
  --dataset-yaml-paths datasets/piba-no-classes-not-polyp-0.yaml \
    datasets/piba-no-classes-not-polyp-2.yaml \
    datasets/piba-no-classes-not-polyp-5.yaml \
    datasets/piba-no-classes-not-polyp-10.yaml \
    datasets/piba-no-classes-not-polyp-15.yaml \
  --output-path results \
  --val-summary-path results/best-val-not-polyp-rt-detr-x.tsv \
  --test-summary-path results/best-test-not-polyp-rt-detr-x.tsv

Video segment evaluation: Models are applied to colonoscopy video segments. Confidence thresholds are chosen based on F1-score curves to optimize performance.

python -m piba_ultralytics.evaluation.test_models_on_video_segments \
  --models-names rt-detr-x-not-polyp-0 rt-detr-x-not-polyp-2 rt-detr-x-not-polyp-5 rt-detr-x-not-polyp-10 rt-detr-x-not-polyp-15 \
  --models RTDETR RTDETR RTDETR RTDETR RTDETR \
  --models-weights results/piba/train/rt-detr-x/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-2/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-5/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-10/weights/best.pt \
    results/rt-detr-x/train/piba-not-polyp-15/weights/best.pt \
  --conf-thresholds 0.676 0.615 0.576 0.581 0.560 \
  --metadata-path data/test-video-segments-not-polyp-15.tsv \
  --videos-path /home/manovo/PolyDeep/Media \
  --stats-paths-pattern results/video-segments-MODEL-on-DATA.tsv

Frame-level statistics: Evaluation results are analyzed at the frame level to assess detection accuracy on individual frames.

python -m piba_ultralytics.evaluation.calculate_video_segment_stats_at_frame_level \
  --detections-files results/video-segments-rt-detr-x-not-polyp-0-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-2-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-5-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-10-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-15-on-test-video-segments-not-polyp-15.tsv \
  --output-file results/video-segment-at-frame-level-stats.tsv

Video-level statistics: Detection results are summarized at the video level to provide a high-level comparison across models.

python -m piba_ultralytics.evaluation.calculate_video_segment_stats_at_video_level \
  --detections-files results/video-segments-rt-detr-x-not-polyp-0-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-2-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-5-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-10-on-test-video-segments-not-polyp-15.tsv \
    results/video-segments-rt-detr-x-not-polyp-15-on-test-video-segments-not-polyp-15.tsv \
  --video-segments-file data/test-video-segments-not-polyp-15.tsv \
  --output-file results/video-segment-at-video-level-stats.tsv

4. Contact

The PIBAdb Cohort is part of the PolyDeep project and is managed by the PolyDeep Research Consortium. For more information or inquiries, please feel free to contact us at: investigacion.pibadb@iisgaliciasur.es.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
piba_ultralytics		piba_ultralytics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PIBAdb Ultralytics: Polyp Detection with Ultralytics Models

1. Introduction

2. Data Preparation

3. Model Evaluation

4. Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

sing-group/pibadb-ultralytics

Folders and files

Latest commit

History

Repository files navigation

PIBAdb Ultralytics: Polyp Detection with Ultralytics Models

1. Introduction

2. Data Preparation

3. Model Evaluation

4. Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages