Skip to content

Augmentation method based on diffusion models for object detection and segmentation πŸ–ΌοΈ

Notifications You must be signed in to change notification settings

CTLab-ITMO/diff-aug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Diff-Aug: Augmentation method based on diffusion models for object detection and segmentation

Requirements:

  1. Install pip.
  2. Install Cuda Toolkit 12.1

Installation (Windows):

setup.bat

Installation (Linux):

To be done.

Usage:

On Windows, you can run the following command to start the augmentation process:

run.bat

Before running the script, you need to modify the run.bat file to specify the augmentation parameters:

  1. data_images_path - path to the folder with images.
  2. data_masks_path - path to the folder with masks (masks are one channel images with 255-pixels for objects and 0-pixels for background).
  3. output_path - path to the output folder.
  4. number_of_inpainted_images_per_image_required - number of augmented images per image required.
  5. main_canny_weight - weight of the canny ControlNet for the main model.
  6. main_depth_weight - weight of the depth ControlNet for the main model.
  7. main_soft_edge_weight - weight of the soft edge ControlNet for the main model.
  8. main_usual_ipadapter_weight - weight of the IPAdapter for general features of neighboring images for the main model.
  9. main_plus_ipadapter_weight - weight of the IPAdapter (Plus) for input image features for the main model.
  10. main_neg_plus_ipadapter_weight - weight of the IPAdapter (Plus) for negative object features of neighboring images for the main model.
  11. dataset_name - name of the dataset for CLIP features storage.
  12. positive_prompt - positive generation prompt.
  13. negative_prompt - negative generation prompt.
  14. seed - random generation seed.

Alternatively, you can run the augmentation process via Python script:

from src.aug_loop import run_augmentation

run_augmentation(
    ...
)

Method schema:

image

image

image

image

image

Examples:

Generation examples on the Potholes dataset:

image

Generation examples on the Rooftops dataset:

image

Detection and Segmentation Results:

For all experiments, we used the pretrained YOLOv8n with the default standard augmentations.

Detection results for the Potholes dataset:

Data Precision Recall mAP50-95
without our augmentation 0.647 Β± 0.020 0.572 Β± 0.010 0.304 Β± 0.004
Diff-Aug (prev) 0.666 Β± 0.019 0.552 Β± 0.015 0.330 Β± 0.003
Diff-Aug 0.665 Β± 0.012 0.565 Β± 0.018 0.330 Β± 0.004

Segmentation results for the Potholes dataset:

Data Precision Recall mAP50-95
without our augmentation 0.674 Β± 0.012 0.556 Β± 0.014 0.282 Β± 0.004
Diff-Aug (prev) 0.666 Β± 0.023 0.548 Β± 0.013 0.294 Β± 0.003
Diff-Aug 0.660 Β± 0.017 0.571 Β± 0.021 0.297 Β± 0.004

Acknowledgements

This research is financially supported by the Foundation for National Technology Initiative's Projects Support as a part of the roadmap implementation for the development of the high-tech field of Artificial Intelligence for the period up to 2030 (agreement 70-2021-00187).

Diff-Aug: АугмСнтация ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ для Π·Π°Π΄Π°Ρ‡ Π΄Π΅Ρ‚Π΅ΠΊΡ†ΠΈΠΈ ΠΈ сСгмСнтации Π½Π° основС Π΄ΠΈΡ„Ρ„ΡƒΠ·ΠΈΠΎΠ½Π½Ρ‹Ρ… Π½Π΅ΠΉΡ€ΠΎΠ½Π½Ρ‹Ρ… сСтСй

ВрСбования

  1. УстановитС pip.
  2. УстановитС Cuda Toolkit 12.1

Установка (Windows):

setup.bat

Установка (Linux):

Π’ процСссС.

ИспользованиС:

На Windows Π²Ρ‹ ΠΌΠΎΠΆΠ΅Ρ‚Π΅ Π·Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ ΡΠ»Π΅Π΄ΡƒΡŽΡ‰ΡƒΡŽ ΠΊΠΎΠΌΠ°Π½Π΄Ρƒ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ Π½Π°Ρ‡Π°Ρ‚ΡŒ процСсс Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ:

run.bat

ΠŸΠ΅Ρ€Π΅Π΄ запуском скрипта Π²Π°ΠΌ Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎ ΠΈΠ·ΠΌΠ΅Π½ΠΈΡ‚ΡŒ Ρ„Π°ΠΉΠ» run.bat, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΡƒΠΊΠ°Π·Π°Ρ‚ΡŒ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€Ρ‹ Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ:

  1. data_images_path - ΠΏΡƒΡ‚ΡŒ ΠΊ ΠΏΠ°ΠΏΠΊΠ΅ с изобраТСниями.
  2. data_masks_path - ΠΏΡƒΡ‚ΡŒ ΠΊ ΠΏΠ°ΠΏΠΊΠ΅ с масками (маски - это ΠΎΠ΄Π½ΠΎΠΊΠ°Π½Π°Π»ΡŒΠ½Ρ‹Π΅ изобраТСния, Π³Π΄Π΅ значСния пиксСлСй Ρ€Π°Π²Π½Ρ‹ 255 для ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ² ΠΈ 0 для Ρ„ΠΎΠ½Π°).
  3. output_path - ΠΏΡƒΡ‚ΡŒ ΠΊ ΠΏΠ°ΠΏΠΊΠ΅ Π²Ρ‹Π²ΠΎΠ΄Π°.
  4. number_of_inpainted_images_per_image_required - количСство Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ Π½Π° ΠΎΠ΄Π½ΠΎ ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅.
  5. main_canny_weight - вСс Canny ControlNet для основной модСли.
  6. main_depth_weight - вСс Depth ControlNet для основной модСли.
  7. main_soft_edge_weight - вСс Soft Edge ControlNet для основной модСли.
  8. main_usual_ipadapter_weight - вСс IPAdapter для ΠΎΠ±Ρ‰ΠΈΡ… ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² сосСдних ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ для основной ΠΌΠΎΠ΄Π΅Π»ΠΈ.
  9. main_plus_ipadapter_weight - вСс IPAdapter (Plus) для ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² Π²Ρ…ΠΎΠ΄Π½ΠΎΠ³ΠΎ изобраТСния для основной ΠΌΠΎΠ΄Π΅Π»ΠΈ.
  10. main_neg_plus_ipadapter_weight - вСс IPAdapter (Plus) для ΠΎΡ‚Ρ€ΠΈΡ†Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹Ρ… ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ² сосСдних ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ для основной ΠΌΠΎΠ΄Π΅Π»ΠΈ.
  11. dataset_name - имя Π½Π°Π±ΠΎΡ€Π° Π΄Π°Π½Π½Ρ‹Ρ… для хранСния ΠΏΡ€ΠΈΠ·Π½Π°ΠΊΠΎΠ² CLIP.
  12. positive_prompt - ΠΏΠΎΠ»ΠΎΠΆΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ ΠΏΡ€ΠΎΠΌΠΏΡ‚ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ.
  13. negative_prompt - ΠΎΡ‚Ρ€ΠΈΡ†Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ ΠΏΡ€ΠΎΠΌΠΏΡ‚ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ.
  14. seed - случайноС Π·Π΅Ρ€Π½ΠΎ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ.

ΠšΡ€ΠΎΠΌΠ΅ Ρ‚ΠΎΠ³ΠΎ, Π²Ρ‹ ΠΌΠΎΠΆΠ΅Ρ‚Π΅ Π·Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ процСсс Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ Ρ‡Π΅Ρ€Π΅Π· скрипт Python:

from src.aug_loop import run_augmentation

run_augmentation(
    ...
)

Π‘Ρ…Π΅ΠΌΠ° ΠΌΠ΅Ρ‚ΠΎΠ΄Π°:

my_method_v3

my_method_v3_target_objects

my_method_v3_features_1

my_method_v3_features_2

my_method_v3_controlnets

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹:

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ Π½Π° датасСтС Potholes:

image

ΠŸΡ€ΠΈΠΌΠ΅Ρ€Ρ‹ Π³Π΅Π½Π΅Ρ€Π°Ρ†ΠΈΠΈ Π½Π° датасСтС Rooftops:

image

Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π΄Π΅Ρ‚Π΅ΠΊΡ†ΠΈΠΈ ΠΈ сСгмСнтации:

Для всСх экспСримСнтов ΠΌΡ‹ использовали ΠΏΡ€Π΅Π΄ΠΎΠ±ΡƒΡ‡Π΅Π½Π½ΡƒΡŽ YOLOv8n с стандартными аугмСнтациями.

Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ Π΄Π΅Ρ‚Π΅ΠΊΡ†ΠΈΠΈ Π½Π° датасСтС Potholes:

Π”Π°Π½Π½Ρ‹Π΅ Π’ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ ΠŸΠΎΠ»Π½ΠΎΡ‚Π° mAP50-95
Π±Π΅Π· нашСй Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ 0.647 Β± 0.020 0.572 Β± 0.010 0.304 Β± 0.004
Diff-Aug (ΠΏΡ€Π΅Π΄) 0.666 Β± 0.019 0.552 Β± 0.015 0.330 Β± 0.003
Diff-Aug 0.665 Β± 0.012 0.565 Β± 0.018 0.330 Β± 0.004

Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ сСгмСнтации Π½Π° датасСтС Potholes:

Π”Π°Π½Π½Ρ‹Π΅ Π’ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ ΠŸΠΎΠ»Π½ΠΎΡ‚Π° mAP50-95
Π±Π΅Π· нашСй Π°ΡƒΠ³ΠΌΠ΅Π½Ρ‚Π°Ρ†ΠΈΠΈ 0.674 Β± 0.012 0.556 Β± 0.014 0.282 Β± 0.004
Diff-Aug (ΠΏΡ€Π΅Π΄) 0.666 Β± 0.023 0.548 Β± 0.013 0.294 Β± 0.003
Diff-Aug 0.660 Β± 0.017 0.571 Β± 0.021 0.297 Β± 0.004

Благодарности

Π Π΅Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½ΠΎ ΠΏΡ€ΠΈ финансовой ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠ΅ Π€ΠΎΠ½Π΄Π° ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠΈ ΠΏΡ€ΠΎΠ΅ΠΊΡ‚ΠΎΠ² ΠΠ°Ρ†ΠΈΠΎΠ½Π°Π»ΡŒΠ½ΠΎΠΉ тСхнологичСской ΠΈΠ½ΠΈΡ†ΠΈΠ°Ρ‚ΠΈΠ²Ρ‹ Π² Ρ€Π°ΠΌΠΊΠ°Ρ… Ρ€Π΅Π°Π»ΠΈΠ·Π°Ρ†ΠΈΠΈ "Π΄ΠΎΡ€ΠΎΠΆΠ½ΠΎΠΉ ΠΊΠ°Ρ€Ρ‚Ρ‹" развития высокотСхнологичного направлСния "Π˜ΡΠΊΡƒΡΡΡ‚Π²Π΅Π½Π½Ρ‹ΠΉ ΠΈΠ½Ρ‚Π΅Π»Π»Π΅ΠΊΡ‚" Π½Π° ΠΏΠ΅Ρ€ΠΈΠΎΠ΄ Π΄ΠΎ 2030 Π³ΠΎΠ΄Π° (Π”ΠΎΠ³ΠΎΠ²ΠΎΡ€ β„– 70-2021-00187).

About

Augmentation method based on diffusion models for object detection and segmentation πŸ–ΌοΈ

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published