Skip to content

In this workshop, the students learn the process of image classification for training text-to-image AI models and build their own small, ‘unflattering’ dataset to experiment with the tools and share their thoughts about the machinery but personal process.

Notifications You must be signed in to change notification settings

soyunparrrk/Unflattering-Dataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Unflattering-Dataset

In this workshop, the students learn the process of image classification for training text-to-image AI models and build their own small, ‘unflattering’ dataset to experiment with the tools and share their thoughts about the machinery but personal process.

Processes behind Dataset Curation & Creation

🡥 (1) Preprocess Data: Crop, Resize, Label.
🡥 (2) Training: How to train the model, what settings are needed
🡥 (3) Inference: How to use your trained model with text-to-image models

(1) Preprocess Data

(1-1) Collect the images for a dataset

  • We will collect a small collection of Images ~20-50 (the more the better)
  • For faces or portraits the number of images can be lower, but for more complex objets, styles, concepts ~50-100 images are needed.
  • A Pre-Trained Model: We will need a LARGE model which has already has a understanding of objects, styles, text/image knowledge. In this workshop, it would be best to use LoRA which is really good for quickly finetuning a model off of your data.
  • Training Images can be JPG or PNG.
  • Mininum size of training images 512x512. The larger the images the better.

(1-2) Batch crop & rename images: Birme.net

  • Settings for Birme: JPG (100%), 512 x 512, Save as Zip


Brime Cropping Michelin Man 🛞

(1-3) Label your images

Open a text editor of your preference and label (classify) each image. You can do it in one file.

** Pay attention to what comes to your mind when you do so. What decisions did you have to make when describing the image? What was easy? What was difficult? And why? **

(1-4) Try automatic data labeling and compare it to yours

Use this colab notebook Dataset maker. This will be used to prepare the images as a dataset for LoRA. It will also create captions for each image, which you can alter manually.

  • Project Name -> Whatever you want, and keep a note of this. Then you can start section (1) Setup. Once (1) is finished, go to section (4), and change the method to 'Photo Caption' - You can black list some words if you want to.

Initial Captioning

  • (4) will generate a caption & text file for every image, which stores the caption. You can alter this manually, by going into the text file, and changing the caption.

(1-5) Curate your tags

  • In section (5), Change 'global_activation_tag' to something unique to your dataset. You can also write things that you want to normalise into the 'remove_tags' field.

Once this has been setup, you can close the colab notebook. Next, you will need to train the model using another colab notebook.

File Structure for Images   Captions

(2) Training

Open and copy this notebook to your drive: Lora Trainer

In setup or section (1) you will need to refer back to the project name you had given in your last colab, and add it to the project name here. The training model we will use is 'AnyLora', if it is not selected already, you can do so by accessing the dropdown menu.

  • Enable flip_aug
  • Change the number of epochs to 20.

You can then run section (1) & Train!

Once the training has finished, you can download the fine-tuned models from the project folder. You should download the highest one/most recent.

(3) Using it for Stable Diffusion

(3-1) Add your model to the right directory on your Google Drive

After downloading the model, you will need to add it to the A111 model/lora directory. Add your model file (.safetensors) to your Google Drive in this path below.

/content/gdrive/MyDrive/sd/stable-diffusion-webui/models/Lora

(3-2) Create images in Stable Diffusion with your own model

Open and copy this notebook to your drive: Fast Stable Diffusion

Set the Model_Version under Model Download/Load to 1.5.

After excecuting all cells, start SDWebUI by clicking the link like this at the 'Start Stable-Diffusion' section: (https://xxx.gradio.live), and click on the LoRA tab. image

You can use this keyword anywhere in the prompt.

Some examples ✨

swimming

00042-396555378

00034-1378921413

About

In this workshop, the students learn the process of image classification for training text-to-image AI models and build their own small, ‘unflattering’ dataset to experiment with the tools and share their thoughts about the machinery but personal process.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published