In this workshop, the students learn the process of image classification for training text-to-image AI models and build their own small, ‘unflattering’ dataset to experiment with the tools and share their thoughts about the machinery but personal process.
🡥 (1) Preprocess Data: Crop, Resize, Label.
🡥 (2) Training: How to train the model, what settings are needed
🡥 (3) Inference: How to use your trained model with text-to-image models
- Info: Breakdown of LoRA and A111 SD Notebook
- We will collect a small collection of Images ~20-50 (the more the better)
- For faces or portraits the number of images can be lower, but for more complex objets, styles, concepts ~50-100 images are needed.
- A Pre-Trained Model: We will need a LARGE model which has already has a understanding of objects, styles, text/image knowledge. In this workshop, it would be best to use LoRA which is really good for quickly finetuning a model off of your data.
- Training Images can be JPG or PNG.
- Mininum size of training images 512x512. The larger the images the better.
(1-2) Batch crop & rename images: Birme.net
- Settings for Birme: JPG (100%), 512 x 512, Save as Zip
Open a text editor of your preference and label (classify) each image. You can do it in one file.
** Pay attention to what comes to your mind when you do so. What decisions did you have to make when describing the image? What was easy? What was difficult? And why? **
Use this colab notebook Dataset maker. This will be used to prepare the images as a dataset for LoRA. It will also create captions for each image, which you can alter manually.
- Project Name -> Whatever you want, and keep a note of this. Then you can start section (1) Setup. Once (1) is finished, go to section (4), and change the method to 'Photo Caption' - You can black list some words if you want to.
- (4) will generate a caption & text file for every image, which stores the caption. You can alter this manually, by going into the text file, and changing the caption.
- In section (5), Change 'global_activation_tag' to something unique to your dataset. You can also write things that you want to normalise into the 'remove_tags' field.
Once this has been setup, you can close the colab notebook. Next, you will need to train the model using another colab notebook.
Open and copy this notebook to your drive: Lora Trainer
In setup or section (1) you will need to refer back to the project name you had given in your last colab, and add it to the project name here. The training model we will use is 'AnyLora', if it is not selected already, you can do so by accessing the dropdown menu.
- Enable flip_aug
- Change the number of epochs to 20.
You can then run section (1) & Train!
Once the training has finished, you can download the fine-tuned models from the project folder. You should download the highest one/most recent.
After downloading the model, you will need to add it to the A111 model/lora directory. Add your model file (.safetensors) to your Google Drive in this path below.
/content/gdrive/MyDrive/sd/stable-diffusion-webui/models/Lora
Open and copy this notebook to your drive: Fast Stable Diffusion
Set the Model_Version under Model Download/Load to 1.5.
After excecuting all cells, start SDWebUI by clicking the link like this at the 'Start Stable-Diffusion' section: (https://xxx.gradio.live), and click on the LoRA tab.
You can use this keyword anywhere in the prompt.
Some examples ✨