This project demonstrates the use of textual inversion concepts from the Stable Diffusion concepts library with a custom loss function for semantic guidance.
- Uses specific textual inversion concepts from the SD Concepts Library
- Handles multiple base models (primarily SD v2.0)
- Supports both style and object concepts
- Generates images using each concept with the same base prompt
- Applies a custom "car_loss" function to guide the generation toward including cars
- Maintains consistent seeds for reproducibility
The project uses these specific concepts:
-
canna-lily-flowers102 (Seed: 42)
- Type: Object concept
- Base Model: Stable Diffusion 2.0
- Description: Canna lily flower style
-
samurai-jack (Seed: 123)
- Type: Style concept
- Base Model: Stable Diffusion 2.0
- Description: Samurai Jack animation style
-
babies-poster (Seed: 456)
- Type: Style concept
- Base Model: Stable Diffusion 2.0
- Description: Babies poster art style
-
animal-toy (Seed: 789)
- Type: Object concept
- Base Model: Stable Diffusion 2.0
- Description: Animal toy style
-
sword-lily-flowers102 (Seed: 999)
- Type: Object concept
- Base Model: Stable Diffusion 2.0
- Description: Sword lily flower style
The script uses different prompt formats based on the concept type:
- For style concepts:
<concept-name> base_prompt
- For object concepts:
base_prompt with a <concept-name>
The custom "car_loss" function encourages the generation of images containing cars by:
- Using CLIP (Contrastive Language-Image Pre-training) to score images based on their similarity to the concept "cars"
- Comparing scores between "a photo of a car" and "a photo without cars"
- Guiding the generation process to maximize the presence of cars in the image
- Unlike color-based losses (red, green, blue), this is a semantic concept-based loss
# Install requirements
pip install -r requirements.txt
# Run the experiment
python stable_diffusion_styles.py
The script will:
- Download the specified concept embeddings from Hugging Face
- Load appropriate base models for each concept
- Generate normal and car-enhanced versions of the same prompt for each concept
- Save the results in the "outputs" directory
The outputs directory will contain 10 images:
- 5 normal images (one for each concept)
- 5 car-enhanced images (one for each concept)
Each image filename indicates both the concept used and whether it's a normal or car-enhanced version.
- Based on the Stable Diffusion Deep Dive notebook by fastai
- Concepts from Hugging Face SD Concepts Library
- Individual concepts:
- canna-lily-flowers102
- samurai-jack
- babies-poster
- animal-toy
- sword-lily-flowers102 # StableDiffusion