Skip to content

jatingargiitk/StableDiffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stable Diffusion Style Exploration

This project demonstrates the use of textual inversion concepts from the Stable Diffusion concepts library with a custom loss function for semantic guidance.

Features

  • Uses specific textual inversion concepts from the SD Concepts Library
  • Handles multiple base models (primarily SD v2.0)
  • Supports both style and object concepts
  • Generates images using each concept with the same base prompt
  • Applies a custom "car_loss" function to guide the generation toward including cars
  • Maintains consistent seeds for reproducibility

Selected Concepts

The project uses these specific concepts:

  1. canna-lily-flowers102 (Seed: 42)

    • Type: Object concept
    • Base Model: Stable Diffusion 2.0
    • Description: Canna lily flower style
  2. samurai-jack (Seed: 123)

    • Type: Style concept
    • Base Model: Stable Diffusion 2.0
    • Description: Samurai Jack animation style
  3. babies-poster (Seed: 456)

    • Type: Style concept
    • Base Model: Stable Diffusion 2.0
    • Description: Babies poster art style
  4. animal-toy (Seed: 789)

    • Type: Object concept
    • Base Model: Stable Diffusion 2.0
    • Description: Animal toy style
  5. sword-lily-flowers102 (Seed: 999)

    • Type: Object concept
    • Base Model: Stable Diffusion 2.0
    • Description: Sword lily flower style

Prompt Construction

The script uses different prompt formats based on the concept type:

  • For style concepts: <concept-name> base_prompt
  • For object concepts: base_prompt with a <concept-name>

Car Loss Function

The custom "car_loss" function encourages the generation of images containing cars by:

  • Using CLIP (Contrastive Language-Image Pre-training) to score images based on their similarity to the concept "cars"
  • Comparing scores between "a photo of a car" and "a photo without cars"
  • Guiding the generation process to maximize the presence of cars in the image
  • Unlike color-based losses (red, green, blue), this is a semantic concept-based loss

Usage

# Install requirements
pip install -r requirements.txt

# Run the experiment
python stable_diffusion_styles.py

The script will:

  1. Download the specified concept embeddings from Hugging Face
  2. Load appropriate base models for each concept
  3. Generate normal and car-enhanced versions of the same prompt for each concept
  4. Save the results in the "outputs" directory

Outputs

The outputs directory will contain 10 images:

  • 5 normal images (one for each concept)
  • 5 car-enhanced images (one for each concept)

Each image filename indicates both the concept used and whether it's a normal or car-enhanced version.

References

About

Explore Stable Diffusion styles with concept-based prompts and CLIP-powered car-guided generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages