Skip to content

danhtran2mind/Ghibli-Stable-Diffusion-Synthesis

Ghibli Stable Diffusion Synthesis 🎨

GitHub Stars Badge

huggingface-hub accelerate bitsandbytes torch Pillow numpy transformers torchvision diffusers gradio License: MIT

Introduction

The Ghibli Fine-Tuned Stable Diffusion 2.1 project is a cutting-edge endeavor that harnesses the power of deep learning to generate images in the enchanting and iconic art style of Studio Ghibli. By fine-tuning the Stable Diffusion 2.1 model, this project enables the creation of visually stunning images that capture the vibrant colors, intricate details, and whimsical charm of Ghibli films. The repository includes a meticulously crafted Jupyter notebook for training, an interactive Gradio demo for real-time image generation, and comprehensive instructions for setup and usage. Designed for data scientists, developers, and Ghibli enthusiasts, this project bridges technology and artistry with unparalleled precision.

Key Features

Training Notebooks

For Full Finetuning training

The cornerstone of this project is the Jupyter notebook located at notebooks/ghibli-sd-2.1-base-finetuning.ipynb. This notebook provides a step-by-step guide to fine-tuning the Stable Diffusion 2.1 Base model using the Ghibli dataset, complete with code, explanations, and best practices. It is designed to be accessible to both beginners and experienced practitioners, offering flexibility to replicate the training process or experiment with custom modifications. The notebook is compatible with the following platforms:

Open In Colab Open in SageMaker Open in Deepnote JupyterLab Open in Gradient Open in Binder Open in Kaggle View on GitHub

For LoRA training

The foundation of this project is the Jupyter notebook found at notebooks/ghibli-sd-2.1-lora.ipynb. It offers a clear, step-by-step walkthrough for fine-tuning the Stable Diffusion 2.1 model with the Ghibli dataset using LoRA (Low-Rank Adaptation), including code, detailed notes, and practical tips. Crafted for both novices and seasoned users, it supports easy replication of the training process or experimentation with custom tweaks. The notebook is compatible with the following platforms:

Open In Colab Open in SageMaker Open in Deepnote JupyterLab Open in Gradient Open in Binder Open in Kaggle View on GitHub

To get started, open the notebook in your preferred platform and follow the instructions to set up the environment and execute the training process.

Datasets

Each task employs a dedicated dataset hosted on HuggingFace, tailored to support the unique requirements of the training process while reflecting Ghibli’s distinctive artistry:

  • Full Fine-Tuning Task: Utilizes HuggingFace Datasets, a comprehensive collection of high-quality Ghibli-inspired images designed for thorough model fine-tuning, ensuring rich and authentic visual outputs.
  • LoRA Task: Leverages HuggingFace Datasets, a lightweight and optimized dataset crafted for LoRA adaptation, enabling efficient training with reduced computational resources while maintaining the charm of Ghibli’s style.

Base Models

The project uses carefully selected Stable Diffusion models for each task, balancing quality, efficiency, and alignment with Ghibli’s artistic vision:

  • Full Fine-Tuning Task: Built on HuggingFace Model Hub, a powerful base model ideal for extensive fine-tuning, producing detailed and faithful Ghibli-style artwork with high fidelity.

  • LoRA Task: Based on HuggingFace Model Hub, a versatile model optimized for LoRA, offering a streamlined approach for rapid experimentation and efficient generation of Ghibli-inspired visuals.

Demonstration

License-Plate-Detector-OCR uses computer vision, OCR to detect, read license plates:

  • HuggingFace Space: HuggingFace Space Demo

  • Demo GUI:
    Gradio Demo

To run the Gradio app locally (localhost:7860):

python apps/gradio_app.py

Usage Guide

Step 1: Clone the Repository

Clone the project repository and navigate to the project directory:

git clone https://github.com/danhtran2mind/Ghibli-Stable-Diffusion-Synthesis.git
cd Ghibli-Stable-Diffusion-Synthesis

Step 2: Install Dependencies

Install Dependencies using requirements.txt

pip install -r requirements/requirements.txt

Step 3: Configure the Environment

Run the following scripts to set up the project:

Download Model Checkpoints

python scripts/download_ckpts.py

Prepare Dataset (Optional, for Training)

python scripts/download_datasets.py

Refer to the Scripts Documents for detailed arguments used in Scripts. ⚙️

Training

The Training Notebooks, available at Training Notebooks, offer a comprehensive guide to both the Full Fine-tuning and LoRA training methods.

To use local datasets downloaded from Hugging Face Datasets, replace --dataset_name with the following in the specified notebooks:

  • In notebooks/ghibli-sd-2.1-base-finetuning.ipynb, replace --dataset_name="uwunish/ghibli-dataset" with by --dataset_name="data/uwunish-ghibli-dataset".
  • In notebooks/ghibli-sd-2.1-lora.ipynb, replace --dataset_name="pulnip/ghibli-dataset" with --dataset_name="data/pulnip-ghibli-dataset".

For more information about Training, you can see Stable Diffusion text-to-image fine-tuning.

Inference

Quick Inference Bash

  • To generate iamge using the Full Fine-tuning model:
python src/ghibli_stable_diffusion_synthesis/infer.py \
    --method full_finetuning \
    --prompt "donald trump in ghibli style" \
    --height 512 --width 512 \
    --num_inference_steps 50 \
    --guidance_scale 3.5 \
    --seed 42 \
    --output_path "tests/test_data/ghibli_style_output_full_finetuning.png"
  • To run inference with LoRA:
python src/ghibli_stable_diffusion_synthesis/infer.py \
    --method lora \
    --prompt "a beautiful city in Ghibli style" \
    --height 720 --width 1280 \
    --num_inference_steps 100 \
    --guidance_scale 15.5 \
    --seed 42 \
    --lora_scale 0.7 \
    --output_path "tests/test_data/ghibli_style_output_lora.png"

Inference Arguments

Refer to the Inference Documents for detailed arguments used in Inference. ⚙️

Inference Example

Environment

Contact

For questions, issues, please contact the maintainer via the Issues tab on GitHub.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published