Skip to content

Commit 978ad6b

Browse files
authored
Merge branch 'main' into fixes-issue-11060
2 parents 8e21d99 + 723dbdd commit 978ad6b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+3567
-267
lines changed

.github/workflows/pr_style_bot.yml

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,5 @@ jobs:
1313
uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main
1414
with:
1515
python_quality_dependencies: "[quality]"
16-
pre_commit_script_name: "Download and Compare files from the main branch"
17-
pre_commit_script: |
18-
echo "Downloading the files from the main branch"
19-
20-
curl -o main_Makefile https://raw.githubusercontent.com/huggingface/diffusers/main/Makefile
21-
curl -o main_setup.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/setup.py
22-
curl -o main_check_doc_toc.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/utils/check_doc_toc.py
23-
24-
echo "Compare the files and raise error if needed"
25-
26-
diff_failed=0
27-
if ! diff -q main_Makefile Makefile; then
28-
echo "Error: The Makefile has changed. Please ensure it matches the main branch."
29-
diff_failed=1
30-
fi
31-
32-
if ! diff -q main_setup.py setup.py; then
33-
echo "Error: The setup.py has changed. Please ensure it matches the main branch."
34-
diff_failed=1
35-
fi
36-
37-
if ! diff -q main_check_doc_toc.py utils/check_doc_toc.py; then
38-
echo "Error: The utils/check_doc_toc.py has changed. Please ensure it matches the main branch."
39-
diff_failed=1
40-
fi
41-
42-
if [ $diff_failed -eq 1 ]; then
43-
echo "❌ Error happened as we detected changes in the files that should not be changed ❌"
44-
exit 1
45-
fi
46-
47-
echo "No changes in the files. Proceeding..."
48-
rm -rf main_Makefile main_setup.py main_check_doc_toc.py
49-
style_command: "make style && make quality"
5016
secrets:
5117
bot_token: ${{ secrets.GITHUB_TOKEN }}

docs/source/en/api/pipelines/deepfloyd_if.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
## Overview

docs/source/en/api/pipelines/flux.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/) by the creators of Flux, Black Forest Labs.

docs/source/en/api/pipelines/kolors.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/kolors/kolors_header_collage.png)

docs/source/en/api/pipelines/ltx_video.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[LTX Video](https://huggingface.co/Lightricks/LTX-Video) is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched. Trained on a large-scale dataset of diverse videos, the model generates high-resolution videos with realistic and varied content. We provide a model for both text-to-video as well as image + text-to-video usecases.
@@ -32,6 +33,7 @@ Available models:
3233
|:-------------:|:-----------------:|
3334
| [`LTX Video 0.9.0`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.safetensors) | `torch.bfloat16` |
3435
| [`LTX Video 0.9.1`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.1.safetensors) | `torch.bfloat16` |
36+
| [`LTX Video 0.9.5`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors) | `torch.bfloat16` |
3537

3638
Note: The recommended dtype is for the transformer component. The VAE and text encoders can be either `torch.float32`, `torch.bfloat16` or `torch.float16` but the recommended dtype is `torch.bfloat16` as used in the original repository.
3739

docs/source/en/api/pipelines/sana.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers](https://huggingface.co/papers/2410.10629) from NVIDIA and MIT HAN Lab, by Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Stable Diffusion 3 (SD3) was proposed in [Scaling Rectified Flow Transformers for High-Resolution Image Synthesis](https://arxiv.org/pdf/2403.03206.pdf) by Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Muller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_xl.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Stable Diffusion XL (SDXL) was proposed in [SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis](https://huggingface.co/papers/2307.01952) by Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach.

docs/source/en/optimization/mps.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ specific language governing permissions and limitations under the License.
1212

1313
# Metal Performance Shaders (MPS)
1414

15+
> [!TIP]
16+
> Pipelines with a <img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22"> badge indicate a model can take advantage of the MPS backend on Apple silicon devices for faster inference. Feel free to open a [Pull Request](https://github.com/huggingface/diffusers/compare) to add this badge to pipelines that are missing it.
17+
1518
🤗 Diffusers is compatible with Apple silicon (M1/M2 chips) using the PyTorch [`mps`](https://pytorch.org/docs/stable/notes/mps.html) device, which uses the Metal framework to leverage the GPU on MacOS devices. You'll need to have:
1619

1720
- macOS computer with Apple silicon (M1/M2) hardware
@@ -37,7 +40,7 @@ image
3740

3841
<Tip warning={true}>
3942

40-
Generating multiple prompts in a batch can [crash](https://github.com/huggingface/diffusers/issues/363) or fail to work reliably. We believe this is related to the [`mps`](https://github.com/pytorch/pytorch/issues/84039) backend in PyTorch. While this is being investigated, you should iterate instead of batching.
43+
The PyTorch [mps](https://pytorch.org/docs/stable/notes/mps.html) backend does not support NDArray sizes greater than `2**32`. Please open an [Issue](https://github.com/huggingface/diffusers/issues/new/choose) if you encounter this problem so we can investigate.
4144

4245
</Tip>
4346

@@ -59,6 +62,10 @@ If you're using **PyTorch 1.13**, you need to "prime" the pipeline with an addit
5962

6063
## Troubleshoot
6164

65+
This section lists some common issues with using the `mps` backend and how to solve them.
66+
67+
### Attention slicing
68+
6269
M1/M2 performance is very sensitive to memory pressure. When this occurs, the system automatically swaps if it needs to which significantly degrades performance.
6370

6471
To prevent this from happening, we recommend *attention slicing* to reduce memory pressure during inference and prevent swapping. This is especially relevant if your computer has less than 64GB of system RAM, or if you generate images at non-standard resolutions larger than 512×512 pixels. Call the [`~DiffusionPipeline.enable_attention_slicing`] function on your pipeline:
@@ -72,3 +79,7 @@ pipeline.enable_attention_slicing()
7279
```
7380

7481
Attention slicing performs the costly attention operation in multiple steps instead of all at once. It usually improves performance by ~20% in computers without universal memory, but we've observed *better performance* in most Apple silicon computers unless you have 64GB of RAM or more.
82+
83+
### Batch inference
84+
85+
Generating multiple prompts in a batch can crash or fail to work reliably. If this is the case, try iterating instead of batching.

examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py

Lines changed: 32 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@
7171
convert_unet_state_dict_to_peft,
7272
is_wandb_available,
7373
)
74+
from diffusers.utils.hub_utils import load_or_create_model_card, populate_model_card
7475
from diffusers.utils.import_utils import is_xformers_available
7576
from diffusers.utils.torch_utils import is_compiled_module
7677

@@ -101,7 +102,7 @@ def determine_scheduler_type(pretrained_model_name_or_path, revision):
101102
def save_model_card(
102103
repo_id: str,
103104
use_dora: bool,
104-
images=None,
105+
images: list = None,
105106
base_model: str = None,
106107
train_text_encoder=False,
107108
train_text_encoder_ti=False,
@@ -111,20 +112,17 @@ def save_model_card(
111112
repo_folder=None,
112113
vae_path=None,
113114
):
114-
img_str = "widget:\n"
115115
lora = "lora" if not use_dora else "dora"
116-
for i, image in enumerate(images):
117-
image.save(os.path.join(repo_folder, f"image_{i}.png"))
118-
img_str += f"""
119-
- text: '{validation_prompt if validation_prompt else ' ' }'
120-
output:
121-
url:
122-
"image_{i}.png"
123-
"""
124-
if not images:
125-
img_str += f"""
126-
- text: '{instance_prompt}'
127-
"""
116+
117+
widget_dict = []
118+
if images is not None:
119+
for i, image in enumerate(images):
120+
image.save(os.path.join(repo_folder, f"image_{i}.png"))
121+
widget_dict.append(
122+
{"text": validation_prompt if validation_prompt else " ", "output": {"url": f"image_{i}.png"}}
123+
)
124+
else:
125+
widget_dict.append({"text": instance_prompt})
128126
embeddings_filename = f"{repo_folder}_emb"
129127
instance_prompt_webui = re.sub(r"<s\d+>", "", re.sub(r"<s\d+>", embeddings_filename, instance_prompt, count=1))
130128
ti_keys = ", ".join(f'"{match}"' for match in re.findall(r"<s\d+>", instance_prompt))
@@ -169,23 +167,7 @@ def save_model_card(
169167
to trigger concept `{key}` → use `{tokens}` in your prompt \n
170168
"""
171169

172-
yaml = f"""---
173-
tags:
174-
- stable-diffusion-xl
175-
- stable-diffusion-xl-diffusers
176-
- diffusers-training
177-
- text-to-image
178-
- diffusers
179-
- {lora}
180-
- template:sd-lora
181-
{img_str}
182-
base_model: {base_model}
183-
instance_prompt: {instance_prompt}
184-
license: openrail++
185-
---
186-
"""
187-
188-
model_card = f"""
170+
model_description = f"""
189171
# SDXL LoRA DreamBooth - {repo_id}
190172
191173
<Gallery />
@@ -234,8 +216,25 @@ def save_model_card(
234216
235217
{license}
236218
"""
237-
with open(os.path.join(repo_folder, "README.md"), "w") as f:
238-
f.write(yaml + model_card)
219+
model_card = load_or_create_model_card(
220+
repo_id_or_path=repo_id,
221+
from_training=True,
222+
license="openrail++",
223+
base_model=base_model,
224+
prompt=instance_prompt,
225+
model_description=model_description,
226+
widget=widget_dict,
227+
)
228+
tags = [
229+
"text-to-image",
230+
"stable-diffusion-xl",
231+
"stable-diffusion-xl-diffusers",
232+
"text-to-image",
233+
"diffusers",
234+
lora,
235+
"template:sd-lora",
236+
]
237+
model_card = populate_model_card(model_card, tags=tags)
239238

240239

241240
def log_validation(

0 commit comments

Comments
 (0)