You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: unit4/README.md
+19-4Lines changed: 19 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,18 @@ Here are the steps for this unit:
13
13
14
14
:loudspeaker: Don't forget to join the [Discord](https://huggingface.co/join/discord), where you can discuss the material and share what you've made in the `#diffusion-models-class` channel.
15
15
16
+
## Table of Contents
17
+
18
+
-[Faster Sampling via Distillation](#faster-sampling-via-distillation)
19
+
-[Training Improvements](#training-improvements)
20
+
-[More Control for Generation and Editing](more-control-for-generation-and-editing)
21
+
-[Video](#video)
22
+
-[Audio](#audio)
23
+
-[New Architectures and Approaches - Towards 'Iterative Refinement'](#new-architectures-and-approaches---towards-iterative-refinement)
24
+
-[Hands-On Notebooks](#hands-on-notebooks)
25
+
-[Where Next?](#where-next)
26
+
27
+
16
28
## Faster Sampling via Distillation
17
29
18
30
Progressive distillation is a technique for taking an existing diffusion model and using it to train a new version of the model that requires fewer steps for inference. The 'student' model is initialized from the weights of the 'teacher' model. During training, the teacher model performs two sampling steps and the student model tries to match the resulting prediction in a single step. This process can be repeated mutiple times, with the previous iteration's student model becoming the teacher for the next stage. The end result is a model that can produce decent samples in much fewer steps (typically 4 or 8) than the original teacher model. The core mechanism is illustrated in this diagram from the [paper that introduced the idea](http://arxiv.org/abs/2202.00512):
@@ -107,6 +119,7 @@ Key references:
107
119
-*[RAVE2](https://github.com/acids-ircam/RAVE) - a new version of a Variational Auto-Encoder that will be useful for latent diffusion on audio tasks. This is used in the soon-to-be-announced *[AudioLDM](https://twitter.com/LiuHaohe/status/1619119637660327936?s=20&t=jMkPWBFuAH19HI9m5Sklmg) model
108
120
-*[Noise2Music](https://noise2music.github.io/) - A diffusion model trained to produce high-quality 30-second clips of audio based on text descriptions
109
121
-*[Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models](https://text-to-audio.github.io/) - a diffusion model trained to generate diverse sounds based on text
122
+
-*[Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion](https://arxiv.org/abs/2301.11757)
110
123
111
124
## New Architectures and Approaches - Towards 'Iterative Refinement'
112
125
@@ -135,10 +148,12 @@ Key references
135
148
136
149
## Hands-On Notebooks
137
150
138
-
We've covered a LOT of different ideas in this unit, many of which deserve much more detailed follow-on lessons in the future. For now, here are two demo notebooks for you to get hands-on with a couple of the ideas discussed above:
139
-
- TODO link Image Editing with DDIM Inversion notebook
140
-
- TODO link Birdcall Generation notebook
151
+
TODO link table
152
+
153
+
We've covered a LOT of different ideas in this unit, many of which deserve much more detailed follow-on lessons in the future. For now, you can two of the many topics via the hands-on notebooks we've prepared.
154
+
-**DDIM Inversion** shows how a technique called inversion can be used to edit images using existing diffusion models
155
+
-**Diffusion for Audio** introduces the idea of spectrograms and shows a minimal example of fine-tuning an audio diffusion model on a specific genre of music.
141
156
142
157
## Where Next?
143
158
144
-
TODO
159
+
This is the final unit of this course for now, which means that what comes next is up to you! Remember that you can always ask questions and chat about your projects on the Hugging Face [Discord](https://huggingface.co/join/discord). We look forward to seeing what you create 🤗
0 commit comments