-
-
Notifications
You must be signed in to change notification settings - Fork 486
Video
SD.Next supports video creation using top-level Video tab
Supoport includes T2V: text-to-video and I2V: image-to-video
Tip
Latest video models use LLMs for prompting and due to that requires very long and descriptive prompt
SD.Next supports following models out-of-the-box:
- Hunyuan: HunyuanVideo, FastHunyuan, SkyReels | T2V, I2V
- WAN21: 1.3B, 14B | T2V, I2V
- LTXVideo: 0.9.0, 0.9.1, 0.9.5 | T2V, I2V
- CogVideoX: 2B, 5B | T2V, I2V
- Allegro: T2V
- Mochi1: T2V
- Latte1: *T2V
Note
All models are auto-downloaded upon first use
Download location uses system paths -> huggingface folder
Tip
Each model may require specific resolution or parameters to produce quality results
This also includes advanced paramters such as Sampler shift which would during normal text-to-image be considered not required to tweak
See individual model's author notes for recommendations on parameters
Additional video models are available as individually selectable scripts in either text or image interfaces
- Stable Video Diffusion, Base, XY 1.0 and XT 1.1
- VGen
- AnimateDiff
SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog
Warning
Any use on GPUs below 16GB and systems below 48GB RAM is experimental
Enable offloading so model components can be moved in and out of VRAM as needed
Most models support all offloading types: Balanced, Model and Sequential
However, balanced offload may lead to CPU vs CUDA errors with some models, in which case try other offloading types
Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings
- BnB
- TorchAO
- Optimum-Quanto
You can enable quantization for both or either Transformers and Text-Encoder separately
- Most T2V and I2V models support on-the-fly quantization of transformers module
- Most T2V support quantization of text-encoder while I2V model may not due to inability to quantize image vectors
Instead of using full VAE that is packaged with the model itself to decode final frames, SD.Next supports use of Tiny VAE as well as ability to use Remote VAE to decode video
- Tiny VAE: support for Hunyuan, WAN, Mochi
- Remote VAE: support for Hunyuan
SD.Next supports two types of optional processing acceleration:
- FasterCache: support for Hunyuan, Mochi, Latte, Allegro, Cog
- PyramidAttentionBroadcast: support for Hunyuan, Mochi, Latte, Allegro, Cog
For all video modules, SD.Next supports adding interpolated frames to video for smoother output