Video

SD.Next supports video creation using top-level Video tab
Supoport includes T2V: text-to-video and I2V: image-to-video

Tip

Latest video models use LLMs for prompting and due to that requires very long and descriptive prompt

Supported models

SD.Next supports following models out-of-the-box:

Hunyuan: HunyuanVideo, FastHunyuan, SkyReels | T2V, I2V
WAN21: 1.3B, 14B | T2V, I2V
LTXVideo: 0.9.0, 0.9.1, 0.9.5 | T2V, I2V
CogVideoX: 2B, 5B | T2V, I2V
Allegro: T2V
Mochi1: T2V
Latte1: *T2V

Note

All models are auto-downloaded upon first use
Download location uses system paths -> huggingface folder

Tip

Each model may require specific resolution or parameters to produce quality results
This also includes advanced paramters such as Sampler shift which would during normal text-to-image be considered not required to tweak
See individual model's author notes for recommendations on parameters

Legacy models

Additional video models are available as individually selectable scripts in either text or image interfaces

Stable Video Diffusion, Base, XY 1.0 and XT 1.1
VGen
AnimateDiff

LoRA

SD.Next includes LoRA support for Hunyuan, LTX, WAN, Mochi, Cog

Optimizations

Warning

Any use on GPUs below 16GB and systems below 48GB RAM is experimental

Offloading

Enable offloading so model components can be moved in and out of VRAM as needed
Most models support all offloading types: Balanced, Model and Sequential
However, balanced offload may lead to CPU vs CUDA errors with some models, in which case try other offloading types

Quantization

Enable on-the-fly quantization during load in Settings -> Quantization for additional memory savings

BnB
TorchAO
Optimum-Quanto

You can enable quantization for both or either Transformers and Text-Encoder separately

Most T2V and I2V models support on-the-fly quantization of transformers module
Most T2V support quantization of text-encoder while I2V model may not due to inability to quantize image vectors

Decoding

Instead of using full VAE that is packaged with the model itself to decode final frames, SD.Next supports use of Tiny VAE as well as ability to use Remote VAE to decode video

Tiny VAE: support for Hunyuan, WAN, Mochi
Remote VAE: support for Hunyuan

Processing

SD.Next supports two types of optional processing acceleration:

FasterCache: support for Hunyuan, Mochi, Latte, Allegro, Cog
PyramidAttentionBroadcast: support for Hunyuan, Mochi, Latte, Allegro, Cog

Interpolation

For all video modules, SD.Next supports adding interpolated frames to video for smoother output

Uh oh!

Video

Video

Supported models

Legacy models

LoRA

Optimizations

Offloading

Quantization

Decoding

Processing

Interpolation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!