A PyTorch implementation of diffusion models built from scratch, featuring:
- Clean, modular codebase
- DDPM (Denoising Diffusion Probabilistic Models)
- DDIM (Denoising Diffusion Implicit Models)
- (More to come...)
- UNet
- DiT (Diffusion Transformer)
- UViT (U-Vision Transformer)
- MNIST
- CIFAR-10
- (More to come...)
- diffusion process
- result on mnist dataset
- result on cifar-10 dataset
- loss
- Introduce cross_attention to fuse image information and classification information
- Implement inference code
- Experiment with different model architectures, e.g., Unet, DiT, UViT, etc.
- Experiment with different noise schedulers, e.g., DDPM, DDIM, DPM-Solver, etc.
- Add evaluation metrics such as FID and CLIP score
- Experiment with different datasets, e.g., MNIST, CIFAR-10, text2image datasets, etc.
- Experiment with different training methods, e.g., LoRA, etc.