What I cannot create, I do not understand
― Richard Feynman
By 2021, we have witnessed the unprecedented feat of AI generating high-quality images and reshaping our digital world. We have reached this point thanks to a cutting-edge method: the latent diffusion model. This method is powered by prior research on VAE and diffusion models. Thus, out of curiosity, this project was done to realize the latent diffusion model from scratch. The VAE model employed in this project is VQ-VAE. DDPM is opted in for the diffusion model. Here, the PneumoniaMNIST dataset is used such that the latent diffusion model can generate chest X-ray images from random noise. Also, the generation is conditioned on labels: normal
or pneumonia
. Furthermore, to make the model more true to the label, we can adjust the value of the classifier-free guidance scale for better results.
You may use this notebook to synthesize a medical image (i.e., chest X-ray) conditioned on a particular label: normal
, pneumonia
, or None
(the unconditional progressive generation).
This table presents the VQ-VAE's reconstruction performance gauged with the VQ-VAE loss and LPIPS.
Test Metric | Score |
---|---|
Loss | 0.0070 |
LPIPS | 0.2709 |
Loss curves of VQ-VAE on PneumoniaMNIST train and validation sets.
LPIPS curves of VQ-VAE on PneumoniaMNIST train and validation sets.
The image below exhibits the reconstruction quality of VQ-VAE.
Progressive noising and de-noising are applied to the latent image.
The modified U-Net loss curves on the latent images of the PneumoniaMNIST train and validation sets. The curve of the EMA model of the U-Net on validation sets is also exhibited.
The first and second rows show the generated images and their latent images, respectively.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Implementing Latent Diffusion Model From Scratch with $0
- Medical Image Generation Using Diffusion Model
- EEG Motor Imagery Classification Using CNN, Transformer, and MLP
- Small Molecular Graph Generation for Drug Discovery
If you think this repository is helpful for your research, you may cite it:
@misc{medical-latent-diffusion-model,
title = {Generating Medical Images with the Label-Conditioned Latent Diffusion Model (From Scratch),
url = {https://github.com/reshalfahsi/medical-latent-diffusion-model},
author = {Resha Dwika Hefni Al-Fahsi},
}
- Diffusion Models
- pytorch-stable-diffusion
- Stable Diffusion Implementation in PyTorch
- High-Resolution Image Synthesis with Latent Diffusion Models
- MedMNIST
- The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
- Perceptual Similarity Metric and Dataset
- Denoising Diffusion Probabilistic Models
- Neural Discrete Representation Learning
- PyTorch Lightning