Skip to content

inferense/cvqvae

Repository files navigation

CVQVAE (Conditional-Vector-Quantized-Variational-Autoencoder) for text-to-image synthesys.

Pytorch implementation of conditional-VQVAE2 for generating high-fidelity multi-object images based on text captions.

original paper: Generating Diverse High-Fidelity Images with VQ-VAE-2

This implementation is optimized for the MS-COCO dataset (Captions 2014). Currently supports hierarchical VQVAE and PixelSNAIL.

The code was imported from ipynb notebook.

Credits: vqvae_prior.py code adapted from kamenbliznashki

Preprequisites

  • Downloaded MS-COCO captions dataset
  • Pytorch >= 1.6
  • GPU environment - the PixelSNAIL (vqvae_prior.py) is heavy to train especially on high-resolution images

Usage

  1. Train vqvae.py
  2. extract codes
  3. Train vqvae_prior.py
  4. Sample

About

Conditional VQVAE

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages