Skip to content

bupt-yiwo/VQGAN

Repository files navigation

VQGAN

This project is for personal learning and experimentation, containing implementations of VQGAN.

🔗 The code is primarily adapted from dome272's VQGAN-pytorch repository.

🎨 The dataset used is from This Anime Does Not Exist, which provides high-quality AI-generated anime portraits. These synthetic images are used for training and evaluation purposes in this project.

Requirements

  • Python (any version)
  • PyTorch (any version)
  • CUDA enabled computing device

Results

📊 Below is a comparison between the performance of our model and that of the original repository. Although I used the exact same hyperparameters, my results are noticeably worse. I believe the primary reason is the quality of the dataset.

PS: The dataset I downloaded is actually quite noisy—but I really just wanted an anime dataset, haha.

First Stage (Reconstruction):

50 Epoch(Dome272's VQGAN):

image-1

46 Epoch(My VQGAN):

image-2

image-3

Second Stage (Generating new Images):

Original Left | Reconstruction Middle Left | Completion Middle Right | New Image Right

100 Epoch(Dome272's VQGAN):

image-4

20 Epoch(My VQGAN):

image-5

Train VQGAN:

First Stage

python train_vqgan.py

Second Stage

python train_transformer_vqgan.py

Citation

@misc{esser2021taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2021},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages