Skip to content

fraxea/pix2pix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Pix2Pix - Conditional GAN for Segmented to Real Image Translation

In this project, we use Conditional GAN architecture for translating segmented images to real images in street view of Cityscapes dataset.

Results

Here is some examples from validation set.

result

The loss of discriminator/generator for train/val for 200 epochs.

loss_curve

Loss function

For more details see the notebook.

Discriminator

The discriminator consist of a sequence of Convolution-BatchNorm-ReLU blocks. The generated-real recognition is done by classifying for patches, then the final output of discriminator is the average of these results. This discriminator architecture is also called PatchGAN.

The discriminator loss is devided by 2 to slow down the rate of learning for D.

Generator

In this framework, the generator has U-Net architecture. Note that all convolution blocks have kernel size 4 and there is no pooling layer. Using random noise for z is not effective in this case, so we use dropout with high probability 0.5 to add diversity to our generator. So at inference time, we do not turn off dropout.

The effect of L-1 loss for generator is controlled by hyper-parameter lambda. Also it is usual to optimize -log(D) instead of log(1-D), because of gradient reasons.

Refrences

  1. Official PyTorch Pix2Pix Implementation: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/
  2. Original Paper of Pix2Pix: https://arxiv.org/pdf/1611.07004
  3. Kaggle Notebook: https://www.kaggle.com/code/mohammadshafizd/pix2pix-conditional-gan-in-cityscapes
  4. Medium Post: https://medium.com/@mohammadshafizd/pix2pix-in-cityscapes-dataset-e4d743b595b6