Conditionsal Generative Adversarial Networks are an extensions to normal GANs where the targeted generated image is conditioned on some condition. This condition helps us in controlling what output we want to generate. Pix2Pix is an image-to-image translation process based on Conditional GAN where a target image is generated, that is conditioned on a given input image. Here in this project Pytorch implementation of Pix2Pix model from scratch has been done. Aim is to generate streetmap images of a corresponding satellite image using Pix2Pix model. The project implementation follows the original paper on Pix2Pix
The dataset is publicly available in Kaggle and can be downloaded from here [https://www.kaggle.com/datasets/vikramtiwari/pix2pix-dataset]
It consists of satellite images along with their ground truth streetmap images
Dataset is divided into train and validation folders with each set having 1096 images
Example of a satellite image and its corresponding ground truth streetmap image
Here a conditional Generative adversarial network is implemented where generated images are conditioned on some input and here it is the input image itself
Above figure shows the conditional GAN architecture . Here y is the condition on which the generator and discriminator are conditioned. In these case y is the input image(satellite image) iteslf.
The generator resembles a U-net architecture where input to generator is RGB image and it tries to generate another RGB image of same shape
The discriminator is Patch GAN which outputs a 30X30 matrix where each cell classifies that part of the image whether it is real or fake generated
Above figure shows the U-net architecture where we go through downsampling followed by upsampling and adding the previous skip connections
Above figure shows the patch GAN architecture where output is 30x30
Along with normal adversarial loss for training the generator block uses an added L1 loss function for evaluating how close the generated streetmap are close to actual streetmap
EARNING_RATE = 2e-4
BATCH_SIZE = 16
NUM_WORKERS = 2
IMAGE_SIZE = 256
CHANNELS_IMG = 3
L1_LAMBDA = 100
LAMBDA_GP = 10
NUM_EPOCHS = 500
1)Download the dataset and place it in the data folder
2)Execute train.py and for the above mentioned hyperparameters the model will be trained and evaluation pictures will also be generated after every 100 epochs
Some of the results generated are as follows
Satellite Image Ground Truth Generated Image
Pix2Pix paper: https://arxiv.org/abs/1611.07004
Youtube Channel: https://www.youtube.com/playlist?list=PLhhyoLH6IjfwIp8bZnzX8QR30TRcHO8Va