Sketch2Face is a deep learning model that can generate realistic human face images from hand-drawn or computer-generated sketches. This implementation uses a conditional GAN architecture based on the pix2pix model, which has shown excellent results for image-to-image translation tasks.
The model was trained on a dataset of paired facial photos and corresponding sketches. The dataset preparation script combines photos and sketches side by side to create training samples.
The model uses a conditional GAN architecture with:
- Encoder-decoder architecture with skip connections
- 8 downsampling layers and 7 upsampling layers
- Uses batch normalization and dropout for regularization
- Classifies patches of the image as real or fake
- 4 downsampling layers
- Uses batch normalization and leaky ReLU activations
To train the model:
python train.py
Training parameters:
- Learning rate: 2e-4
- Batch size: 1
- L1 lambda: 100
- Optimizer: Adam (beta1=0.5)
- Random jittering for data augmentation
Training checkpoints are saved every 5000 steps and can be found in the training_checkpoints
directory.
To generate faces from sketches:
python inference.py --input path/to/sketch.jpg --output path/to/output.jpg
The model progressively improves during training. Here are some results at various training stages:
- Initial results (0k steps): Blurry images with basic facial features
- Mid-training (2k steps): Improved details and color accuracy
- Final results (4k steps): Realistic facial features and textures
The final model achieves:
- Generator GAN loss: 3.85
- Generator L1 loss: 0.58
- Discriminator loss: 0.03
The trained model weights can be saved in both HDF5 and TensorFlow checkpoint formats:
# Save the model
generator.save_weights('./model_weights.h5') # HDF5 format
generator.save_weights('./model_weights.ckpt') # TensorFlow checkpoint format
# Load the model
generator = Generator()
generator.load_weights('./model_weights.h5')
- Image-to-Image Translation with Conditional Adversarial Networks (Pix2Pix paper)
- TensorFlow Pix2Pix Tutorial
This project is licensed under the MIT License - see the LICENSE file for details.
- The dataset used for training is derived from person-face-sketches
- The implementation is based on the TensorFlow Pix2Pix tutorial