This project implements a Pix2Pix Conditional GAN that generates realistic images from edge outlines, using object masks from the COCO 2017 dataset. The model learns to translate binary edge maps into colored, high-resolution dog/object images using a U-Net-based generator and a PatchGAN discriminator.
- Dataset: COCO 2017 via TensorFlow Datasets (TFDS)
- Input: Edge maps (generated using OpenCV's Canny filter on segmentation masks)
- Output: 256×256 RGB images of objects (e.g., dogs)
- Model: Pix2Pix GAN with U-Net generator + PatchGAN discriminator
- Loss Function: GAN loss + L1 loss (
total = GAN + λ * L1) - Frameworks: TensorFlow 2.x, OpenCV, Matplotlib
- 🧩 End-to-end edge-to-photo translation using paired image data
- 🖼️ Real-time image generation visualization
- 📦 Custom training loop with TensorBoard support
- 🧪 Includes data preprocessing, augmentation, batching, and shuffling
- 💾 Checkpoint saving for long training runs
📁 pix2pix-gan-coco/
│
├── ai_project_ares.py # Full training & model definition
├── checkpoints/ # Saved model weights
├── logs/ # TensorBoard logs
├── samples/ # Example predictions
├── README.md # You are here
-
Data Preprocessing:
- Loads COCO 2017 images using TFDS
- Extracts segmentation masks and applies edge detection (Canny)
- Resizes and normalizes both edge maps and original images
-
Model Architecture:
- Generator: U-Net-style encoder-decoder with skip connections
- Discriminator: PatchGAN to classify real/fake (image, outline) pairs
-
Training Loop:
- Uses
tf.GradientTapefor manual backpropagation - Logs losses (GAN, L1, Discriminator) with TensorBoard
- Saves model checkpoints every 5000 steps
- Generates preview outputs during training
- Uses
| Input Edge Map | Ground Truth | Generated Image |
|---|---|---|
![]() |
![]() |
![]() |
- Python 3.10+
- TensorFlow 2.14+
- OpenCV
- Matplotlib
- COCO Dataset (via
tensorflow_datasets)
# Clone the repo
git clone https://github.com/your-username/pix2pix-gan-coco.git
cd pix2pix-gan-coco
# Install dependencies
pip install tensorflow opencv-python matplotlib tensorflow-datasets
# Run training script
python ai_project_ares.py- Add FID and SSIM metrics for evaluation
- Support for custom datasets (e.g., Stanford Dogs)
- Streamlit or Gradio UI for demo
- Export model for real-time inference
This project is released under the MIT License.
Anurag Pradhan
📧 anuragpradhancb@gmail.com
🔗 LinkedIn • GitHub


