Skip to content

FaNa-AI/GAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

DCGAN for Cat and Dog Image Generation

This project implements a Deep Convolutional Generative Adversarial Network (DCGAN) using PyTorch to generate images of cats and dogs. The architecture is based on best practices from the original DCGAN paper and is designed to train on a folder of labeled images.


πŸ“Œ Features

  • PyTorch-based DCGAN implementation
  • Generator and Discriminator architectures defined clearly
  • Customizable training parameters
  • Progress tracking with saved generated images
  • Visualization of training loss and generated results
  • Support for GPU acceleration

πŸ—‚οΈ Dataset Structure

This project uses datasets structured like this:

/path/to/dataset/
β”œβ”€β”€ cats/
β”‚   β”œβ”€β”€ cat1.jpg
β”‚   β”œβ”€β”€ cat2.jpg
β”‚   └── ...
└── dogs/
    β”œβ”€β”€ dog1.jpg
    β”œβ”€β”€ dog2.jpg
    └── ...

It uses torchvision.datasets.ImageFolder for loading the dataset. Make sure each category (cats, dogs) is placed in a separate folder under the training directory.


πŸ”§ Requirements

  • Python 3.7+
  • PyTorch
  • torchvision
  • matplotlib
  • numpy

You can install the required packages using:

pip install torch torchvision matplotlib numpy

βš™οΈ Configuration

You can modify the training parameters at the top of the script:

DATA_ROOT = 'path/to/dataset'
IMAGE_SIZE = 64
Z_DIM = 100
BATCH_SIZE = 128
NUM_EPOCHS = 5
LR_G = 0.0002
LR_D = 0.0002

The output images and plots are saved to the ./generated_images directory.


πŸš€ Running the Code

To train the model:

python dcgan_dog_cat.py

Make sure to adjust the DATA_ROOT variable to point to your dataset path.

The training will:

  • Display the generator and discriminator architectures
  • Save generated images during training (every 500 iterations)
  • Save loss plots and final generated images

πŸ“ˆ Output Samples

  • Loss Curve: loss_plot.png
  • Final Generated Images: final_generated_images.png
  • Intermediate Generated Images: generated_image_epoch_XXXX_iter_XXXXXX.png

These are all saved in the ./generated_images folder.


🧠 Model Architecture

Generator

  • Based on nn.ConvTranspose2d layers
  • Uses BatchNorm2d and ReLU activations
  • Outputs 64x64 RGB images

Discriminator

  • Based on nn.Conv2d layers
  • Uses BatchNorm2d and LeakyReLU
  • Outputs a scalar probability

πŸ“Ž Notes

  • Make sure your dataset is large and diverse enough for the GAN to learn useful representations.
  • Increase NUM_EPOCHS for better results.
  • Use GPU for faster training (cuda:0 is automatically detected if available).

πŸ“„ License

This project is provided under the MIT License.

Releases

No releases published

Packages

No packages published