This project implements a Deep Convolutional Generative Adversarial Network (DCGAN) using PyTorch to generate images of cats and dogs. The architecture is based on best practices from the original DCGAN paper and is designed to train on a folder of labeled images.
- PyTorch-based DCGAN implementation
- Generator and Discriminator architectures defined clearly
- Customizable training parameters
- Progress tracking with saved generated images
- Visualization of training loss and generated results
- Support for GPU acceleration
This project uses datasets structured like this:
/path/to/dataset/
βββ cats/
β βββ cat1.jpg
β βββ cat2.jpg
β βββ ...
βββ dogs/
βββ dog1.jpg
βββ dog2.jpg
βββ ...
It uses torchvision.datasets.ImageFolder
for loading the dataset. Make sure each category (cats, dogs) is placed in a separate folder under the training directory.
- Python 3.7+
- PyTorch
- torchvision
- matplotlib
- numpy
You can install the required packages using:
pip install torch torchvision matplotlib numpy
You can modify the training parameters at the top of the script:
DATA_ROOT = 'path/to/dataset'
IMAGE_SIZE = 64
Z_DIM = 100
BATCH_SIZE = 128
NUM_EPOCHS = 5
LR_G = 0.0002
LR_D = 0.0002
The output images and plots are saved to the ./generated_images
directory.
To train the model:
python dcgan_dog_cat.py
Make sure to adjust the
DATA_ROOT
variable to point to your dataset path.
The training will:
- Display the generator and discriminator architectures
- Save generated images during training (every 500 iterations)
- Save loss plots and final generated images
- Loss Curve:
loss_plot.png
- Final Generated Images:
final_generated_images.png
- Intermediate Generated Images:
generated_image_epoch_XXXX_iter_XXXXXX.png
These are all saved in the ./generated_images
folder.
- Based on
nn.ConvTranspose2d
layers - Uses
BatchNorm2d
andReLU
activations - Outputs 64x64 RGB images
- Based on
nn.Conv2d
layers - Uses
BatchNorm2d
andLeakyReLU
- Outputs a scalar probability
- Make sure your dataset is large and diverse enough for the GAN to learn useful representations.
- Increase
NUM_EPOCHS
for better results. - Use GPU for faster training (
cuda:0
is automatically detected if available).
This project is provided under the MIT License.