This repository contains a PyTorch implementation of the famous AlexNet architecture, based on the 2012 research paper - ImageNet Classification with Deep Convolutional Neural Networks. The implementation was built from scratch without relying on pre-trained models, focusing on understanding the architecture and training pipeline as described in the paper.
This project demonstrates:
- The use of convolutional layers, max-pooling, dropout, and ReLU activation.
- Custom data augmentation techniques inspired by the paper, such as PCA-based color augmentation.
- Training on a custom dataset (insect classification).
- Full PyTorch implementation of AlexNet, following the exact and original architecture.
- Support for binary and multi-class classification.
- Custom data augmentation pipeline including resizing, cropping, flipping, and PCA augmentation.
- Example training on a custom insect dataset.
- The main implementation of Alexnet is in
alexnet_implement-colab.ipynb
. This one was trained on google colab's T4 GPU. Some slight changes I had to make different from the original paper include"- Changing the drop out values to
0.7
from0.5
, because for some reason with dropout set to0.5
, the model was not learning at all and giving stagnant losses throughout each epoch during training. Switching it to0.7
showed significant gradual decrease in losses and improvement in accuracy from45.6%
to79.69%
. - Ignored the softmax ending layer because pytorch's
CrossEntropyLoss
expects raw logits and handlessoftmax
internally. - Used a custom and smaller dataset to reduce training time and other complexities. Also, original imagenet dataset too large to handle.
- Changing the drop out values to
- Main challenges I faced can be clearly seen in
alexnet-implement.ipynb
andalexnet-implement-trialx
files since I tried training the network on my local GPU (rtx 3050).- For some unknown reason, no matter how much I change the dropout factor, learning rates and random seeds, the model was not learning. Even on running the same colab notebook locally, the results were poor.
- Another challenge I faced was the issue of varying accuracies, each time I ran the accuracy block, I got different accuracies - ended up fixing this one by setting seed to 42 each time randomization was being performed. (shout out to claude for this one)
- One last change I made that was different from the paper was using Adam optimizer instead of Stochastic Gradient because using Adam showed significant reduce in losses whereas with SGD, the model was again not learning well enough.
Feel free to download and run the code yourself and check if the issue persists in all local GPUs.
Big shout out to chatgpt and claude for helping me understand the paper in depth (annotated notes by me can be found in AlexNet Annotated - shiven.pdf
)
- Original Paper: ImageNet Classification with Deep Convolutional Neural Networks by Alex Krizhevsky.
- Datasets: Custom insect classification dataset.
This project is licensed under the MIT License. See LICENSE
for details.