Image Similarity Neural Network

A Siamese neural network implementation for calculating similarity between pairs of images using the MNIST dataset. This project is implemented as an interactive Jupyter notebook with comprehensive visualizations and step-by-step explanations.

Problem Statement

The goal of this project is to train a neural network that can calculate the similarity between pairs of images. The network distinguishes whether two images belong to the same class or not, using the MNIST dataset as a benchmark.

Methodology

This implementation uses a Siamese neural network architecture that learns to embed images into a feature space where:

Similar images (same class) are close together
Dissimilar images (different classes) are far apart

The network is trained using pairs of images with binary labels indicating whether the images are from the same class (similar=1) or different classes (dissimilar=0). Cosine similarity between embeddings serves as the similarity metric, and the network is trained with binary cross-entropy loss.

Architecture Overview

1. Embedding Network

Input: 28×28×1 MNIST images
Architecture:
- Two convolutional blocks (Conv2d → BatchNorm → ReLU → MaxPool)
- Fully connected layers with BatchNorm and ReLU
- L2 normalization for unit-length embeddings
Output: 128-dimensional normalized embedding vectors

2. Similarity Model

Takes two images as input
Computes embeddings using the shared embedding network
Calculates cosine similarity between embeddings
Applies a learnable projection layer for binary classification

Implementation Details

Data Preparation

Transform: Converts images to tensors with normalization (mean=0.1307, std=0.3081)
Pair Generation: Custom PairDataset class creates pairs with 50% positive (same class) and 50% negative (different class) samples

Training Configuration

Optimizer: Adam (lr=1e-2, weight_decay=1e-5)
Loss Function: Binary Cross-Entropy with Logits
Scheduler: ReduceLROnPlateau (patience=2, factor=0.5)
Batch Size: 256
Epochs: 30

Code Structure

The implementation is organized into several key components:

Data Loading and Preprocessing: MNIST dataset loading with normalization transforms
PairDataset Class: Custom dataset for generating positive/negative image pairs
EmbeddingNet Architecture: Convolutional neural network for feature extraction
SimilarityModel: Siamese network combining embeddings with cosine similarity
Training Setup: Loss function, optimizer, and evaluation utilities
Training Loop: Main training process with validation and checkpointing
Visualization: Results analysis and similarity visualization tools

Requirements

torch - PyTorch deep learning framework
torchvision - Computer vision datasets and transforms
matplotlib - Plotting and visualization
numpy - Numerical computing
pandas - Data manipulation and analysis
jupyter - Jupyter notebook environment

Installation and Usage

Clone the repository:

git clone https://github.com/HasanAbdelhady/Calculating-Image-Similarity-using-Neural-Networks.git

Install dependencies:

pip install torch torchvision matplotlib numpy pandas jupyter

Run the Jupyter notebook:
```
jupyter notebook
```
Then open the main notebook file and run all cells sequentially. The notebook includes:
- Data loading and preprocessing
- Model architecture definitions
- Training loop with progress monitoring
- Results visualization and analysis

Key Features

Siamese Architecture: Shared weights between twin networks for consistent feature extraction
Cosine Similarity: Uses normalized embeddings for robust similarity computation
Balanced Dataset: Automatic generation of positive/negative pairs
Mixed Precision: Optimized training with automatic mixed precision (when GPU available)
Visualization: Comprehensive similarity visualization with color-coded results

Model Architecture Details

Embedding Network Flow

Input (28×28×1) → Conv2d(1→32) → BatchNorm → ReLU → MaxPool(2×2) →
Conv2d(32→64) → BatchNorm → ReLU → MaxPool(2×2) →
Flatten → Linear(3136→128) → BatchNorm → ReLU → L2_Normalize → Output (128-dim)

Training Process

Pair Generation: Creates balanced pairs of similar/dissimilar images
Forward Pass: Computes embeddings for both images in a pair
Similarity Calculation: Uses cosine similarity between normalized embeddings
Loss Computation: Binary cross-entropy loss for classification
Optimization: Adam optimizer with learning rate scheduling

Results

The trained model achieves:

High accuracy in distinguishing between similar and dissimilar image pairs
Meaningful similarity scores (higher for same-class pairs, lower for different-class pairs)
Robust performance across different MNIST digit classes

Dataset Visualization

Figure 1: Visualization of the PairDataset showing how similar and dissimilar image pairs are constructed from MNIST digits.

Figure 2: Structure of the paired dataset showing labels where 1 indicates similar pairs (same class) and 0 indicates dissimilar pairs (different classes).

Architecture Visualization

Figure 3: 3D visualization of the EmbeddingNet architecture showing the convolutional blocks and fully connected layers.

Figure 4: Visual representation of the Similarity Model showing how cosine similarity is computed between image embeddings.

Model Results

Figure 5: Example pairs of images with their predicted similarity scores from the trained model.

Figure 6: Additional examples showing comprehensive similarity comparisons between a test image and various digit classes.

Visualization Features

Similarity Matrix: Shows how a test image compares to all digit classes
Color Coding: Green for high similarity, red for low similarity
Ranked Results: Displays digits ranked by similarity score
Probability Scores: Both raw cosine similarity and sigmoid probability

Technical Implementation Notes

Memory Optimization: Uses pin_memory=True and non_blocking=True for faster GPU transfers
Reproducibility: Includes proper random seeding and deterministic behavior options
Efficiency: Implements efficient data loading with multiple workers and persistent workers
Checkpointing: Automatically saves the best model as "best_model.pt" based on validation loss

Usage Examples

After running the training cells in the notebook, you can use the model to compute similarities:

# The trained model is available in the notebook context
model.eval()

# Compute similarity between two images
with torch.no_grad():
    similarity = model.get_raw_similarity(img1, img2)
    print(f"Similarity score: {similarity.item():.4f}")

# Or get the projected similarity logit
logit = model(img1, img2)
probability = torch.sigmoid(logit)
print(f"Similarity probability: {probability.item():.4f}")

License

This project is available for educational and research purposes.

Contributing

Feel free to submit issues, feature requests, and pull requests to improve the implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
Image Similairty Search Using Neural Networks.ipynb		Image Similairty Search Using Neural Networks.ipynb
LICENSE		LICENSE
blocks2.png		blocks2.png
cosine.png		cosine.png
df.png		df.png
pairs2.png		pairs2.png
readme.md		readme.md
sim_comp2.png		sim_comp2.png
sim_out.png		sim_out.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Similarity Neural Network

Problem Statement

Methodology

Architecture Overview

1. Embedding Network

2. Similarity Model

Implementation Details

Data Preparation

Training Configuration

Code Structure

Requirements

Installation and Usage

Key Features

Model Architecture Details

Embedding Network Flow

Training Process

Results

Dataset Visualization

Architecture Visualization

Model Results

Visualization Features

Technical Implementation Notes

Usage Examples

License

Contributing

About

Uh oh!

Languages

License

HasanAbdelhady/Calculating-Image-Similarity-using-Neural-Networks

Folders and files

Latest commit

History

Repository files navigation

Image Similarity Neural Network

Problem Statement

Methodology

Architecture Overview

1. Embedding Network

2. Similarity Model

Implementation Details

Data Preparation

Training Configuration

Code Structure

Requirements

Installation and Usage

Key Features

Model Architecture Details

Embedding Network Flow

Training Process

Results

Dataset Visualization

Architecture Visualization

Model Results

Visualization Features

Technical Implementation Notes

Usage Examples

License

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages