FasterRCNN_Embed_Extract

A Python project for extracting embeddings from images using the Faster R-CNN model. This project processes images to extract feature embeddings from detected objects before the classification layers, which can be used for downstream tasks like training custom classifiers.

Overview

This project demonstrates how to:

Load a pre-trained Faster R-CNN ResNet-50 FPN model
Extract feature embeddings from detected objects in images
Process multiple images and save embeddings as NumPy arrays
Generate a summary report of the extraction process

Project Structure

FasterRCNN_Embed_Extract/
├── Notebook/
│   └── fasterrcnn-embed-extract.ipynb    # Main Jupyter notebook with implementation
├── Results/                               # Output directory containing extracted embeddings
│   ├── *.npy                             # NumPy files containing embeddings for each image
│   └── data_emebeddings.xlsx             # Summary Excel file with extraction metadata
└── README.md                             # This file

Features

Pre-trained Model: Uses fasterrcnn_resnet50_fpn from torchvision with COCO weights
Embedding Extraction: Extracts features from the ROI (Region of Interest) pooling layer
Batch Processing: Processes multiple images and saves individual embeddings
Metadata Tracking: Records number of bounding boxes and embedding dimensions for each image
Output Formats: Saves embeddings as .npy files and summary as Excel spreadsheet

How It Works

Image Processing: Converts PIL images to PyTorch tensors
Feature Extraction: Passes images through the backbone network to get feature maps
RPN Processing: Uses Region Proposal Network to generate object proposals
ROI Pooling: Applies Region of Interest pooling to extract region-specific features
Embedding Generation: Passes pooled features through the box head to get final embeddings
Output Generation: Saves embeddings and creates metadata summary

Key Functions

get_embeddings(model, image): Extracts embeddings from a single image
process_images(model, image_paths): Processes multiple images and saves results
get_image_paths(image_folder, limit=10): Retrieves image paths from a directory

Dependencies

PyTorch
torchvision
PIL (Pillow)
NumPy
pandas
openpyxl (for Excel output)

Usage

Open the Jupyter notebook: Notebook/fasterrcnn-embed-extract.ipynb
Update the image_folder path to point to your image directory
Run all cells to process images and extract embeddings
Check the Results/ folder for extracted embeddings and summary data

Output

Embedding Files: Each processed image generates a .npy file containing the extracted embeddings
Summary Excel: Contains metadata including image name, number of bounding boxes, embedding size, and corresponding file names
Console Output: Displays the number of detected objects for each image during processing

Notes

The model automatically downloads pre-trained weights on first run
Supports common image formats (tested with .jpg files)
Embeddings are extracted before the final classification layer
Each bounding box detection generates a separate embedding vector

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FasterRCNN_Embed_Extract

Overview

Project Structure

Features

How It Works

Key Functions

Dependencies

Usage

Output

Notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Notebook		Notebook
Results		Results
README.md		README.md

GuruprasannaRS/FasterRCNN-Embed-Extract

Folders and files

Latest commit

History

Repository files navigation

FasterRCNN_Embed_Extract

Overview

Project Structure

Features

How It Works

Key Functions

Dependencies

Usage

Output

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages