A Python project for extracting embeddings from images using the Faster R-CNN model. This project processes images to extract feature embeddings from detected objects before the classification layers, which can be used for downstream tasks like training custom classifiers.
This project demonstrates how to:
- Load a pre-trained Faster R-CNN ResNet-50 FPN model
- Extract feature embeddings from detected objects in images
- Process multiple images and save embeddings as NumPy arrays
- Generate a summary report of the extraction process
FasterRCNN_Embed_Extract/
├── Notebook/
│ └── fasterrcnn-embed-extract.ipynb # Main Jupyter notebook with implementation
├── Results/ # Output directory containing extracted embeddings
│ ├── *.npy # NumPy files containing embeddings for each image
│ └── data_emebeddings.xlsx # Summary Excel file with extraction metadata
└── README.md # This file
- Pre-trained Model: Uses
fasterrcnn_resnet50_fpn
from torchvision with COCO weights - Embedding Extraction: Extracts features from the ROI (Region of Interest) pooling layer
- Batch Processing: Processes multiple images and saves individual embeddings
- Metadata Tracking: Records number of bounding boxes and embedding dimensions for each image
- Output Formats: Saves embeddings as
.npy
files and summary as Excel spreadsheet
- Image Processing: Converts PIL images to PyTorch tensors
- Feature Extraction: Passes images through the backbone network to get feature maps
- RPN Processing: Uses Region Proposal Network to generate object proposals
- ROI Pooling: Applies Region of Interest pooling to extract region-specific features
- Embedding Generation: Passes pooled features through the box head to get final embeddings
- Output Generation: Saves embeddings and creates metadata summary
get_embeddings(model, image)
: Extracts embeddings from a single imageprocess_images(model, image_paths)
: Processes multiple images and saves resultsget_image_paths(image_folder, limit=10)
: Retrieves image paths from a directory
- PyTorch
- torchvision
- PIL (Pillow)
- NumPy
- pandas
- openpyxl (for Excel output)
- Open the Jupyter notebook:
Notebook/fasterrcnn-embed-extract.ipynb
- Update the
image_folder
path to point to your image directory - Run all cells to process images and extract embeddings
- Check the
Results/
folder for extracted embeddings and summary data
- Embedding Files: Each processed image generates a
.npy
file containing the extracted embeddings - Summary Excel: Contains metadata including image name, number of bounding boxes, embedding size, and corresponding file names
- Console Output: Displays the number of detected objects for each image during processing
- The model automatically downloads pre-trained weights on first run
- Supports common image formats (tested with .jpg files)
- Embeddings are extracted before the final classification layer
- Each bounding box detection generates a separate embedding vector