Object Detection on BDD100K Dataset
This repository contains the code for BDD Object Detection Dataset analysis. The code is containerized using Docker to ensure it can run on any machine.
- Docker installed on your machine. You can download it from here.
- IMPORTANT Make sure the bdd data is stored in the folder named assignment_data_bdd in the same directory.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Build the Docker image:
docker build -t bdd-object-detection .
-
Run the Docker container:
docker run -p 8888:8888 -v $(pwd):/app bdd-object-detection
The data flow within this project is structured as follows:
- Dataset: The BDD100K dataset is expected to be located in the
assignment_data_bdd
folder, structured with images and labels. - Data Loading (
data.py
):- The
BDDObjectDetectionDataset
class handles loading and preprocessing the BDD100K dataset. - It supports splitting the data into training and validation sets, specified during dataset initialization.
- Annotations are loaded from JSON files and cached as parquet files for faster loading in subsequent runs.
- The dataset class provides methods to access images and their corresponding bounding box annotations.
- The
- Data Transformation (
data.py
):- The
__getitem__
method retrieves an image and its target (bounding boxes and labels). - The
custom_collate_fn
function is used to collate batches of data, converting PIL Images to tensors.
- The
The training process is managed by train_model.py
. Here's a breakdown of the key steps:
-
Data Loading (
train_model.py
):- The
train_function
initializes the training and validation datasets usingBDDObjectDetectionDataset
. DataLoader
is used to create iterable data loaders for training and validation sets, usingcustom_collate_fn
to handle batching.
- The
-
Model Definition (
model.py
):- The
ObjectDetectionModel
class defines the object detection model as a PyTorch Lightning module. - It uses a pre-trained Faster R-CNN model with a ResNet-50 backbone, obtained from
torchvision.models
. - The
get_pretrained_model
function configures the model, replacing the classifier with a new one suitable for the BDD100K dataset (10 object classes + background). - The backbone's pre-trained weights are frozen during initial training to stabilize training and leverage pre-trained features.
- The
-
Training Loop (
train_model.py
andmodel.py
):- The
train_function
sets up the training and validation data loaders, the model, and the PyTorch Lightning trainer. - Loss Function (
losses.py
): TheObjectDetectionLoss
class calculates the combined loss (classification and regression) for the object detection task. It includes Focal Loss for classification and Smooth L1 Loss for bounding box regression. - Optimizer (
model.py
): The Adam optimizer is used with a learning rate of 1e-3. - Callbacks (
train_model.py
):ModelCheckpoint
: Saves the best model based on validation loss. Checkpoints are saved in thecheckpoints/
directory. The filename includes the epoch number and validation loss.EarlyStopping
: Stops training when the validation loss stops improving, with a patience of 10 epochs.
- Logging (
train_model.py
): TensorBoard is used for logging metrics during training. Logs are stored in thelightning_logs
directory. - The
training_step
andvalidation_step
methods inObjectDetectionModel
define the training and validation logic, respectively.
- The
-
Running Training (
train_model.py
): To start the training, execute thetrain_model.py
script:python bdd_object_detection/train_model.py
The result_analysis.ipynb
notebook provides a comprehensive analysis of the model's performance. Here's a breakdown of the key steps:
-
Data Loading (
result_analysis.ipynb
):- Prediction and ground truth data are loaded from parquet files (
bdd100k_val_cache_predictions.parquet
andbdd100k_val_cache.parquet
, respectively). - These files should contain the bounding box predictions and ground truth annotations for the validation set.
- Prediction and ground truth data are loaded from parquet files (
-
Metric Calculation (
losses.py
andresult_analysis.ipynb
):- The
calculate_metrics
function inlosses.py
calculates object detection metrics such as Precision, Recall, and F1-score based on Intersection over Union (IoU) between predicted and ground truth bounding boxes. - The notebook iterates through different score thresholds and IoU thresholds to evaluate the model's performance under various conditions.
- Metrics are calculated for each category and for all categories combined.
- The
-
Visualization (
result_analysis.ipynb
):- The notebook generates various plots to visualize the model's performance:
- Precision-Recall curves: Plots of precision vs. recall for different IoU thresholds.
- F1-score curves: Plots of F1-score vs. score threshold for different IoU thresholds.
- Bar charts: Bar charts comparing precision, recall, and F1-score for different categories at a fixed IoU threshold.
- AP (Average Precision) analysis: Analysis of AP, AP50, AP75, and AR (Average Recall) metrics, including plots for individual classes and overall performance.
- The plots help to identify the optimal score threshold for each category and to understand the model's strengths and weaknesses.
- The notebook generates various plots to visualize the model's performance:
-
Max F1 Score Analysis (
result_analysis.ipynb
):- The notebook identifies the score threshold at which the F1-score is maximized for each category.
- This allows for setting a working point for the model where it performs the best, balancing precision and recall.
-
Average Precision Analysis (
result_analysis.ipynb
):- The notebook analyzes the Average Precision (AP) for each class using pre-computed results from a RetinaNet model.
- It generates bar charts to visualize the AP for different classes and IoU thresholds.
-
Running Analysis (
result_analysis.ipynb
):- To run the analysis, execute the
result_analysis.ipynb
notebook in a Jupyter environment.
- To run the analysis, execute the
- The
Dockerfile
sets up the environment and installs all necessary dependencies specified inrequirements.txt
. - This
README.md
provides instructions on how to build and run the Docker container, train the model, and analyze the results