Task:
Training the DINO object detection model on a pedestrian dataset consisting of 200 images collected within the IIT Delhi campus. The dataset, annotated in COCO format, includes both images and corresponding annotations in JSON format.
π Dataset Link
For visualizing the bounding boxes on 200 images in the dataset using the JSON file, I used the script
Visualizing_bounding_boxes.ipynb
uploaded in this repo.
-
Upload the dataset (see link above) on your Drive and add the JSON file
random_sample_mavi_2_gt.json
into the directory containing the 200 images. -
Download
code_file.ipynb
and upload the notebook to Google Colab.
Make sure to use GPU runtime. -
Mount Google Drive and clone the GitHub repo (link provided in the notebook).
Install PyTorch, torchvision, and other requirements.
Then compile CUDA operations.If you face any errors of missing modules like
'MultiScaleDeformableAttention'
, re-execute this cell and then restart the session. -
Organize the data in COCO format:
COCODIR/ βββ train2017/ βββ val2017/ βββ annotations/ βββ instances_train2017.json βββ instances_val2017.json
-
Download the DINO model checkpoint (
checkpoint0011_4scale.pth
) from the link:
π Pretrained DINO-4scale Model -
Run evaluation script using pretrained model checkpoint:
bash scripts/DINO_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
-
Visualize predictions of pretrained model on validation set.
(Script is modified to compare ground truth and predicted images.)If you face some dependency errors or errors in repo files, it is due to version mismatch. Simply try changing the version (or) making the necessary changes as mentioned in the error.
-
Fine-tune the pretrained model on our custom dataset (12 epochs).
π Finetuned Model Weights!bash /content/DINO/scripts/DINO_train.sh /path/to/your/COCODIR \ --output_dir logs/DINO/R50-MS4 \ --config_file /content/DINO/config/DINO/DINO_4scale.py \ --pretrain_model_path /path/to/your/checkpoint \ --finetune_ignore label_enc.weight class_embed
-
Run evaluation using the finetuned model checkpoint:
bash scripts/DINO_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
-
Visualize predictions from the finetuned model by updating the checkpoint path in the visualization script.
-
Loss graphs during fine-tuning:
You can editengine.py
to store loss values during training.I manually added the loss values through observation due to Colab limitations.
You can use matplotlib to plot the loss graph.