F1 score for instance segmentation #3993

andreaceruti · 2022-02-22T23:19:12Z

andreaceruti
Feb 22, 2022

Firstly, thanks to anyone that will spend some of their time in reading this post.

I am using detectron2 for my project and for the moment I am stuck in calculating some standard metrics that go beyond the ones implemented in COCO API. Since COCOEvaluator is very user-friendly I am also using it and I reach very high AP ( 0.84 on my best training setting).
Now I am trying to calculate precision and recall metrics as the standard definition , i.e. precision = TP / (TP+FP) and recall = TP/ (TP+FN), but the results are very poor so I would like to receive a feedback to everyone that can dive into this.
To do this, for every image i calculate TPs, FPs and TNs in this way. (code example on one image below)

#select one image
imgIds = [147]
coco_eval = COCOeval(coco_gt, coco_dt, "segm")
coco_eval.params.imgIds = imgIds
coco_eval.evaluate()

#now I will use the dict produced by the previous function to calculate all

#here i select the dict that corresponds to 'aRng': [0, 10000000000.0]
image_evaluation_dict = coco_eval.evalImgs[0]

#select the index related to IoU = 0.5
iou_treshold_index = 0

#all the detections from the model, it is a numpy of True/False (In my case they are all False)
detection_ignore = image_evaluation_dict["dtIgnore"][iou_treshold_index]

#here we consider the detection that we can not ignore (we use the not operator on every element of the array)
mask = ~detection_ignore

#detections number
n_ignored = detection_ignore.sum()

#and finally we calculate tp, fp and the total positives
tp = (image_evaluation_dict["dtMatches"][iou_treshold_index][mask] > 0).sum()
fp = (image_evaluation_dict["dtMatches"][iou_treshold_index][mask] == 0).sum()
n_gt = len(image_evaluation_dict["gtIds"]) - image_evaluation_dict["gtIgnore"].astype(int).sum()

recall = tp / n_gt
precision = tp / (tp + fp)
f1 = 2 * precision * recall / (precision + recall)

In this way my results are very poor in terms of precision and recall and so I would like to know if I am doing a mistake with this implementation.

andreaceruti · 2022-02-23T16:28:45Z

andreaceruti
Feb 23, 2022
Author

Ok well the problem of my poor results were due to the fact that I was using _C.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.05 and instead when I use a value of 0.75 this is what my model produce (on the left we have gt and on the right the predictions)

With this parameter i trade off the average precision given by COCOEvaluator and what my code returns.

The question that remains is: do you think that my code for TP/FP/FN calculation is wrong?

1 reply

emmanuel-nwogu Dec 15, 2022

Hey, did you ever figure out if your calculation was correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

F1 score for instance segmentation #3993

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

F1 score for instance segmentation #3993

Uh oh!

andreaceruti Feb 22, 2022

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

andreaceruti Feb 23, 2022 Author

Uh oh!

emmanuel-nwogu Dec 15, 2022

andreaceruti
Feb 22, 2022

Replies: 1 comment 1 reply

andreaceruti
Feb 23, 2022
Author