F1 score for instance segmentation #3993
andreaceruti
started this conversation in
General
Replies: 1 comment 1 reply
-
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Firstly, thanks to anyone that will spend some of their time in reading this post.
I am using detectron2 for my project and for the moment I am stuck in calculating some standard metrics that go beyond the ones implemented in COCO API. Since COCOEvaluator is very user-friendly I am also using it and I reach very high AP ( 0.84 on my best training setting).
Now I am trying to calculate precision and recall metrics as the standard definition , i.e. precision = TP / (TP+FP) and recall = TP/ (TP+FN), but the results are very poor so I would like to receive a feedback to everyone that can dive into this.
To do this, for every image i calculate TPs, FPs and TNs in this way. (code example on one image below)
#select one image
imgIds = [147]
coco_eval = COCOeval(coco_gt, coco_dt, "segm")
coco_eval.params.imgIds = imgIds
coco_eval.evaluate()
#now I will use the dict produced by the previous function to calculate all
#here i select the dict that corresponds to 'aRng': [0, 10000000000.0]
image_evaluation_dict = coco_eval.evalImgs[0]
#select the index related to IoU = 0.5
iou_treshold_index = 0
#all the detections from the model, it is a numpy of True/False (In my case they are all False)
detection_ignore = image_evaluation_dict["dtIgnore"][iou_treshold_index]
#here we consider the detection that we can not ignore (we use the not operator on every element of the array)
mask = ~detection_ignore
#detections number
n_ignored = detection_ignore.sum()
#and finally we calculate tp, fp and the total positives
tp = (image_evaluation_dict["dtMatches"][iou_treshold_index][mask] > 0).sum()
fp = (image_evaluation_dict["dtMatches"][iou_treshold_index][mask] == 0).sum()
n_gt = len(image_evaluation_dict["gtIds"]) - image_evaluation_dict["gtIgnore"].astype(int).sum()
recall = tp / n_gt
precision = tp / (tp + fp)
f1 = 2 * precision * recall / (precision + recall)
In this way my results are very poor in terms of precision and recall and so I would like to know if I am doing a mistake with this implementation.
Beta Was this translation helpful? Give feedback.
All reactions