The evaluation results of train.py and validate.py are inconsistent #157
Unanswered
qzwangUSTC
asked this question in
Q&A
Replies: 2 comments
-
Sorry, the problem described above may be a bit messy, this time I reformat it. I use the following parameters to train efficientdet-d0 from scratch:./distributed_train.sh 4 /mscoco --model efficientdet_d0 -b 11 --amp --lr 0.06 --sync-bn --opt fusedmomentum --warmup-epochs 5 --lr-noise 0.4 0.9 --model-ema --model-ema-decay 0.9999 Good log output fragments are as follows:Train: 234 [2500/2689 ( 93%)] Loss: 0.593521 (0.5254) Time: 0.495s, 88.80/s (0.537s, 81.88/s) LR: 9.842e-03 Data: 0.028 (0.027)
Train: 234 [2550/2689 ( 95%)] Loss: 0.582619 (0.5265) Time: 0.516s, 85.24/s (0.538s, 81.79/s) LR: 9.842e-03 Data: 0.027 (0.027)
Train: 234 [2600/2689 ( 97%)] Loss: 0.556304 (0.5270) Time: 0.586s, 75.13/s (0.538s, 81.84/s) LR: 9.842e-03 Data: 0.030 (0.027)
Train: 234 [2650/2689 ( 99%)] Loss: 0.538032 (0.5272) Time: 0.465s, 94.72/s (0.538s, 81.76/s) LR: 9.842e-03 Data: 0.031 (0.027)
Train: 234 [2688/2689 (100%)] Loss: 0.609689 (0.5278) Time: 1.239s, 12.91/s (0.538s, 29.72/s) LR: 9.842e-03 Data: 0.781 (0.028)
Test (EMA): [ 0/113] Time: 1.555 (1.555) Loss: 0.4949 (0.4949)
Test (EMA): [ 50/113] Time: 0.215 (0.236) Loss: 0.5803 (0.5480)
Test (EMA): [ 100/113] Time: 0.204 (0.223) Loss: 0.5558 (0.5510)
Test (EMA): [ 113/113] Time: 0.425 (0.221) Loss: 0.5754 (0.5531)
Current checkpoints:
('./output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar', 0.3401196940773372)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-210.pth.tar', 0.34007189357458284)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-211.pth.tar', 0.34005645098502846)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-231.pth.tar', 0.3400186921308521)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-209.pth.tar', 0.3400044905741661)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-232.pth.tar', 0.33993297177380966)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-213.pth.tar', 0.339857531277273)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-230.pth.tar', 0.33980180631739004)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-234.pth.tar', 0.3397794123140815)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-208.pth.tar', 0.3397224963303912)
Train: 235 [ 0/2689 ( 0%)] Loss: 0.538377 (0.5384) Time: 1.825s, 24.10/s (1.825s, 24.10/s) LR: 5.600e-03 Data: 1.342 (1.342)
Train: 235 [ 50/2689 ( 2%)] Loss: 0.543684 (0.5410) Time: 0.508s, 86.54/s (0.547s, 80.50/s) LR: 5.600e-03 Data: 0.025 (0.055)
Train: 235 [ 100/2689 ( 4%)] Loss: 0.518846 (0.5336) Time: 0.500s, 88.01/s (0.555s, 79.32/s) LR: 5.600e-03 Data: 0.025 (0.042)
Train: 235 [ 150/2689 ( 6%)] Loss: 0.501451 (0.5256) Time: 0.555s, 79.21/s (0.547s, 80.43/s) LR: 5.600e-03 Data: 0.025 (0.038)
Train: 235 [ 200/2689 ( 7%)] Loss: 0.522107 (0.5249) Time: 0.484s, 90.91/s (0.541s, 81.26/s) LR: 5.600e-03 Data: 0.027 (0.035)
Train: 235 [ 250/2689 ( 9%)] Loss: 0.474666 (0.5165) Time: 0.551s, 79.81/s (0.548s, 80.29/s) LR: 5.600e-03 Data: 0.026 (0.034) Based on the idea of saving a checkpoint in your code, './output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar' should be best. But when I use validate.py to evaluate, the result is different from the training. I use the following parameters to run validate.py :python validate.py /localtion/of/mscoco/ --model efficientdet_d0 -b 10 ----checkpoint ./output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar Loading and preparing results...
DONE (t=3.91s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=106.94s).
Accumulating evaluation results...
DONE (t=18.99s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.330
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.510
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.347
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.128
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.381
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.519
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.285
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.437
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.463
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.196
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.543
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692 It is 0.34011969 in train.py and 0.330 in validate.py, but both use the same model weight. So, Why is that? |
Beta Was this translation helpful? Give feedback.
0 replies
-
@qzwangUSTC you likely need to specify to use the EMA weights when calling the validation script |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I use the following parameters to train efficientdet-d0 from scratch.
./distributed_train.sh 4 /mscoco --model efficientdet_d0 -b 11 --amp --lr 0.06 --sync-bn --opt fusedmomentum --warmup-epochs 5 --lr-noise 0.4 0.9 --model-ema --model-ema-decay 0.9999
Good log output fragments are as follows:
Train: 234 [2500/2689 ( 93%)] Loss: 0.593521 (0.5254) Time: 0.495s, 88.80/s (0.537s, 81.88/s) LR: 9.842e-03 Data: 0.028 (0.027)
Train: 234 [2550/2689 ( 95%)] Loss: 0.582619 (0.5265) Time: 0.516s, 85.24/s (0.538s, 81.79/s) LR: 9.842e-03 Data: 0.027 (0.027)
Train: 234 [2600/2689 ( 97%)] Loss: 0.556304 (0.5270) Time: 0.586s, 75.13/s (0.538s, 81.84/s) LR: 9.842e-03 Data: 0.030 (0.027)
Train: 234 [2650/2689 ( 99%)] Loss: 0.538032 (0.5272) Time: 0.465s, 94.72/s (0.538s, 81.76/s) LR: 9.842e-03 Data: 0.031 (0.027)
Train: 234 [2688/2689 (100%)] Loss: 0.609689 (0.5278) Time: 1.239s, 12.91/s (0.538s, 29.72/s) LR: 9.842e-03 Data: 0.781 (0.028)
Test (EMA): [ 0/113] Time: 1.555 (1.555) Loss: 0.4949 (0.4949)
Test (EMA): [ 50/113] Time: 0.215 (0.236) Loss: 0.5803 (0.5480)
Test (EMA): [ 100/113] Time: 0.204 (0.223) Loss: 0.5558 (0.5510)
Test (EMA): [ 113/113] Time: 0.425 (0.221) Loss: 0.5754 (0.5531)
Current checkpoints:
('./output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar', 0.3401196940773372)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-210.pth.tar', 0.34007189357458284)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-211.pth.tar', 0.34005645098502846)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-231.pth.tar', 0.3400186921308521)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-209.pth.tar', 0.3400044905741661)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-232.pth.tar', 0.33993297177380966)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-213.pth.tar', 0.339857531277273)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-230.pth.tar', 0.33980180631739004)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-234.pth.tar', 0.3397794123140815)
('./output/train/20201218-122406-efficientdet_d0/checkpoint-208.pth.tar', 0.3397224963303912)
Train: 235 [ 0/2689 ( 0%)] Loss: 0.538377 (0.5384) Time: 1.825s, 24.10/s (1.825s, 24.10/s) LR: 5.600e-03 Data: 1.342 (1.342)
Train: 235 [ 50/2689 ( 2%)] Loss: 0.543684 (0.5410) Time: 0.508s, 86.54/s (0.547s, 80.50/s) LR: 5.600e-03 Data: 0.025 (0.055)
Train: 235 [ 100/2689 ( 4%)] Loss: 0.518846 (0.5336) Time: 0.500s, 88.01/s (0.555s, 79.32/s) LR: 5.600e-03 Data: 0.025 (0.042)
Train: 235 [ 150/2689 ( 6%)] Loss: 0.501451 (0.5256) Time: 0.555s, 79.21/s (0.547s, 80.43/s) LR: 5.600e-03 Data: 0.025 (0.038)
Train: 235 [ 200/2689 ( 7%)] Loss: 0.522107 (0.5249) Time: 0.484s, 90.91/s (0.541s, 81.26/s) LR: 5.600e-03 Data: 0.027 (0.035)
Train: 235 [ 250/2689 ( 9%)] Loss: 0.474666 (0.5165) Time: 0.551s, 79.81/s (0.548s, 80.29/s) LR: 5.600e-03 Data: 0.026 (0.034)
Based on the idea of saving a checkpoint in your code, './output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar' should be best.
But when I use validate.py to evaluate, the result is different from the training.
I use the following parameters to run validate.py :
python validate.py /localtion/of/mscoco/ --model efficientdet_d0 -b 10 ----checkpoint ./output/train/20201218-122406-efficientdet_d0/checkpoint-212.pth.tar
Loading and preparing results...
DONE (t=3.91s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=106.94s).
Accumulating evaluation results...
DONE (t=18.99s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.330
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.510
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.347
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.128
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.381
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.519
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.285
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.437
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.463
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.196
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.543
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692
It is 0.34011969 in train.py and 0.330 in validate.py, but both use the same model weight. So, Why is that?
Beta Was this translation helpful? Give feedback.
All reactions