Replies: 2 comments
-
Hi @Brainkite |
Beta Was this translation helpful? Give feedback.
0 replies
-
Interesting. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm quite new to MMsegm and it's a great library.
I did some WandB hyper-parameters optimization with various models on the demo Colab notebook and strangely reached some very similar best val Dice score for FCN_R101, OCR_HRNet_W48 and DeepLabV3+.
I mostly explored Optimizers (sgd, adam), schedules (poly,1-cycle), lr, mom & wd. And always started from the Cityscapes 80K pre-trained models.
Here are the best val/Dice score they reached and with what parameters:
OCR_HRNet_W48:
best Dice: 0.8683
bs: 16
max_iters: 2000
lr: 0.02765
mom: 0.8
optim: sgd
regime: poly
wd: 0.00001
FCN_R101b:
best Dice: 0.8631
bs:16
max_iters: 2000
Lr: 0.01658
mom: 0.8113
optim: sgd
regime: poly
wd: 0.00001
DeepLabV3+_R101:
best Dice: 0.8616
bs: 16
iters: 2000
lr: 0.0001
optim: adam
regime: cyclic
wd: 0.000001
Can this be because of the "small" size of the Stanford dataset or because of the "short" training of 2000 iters?
Is it because I started from very fine-tuned models on the Cityscapes dataset and would have obtained quite different scores training from pre-trained backbones and random init heads?
Beta Was this translation helpful? Give feedback.
All reactions