LR Scheduler and Optimizer #11
lukas-blecher
started this conversation in
Ideas
Replies: 2 comments 4 replies
-
I am current training as well, I got BLEU=0.89 1 time, but then it fluctuated between 0.7 to 0.8x all the time. I guess there is something to do with the lr as you mentioned. I am planning to make it faster if possible for deployment. |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Ideas for training speed up by choosing optimal optimization algorithms.
As optimizer I only ever tried Adam and AdamW. They seem to perform quite well but I had the feeling that AdamW is a better fit.
Now for the LR Scheduler.
It has a quite big effect on the training progress. I've mostly used OneCycleLR until now. But the loss either stagnated or got even worse after some time. That's why I continued the training after a couple of epochs with a "fresh" OneCycle.
Maybe using a cyclic scheduler from the start would be the way to go. Something like CosineAnnealingLR.
Does anybody have experience with other schedulers/optimizers?
Beta Was this translation helpful? Give feedback.
All reactions