LR Scheduler and Optimizer #11

lukas-blecher · 2021-05-05T16:10:49Z

lukas-blecher
May 5, 2021
Maintainer

Ideas for training speed up by choosing optimal optimization algorithms.

As optimizer I only ever tried Adam and AdamW. They seem to perform quite well but I had the feeling that AdamW is a better fit.

Now for the LR Scheduler.
It has a quite big effect on the training progress. I've mostly used OneCycleLR until now. But the loss either stagnated or got even worse after some time. That's why I continued the training after a couple of epochs with a "fresh" OneCycle.
Maybe using a cyclic scheduler from the start would be the way to go. Something like CosineAnnealingLR.
Does anybody have experience with other schedulers/optimizers?

jalola · 2021-11-01T11:37:32Z

jalola
Nov 1, 2021

I am current training as well, I got BLEU=0.89 1 time, but then it fluctuated between 0.7 to 0.8x all the time.
I used different optimizer (AdamW) with very small lr (0.0001). When I use Adam, the accuracy fluctuate much more (0.4-0.7)

I guess there is something to do with the lr as you mentioned. I am planning to make it faster if possible for deployment.

2 replies

lukas-blecher Nov 4, 2021
Maintainer Author

Thank you for trying to improve the training. I'm very interested in the progress.
Once I have the resources again I will try out what you are suggesting.

Please keep me posted if you find something!

oakmoos Dec 15, 2023

Excuse me, if you used a self-generated dataset or the dataset provided by this project for training? I trained using the dataset provided by the project and found that progress was very slow without loading any checkpoints/(ㄒoㄒ)/~~. Thank you!

lukas-blecher · 2021-11-24T18:24:59Z

lukas-blecher
Nov 24, 2021
Maintainer Author

I've managed to achieve a good performance and loss without restarts by replacing the OneCycleLR with the StepLR.
Commit coming soon

2 replies

tuiiitendinh Sep 22, 2022

Could you please show us the configuration of OneCycleLR? I have encountered the error as the picture below

lukas-blecher Sep 23, 2022
Maintainer Author

I don't use OneCycleLR anymore and it's not supported without changing the code, sorry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LR Scheduler and Optimizer #11

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

LR Scheduler and Optimizer #11

Uh oh!

lukas-blecher May 5, 2021 Maintainer

Replies: 2 comments · 4 replies

Uh oh!

jalola Nov 1, 2021

Uh oh!

lukas-blecher Nov 4, 2021 Maintainer Author

Uh oh!

oakmoos Dec 15, 2023

Uh oh!

lukas-blecher Nov 24, 2021 Maintainer Author

Uh oh!

tuiiitendinh Sep 22, 2022

Uh oh!

lukas-blecher Sep 23, 2022 Maintainer Author

lukas-blecher
May 5, 2021
Maintainer

Replies: 2 comments 4 replies

jalola
Nov 1, 2021

lukas-blecher Nov 4, 2021
Maintainer Author

lukas-blecher
Nov 24, 2021
Maintainer Author

lukas-blecher Sep 23, 2022
Maintainer Author