Changing intial learning rate #6384

Khimer · 2023-04-06T19:53:07Z

Khimer
Apr 6, 2023

Hello!
I am trying to finetune the stt_ru_conformer_transducer_large model on new datasets using the adamw optimizer and scheduler CosineAnnealing. But when replacing the dataset, the initial learning rate is around 1e-10 or 1e-8 and grows slowly, which is very time consuming. I would like to be able to specify the initial learning rate at which training would start or restore its value from the last checkpoint, but unfortunately I only see the parameters model.cfg.optim.lr (set to 0.001) and model.cfg.optim. sched.min_lr (set as 1e-6), when changing which the initial learning rate still remains very small. Is it possible to specify the initial learning rate directly? Thank you!

Answered by titu1994

Apr 6, 2023

You can print out the optim config with print(model.cfg.optim) then change warmup_steps to a 0 or a small number like 5000. Probably the warmup steps is too much. Alzi check that the scheduler is actually CosineAnnealing - then 0.001 is used.

If it's Noam, then optim.lr acts as a multiplier - so your actual LR is being multiplied by 0.001

View full answer

titu1994 · 2023-04-06T20:03:48Z

titu1994
Apr 6, 2023
Maintainer

You can print out the optim config with print(model.cfg.optim) then change warmup_steps to a 0 or a small number like 5000. Probably the warmup steps is too much. Alzi check that the scheduler is actually CosineAnnealing - then 0.001 is used.

If it's Noam, then optim.lr acts as a multiplier - so your actual LR is being multiplied by 0.001

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changing intial learning rate #6384

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Changing intial learning rate #6384

Uh oh!

Khimer Apr 6, 2023

Replies: 1 comment

Uh oh!

titu1994 Apr 6, 2023 Maintainer

Khimer
Apr 6, 2023

titu1994
Apr 6, 2023
Maintainer