Skip to content

confusion about learning rate scaling law #71

@ds22058

Description

@ds22058

Your study on learning rate is very helpful to me, but I still have some questions.

  1. In your learning rate scaling law experiment, are all trainings reduced to 31.6% of the initial learning rate at 80% and 10% of the initial learning rate at 90% as mentioned before?

  2. Does the learning rate used in the scaling law refer to the initial learning rate?

Looking forward to your reply.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions