Skip to content

[Feature Request] Integrate DeMO optimizer #640

@jzthree

Description

@jzthree

Is your feature request related to a problem? Please describe.
This new optimizer seems to be perfect for distributed training https://github.com/NousResearch/DisTrO - reducing communication bandwidth by several orders of magnitude. I apologize if I misunderstood since I am new to both projects. I just got excited about the potential of combining that with what hivemind can do.

Describe the solution you'd like
An optimizer class implementing DeMo.

Describe alternatives you've considered
No alternative currently exists as of my knowledge

Additional context
Paper
https://arxiv.org/abs/2411.19870
Code
https://github.com/bloc97/DeMo
15B training run
https://distro.nousresearch.com/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions