-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Description
How can i use Muon
with llama model? I run it with Llama, 64 A100
model = LlamaForCausalLM.from_pretrained("meta-llama/Llama-2-7b")
grouped_parameters = [
p for p in model.parameters() if p.requires_grad
]
optimizer = Muon(grouped_parameters)
But it got wrong
[rank3]: File "/xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/optimizer/Muon.py", line 104, in <listcomp>
[rank3]: params = [p for p in group['params'] if self.state[p]['use_muon']]
[rank3]: KeyError: 'use_muon'
When I print the params,it seems that the params in self.state
not equal group['params']
Metadata
Metadata
Assignees
Labels
No labels