Replies: 1 comment
-
如果我想要grad_1中的梯度是多卡的平均,怎么改比较合理一点呢? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
loss.backward(retain_graph=True)
grad_0 = {}
for name, param in runner.model.module.named_parameters():
grad_0[name]=param.grad.clone().detach()
runner.optimizer.zero_grad()
loss.backward()
grad_1 = {}
for name, param in runner.model.module.named_parameters():
grad_1[name]=param.grad.clone().detach()
grad_0 中的梯度是所有卡上梯度的平均值,grad_1 中的梯度不会做多卡的平均。
Beta Was this translation helpful? Give feedback.
All reactions