-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
@KevinMIN95 Why you use model_without_ddp and discriminator_without_ddp to calculate some tensors participating the losses calculation? I think the gradients of model_without_ddp will not be synchronized and reduced accross the device, and could this lead to mistakes in distributed training?
Metadata
Metadata
Assignees
Labels
No labels