How does DDP work under the hood in Lightning? #20917
Unanswered
davidgill97
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to make a pipeline with hydra+lightning, but it seems lightning spawns multiple processes and runs the script multiple times (like using torchrun), causing unwanted side effects. So i want to understand how it works under the hood to fix the problem. (Not likely caused by lightning/hydra)
I know how it works with plain pytorch, so tried to dig into the source code.. which ended up with more confusion
Does anyone know how lightning (or fabric) handles DDP, especially without torchrun?
Beta Was this translation helpful? Give feedback.
All reactions