Replies: 2 comments 2 replies
-
Just to make sure I understand what you're trying to do, is the script you provided us with for single-GPU? If not, can you share what it looks like when running on a single GPU?
Can you clarify what it means? |
Beta Was this translation helpful? Give feedback.
1 reply
-
Was this resolved? I'm getting a similar problem where the gradients are slightly off when compared to a single GPU baseline |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
i am a chemistry researcher, we use gnn to predict molecular energy and force , force must predict by
$de/dx$
i have trained a model,but i want to speed up inference when simulate very many atoms.
i want to split atom in multi-gpu when message, and update x by torch.dist or pyg.dist.
next is my basic code
can split x in multi-gpu by ddp(model) and torch.all_gather(x) after MP1 and MP2
it can get energy
but it will lose grad,and grad is needed to predict force ,i have no idea to continue,how to do inference.
suppose my cuda mem is enough, i just want to speed up first by split message calculation into multi-gpu
Beta Was this translation helpful? Give feedback.
All reactions