Replies: 1 comment
-
The same answer here #54 Probably all GPUs see the all dataset in an epoch. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
something is odd with the trainer. When I train on 2 GPUs on a dataset D, it takes X hours, if I train on 4 GPU it still takes X hours, I was expecting it to take X / 2 hours. Does the trainer share the batches between subprocesses or each subprocess takes it's own batch?
Beta Was this translation helpful? Give feedback.
All reactions