数据并行数 × epoch数 = 真实的epoch数? #2961
Unanswered
bobo0810
asked this question in
Community | Q&A
Replies: 1 comment 1 reply
-
Hi @bobo0810 如果是PyTorch正常dataloader提供给Colossal,会被自动转成DistributedSampler。每个GPU各自处理一部分数据,共同完成整个数据集的epoch。epoch=3是3遍数据集。 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
每个gpu上的dataloader都是完整的数据集,未做拆分。 即epoch=3 gpu=2时仅数据并行,模型实际上过了6遍数据集。
Beta Was this translation helpful? Give feedback.
All reactions