Replies: 2 comments
-
|
@tjruwase can you help me with this/ |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Historically,
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I am new to distributed training and am using huggingface to train large models. I see many options to run distributed training. Can I know what is the difference between the following options:
python train.py .....<ARGS>python -m torch.distributed.launch <ARGS>deepspeed train.py <ARGS>I did not expect option 1 to use distributed training. But it even seem to use some sort of torch distributed training? In that case, whats the difference between option 1 and option 2?
Does deepspeed use torch.distributed in the background?
Beta Was this translation helpful? Give feedback.
All reactions