A Problem encountered when using data parallel #1540
Unanswered
Aziily
asked this question in
Community | Q&A
Replies: 1 comment
-
Did you set the config about 'zero'? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I think I may need someone to help me solve the problem

I am trying to use two nodes which each has a GPU, and I have written such config in config.py
while when I use colossalai run to start the distributed training, I get such WARNING

and by watching the logs of training, I believe it doesn't train a model, instead it starts training a model per process
Beta Was this translation helpful? Give feedback.
All reactions