-
Couldn't load subscription status.
- Fork 4.6k
Description
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/utils/nvtx.py", line 20, in wrapped_fn
rank2: ret_val = func(*args, **kwargs)
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/runtime/engine.py", line 2324, in backward
rank2: self._backward_epilogue()
rank2: ~~~~~~~~~~~~~~~~~~~~~~~^^
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/runtime/engine.py", line 2260, in _backward_epilogue
rank2: self.allreduce_gradients()
rank2: ~~~~~~~~~~~~~~~~~~~~~~~~^^
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/utils/nvtx.py", line 20, in wrapped_fn
rank2: ret_val = func(*args, **kwargs)
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/runtime/engine.py", line 2211, in allreduce_gradients
rank2: self.optimizer.overlapping_partition_gradients_reduce_epilogue()
rank2: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 946, in overlapping_partition_gradients_reduce_epilogue
rank2: self.independent_gradient_partition_epilogue()
rank2: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
rank2: File "/mnt/data/anaconda3/envs/optimizer/lib/python3.13/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 860, in independent_gradient_partition_epilogue
rank2: for accumulated_grad, new_avg_grad in zip(self.all_grad_tensors[i], avg_new):
rank2: ~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
rank2: TypeError: 'NoneType' object is not iterable