-
Notifications
You must be signed in to change notification settings - Fork 294
Open
Description
I encountered an issue while finetune with the officially released code using the DeepSpeed. Here is the detailed error message:
File "/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py", line 57, in forward
output = input.matmul(weight.t())
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16
It appears that the matmul operation expects the two input tensors to have the same dtype. However, in my case, one of the tensors is of dtype float and the other is of dtype BFloat16.
I am not sure if this is a bug in the DeepSpeed library or an issue with my usage. I would appreciate any assistance in resolving this issue.
double-fire-0 and MessMemory
Metadata
Metadata
Assignees
Labels
No labels