Skip to content

Finetune with deepspeed: type mismatch #35

@YeZiyi1998

Description

@YeZiyi1998

I encountered an issue while finetune with the officially released code using the DeepSpeed. Here is the detailed error message:

File "/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py", line 57, in forward
output = input.matmul(weight.t())
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

It appears that the matmul operation expects the two input tensors to have the same dtype. However, in my case, one of the tensors is of dtype float and the other is of dtype BFloat16.

I am not sure if this is a bug in the DeepSpeed library or an issue with my usage. I would appreciate any assistance in resolving this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions