Replies: 1 comment 3 replies
-
Yes, models trained with fp16 / bf 16 can do inference in fp32 or the original dtype. There are some cases (rare), where a model trained in bf16 can't do inference in fp16 or vice versa, but that can be fixed with a little finetuning on the required dtype. Some super large models may also have difficulty if they are trained on fp32 and tried to do inference in fp16, but bf16 will work most of the time |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I trained a model using bf16. But after I see that it's still saved it's saved in fp32. Will this model still be able to provide computation performance improvements in inference? Does it need to be converted back to bf16?
I see in the documentation and other sources that mixed precision is only mentioned for training speed boost.
Beta Was this translation helpful? Give feedback.
All reactions