Are ASR models trained using mixed precision also suitable for half precision inference? #6261

whrichd · 2023-03-20T23:18:48Z

whrichd
Mar 20, 2023

I trained a model using bf16. But after I see that it's still saved it's saved in fp32. Will this model still be able to provide computation performance improvements in inference? Does it need to be converted back to bf16?

I see in the documentation and other sources that mixed precision is only mentioned for training speed boost.

titu1994 · 2023-03-20T23:22:15Z

titu1994
Mar 20, 2023
Maintainer

Yes, models trained with fp16 / bf 16 can do inference in fp32 or the original dtype.

There are some cases (rare), where a model trained in bf16 can't do inference in fp16 or vice versa, but that can be fixed with a little finetuning on the required dtype.

Some super large models may also have difficulty if they are trained on fp32 and tried to do inference in fp16, but bf16 will work most of the time

3 replies

whrichd Mar 20, 2023
Author

Thanks. How do I export the trained model (saved in fp32) back to bf16? So from what I've learned I can do inference with it using autocast, but what to do to change the dtype of the model itself? Then to export to ONNX format for example? Tried model.bfloat16() but the resulting dtype is still fp32.

titu1994 Mar 21, 2023
Maintainer

Model.to(torch.bfloat16) is not recommended, that will convert even weights that aren't meant to support bfloat to that type (batch norm counters etc)

Use The export script in NeMos script dir to export model, that handles dtype for you

whrichd Mar 21, 2023
Author

Great! Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Are ASR models trained using mixed precision also suitable for half precision inference? #6261

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Are ASR models trained using mixed precision also suitable for half precision inference? #6261

Uh oh!

whrichd Mar 20, 2023

Replies: 1 comment · 3 replies

Uh oh!

titu1994 Mar 20, 2023 Maintainer

Uh oh!

whrichd Mar 20, 2023 Author

Uh oh!

titu1994 Mar 21, 2023 Maintainer

Uh oh!

whrichd Mar 21, 2023 Author

whrichd
Mar 20, 2023

Replies: 1 comment 3 replies

titu1994
Mar 20, 2023
Maintainer

whrichd Mar 20, 2023
Author

titu1994 Mar 21, 2023
Maintainer

whrichd Mar 21, 2023
Author