You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a large Conformer and I'm exporting it with ONNX in fp16. I'm interested in its quantization.
Have you tried this? Maybe with TensorRT or other frameworks? Can you recommend any working recipes for it?
If it worked for you, what kind of speedup did you get? What is WER reduction?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I have a large Conformer and I'm exporting it with ONNX in fp16. I'm interested in its quantization.
Have you tried this? Maybe with TensorRT or other frameworks? Can you recommend any working recipes for it?
If it worked for you, what kind of speedup did you get? What is WER reduction?
I saw https://github.com/kssteven418/Q-ASR for ASR solutions.
So, I'm interested in your experience with quantizing the Conformer/other ASR models. Can you share it please?
Beta Was this translation helpful? Give feedback.
All reactions