What is the purpose of weight quantization and activation quantization? #3176
Unanswered
CoinCheung
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I did not find enough specification of weight/activation quantization. I can thought of two usage of this: the first one is to speed up training with 8 bit or 4 bit computation, in which process we need to quantize the weights or activation. The second usage is the so-called quantized-aware-training(QAT), in which we want the model to be adapted to the precision of quantization during training process. Would you tell me which one is the purpose of weight/activation quantization in deepspeed?
Beta Was this translation helpful? Give feedback.
All reactions