The precision issue of re-quantization. #9849
Unanswered
lingyezhixing
asked this question in
Q&A
Replies: 1 comment
-
it depends on the precision the model was trained in so if it was done in bf16 then converting to fp16 then into something lower will result in a higher quality loss though it will be negligble so it might not matter |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
For a 14B decode-only model, is there a difference in the final accuracy of the IQ3-M model when quantizing from f16 or bf16 to IQ3-M, and from IQ4-XS to IQ3-M? If so, what is the extent of the difference?
Beta Was this translation helpful? Give feedback.
All reactions