The precision issue of re-quantization. #9849

lingyezhixing · 2024-10-11T17:14:43Z

lingyezhixing
Oct 11, 2024

For a 14B decode-only model, is there a difference in the final accuracy of the IQ3-M model when quantizing from f16 or bf16 to IQ3-M, and from IQ4-XS to IQ3-M? If so, what is the extent of the difference?

wooooyeahhhh · 2024-10-12T14:14:43Z

wooooyeahhhh
Oct 12, 2024

it depends on the precision the model was trained in so if it was done in bf16 then converting to fp16 then into something lower will result in a higher quality loss though it will be negligble so it might not matter

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The precision issue of re-quantization. #9849

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The precision issue of re-quantization. #9849

Uh oh!

lingyezhixing Oct 11, 2024

Replies: 1 comment

Uh oh!

wooooyeahhhh Oct 12, 2024

lingyezhixing
Oct 11, 2024

wooooyeahhhh
Oct 12, 2024