How to convert FP16 gguf model to 4bit or 5 bit gguf model #8108

RakshitAralimatti · 2024-06-25T09:40:29Z

RakshitAralimatti
Jun 25, 2024

I am able to convert the models to fp16 gguf models but facing issue in converting them further in 4bit or 5bit can any one help me to achieve this?

Answered by cshamis

% llama-quantize ../models/model_f16.gguf ../models/model_Q4_K_M Q4_K_M

micsthepick · 2024-06-30T07:59:04Z

./llama-quantize ../text-generation-webui/models/biblegpt-unsloth/biblegpt-unsloth.F16.gguf  2

the above command works for me?
(2 comes from ./llama-quantize --help)

0 replies

cshamis · 2024-07-02T17:12:42Z

% llama-quantize ../models/model_f16.gguf ../models/model_Q4_K_M Q4_K_M

0 replies