You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(base) server@Server:~/llama.cpp$ ./quantize ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf ./models/Phi-3-mini-4k-instruct/ggml-model-Q4_K_M.gguf Q4_K_M
main: build = 913 (eb542d3)
main: quantizing './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf' to './models/Phi-3-mini-4k-instruct/ggml-model--Q4_K_M.gguf' as Q4_K_M
llama.cpp: loading model from ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf
llama_model_quantize: failed to quantize: unknown (magic, version) combination: 46554747, 00000003; is this really a GGML file?
main: failed to quantize model from './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf'
I did the conversion of phi-3 to gguf and it was successful. However, when I tried to quantize this gguf file to 4 bits, I encountered an error. Could anyone provide a suggestion on how to resolve this quantization issue? Thank you in advance.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
(base) server@Server:~/llama.cpp$ ./quantize ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf ./models/Phi-3-mini-4k-instruct/ggml-model-Q4_K_M.gguf Q4_K_M
main: build = 913 (eb542d3)
main: quantizing './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf' to './models/Phi-3-mini-4k-instruct/ggml-model--Q4_K_M.gguf' as Q4_K_M
llama.cpp: loading model from ./models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf
llama_model_quantize: failed to quantize: unknown (magic, version) combination: 46554747, 00000003; is this really a GGML file?
main: failed to quantize model from './models/Phi-3-mini-4k-instruct/ggml-model-f16.gguf'
I did the conversion of phi-3 to gguf and it was successful. However, when I tried to quantize this gguf file to 4 bits, I encountered an error. Could anyone provide a suggestion on how to resolve this quantization issue? Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions