Issue converting models to new format #1419
wiseman-timelord
started this conversation in
General
Replies: 1 comment 1 reply
-
You'll want to use a floating point (non-quantized) model as input. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
quantize.exe "./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin" "./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin" q5_0 21
main: build = 531 (553fd4d)
main: quantizing './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin' to './models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin' as q5_0 using 21 threads
llama.cpp: loading model from ./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin
llama.cpp: saving model to ./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin
[ 1/ 363] tok_embeddings.weight - 5120 x 32001, type = q5_0, llama_model_quantize: failed to quantize: type q5_0 unsupported for integer quantization
main: failed to quantize model from './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin'
Same thing happens with the q5_1 version, I also tried on "wizard-vicuna-13B.ggml.q4_0.bin" with the "q4_0" settings, and "ggml-vic13b-q5_1.bin" with "q5_1" and "vicuna-13b-free-q4_0.bin" with "q4_0", keeps happening...
Its creating files 423KB in size then giving up, I have 60GB free on the drive, so cant be that..
Beta Was this translation helpful? Give feedback.
All reactions