how to dump tensor data with data type GGML_TYPE_Q8_0? #7767
-
Dear GGML community, I'm a quantize beginner here. Can anyone(community developer or AI expert) help to explain how to dump data in a 2D ggml tensor with data type GGML_TYPE_Q8_0? I'm not sure whether my implementation is correct. I need help from AI expert.Thanks so much.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
This function in |
Beta Was this translation helpful? Give feedback.
-
It's my first time to touch the concept of quantize. Thanks so much with sincerely thanks. Could you help to confirm whether my implementation(the codes is exactly referenced from the place you point out but I really don't understand what's the meaning of "y[i*qk + j] = x[i].qs[j]*d; ") is correct? |
Beta Was this translation helpful? Give feedback.
This function in
ggml-quants.c
shows how to convert the data of a Q8_0 tensor to float:https://github.com/ggerganov/llama.cpp/blob/2b3389677a833cee0880226533a1768b1a9508d2/ggml-quants.c#L1609-L1623