what data type does GGML quantization cast to on inference? #6274

aviadingo · 2024-03-24T08:50:46Z

aviadingo
Mar 24, 2024

Hello, I am interested to know the casting process during inference where an input signal is going through a quantized layer.
If I understand correctly we have the input which is x, I guess the type is float right?
and the layer itself is qunatized, let's say INT8.

does the layer weights get casted back to floats for computation or does the signal gets casted to int instead?

would love to know where I can find the answer in the code as well.

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

what data type does GGML quantization cast to on inference? #6274

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

what data type does GGML quantization cast to on inference? #6274

Uh oh!

aviadingo Mar 24, 2024

Replies: 0 comments

aviadingo
Mar 24, 2024