Does llama.cpp provide quant weights in one column of Matrix A? #8907

ErvinXie · 2024-08-07T09:39:12Z

ErvinXie
Aug 7, 2024

I am researching quant of llama.cpp. It seems all quant types quant weights in one row of Matrix A and weights in one column of Matrix B when calculating C = A x B. Does llama.cpp provide SGEMM of column major quant?

ErvinXie · 2024-08-07T09:43:34Z

ErvinXie
Aug 7, 2024
Author

According to this paper, per channel quant work better for K state. Per channel quant is a column major quant of Matrix A. So i wonder whether llama.cpp support this per channel quant.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does llama.cpp provide quant weights in one column of Matrix A? #8907

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Does llama.cpp provide quant weights in one column of Matrix A? #8907

Uh oh!

ErvinXie Aug 7, 2024

Replies: 1 comment

Uh oh!

ErvinXie Aug 7, 2024 Author

ErvinXie
Aug 7, 2024

ErvinXie
Aug 7, 2024
Author