Replies: 2 comments 1 reply
-
GPU info. I have tried to set
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Oh, I got it. CPU backend will first quantize |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to use Vulkan backend in my project chatllm.cpp, and have troubles with
mat_mult
operator, wherew
isQ8_0
,input
&output
areF32
. The result differs slightly from CPU (w
andinput
are exactly the same).Dumped data (here,
input
is just a vector):Plot of point-wise error:
I think this might be caused by a flag or missing of a function call in my code. @0cc4m would you provide some hints?
Beta Was this translation helpful? Give feedback.
All reactions