Rounding in '/blck_size' calculations #4083
AndrewGodfrey
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was reading ggml_nbytes while researching #3644, and noticed:
This division rounds down.
I imagine this is usually irrelevant because ne[0] tends to be a multiple of blck_size. But it feels like it's just a matter of time before this causes problems, if it hasn't already.
For GGML_TYPE_Q8_0 (blck_size==32) maybe it hasn't. But in the code there's a definition of QK_K==256, and that doesn't divide evenly into 3200 or 8640.
A similar calculation is done in some other places too. e.g. ggml_new_tensor_impl has
ne[0]/ggml_blck_size(type)
What I'd expect to see for a block calculation, would be
numBlocks * blck_size
, wherenumBlocks = (ne[0] + blck_size - 1) / blck_size
.P.S. In finetune.cpp there are plenty of tensors whose ne[0] isn't a multiple of 32. But, those tend to be f32 or f16 and so they don't go through this codepath.
Beta Was this translation helpful? Give feedback.
All reactions