Rounding in '/blck_size' calculations #4083

AndrewGodfrey · 2023-11-15T04:25:06Z

AndrewGodfrey
Nov 15, 2023

I was reading ggml_nbytes while researching #3644, and noticed:

        nbytes = tensor->ne[0]*tensor->nb[0]/blck_size;

This division rounds down.

I imagine this is usually irrelevant because ne[0] tends to be a multiple of blck_size. But it feels like it's just a matter of time before this causes problems, if it hasn't already.

For GGML_TYPE_Q8_0 (blck_size==32) maybe it hasn't. But in the code there's a definition of QK_K==256, and that doesn't divide evenly into 3200 or 8640.

A similar calculation is done in some other places too. e.g. ggml_new_tensor_impl has ne[0]/ggml_blck_size(type)

What I'd expect to see for a block calculation, would be numBlocks * blck_size, where numBlocks = (ne[0] + blck_size - 1) / blck_size.

P.S. In finetune.cpp there are plenty of tensors whose ne[0] isn't a multiple of 32. But, those tend to be f32 or f16 and so they don't go through this codepath.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rounding in '/blck_size' calculations #4083

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Rounding in '/blck_size' calculations #4083

Uh oh!

AndrewGodfrey Nov 15, 2023

Replies: 0 comments

AndrewGodfrey
Nov 15, 2023