What is the principle of quantize(eg. Q4_0)? #3546

Tr-buaa · 2023-10-08T09:03:19Z

Tr-buaa
Oct 8, 2023

I want to learn the principle of the quantize, but it's a little difficult for me to read the source code. So I need some formula or describe，thank you！

Answered by KerfuffleV2

Oct 8, 2023

The purpose of course is to try to use less storage space/memory (it also can speed up running the model). The principle is to try to fit a large range of values into a smaller range while preserving as much accuracy as possible.

Not sure if it'll help you, but I wrote a pretty simple Python implementation of q8_0 a while back:

import numpy as np

#### Mini Q8_0 quantization in Python
QK8_0 = 32
BLOCK_Q8_0 = np.dtype([('d', '<f2'), ('qs', 'i1', (QK8_0,))])
def quantize_array_q8_0(arr):
    assert arr.size % QK8_0 == 0 and arr.size != 0, f'Bad array size {arr.size}'
    assert arr.dtype == np.float32, f'Bad array type {arr.dtype}'
    n_blocks = arr.size // QK8_0
    blocks = arr.reshape((n…

View full answer

KerfuffleV2 · 2023-10-08T10:30:00Z

KerfuffleV2
Oct 8, 2023
Collaborator

The purpose of course is to try to use less storage space/memory (it also can speed up running the model). The principle is to try to fit a large range of values into a smaller range while preserving as much accuracy as possible.

Not sure if it'll help you, but I wrote a pretty simple Python implementation of q8_0 a while back:

import numpy as np

#### Mini Q8_0 quantization in Python
QK8_0 = 32
BLOCK_Q8_0 = np.dtype([('d', '<f2'), ('qs', 'i1', (QK8_0,))])
def quantize_array_q8_0(arr):
    assert arr.size % QK8_0 == 0 and arr.size != 0, f'Bad array size {arr.size}'
    assert arr.dtype == np.float32, f'Bad array type {arr.dtype}'
    n_blocks = arr.size // QK8_0
    blocks = arr.reshape((n_blocks, QK8_0))
    return np.fromiter(map(quantize_block_q8_0, blocks), count = n_blocks, dtype = BLOCK_Q8_0)

def quantize_block_q8_0(blk):
    d = abs(blk).max() / np.float32(127)
    if d == np.float32(0):
        return (np.float16(d), np.int8(0),) * QK8_0)
    return (np.float16(d), (blk * (np.float32(1) / d)).round())

There's a much faster (but possibly harder to understand) version in convert.py

1 reply

mmccqq Dec 15, 2023

啊，抓到你了。是我啊，给你发reddit信息和gmail怎么都没回。你还好吗？偌大个github居然不支持私信功能。。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the principle of quantize(eg. Q4_0)? #3546

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

What is the principle of quantize(eg. Q4_0)? #3546

Uh oh!

Tr-buaa Oct 8, 2023

Replies: 1 comment · 1 reply

Uh oh!

KerfuffleV2 Oct 8, 2023 Collaborator

Uh oh!

Uh oh!

mmccqq Dec 15, 2023

Tr-buaa
Oct 8, 2023

Replies: 1 comment 1 reply

KerfuffleV2
Oct 8, 2023
Collaborator