fixing bug in GPTQ (#120)

HDCharles · web-flow · commit eba4c368fdc0 · 2024-04-04T01:23:07.000-04:00
* fixing bug in GPTQ

Summary: shape was always padded even when not needed.

Test Plan: pythont test/quantization/test_quant_api.py -k
"test_gptq_quantizer_int4wo"

Reviewers:

Subscribers:

Tasks:

Tags:

* removing extra spaces

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
diff --git a/torchao/quantization/GPTQ.py b/torchao/quantization/GPTQ.py
@@ -950,7 +950,10 @@ def __init__(
             # TODO: this is the gpt-fast version, merge with the main version later
             def make_names_and_values_dict_func(q, qparams):
                 k = q.shape[1]
-                new_k = find_multiple(k, 1024)
+                if not _check_linear_int4_k(k, groupsize):
+                    new_k = find_multiple(k, 1024)
+                else:
+                    new_k = k
                 # how much we need to pad the weight
                 delta_k = new_k - q.shape[1]
                 q = q.to(torch.int32)