Skip to content

Commit ee6bc28

Browse files
Lyt/blockwise (#1441)
* [Algo] blockwise tuning Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] code update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] sq argument update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] log update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] code update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix bugs Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] log update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] enable blockwise on Llama models Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] enable blockwise on Llama models Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] code update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] format code Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] add ut Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix format issue Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] log update Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] move do_blockwise arg Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> * [Algo] fix bug Signed-off-by: Lu, Yintong <yintong.lu@intel.com> --------- Signed-off-by: Lu, Yintong <yintong.lu@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 1f236d5 commit ee6bc28

File tree

7 files changed

+300
-42
lines changed

7 files changed

+300
-42
lines changed

neural_compressor/adaptor/onnxrt.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,13 @@ def smooth_quant(
175175
scales_per_op=True,
176176
record_max_info=False,
177177
weight_clip=True,
178-
auto_alpha_args={"alpha_min": 0.0, "alpha_max": 1.0, "alpha_step": 0.1, "shared_criterion": "mean"},
178+
auto_alpha_args={
179+
"alpha_min": 0.0,
180+
"alpha_max": 1.0,
181+
"alpha_step": 0.1,
182+
"shared_criterion": "mean",
183+
"do_blockwise": False,
184+
},
179185
default_alpha=0.5,
180186
):
181187
"""Get augmented model with smooth quant.
@@ -194,6 +200,7 @@ def smooth_quant(
194200
weight_clip: Whether to clip weight when calculating scales; by default it is on.
195201
auto_alpha_args: Hyperparameters used to set the alpha search space in SQ auto-tuning.
196202
By default the search space is 0.0-1.0 with step_size 0.1.
203+
do_blockwise: Whether to do blockwise auto-tuning.
197204
default_alpha: A hyperparameter that is used in SQ auto-tuning; by default it is 0.5.
198205
199206
Returns:

neural_compressor/adaptor/pytorch.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1737,7 +1737,13 @@ def smooth_quant(
17371737
force_re_smooth=False,
17381738
record_max_info=False,
17391739
weight_clip=True,
1740-
auto_alpha_args={"alpha_min": 0.0, "alpha_max": 1.0, "alpha_step": 0.1, "shared_criterion": "mean"},
1740+
auto_alpha_args={
1741+
"alpha_min": 0.0,
1742+
"alpha_max": 1.0,
1743+
"alpha_step": 0.1,
1744+
"shared_criterion": "mean",
1745+
"do_blockwise": False,
1746+
},
17411747
default_alpha=0.5,
17421748
):
17431749
"""Convert the model by smooth quant.
@@ -1756,6 +1762,7 @@ def smooth_quant(
17561762
weight_clip: Whether to clip weight when calculating scales; by default it is on.
17571763
auto_alpha_args: Hyperparameters used to set the alpha search space in SQ auto-tuning.
17581764
By default the search space is 0.0-1.0 with step_size 0.1.
1765+
do_blockwise determines whether to do blockwise auto-tuning.
17591766
default_alpha: A hyperparameter that is used in SQ auto-tuning; by default it is 0.5.
17601767
17611768
Returns:

neural_compressor/adaptor/tensorflow.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1833,7 +1833,13 @@ def smooth_quant(
18331833
scales_per_op=True,
18341834
record_max_info=False,
18351835
weight_clip=True,
1836-
auto_alpha_args={"alpha_min": 0.0, "alpha_max": 1.0, "alpha_step": 0.1, "shared_criterion": "mean"},
1836+
auto_alpha_args={
1837+
"alpha_min": 0.0,
1838+
"alpha_max": 1.0,
1839+
"alpha_step": 0.1,
1840+
"shared_criterion": "mean",
1841+
"do_blockwise": False,
1842+
},
18371843
default_alpha=0.5,
18381844
):
18391845
"""Convert the model by smooth quant.
@@ -1852,6 +1858,7 @@ def smooth_quant(
18521858
weight_clip: Whether to clip weight when calculating scales; by default it is on.
18531859
auto_alpha_args: Hyperparameters used to set the alpha search space in SQ auto-tuning.
18541860
By default the search space is 0.0-1.0 with step_size 0.1.
1861+
do_blockwise: Whether to do blockwise auto-tuning.
18551862
default_alpha: A hyperparameter that is used in SQ auto-tuning; by default it is 0.5.
18561863
18571864
Returns:

0 commit comments

Comments
 (0)