Skip to content

Conversation

@yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Apr 28, 2024

Type of Change

feature or bug fix or documentation or validation or others
API changed or not

Description

Register pt2e static quantization

  • Align the W8A8StaticQuantizer with Quantizer
  • Add export API
  • Map the StaticQuantConfig to X86InductorQuantizer's config
  • Add export for ipex by separate PR

static_quant_path

Usage

# User script
model = UserModel()
example_inputs = ...

# quantize script

# import intel_extension_for_pytorch ### <--- if user want to use ipex' static quant

from neural_compressor.torch.quantization import get_default_static_config, prepare, convert
from neural_compressor.torch.export import export

# export
dynamic_shapes = {"input_ids": (None, Dim("seq_len"))}
exported_model = export(model, example_inputs=example_inputs, dynamic_shapes=dynamic_shapes)

# prepare
quant_config = get_default_static_config()
prepare_model = prepare(model, quant_config)

# calibrate
run_fn(prepare_model)

# convert
converted_model = convert(prepare_model)

# compile and inference
opt_model = torch.compile(converted_model)
out = opt_model(*example_inputs)

@ftian1 @xin3he @violetch24

How has this PR been tested?

Pre-CI

Dependency Change?

None

yiliu30 added 7 commits April 27, 2024 22:57
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 marked this pull request as ready for review April 29, 2024 05:33
@yiliu30 yiliu30 requested review from ftian1, xin3he and yuwenzho April 29, 2024 05:33
@github-actions
Copy link

github-actions bot commented Apr 29, 2024

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow
Check ID Status Error details
Code-Scan success
Code-Scan (Bandit Code Scan Bandit) success
Code-Scan (DocStyle Code Scan DocStyle) success
Code-Scan (Pylint Code Scan Pylint) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py.

🟢 Model Tests 3x workflow
Check ID Status Error details
Model-Test-3x success
Model-Test-3x (Generate Report GenerateReport) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py.

🟢 Unit Tests 3x-PyTorch workflow
Check ID Status Error details
UT-3x-Torch success
UT-3x-Torch (Coverage Compare CollectDatafiles) success
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch) success
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py, test/3x/torch/algorithms/pt2e_quant/test_pt2e_w8a8.py, test/3x/torch/quantization/test_pt2e_quant.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

yiliu30 added 3 commits April 30, 2024 15:26
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 added PyTorch Related to PyTorch F/W INC3.X WIP labels Apr 30, 2024
yiliu30 and others added 8 commits May 8, 2024 11:32
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 removed the WIP label May 8, 2024
@chensuyue
Copy link
Contributor

@xin3he please review.

@yiliu30 yiliu30 merged commit 43c3580 into master May 9, 2024
@yiliu30 yiliu30 deleted the pt2e_entry branch May 9, 2024 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

INC3.X PyTorch Related to PyTorch F/W

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants