Skip to content

Commit 3e1b9d4

Browse files
Update readme for v2.3 release (#1258)
Signed-off-by: chensuyue <suyue.chen@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent f7a3369 commit 3e1b9d4

File tree

6 files changed

+43
-29
lines changed

6 files changed

+43
-29
lines changed

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Intel® Neural Compressor
55
<h3> An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)</h3>
66

77
[![python](https://img.shields.io/badge/python-3.7%2B-blue)](https://github.com/intel/neural-compressor)
8-
[![version](https://img.shields.io/badge/release-2.2-green)](https://github.com/intel/neural-compressor/releases)
8+
[![version](https://img.shields.io/badge/release-2.3-green)](https://github.com/intel/neural-compressor/releases)
99
[![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)
1010
[![coverage](https://img.shields.io/badge/coverage-85%25-green)](https://github.com/intel/neural-compressor)
1111
[![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor)
@@ -21,9 +21,9 @@ In particular, the tool provides the key features, typical examples, and open co
2121

2222
* Support a wide range of Intel hardware such as [Intel Xeon Scalable processor](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing
2323

24-
* Validate more than 10,000 models such as [Bloom-176B](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), [OPT-6.7B](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
24+
* Validate popular LLMs such as LLama2, [LLama](examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static), [MPT](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/text-generation/quantization/README.md), [Falcon](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/language-modeling/quantization/README.md), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
2525

26-
* Collaborate with cloud marketplace such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html) and [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
26+
* Collaborate with cloud marketplace such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
2727

2828
## Installation
2929

@@ -120,7 +120,7 @@ q_model = fit(
120120
<td colspan="2" align="center"><a href="./docs/source/smooth_quant.md">SmoothQuant</td>
121121
</tr>
122122
<tr>
123-
<td colspan="8" align="center"><a href="./docs/source/quantization_weight_only.md">Weight-Only Quantization</td>
123+
<td colspan="8" align="center"><a href="./docs/source/quantization_weight_only.md">Weight-Only Quantization (INT8/INT4/FP4/NF4) </td>
124124
</tr>
125125
</tbody>
126126
<thead>
@@ -139,10 +139,9 @@ q_model = fit(
139139
> More documentations can be found at [User Guide](./docs/source/user_guide.md).
140140
141141
## Selected Publications/Events
142+
* arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
142143
* Post on Social Media: [ONNXCommunityMeetup2023: INT8 Quantization for Large Language Models with Intel Neural Compressor](https://www.youtube.com/watch?v=luYBWA1Q5pQ) (July 2023)
143144
* Blog by Intel: [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html) (July 2023)
144-
* Blog on Medium: [Quantization Accuracy Loss Diagnosis with Neural Insights](https://medium.com/@NeuralCompressor/quantization-accuracy-loss-diagnosis-with-neural-insights-5d73f4ca2601) (Aug 2023)
145-
* Blog on Medium: [Faster Stable Diffusion Inference with Intel Extension for Transformers](https://medium.com/intel-analytics-software/faster-stable-diffusion-inference-with-intel-extension-for-transformers-on-intel-platforms-7e0f563186b0) (July 2023)
146145
* NeurIPS'2022: [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) (Oct 2022)
147146
* NeurIPS'2022: [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114) (Oct 2022)
148147

@@ -155,6 +154,7 @@ q_model = fit(
155154
* [Legal Information](./docs/source/legal_information.md)
156155
* [Security Policy](SECURITY.md)
157156

158-
## Research Collaborations
159-
160-
Welcome to raise any interesting research ideas on model compression techniques and feel free to reach us ([inc.maintainers@intel.com](mailto:inc.maintainers@intel.com)). Look forward to our collaborations on Intel Neural Compressor!
157+
## Communication
158+
- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bugs report, new feature request, question asking, etc.
159+
- [Email](mailto:inc.maintainers@intel.com): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.
160+
- [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.

docs/source/imgs/wechat_group.jpg

28.5 KB
Loading

docs/source/installation_guide.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -145,21 +145,22 @@ The following prerequisites and requirements must be satisfied for a successful
145145
<tbody>
146146
<tr align="center">
147147
<th>Version</th>
148-
<td class="tg-7zrl"><a href=https://github.com/tensorflow/tensorflow/tree/v2.12.0>2.12.0</a><br>
149-
<a href=https://github.com/tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br>
150-
<a href=https://github.com/tensorflow/tensorflow/tree/v2.10.1>2.10.1</a><br></td>
151-
<td class="tg-7zrl"><a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.12.0>2.12.0</a><br>
152-
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br>
153-
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.10.0>2.10.0</a><br></td>
154-
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.2.0>1.2.0</a><br>
148+
<td class="tg-7zrl"> <a href=https://github.com/tensorflow/tensorflow/tree/v2.13.0>2.13.0</a><br>
149+
<a href=https://github.com/tensorflow/tensorflow/tree/v2.12.1>2.12.1</a><br>
150+
<a href=https://github.com/tensorflow/tensorflow/tree/v2.11.1>2.11.1</a><br></td>
151+
<td class="tg-7zrl"> <a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.13.0>2.13.0</a><br>
152+
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.12.0>2.12.0</a><br>
153+
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br></td>
154+
<td class="tg-7zrl"> <a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v2.13.0.0>v2.13.0.0</a><br>
155+
<a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.2.0>1.2.0</a><br>
155156
<a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.1.0>1.1.0</a></td>
156-
<td class="tg-7zrl"><a href=https://download.pytorch.org/whl/torch_stable.html>2.0.1+cpu</a><br>
157-
<a href=https://download.pytorch.org/whl/torch_stable.html>1.13.1+cpu</a><br>
158-
<a href=https://download.pytorch.org/whl/torch_stable.html>1.12.1+cpu</a><br></td>
157+
<td class="tg-7zrl"><a href=https://github.com/pytorch/pytorch/tree/v2.0.1>2.0.1+cpu</a><br>
158+
<a href=https://github.com/pytorch/pytorch/tree/v1.13.1>1.13.1+cpu</a><br>
159+
<a href=https://github.com/pytorch/pytorch/tree/v1.12.1>1.12.1+cpu</a><br></td>
159160
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-pytorch/tree/v2.0.100+cpu>2.0.1+cpu</a><br>
160161
<a href=https://github.com/intel/intel-extension-for-pytorch/tree/v1.13.100+cpu>1.13.1+cpu</a><br>
161162
<a href=https://github.com/intel/intel-extension-for-pytorch/tree/v1.12.100>1.12.1+cpu</a><br></td>
162-
<td class="tg-7zrl"><a href=https://github.com/microsoft/onnxruntime/tree/v1.15.0>1.15.0</a><br>
163+
<td class="tg-7zrl"><a href=https://github.com/microsoft/onnxruntime/tree/v1.15.1>1.15.1</a><br>
163164
<a href=https://github.com/microsoft/onnxruntime/tree/v1.14.1>1.14.1</a><br>
164165
<a href=https://github.com/microsoft/onnxruntime/tree/v1.13.1>1.13.1</a><br></td>
165166
<td class="tg-7zrl"><a href=https://github.com/apache/incubator-mxnet/tree/1.9.1>1.9.1</a><br></td>

docs/source/publication_list.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
Full Publications/Events (74)
1+
Full Publications/Events (75)
22
==========
3-
## 2023 (20)
3+
## 2023 (21)
4+
* arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
45
* Blog on Medium: [Quantization Accuracy Loss Diagnosis with Neural Insights](https://medium.com/@NeuralCompressor/quantization-accuracy-loss-diagnosis-with-neural-insights-5d73f4ca2601) (Aug 2023)
56
* Blog on Medium: [Faster Stable Diffusion Inference with Intel Extension for Transformers](https://medium.com/intel-analytics-software/faster-stable-diffusion-inference-with-intel-extension-for-transformers-on-intel-platforms-7e0f563186b0) (July 2023)
67
* Post on Social Media: [ONNXCommunityMeetup2023: INT8 Quantization for Large Language Models with Intel Neural Compressor](https://www.youtube.com/watch?v=luYBWA1Q5pQ) (July 2023)

docs/source/quantization.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -469,7 +469,7 @@ Intel(R) Neural Compressor support multi-framework: PyTorch, Tensorflow, ONNX Ru
469469
<td align="left">cpu</td>
470470
</tr>
471471
<tr>
472-
<td rowspan="4" align="left">ONNX Runtime</td>
472+
<td rowspan="5" align="left">ONNX Runtime</td>
473473
<td align="left">CPUExecutionProvider</td>
474474
<td align="left">MLAS</td>
475475
<td align="left">"default"</td>
@@ -493,6 +493,12 @@ Intel(R) Neural Compressor support multi-framework: PyTorch, Tensorflow, ONNX Ru
493493
<td align="left">"onnxrt_dnnl_ep"</td>
494494
<td align="left">cpu</td>
495495
</tr>
496+
<tr>
497+
<td align="left">DmlExecutionProvider*</td>
498+
<td align="left">OneDNN</td>
499+
<td align="left">"onnxrt_dml_ep"</td>
500+
<td align="left">NA</td>
501+
</tr>
496502
<tr>
497503
<td rowspan="2" align="left">Tensorflow</td>
498504
<td align="left">Tensorflow</td>
@@ -518,6 +524,7 @@ Intel(R) Neural Compressor support multi-framework: PyTorch, Tensorflow, ONNX Ru
518524
<br>
519525
<br>
520526

527+
> Note: DmlExecutionProvider support works as experimental, please expect exceptions.
521528
522529
Examples of configure:
523530
```python

third-party-programs.txt

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,9 @@ terms are listed below.
402402
socket.io
403403
Copyright (c) 2014-2018 Automattic <dev@cloudup.com>
404404

405+
sass
406+
Copyright (c) 2016, Google Inc.
407+
405408

406409
The MIT License (MIT)
407410

@@ -1840,13 +1843,16 @@ Code generated by the Protocol Buffer compiler is owned by the owner
18401843
of the input file used when generating it. This code is not
18411844
standalone and requires a support library to be linked with it. This
18421845
support library is itself covered by the above license.
1846+
18431847
-------------------------------------------------------------
1844-
7. Hardware-Aware Transformer software
1848+
8. Hardware-Aware Transformer software
1849+
Copyright (c) 2020, Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai,
1850+
Ligeng Zhu, Chuang Gan and Song Han
1851+
All rights reserved.
18451852

1846-
------------ LICENSE For Hardware-Aware Transformer software ---------------
1847-
Copyright (c) 2020, Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai,
1848-
Ligeng Zhu, Chuang Gan and Song Han
1849-
All rights reserved.
1853+
css-select
1854+
Copyright (c) Felix Böhm
1855+
All rights reserved.
18501856

18511857
Redistribution and use in source and binary forms, with or without
18521858
modification, are permitted provided that the following conditions are met:
@@ -1893,7 +1899,6 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18931899
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
18941900
SOFTWARE.
18951901

1896-
18971902
------------------------------------------------------------------
18981903

18991904
The following third party programs have their own third party program files. These additional

0 commit comments

Comments
 (0)