Skip to content

Commit a0b0707

Browse files
authored
Docs enhancement (#1032)
Signed-off-by: chensuyue <suyue.chen@intel.com>
1 parent 905eda3 commit a0b0707

File tree

38 files changed

+2387
-117
lines changed

38 files changed

+2387
-117
lines changed

.azure-pipelines/code-scan-neural-insights.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ pr:
1111
- neural_insights
1212
- setup.py
1313
- .azure-pipelines/code-scan-neural-insights.yml
14-
- .azure-pipelines/scripts/codeScan
1514

1615
pool:
1716
vmImage: "ubuntu-latest"

.azure-pipelines/code-scan-neural-solution.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ pr:
1111
- neural_solution
1212
- setup.py
1313
- .azure-pipelines/code-scan-neural-solution.yml
14-
- .azure-pipelines/scripts/codeScan
1514

1615
pool:
1716
vmImage: "ubuntu-latest"

.azure-pipelines/scripts/codeScan/pyspelling/inc_dict.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2490,6 +2490,7 @@ Thalaiyasingam
24902490
Torr
24912491
QOperator
24922492
MixedPrecisionConfig
2493+
mixedprecision
24932494
contrib
24942495
ONNXConfig
24952496
Arial

docker/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
## build `Neural Compressor(INC)` Containers:
1+
## Build Intel Neural Compressor Containers:
22

3-
### To build the the `Pip` based deployment container:
3+
### To build the `Pip` based deployment container:
44
Please note that `INC_VER` must be set to a valid version published here:
55
https://pypi.org/project/neural-compressor/#history
66

@@ -12,7 +12,7 @@ $ IMAGE_TAG=${INC_VER}
1212
$ docker build --build-arg PYTHON=${PYTHON} --build-arg INC_VER=${INC_VER} -f Dockerfile -t ${IMAGE_NAME}:${IMAGE_TAG} .
1313
```
1414

15-
### To build the the `Pip` based development container:
15+
### To build the `Pip` based development container:
1616
Please note that `INC_BRANCH` must be a set to a valid branch name otherwise, Docker build fails.
1717
If `${INC_BRANCH}-devel` does not meet Docker tagging requirements described here:
1818
https://docs.docker.com/engine/reference/commandline/tag/

docs/source/FX.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ FX
22
====
33
1. [Introduction](#introduction)
44
2. [FX Mode Support Matrix in Neural Compressor](#fx-mode-support-matrix-in-neural-compressor)
5-
3. [Get Start](#get-start)
5+
3. [Get Started](#get-started)
66

77
3.1. [Post Training Static Quantization](#post-training-static-quantization)
88

@@ -34,7 +34,7 @@ For detailed description, please refer to [PyTorch FX](https://pytorch.org/docs/
3434
|Quantization-Aware Training |&#10004; |
3535

3636

37-
## Get Start
37+
## Get Started
3838

3939
**Note:** "backend" field indicates the backend used by the user in configure. And the "default" value means it will quantization model with fx backend for PyTorch model.
4040

docs/source/adaptor.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Adaptor
33
1. [Introduction](#introduction)
44
2. [Adaptor Support Matrix](#adaptor-support-matrix)
55
3. [Working Flow](#working-flow)
6-
4. [Get Start with Adaptor API](#get-start-with-adaptor-api)
6+
4. [Get Started with Adaptor API](#get-start-with-adaptor-api)
77

88
4.1 [Query API](#query-api)
99

@@ -33,7 +33,7 @@ tuning strategy and vanilla framework quantization APIs.
3333
## Working Flow
3434
Adaptor only provide framework API for tuning strategy. So we can find complete working flow in [tuning strategy working flow](./tuning_strategies.md).
3535

36-
## Get Start with Adaptor API
36+
## Get Started with Adaptor API
3737

3838
Neural Compressor supports a new adaptor extension by
3939
implementing a subclass `Adaptor` class in the neural_compressor.adaptor package

docs/source/dataloader.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ DataLoader
55

66
2. [Supported Framework Dataloader Matrix](#supported-framework-dataloader-matrix)
77

8-
3. [Get Start with Dataloader](#get-start-with-dataloader)
8+
3. [Get Started with Dataloader](#get-start-with-dataloader)
99

1010
3.1 [Use Intel® Neural Compressor DataLoader API](#use-intel®-neural-compressor-dataloader-api)
1111

@@ -37,7 +37,7 @@ Of cause, users can also use frameworks own dataloader in Neural Compressor.
3737
| PyTorch | &#10004; |
3838
| ONNX Runtime | &#10004; |
3939

40-
## Get Start with DataLoader
40+
## Get Started with DataLoader
4141

4242
### Use Intel® Neural Compressor DataLoader API
4343

docs/source/get_started.md

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,6 @@
22

33
1. [Quick Samples](#quick-samples)
44

5-
1.1 [Quantization with Python API](#quantization-with-python-api)
6-
7-
1.2 [Quantization with JupyterLab Extension](#quantization-with-jupyterlab-extension)
8-
95
2. [Validated Models](#validated-models)
106

117
## Quick Samples
@@ -35,14 +31,6 @@ q_model = fit(
3531
eval_dataloader=dataloader)
3632
```
3733

38-
### Quantization with [JupyterLab Extension](/neural_coder/extensions/neural_compressor_ext_lab/README.md)
39-
40-
Search for ```jupyter-lab-neural-compressor``` in the Extension Manager in JupyterLab and install with one click:
41-
42-
<a target="_blank" href="/neural_coder/extensions/screenshots/extmanager.png">
43-
<img src="/neural_coder/extensions/screenshots/extmanager.png" alt="Extension" width="35%" height="35%">
44-
</a>
45-
4634
## Validated Models
4735
Intel® Neural Compressor validated the quantization for 10K+ models from popular model hubs (e.g., HuggingFace Transformers, Torchvision, TensorFlow Model Hub, ONNX Model Zoo).
4836
Over 30 pruning, knowledge distillation and model export samples are also available.

docs/source/installation_guide.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ You can install Neural Compressor using one of three options: Install single com
3030

3131
The following prerequisites and requirements must be satisfied for a successful installation:
3232

33-
- Python version: 3.7 or 3.8 or 3.9 or 3.10
33+
- Python version: 3.7 or 3.8 or 3.9 or 3.10 or 3.11
3434

3535
> Notes:
3636
> - If you get some build issues, please check [frequently asked questions](faq.md) at first.
@@ -82,7 +82,7 @@ The AI Kit is distributed through many common channels, including from Intel's w
8282

8383
The following prerequisites and requirements must be satisfied for a successful installation:
8484

85-
- Python version: 3.7 or 3.8 or 3.9 or 3.10
85+
- Python version: 3.7 or 3.8 or 3.9 or 3.10 or 3.11
8686

8787
### Install from Binary
8888

@@ -127,7 +127,7 @@ The following prerequisites and requirements must be satisfied for a successful
127127
### Validated Software Environment
128128

129129
* OS version: CentOS 8.4, Ubuntu 22.04
130-
* Python version: 3.7, 3.8, 3.9, 3.10
130+
* Python version: 3.7, 3.8, 3.9, 3.10, 3.11
131131

132132
<table class="docutils">
133133
<thead>
@@ -148,20 +148,20 @@ The following prerequisites and requirements must be satisfied for a successful
148148
<td class="tg-7zrl"><a href=https://github.com/tensorflow/tensorflow/tree/v2.12.0>2.12.0</a><br>
149149
<a href=https://github.com/tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br>
150150
<a href=https://github.com/tensorflow/tensorflow/tree/v2.10.1>2.10.1</a><br></td>
151-
<td class="tg-7zrl"><a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br>
152-
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.10.0>2.10.0</a><br>
153-
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.9.1>2.9.1</a><br></td>
154-
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.1.0>1.1.0</a><br>
155-
<a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.0.0>1.0.0</a></td>
156-
<td class="tg-7zrl"><a href=https://download.pytorch.org/whl/torch_stable.html>2.0.0+cpu</a><br>
157-
<a href=https://download.pytorch.org/whl/torch_stable.html>1.13.0+cpu</a><br>
151+
<td class="tg-7zrl"><a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.12.0>2.12.0</a><br>
152+
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.11.0>2.11.0</a><br>
153+
<a href=https://github.com/Intel-tensorflow/tensorflow/tree/v2.10.0>2.10.0</a><br></td>
154+
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.2.0>1.2.0</a><br>
155+
<a href=https://github.com/intel/intel-extension-for-tensorflow/tree/v1.1.0>1.1.0</a></td>
156+
<td class="tg-7zrl"><a href=https://download.pytorch.org/whl/torch_stable.html>2.0.1+cpu</a><br>
157+
<a href=https://download.pytorch.org/whl/torch_stable.html>1.13.1+cpu</a><br>
158158
<a href=https://download.pytorch.org/whl/torch_stable.html>1.12.1+cpu</a><br></td>
159-
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-pytorch/tree/v2.0.0+cpu>2.0.0+cpu</a><br>
160-
<a href=https://github.com/intel/intel-extension-for-pytorch/tree/v1.13.0+cpu>1.13.0+cpu</a><br>
159+
<td class="tg-7zrl"><a href=https://github.com/intel/intel-extension-for-pytorch/tree/v2.0.100+cpu>2.0.1+cpu</a><br>
160+
<a href=https://github.com/intel/intel-extension-for-pytorch/tree/v1.13.100+cpu>1.13.1+cpu</a><br>
161161
<a href=https://github.com/intel/intel-extension-for-pytorch/tree/v1.12.100>1.12.1+cpu</a><br></td>
162-
<td class="tg-7zrl"><a href=https://github.com/microsoft/onnxruntime/tree/v1.14.1>1.14.1</a><br>
163-
<a href=https://github.com/microsoft/onnxruntime/tree/v1.13.1>1.13.1</a><br>
164-
<a href=https://github.com/microsoft/onnxruntime/tree/v1.12.1>1.12.1</a><br></td>
162+
<td class="tg-7zrl"><a href=https://github.com/microsoft/onnxruntime/tree/v1.15.0>1.15.0</a><br>
163+
<a href=https://github.com/microsoft/onnxruntime/tree/v1.14.1>1.14.1</a><br>
164+
<a href=https://github.com/microsoft/onnxruntime/tree/v1.13.1>1.13.1</a><br></td>
165165
<td class="tg-7zrl"><a href=https://github.com/apache/incubator-mxnet/tree/1.9.1>1.9.1</a><br></td>
166166
</tr>
167167
</tbody>

docs/source/metric.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Metrics
1111

1212
2.4. [ONNXRT](#onnxrt)
1313

14-
3. [Get Start with Metric](#get-start-with-metric)
14+
3. [Get Started with Metric](#get-start-with-metric)
1515

1616
3.1. [Use Intel® Neural Compressor Metric API](#use-intel®-neural-compressor-metric-api)
1717

@@ -88,11 +88,11 @@ Neural Compressor supports some built-in metrics that are popularly used in indu
8888

8989

9090

91-
## Get Start with Metric
91+
## Get Started with Metric
9292

9393
### Use Intel® Neural Compressor Metric API
9494

95-
Users can specify an Neural Compressor built-in metric such as shown below:
95+
Users can specify a Neural Compressor built-in metric such as shown below:
9696

9797
```python
9898
from neural_compressor import Metric

docs/source/mixed_precision.md

Lines changed: 117 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -12,30 +12,126 @@ The recent growth of Deep Learning has driven the development of more complex mo
1212

1313
The recently launched 3rd Gen Intel® Xeon® Scalable processor (codenamed Cooper Lake), featuring Intel® Deep Learning Boost, is the first general-purpose x86 CPU to support the bfloat16 format. Specifically, three new bfloat16 instructions are added as a part of the AVX512_BF16 extension within Intel Deep Learning Boost: VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS. The first two instructions allow converting to and from bfloat16 data type, while the last one performs a dot product of bfloat16 pairs. Further details can be found in the [hardware numerics document](https://software.intel.com/content/www/us/en/develop/download/bfloat16-hardware-numerics-definition.html) published by Intel.
1414

15-
<a target="_blank" href="./imgs/data_format.png" text-align:center>
16-
<center>
17-
<img src="./imgs/data_format.png" alt="Architecture" height=200>
18-
</center>
19-
</a>
15+
<p align="center" width="100%">
16+
<img src="./imgs/data_format.png" alt="Architecture" height=230>
17+
</p>
2018

2119
## Mixed Precision Support Matrix
22-
23-
|Framework |BF16 |FP16 |
24-
|--------------|:-----------:|:-----------:|
25-
|TensorFlow |&#10004; |:x: |
26-
|PyTorch |&#10004; |:x: |
27-
|ONNX Runtime |&#10004; |&#10004; |
28-
|MXNet |&#10004; |:x: |
29-
30-
> **During quantization, BF16 conversion is default enabled, FP16 can be executed if 'device' of config is 'gpu'. Please refer to this [document](./quantization_mixed_precision.md) for its workflow.**
20+
<table class="center">
21+
<thead>
22+
<tr>
23+
<th>Framework</th>
24+
<th>Backend</th>
25+
<th>Backend Library</th>
26+
<th>Backend Value</th>
27+
<th>Support Device(cpu as default)</th>
28+
<th>Support BF16</th>
29+
<th>Support FP16</th>
30+
</tr>
31+
</thead>
32+
<tbody>
33+
<tr>
34+
<td rowspan="2" align="left">PyTorch</td>
35+
<td align="left">FX</td>
36+
<td align="left">FBGEMM</td>
37+
<td align="left">"default"</td>
38+
<td align="left">cpu</td>
39+
<td align="left">&#10004;</td>
40+
<td align="left">:x:</td>
41+
</tr>
42+
<tr>
43+
<td align="left">IPEX</td>
44+
<td align="left">OneDNN</td>
45+
<td align="left">"ipex"</td>
46+
<td align="left">cpu</td>
47+
<td align="left">&#10004;</td>
48+
<td align="left">:x:</td>
49+
</tr>
50+
<tr>
51+
<td rowspan="3" align="left">ONNX Runtime</td>
52+
<td align="left">CPUExecutionProvider</td>
53+
<td align="left">MLAS</td>
54+
<td align="left">"default"</td>
55+
<td align="left">cpu</td>
56+
<td align="left">:x:</td>
57+
<td align="left">:x:</td>
58+
</tr>
59+
<tr>
60+
<td align="left">TensorrtExecutionProvider</td>
61+
<td align="left">TensorRT</td>
62+
<td align="left">"onnxrt_trt_ep"</td>
63+
<td align="left">gpu</td>
64+
<td align="left">:x:</td>
65+
<td align="left">:x:</td>
66+
</tr>
67+
<tr>
68+
<td align="left">CUDAExecutionProvider</td>
69+
<td align="left">CUDA</td>
70+
<td align="left">"onnxrt_cuda_ep"</td>
71+
<td align="left">gpu</td>
72+
<td align="left">&#10004;</td>
73+
<td align="left">&#10004;</td>
74+
</tr>
75+
<tr>
76+
<td rowspan="2" align="left">Tensorflow</td>
77+
<td align="left">Tensorflow</td>
78+
<td align="left">OneDNN</td>
79+
<td align="left">"default"</td>
80+
<td align="left">cpu</td>
81+
<td align="left">&#10004;</td>
82+
<td align="left">:x:</td>
83+
</tr>
84+
<tr>
85+
<td align="left">ITEX</td>
86+
<td align="left">OneDNN</td>
87+
<td align="left">"itex"</td>
88+
<td align="left">cpu | gpu</td>
89+
<td align="left">&#10004;</td>
90+
<td align="left">:x:</td>
91+
</tr>
92+
<tr>
93+
<td align="left">MXNet</td>
94+
<td align="left">OneDNN</td>
95+
<td align="left">OneDNN</td>
96+
<td align="left">"default"</td>
97+
<td align="left">cpu</td>
98+
<td align="left">&#10004;</td>
99+
<td align="left">:x:</td>
100+
</tr>
101+
</tbody>
102+
</table>
103+
104+
105+
### Hardware and Software requests for **BF16**
106+
- TensorFlow
107+
1. Hardware: CPU supports `avx512_bf16` instruction set.
108+
2. Software: intel-tensorflow >= [2.3.0](https://pypi.org/project/intel-tensorflow/2.3.0/).
109+
- PyTorch
110+
1. Hardware: CPU supports `avx512_bf16` instruction set.
111+
2. Software: torch >= [1.11.0](https://download.pytorch.org/whl/torch_stable.html).
112+
- ONNX Runtime
113+
1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
114+
2. Software: onnxruntime-gpu.
115+
116+
### Hardware and Software requests for **FP16**
117+
- ONNX Runtime
118+
1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
119+
2. Software: onnxruntime-gpu.
120+
121+
### During quantization mixed precision
122+
During quantization, if the hardware support BF16, the conversion is default enabled. So you may get an INT8/BF16/FP32 mixed precision model on those hardware. FP16 can be executed if 'device' of config is 'gpu'.
123+
Please refer to this [document](https://github.com/intel/neural-compressor/blob/master/docs/source/quantization_mixed_precision.md) for its workflow.
124+
125+
### Accuracy-driven mixed precision
126+
BF16/FP16 conversion may lead to accuracy drop. Intel® Neural Compressor provides an accuracy-driven tuning function to reduce accuracy loss,
127+
which will fallback converted ops to FP32 automatically to get better accuracy. To enable this function, users only to provide
128+
`evaluation function` or (`evaluation dataloader` plus `evaluation metric`) for [mixed precision inputs](https://github.com/intel/neural-compressor/blob/master/neural_compressor/mix_precision.py).
129+
To be noticed, IPEX backend doesn't support accuracy-driven mixed precision.
31130

32131
## Get Started with Mixed Precision API
33132

34133
To get a bf16/fp16 model, users can use the Mixed Precision API as follows.
35134

36-
37-
Supported precisions for mix precision include bf16 and fp16. If users want to get a pure fp16 or bf16 model, they should add another precision into excluded_precisions.
38-
39135
- BF16:
40136

41137
```python
@@ -60,34 +156,10 @@ conf = MixedPrecisionConfig(
60156
converted_model = mix_precision.fit(model, conf=conf)
61157
converted_model.save('./path/to/save/')
62158
```
63-
64-
> **BF16/FP16 conversion may lead to accuracy drop. Intel® Neural Compressor provides an accuracy-aware tuning function to reduce accuracy loss, which will fallback converted ops to FP32 automatically to get better accuracy. To enable this function, users only need to provide an evaluation function (or dataloader + metric).**
65-
66159

67160
## Examples
68161

69-
There are some pre-requirements to run mixed precision examples for each framework. If the hardware requirements cannot be met, the program would exit consequently.
70-
71-
- **BF16:**
72-
73-
#### TensorFlow
74-
75-
1. Hardware: CPU supports `avx512_bf16` instruction set.
76-
2. Software: intel-tensorflow >= [2.3.0](https://pypi.org/project/intel-tensorflow/2.3.0/).
77-
78-
#### PyTorch
79-
80-
1. Hardware: CPU supports `avx512_bf16` instruction set.
81-
2. Software: torch >= [1.11.0](https://download.pytorch.org/whl/torch_stable.html).
82-
83-
#### ONNX Runtime
84-
85-
1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
86-
2. Software: onnxruntime-gpu.
87-
88-
- **FP16:**
89-
90-
#### ONNX Runtime
91-
92-
1. Hardware: GPU, set 'device' of config to 'gpu' and 'backend' to 'onnxrt_cuda_ep'.
93-
2. Software: onnxruntime-gpu.
162+
- Quick started with [helloworld example](/examples/helloworld/tf_example3)
163+
- PyTorch [ResNet18](/examples/pytorch/image_recognition/torchvision_models/mixed_precision/resnet18)
164+
- IPEX [DistilBERT base](/examples/pytorch/nlp/huggingface_models/question-answering/mixed_precision/ipex)
165+
- Tensorflow [ResNet50](/examples/tensorflow/image_recognition/tensorflow_models/resnet50_v1/mixed_precision)

docs/source/releases_info.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,5 @@ The diagnosis function does not work with ONNX Runtime 1.13.1 for QDQ format qua
2828
[Neural Compressor v1.7](https://github.com/intel/neural-compressor/tree/v1.7) renames the pip/conda package name from lpot to neural_compressor. To run old examples on latest software, please replace package name for compatibility with `sed -i "s|lpot|neural_compressor|g" your_script.py` .
2929

3030
[Neural Compressor v2.0](https://github.com/intel/neural-compressor/tree/v2.0) renames the `DATASETS` class as `Datasets`, please notice use cases like `from neural_compressor.data import Datasets`. Details please check the [PR](https://github.com/intel/neural-compressor/pull/244/files).
31+
32+
[Neural Compressor v2.2](https://github.com/intel/neural-compressor/tree/v2.2) from this release, binary `neural-compressor-full` is deprecated, we deliver 3 binaries named `neural-compressor`, `neural-solution` and `neural-insights`.

0 commit comments

Comments
 (0)