You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+64-9Lines changed: 64 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# MAX78000 Model Training and Synthesis
2
2
3
-
_July 19, 2021_
3
+
_July 20, 2021_
4
4
5
5
The Maxim Integrated AI project is comprised of five repositories:
6
6
@@ -52,7 +52,7 @@ where “....” is the project root, for example `~/Documents/Source/AI`.
52
52
53
53
### Prerequisites
54
54
55
-
This software currently supports Ubuntu Linux 18.04 LTS and 20.04 LTS. The server version is sufficient, see https://ubuntu.com/download/server. *Alternatively, Ubuntu Linux can also be used inside the Windows Subsystem for Linux (WSL2) by following
55
+
This software currently supports Ubuntu Linux 20.04 LTS. The server version is sufficient, see https://ubuntu.com/download/server. *Alternatively, Ubuntu Linux can also be used inside the Windows Subsystem for Linux (WSL2) by following
56
56
https://docs.nvidia.com/cuda/wsl-user-guide/. However, please note that WSL2 with CUDA is a pre-release and unexpected behavior may occur.*
57
57
58
58
When going beyond simple models, model training does not work well without CUDA hardware acceleration. The network loader (“izer”) does not require CUDA, and very simple models can also be trained on systems without CUDA.
@@ -458,6 +472,8 @@ Any given processor has visibility of:
458
472
459
473
#### Weight Memory
460
474
475
+
*Note: Depending on context, weights may also be referred to as “kernels” or “masks”. Additionally, weights are also part of a network’s “parameters”.*
476
+
461
477
For each of the four 16-processor quadrants, weight memory and processors can be visualized as follows. Assuming one input channel processed by processor 0, and 8 output channels, the 8 shaded kernels will be used:
462
478
463
479

@@ -1031,7 +1047,7 @@ The ONNX model export (via `--summary onnx` or `--summary onnx_simplified`) is p
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
@@ -1157,11 +1173,13 @@ Both TensorBoard and [Manifold](#Manifold) can be used for model comparison and
1157
1173
1158
1174
For classification models, TensorBoard supports the optional `--param-hist` and `--embedding`command line arguments. `--embedding` randomly selects up to 100 data points from the last batch of each verification epoch. These can be viewed in the “projector” tab in TensorBoard.
1159
1175
1176
+
`--pr-curves` adds support for displaying precision-recall curves.
1177
+
1160
1178
To start the TensorBoard server, use a second terminal window:
1161
1179
1162
1180
```shell
1163
1181
(ai8x-training) $ tensorboard --logdir='./logs'
1164
-
TensorBoard 2.2.2 at http://127.0.0.1:6006/ (Press CTRL+C to quit)
1182
+
TensorBoard 2.4.1 at http://127.0.0.1:6006/ (Press CTRL+C to quit)
1165
1183
```
1166
1184
1167
1185
On a shared system, add the `--port 0`command line option.
@@ -1170,9 +1188,9 @@ The training progress can be observed by starting TensorBoard and pointing a web
1170
1188
1171
1189
##### Examples
1172
1190
1173
-
TensorBoard produces graphs and displays metrics that may help optimize the training process, and can compare the performance of multiple training sessions and their settings. Additionally, TensorBoard can show a graphical representation of the model and its parameters. For more information, please see the [TensorBoard web site](https://www.tensorflow.org/tensorboard/).
1191
+
TensorBoard produces graphs and displays metrics that may help optimize the training process, and can compare the performance of multiple training sessions and their settings. Additionally, TensorBoard can show a graphical representation of the model and its parameters, and help discover labeling errors. For more information, please see the [TensorBoard web site](https://www.tensorflow.org/tensorboard/).
To evaluate the quantized network for MAX78000 (run from the training project):
1324
+
To evaluate the quantized network for MAX78000 (**run from the training project**):
1296
1325
1297
1326
```shell
1298
1327
(ai8x-training) $ scripts/evaluate_mnist.sh
1328
+
...
1329
+
--- test ---------------------
1330
+
10000 samples (256 per mini-batch)
1331
+
Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
1332
+
1333
+
Test: [ 10/ 40] Loss 0.007288 Top1 99.531250 Top5 100.000000
1334
+
Test: [ 20/ 40] Loss 0.010161 Top1 99.414062 Top5 100.000000
1335
+
Test: [ 30/ 40] Loss 0.007681 Top1 99.492188 Top5 100.000000
1336
+
Test: [ 40/ 40] Loss 0.009589 Top1 99.440000 Top5 100.000000
1337
+
==> Top1: 99.440 Top5: 100.000 Loss: 0.010
1338
+
1339
+
==> Confusion:
1340
+
[[ 978 0 1 0 0 0 0 0 1 0]
1341
+
[ 0 1132 1 1 0 0 1 0 0 0]
1342
+
[ 0 0 1028 0 0 0 0 4 0 0]
1343
+
[ 0 1 0 1007 0 1 0 1 0 0]
1344
+
[ 0 0 1 0 977 0 1 0 1 2]
1345
+
[ 1 0 0 3 0 884 3 0 0 1]
1346
+
[ 3 0 1 0 1 3 949 0 1 0]
1347
+
[ 0 2 1 0 0 0 0 1024 0 1]
1348
+
[ 0 0 2 1 1 1 0 0 968 1]
1349
+
[ 0 0 0 0 7 1 0 4 0 997]]
1350
+
1351
+
Log file for this run: 2021.07.20-123302/2021.07.20-123302.log
1299
1352
```
1300
1353
1354
+
*Note that the “Loss” output is not always directly comparable to the unquantized network, depending on the loss functionitself.*
1355
+
1301
1356
#### Alternative Quantization Approaches
1302
1357
1303
1358
If quantization-aware training is not desired, post-training quantization can be improved using more sophisticated methods. For example, see
0 commit comments