You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+21-16Lines changed: 21 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# MAX78000 Model Training and Synthesis
2
2
3
-
_March 31, 2021_
3
+
_April 8, 2021_
4
4
5
5
The Maxim Integrated AI project is comprised of four repositories:
6
6
@@ -90,9 +90,9 @@ The following software is optional, and can be replaced with other similar softw
90
90
91
91
### Project Installation
92
92
93
-
*The software in this project uses Python 3.8.6 or a later 3.8.x version.*
93
+
*The software in this project uses Python 3.8.9 or a later 3.8.x version.*
94
94
95
-
It is not necessary to install Python 3.8.6 system-wide, or to rely on the system-provided Python. To manage Python versions, use `pyenv` (https://github.com/pyenv/pyenv).
95
+
It is not necessary to install Python 3.8.9 system-wide, or to rely on the system-provided Python. To manage Python versions, use `pyenv` (https://github.com/pyenv/pyenv).
If you use zsh as the shell (default on macOS), add these same commands to `~/.zprofile` or `~/.zshrc` in addition to adding them to the bash startup scripts.
121
121
122
-
Next, close the Terminal, open a new Terminal and install Python 3.8.6.
122
+
Next, close the Terminal, open a new Terminal and install Python 3.8.9.
@@ -646,9 +646,10 @@ Because of the fact that a processor has its own dedicated weight memory, this w
646
646
647
647
For each layer, a set of active processors must be specified. The number input channels for the layer must be equal to or a multiple of the active processors, and the input data for that layer must be located in data memory instances accessible to the selected processors.
648
648
649
-
It is possible to specify a relative offset into the data memory instance that applies to all processors. _Example:_ Assuming HWC data format, specifying the offset as 8192 bytes will cause processors 0-3 to read their input from the second half of data memory 0, processors 4-7 will read from the second half of data memory instance 1, etc.
649
+
It is possible to specify a relative offset into the data memory instance that applies to all processors.
650
+
_Example:_ Assuming HWC data format, specifying the offset as 16384 bytes (or 0x4000) will cause processors 0-3 to read their input from the second half of data memory 0, processors 4-7 will read from the second half of data memory instance 1, etc.
650
651
651
-
For most simple networks with limited data sizes, it is easiest to ping-pong between the first and second halves of the data memories - specify the data offset as 0 for the first layer, 0x2000 for the second layer, 0 for the third layer, etc. This strategy avoids overlapping inputs and outputs when a given processor is used in two consecutive layers.
652
+
For most simple networks with limited data sizes, it is easiest to ping-pong between the first and second halves of the data memories – specify the data offset as 0 for the first layer, 0x4000 for the second layer, 0 for the third layer, etc. This strategy avoids overlapping inputs and outputs when a given processor is used in two consecutive layers.
652
653
653
654
Even though it is supported by the accelerator, the Network Generator will not be able to check for inadvertent overwriting of unprocessed input data by newly generated output data when overlapping data or streaming data. Use the `--overlap-data` command line switch to disable these checks, and to allow overlapped data.
654
655
@@ -823,11 +824,15 @@ The following table describes the most important command line arguments for `tra
|`--exp-load-weights-from`| Load weights from file ||
825
826
|*Export*|||
826
-
|`--summary onnx`| Export trained model to ONNX (default name: to model.onnx) ||
827
+
|`--summary onnx`| Export trained model to ONNX (default name: to model.onnx) — *see description below*||
827
828
|`--summary onnx_simplified`| Export trained model to simplified ONNX file (default name: model.onnx) ||
828
829
|`--summary-filename`| Change the file name for the exported model |`--summary-filename mnist.onnx`|
829
830
|`--save-sample`| Save data[index] from the test set to a NumPy pickle for use as sample data |`--save-sample 10`|
830
831
832
+
#### ONNX Model Export
833
+
834
+
The ONNX model export (via `--summary onnx` or `--summary onnx_simplified`) is primarily intended for visualization of the model. ONNX does not support all of the operators that `ai8x.py` uses, and these operators are therefore removed from the export (see function `onnx_export_prep()` in `ai8x.py`). The ONNX file does contain the trained weights and *may* therefore be usable for inference under certain circumstances. However, it is important to note that the ONNX file **will not** be usable for training (for example, the ONNX `floor` operator has a gradient of zero which is incompatible with quantization-aware training as implemented in `ai8x.py`).
835
+
831
836
### Observing GPU Resources
832
837
833
838
`nvidia-smi` can be used in a different terminal during training to examine the GPU resource usage of the training process. In the following example, the GPU is using 100% of its compute capabilities, but not all of the available memory. In this particular case, the batch size could be increased to use more memory.
@@ -1910,7 +1915,7 @@ Perform minimum accelerator initialization so it can be configured or restarted.
1910
1915
Configure the accelerator for the given network.
1911
1916
1912
1917
`int cnn_load_weights(void);`
1913
-
Load the accelerator weights.
1918
+
Load the accelerator weights. Note that `cnn_init()` must be called before loading weights after reset or wake from sleep.
1914
1919
1915
1920
`int cnn_verify_weights(void);`
1916
1921
Verify the accelerator weights (used for debug only).
0 commit comments