analogdevicesinc
diff --git a/‎README.md
Lines changed: 116 additions & 56 deletions b/‎README.md
Lines changed: 116 additions & 56 deletions
diff --git a/‎README.pdf
89.4 KB b/‎README.pdf
89.4 KB
@@ -1,6 +1,6 @@
 # MAX78000 Model Training and Synthesis
 
-_April 22, 2021_
+_May 4, 2021_
 
 The Maxim Integrated AI project is comprised of four repositories:
 
@@ -301,7 +301,7 @@ By default, the main branch is checked out. This branch has been tested more rig
 
 Support for TensorFlow / Keras is currently in the `develop-tf` branch.
 
-##### Updates
+#### Updating to the Latest Version
 
 After additional testing, `develop` is merged into the main branch at regular intervals.
 
@@ -311,7 +311,7 @@ After a small delay of typically a day, a “Release” tag is created on GitHub
 
 In addition to code updated in the repository itself, submodules and Python libraries may have been updated as well.
 
-Major upgrades (such as updating from PyTorch 1.5 to PyTorch 1.7) are best done by removing all installed wheels. This can be achieved most easily by creating a new folder and starting from scratch at [Upstream Code](#Upstream Code). Starting from scratch is also recommended when upgrading the Python version.
+Major upgrades (such as updating from PyTorch 1.7 to PyTorch 1.8) are best done by removing all installed wheels. This can be achieved most easily by creating a new folder and starting from scratch at [Upstream Code](#Upstream Code). Starting from scratch is also recommended when upgrading the Python version.
 
 For minor updates, pull the latest code and install the updated wheels:
 
@@ -322,7 +322,44 @@ For minor updates, pull the latest code and install the updated wheels:
 (ai8x-training) $ pip3 install -U -r requirements.txt # or requirements-cu11.txt with CUDA 11.x
 ```
 
-Updating Python frequently requires updating `pyenv` first. Should `pyenv install x.y.z`
+##### Python Version Updates
+
+Updating Python may require updating `pyenv` first. Should `pyenv install 3.8.9` fail,
+
+```shell
+$ pyenv install 3.8.9
+python-build: definition not found: 3.8.9
+```
+
+then  `pyenv` must be updated. On macOS, use:
+
+```shell
+$ brew update && brew upgrade pyenv
+...
+$
+```
+
+On Linux, use:
+
+```shell
+$ cd $(pyenv root) && git pull && cd -
+remote: Enumerating objects: 19021, done.
+...
+$
+```
+
+The update should now succeed:
+
+```shell
+$ pyenv install 3.8.9
+Downloading Python-3.8.9.tar.xz...
+-> https://www.python.org/ftp/python/3.8.9/Python-3.8.9.tar.xz
+Installing Python-3.8.9...
+...
+$ pyenv local 3.8.9
+```
+
+
 
 #### Synthesis Project
 
@@ -964,7 +1001,7 @@ There are two supported cases for  `view()` or `reshape()`.
        `x = x.view(x.size(0), x.size(1), -1)  # 2D to 1D`
        `x = x.view(x.shape[0], x.shape[1], 16, -1)  # 1D to 2D`
    *Note: `x.size()` and `x.shape[]` are equivalent.*
-When reshaping data, `in_dim:` must be specified in the model description file.
+   When reshaping data, `in_dim:` must be specified in the model description file.
 2. Conversion from 1D and 2D to Fully Connected (“flattening”): The batch dimension (first dimension) must stay the same, and the other dimensions are combined (i.e., M = C×H×W or M = C×L).
    Example: 
        `x = x.view(x.size(0), -1)  # Flatten`
@@ -989,6 +1026,8 @@ By default, weights are quantized to 8-bits after 10 epochs as specified in `qat
 
 Quantization-aware training can be <u>disabled</u> by specifying `--qat-policy None`.
 
+For more information, please also see [Quantization](#Quantization).
+
 #### Batch Normalization
 
 Batch normalization after `Conv1d` and  `Conv2d` layers is supported using “fusing”. The fusing operation merges the effect of batch normalization layers into the parameters of the preceding convolutional layer. For detailed information about batch normalization fusing/folding, see Section 3.2 of the following paper: https://arxiv.org/pdf/1712.05877.pdf.
@@ -1085,7 +1124,6 @@ The following table describes the command line arguments for `batchnormfuser.py`
 | `-o`, `--out_path`  | Set output checkpoint path for saving fused model            | `-o best_without_bn.pth.tar`             |
 | `-oa`, `--out_arch` | Set output architecture name (architecture without batchnorm layers) | `-oa ai85simplenet`                      |
 
-
 ### Quantization
 
 There are two main approaches to quantization — quantization-aware training and post-training quantization. The MAX78000/MAX78002 support both approaches.
@@ -1186,9 +1224,9 @@ The training/verification data is located (by default) in `data/DataSetName`, fo
 
 Train the new network/new dataset. See `scripts/train_mnist.sh` for a command line example.
 
-#### Netron - Network Visualization
+#### Netron — Network Visualization
 
-The Netron tool (https://github.com/lutzroeder/Netron) can visualize networks, similar to what is available within Tensorboard. To use Netron, use `train.py` to export the trained network to ONNX, and upload the ONNX file.
+The [Netron tool](https://github.com/lutzroeder/Netron) can visualize networks, similar to what is available within Tensorboard. To use Netron, use `train.py` to export the trained network to ONNX, and upload the ONNX file.
 
 ```shell
 (ai8x-training) $ ./train.py --model ai85net5 --dataset MNIST --evaluate --exp-load-weights-from checkpoint.pth.tar --device MAX78000 --summary onnx
@@ -1198,6 +1236,8 @@ The Netron tool (https://github.com/lutzroeder/Netron) can visualize networks, s
 
 ---
 
+
+
 ## Network Loader (AI8Xize)
 
 _The `ai8xize` network loader currently depends on PyTorch and Nervana’s Distiller. This requirement will be removed in the future._
@@ -1350,6 +1390,8 @@ To generate an RTL simulation for the same network and sample data in the direct
 (ai8x-synthesize) $ ./ai8xize.py --rtl --verbose --autogen rtlsim --log --test-dir rtlsim --prefix ai85-mnist --checkpoint-file trained/ai85-mnist.pth.tar --config-file networks/mnist-chw-ai85.yaml --device MAX78000
 ```
 
+
+
 ### Network Loader Configuration Language
 
 Network descriptions are written in YAML (see https://en.wikipedia.org/wiki/YAML). There are two sections in each file — global statements and a sequence of layer descriptions.
@@ -1826,7 +1868,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
    --------------------------------------------------------
    Logging to TensorBoard - remember to execute the server:
    > tensorboard --logdir='./logs'
-
+   
    => loading checkpoint ../ai8x-synthesis/trained/new.pth.tar
    => Checkpoint contents:
    +----------------------+-------------+----------+
@@ -1840,7 +1882,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
    | optimizer_type       | type        | SGD      |
    | state_dict           | OrderedDict |          |
    +----------------------+-------------+----------+
-
+   
    => Checkpoint['extras'] contents:
    +-----------------+--------+-------------------+
    | Key             | Type   | Value             |
@@ -1851,7 +1893,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
    | clipping_scale  | float  | 0.85              |
    | current_top1    | float  | 99.46666666666667 |
    +-----------------+--------+-------------------+
-
+   
    Loaded compression schedule from checkpoint (epoch 165)
    => loaded 'state_dict' from checkpoint '../ai8x-synthesis/trained/new.pth.tar'
    Optimizer Type: <class 'torch.optim.sgd.SGD'>
@@ -1867,7 +1909,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
    Test: [   30/   40]    Loss 51.816276    Top1 99.518229    Top5 99.986979    
    Test: [   40/   40]    Loss 53.596094    Top1 99.500000    Top5 99.990000    
    ==> Top1: 99.500    Top5: 99.990    Loss: 53.596
-
+   
    ==> Confusion:
    [[ 979    0    0    0    0    0    0    0    1    0]
     [   0 1132    1    0    0    0    0    2    0    0]
@@ -1896,50 +1938,19 @@ The MAX78000/MAX78002 accelerator can generate an interrupt on completion, and i
 
 To run another inference, ensure all groups are disabled (stopping the state machine, as shown in `cnn_init()`). Next, load the new input data and start processing.
 
-#### Softmax, and Data Unload in C
-
-`ai8xize.py` can generate a call to a software Softmax function using the command line switch `--softmax`. That function is provided in the `assets/device-all` folder. To use the provided software Softmax on MAX78000/MAX78002, the last layer output should be 32-bit wide (`output_width: 32`).
-
-The software Softmax function is optimized for processing time and it quantizes the input. When the last layer uses weights that are not 8-bits, the software function used will shift the input values first.
-
-![softmax](docs/softmax.png)
-
-
-
-#### Generated Files and Upgrading the CNN Model
-
-The generated C code comprises the following files. Some of the files are customized based in the project name, and some are custom for a combination of project name and weight/sample data inputs:
-
-| File name    | Source                           | Project specific? | Model/weights change? |
-| ------------ | -------------------------------- | ----------------- | --------------------- |
-| Makefile     | template in assets/embedded-ai87 | Yes               | No                    |
-| cnn.c        | generated                        | Yes               | **Yes**               |
-| cnn.h        | template in assets/device-all    | Yes               | **Yes**               |
-| weights.h    | generated                        | Yes               | **Yes**               |
-| log.txt      | generated                        | Yes               | **Yes**               |
-| main.c       | generated                        | Yes               | No                    |
-| sampledata.h | generated                        | Yes               | No                    |
-| softmax.c    | assets/device-all                | No                | No                    |
-| model.launch | template in assets/eclipse       | Yes               | No                    |
-| .cproject    | template in assets/eclipse       | Yes               | No                    |
-| .project     | template in assets/eclipse       | Yes               | No                    |
-
-In order to upgrade an embedded project after retraining the model, point the network generator to a new empty directory and regenerate. Then, copy the four files that will have changed to your original project — `cnn.c`, `cnn.h`, `weights.h`, and `log.txt`. This allows for persistent customization of the I/O code and project (for example, in `main.c` and additional files) while allowing easy model upgrades.
 
-The generator also adds all files from the `assets/eclipse`, `assets/device-all`, and `assets/embedded-ai87` folders. These files (when starting with `template` in their name) will be automatically customized to include project specific information as shown in the following table:
+#### Overview of the Functions in main.c
 
-| Key                   | Replaced by                                                  |
-| --------------------- | ------------------------------------------------------------ |
-| `##__PROJ_NAME__##`   | Project name (works on file names as well as the file contents) |
-| `##__ELF_FILE__##`    | Output elf (binary) file name                                |
-| `##__BOARD__##`       | Board name (e.g., `EvKit_V1`)                                |
-| `##__FILE_INSERT__##` | Network statistics and timer                                 |
+The generated code is split between API code (in `cnn.c`) and data dependent code in `main.c` or `main_riscv.c`. The data dependent code is based on a known-answer test. The `main()` function shows the proper sequence of steps to load and configure the CNN accelerator, run it, unload it, and verify the result.
 
-##### Contents of the device-all Folder
+`void load_input(void);`
+Load the example input. This function can serve as a template for loading data into the CNN accelerator. Note that the clocks and power to the accelerator must be enabled first. If this is skipped, the device may hang and the [recovery procedure](https://github.com/MaximIntegratedAI/MaximAI_Documentation/tree/master/MAX78000_Feather#how-to-unlock-a-max78000-that-can-no-longer-be-programmed) may have to be used.
 
-* For MAX78000/MAX78002, the software Softmax is implemented in `softmax.c`.
-* A template for the `cnn.h` header file in `templatecnn.h`. The template is customized during code generation using model statistics and timer, but uses common function signatures for all projects.
+`int check_output(void);`
+This function verifies that the known-answer test works correctly in hardware (using the example input). This function is typically not needed in the final application.
 
+`int main(void);`
+This is the main function and can serve as a template for the user application. It shows the correct sequence of operations to initialize, load, run, and unload the CNN accelerator.
 
 
 #### Overview of the Generated API Functions
@@ -1988,6 +1999,50 @@ Turn on the boost circuit on `port`.`pin`. This is only needed for very energy i
 Turn off the boost circuit connected to `port`.`pin`.
 
 
+#### Softmax, and Data Unload in C
+
+`ai8xize.py` can generate a call to a software Softmax function using the command line switch `--softmax`. That function is provided in the `assets/device-all` folder. To use the provided software Softmax on MAX78000/MAX78002, the last layer output should be 32-bit wide (`output_width: 32`).
+
+The software Softmax function is optimized for processing time and it quantizes the input. When the last layer uses weights that are not 8-bits, the software function used will shift the input values first.
+
+![softmax](docs/softmax.png)
+
+
+#### Generated Files and Upgrading the CNN Model
+
+The generated C code comprises the following files. Some of the files are customized based in the project name, and some are custom for a combination of project name and weight/sample data inputs:
+
+| File name    | Source                           | Project specific? | Model/weights change? |
+| ------------ | -------------------------------- | ----------------- | --------------------- |
+| Makefile     | template in assets/embedded-ai87 | Yes               | No                    |
+| cnn.c        | generated                        | Yes               | **Yes**               |
+| cnn.h        | template in assets/device-all    | Yes               | **Yes**               |
+| weights.h    | generated                        | Yes               | **Yes**               |
+| log.txt      | generated                        | Yes               | **Yes**               |
+| main.c       | generated                        | Yes               | No                    |
+| sampledata.h | generated                        | Yes               | No                    |
+| softmax.c    | assets/device-all                | No                | No                    |
+| model.launch | template in assets/eclipse       | Yes               | No                    |
+| .cproject    | template in assets/eclipse       | Yes               | No                    |
+| .project     | template in assets/eclipse       | Yes               | No                    |
+
+In order to upgrade an embedded project after retraining the model, point the network generator to a new empty directory and regenerate. Then, copy the four files that will have changed to your original project — `cnn.c`, `cnn.h`, `weights.h`, and `log.txt`. This allows for persistent customization of the I/O code and project (for example, in `main.c` and additional files) while allowing easy model upgrades.
+
+The generator also adds all files from the `assets/eclipse`, `assets/device-all`, and `assets/embedded-ai87` folders. These files (when starting with `template` in their name) will be automatically customized to include project specific information as shown in the following table:
+
+| Key                   | Replaced by                                                  |
+| --------------------- | ------------------------------------------------------------ |
+| `##__PROJ_NAME__##`   | Project name (works on file names as well as the file contents) |
+| `##__ELF_FILE__##`    | Output elf (binary) file name                                |
+| `##__BOARD__##`       | Board name (e.g., `EvKit_V1`)                                |
+| `##__FILE_INSERT__##` | Network statistics and timer                                 |
+
+##### Contents of the device-all Folder
+
+* For MAX78000/MAX78002, the software Softmax is implemented in `softmax.c`.
+* A template for the `cnn.h` header file in `templatecnn.h`. The template is customized during code generation using model statistics and timer, but uses common function signatures for all projects.
+
+
 
 #### Energy Measurement
 
@@ -1997,14 +2052,21 @@ When running C code generated with `--energy`, the power display on the EVKit wi
 
 *Note: MAX78000 uses LED1 and LED2 to trigger power measurement via MAX32625 and MAX34417.*
 
-See https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/MAX78000_Evaluation_Kit/MAX78000%20Power%20Monitor%20and%20Energy%20Benchmarking%20Guide.pdf for more information about benchmarking.
+See the [benchmarking guide](https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/MAX78000_Evaluation_Kit/MAX78000%20Power%20Monitor%20and%20Energy%20Benchmarking%20Guide.pdf) for more information about benchmarking.
+
+
 
 ## Further Information
 
-Additional information about the evaluation kits, and the software development kit (SDK) is available on the web at https://github.com/MaximIntegratedAI/aximAI_Documentation
+Additional information about the evaluation kits, and the software development kit (SDK) is available on the web at https://github.com/MaximIntegratedAI/MaximAI_Documentation
+
+
+
 
 ---
 
+
+
 ## AHB Memory Addresses
 
 The following tables show the AHB memory addresses for the MAX78000 accelerator:
@@ -2221,6 +2283,4 @@ Do not try to push any changes into the master branch. Instead, create a fork an
 The following document has more information:
 https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/CONTRIBUTING.md
 
----
-
-o
+---