Skip to content

Commit 48f92d9

Browse files
author
Robert Muchsel
authored
README: Updating pyenv; overview of functions in main.c (#130)
1 parent 9680f23 commit 48f92d9

File tree

2 files changed

+116
-56
lines changed

2 files changed

+116
-56
lines changed

README.md

Lines changed: 116 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# MAX78000 Model Training and Synthesis
22

3-
_April 22, 2021_
3+
_May 4, 2021_
44

55
The Maxim Integrated AI project is comprised of four repositories:
66

@@ -301,7 +301,7 @@ By default, the main branch is checked out. This branch has been tested more rig
301301

302302
Support for TensorFlow / Keras is currently in the `develop-tf` branch.
303303

304-
##### Updates
304+
#### Updating to the Latest Version
305305

306306
After additional testing, `develop` is merged into the main branch at regular intervals.
307307

@@ -311,7 +311,7 @@ After a small delay of typically a day, a “Release” tag is created on GitHub
311311

312312
In addition to code updated in the repository itself, submodules and Python libraries may have been updated as well.
313313

314-
Major upgrades (such as updating from PyTorch 1.5 to PyTorch 1.7) are best done by removing all installed wheels. This can be achieved most easily by creating a new folder and starting from scratch at [Upstream Code](#Upstream Code). Starting from scratch is also recommended when upgrading the Python version.
314+
Major upgrades (such as updating from PyTorch 1.7 to PyTorch 1.8) are best done by removing all installed wheels. This can be achieved most easily by creating a new folder and starting from scratch at [Upstream Code](#Upstream Code). Starting from scratch is also recommended when upgrading the Python version.
315315

316316
For minor updates, pull the latest code and install the updated wheels:
317317

@@ -322,7 +322,44 @@ For minor updates, pull the latest code and install the updated wheels:
322322
(ai8x-training) $ pip3 install -U -r requirements.txt # or requirements-cu11.txt with CUDA 11.x
323323
```
324324

325-
Updating Python frequently requires updating `pyenv` first. Should `pyenv install x.y.z`
325+
##### Python Version Updates
326+
327+
Updating Python may require updating `pyenv` first. Should `pyenv install 3.8.9` fail,
328+
329+
```shell
330+
$ pyenv install 3.8.9
331+
python-build: definition not found: 3.8.9
332+
```
333+
334+
then `pyenv` must be updated. On macOS, use:
335+
336+
```shell
337+
$ brew update && brew upgrade pyenv
338+
...
339+
$
340+
```
341+
342+
On Linux, use:
343+
344+
```shell
345+
$ cd $(pyenv root) && git pull && cd -
346+
remote: Enumerating objects: 19021, done.
347+
...
348+
$
349+
```
350+
351+
The update should now succeed:
352+
353+
```shell
354+
$ pyenv install 3.8.9
355+
Downloading Python-3.8.9.tar.xz...
356+
-> https://www.python.org/ftp/python/3.8.9/Python-3.8.9.tar.xz
357+
Installing Python-3.8.9...
358+
...
359+
$ pyenv local 3.8.9
360+
```
361+
362+
326363

327364
#### Synthesis Project
328365

@@ -964,7 +1001,7 @@ There are two supported cases for `view()` or `reshape()`.
9641001
`x = x.view(x.size(0), x.size(1), -1) # 2D to 1D`
9651002
`x = x.view(x.shape[0], x.shape[1], 16, -1) # 1D to 2D`
9661003
*Note: `x.size()` and `x.shape[]` are equivalent.*
967-
When reshaping data, `in_dim:` must be specified in the model description file.
1004+
When reshaping data, `in_dim:` must be specified in the model description file.
9681005
2. Conversion from 1D and 2D to Fully Connected (“flattening”): The batch dimension (first dimension) must stay the same, and the other dimensions are combined (i.e., M = C×H×W or M = C×L).
9691006
Example:
9701007
`x = x.view(x.size(0), -1) # Flatten`
@@ -989,6 +1026,8 @@ By default, weights are quantized to 8-bits after 10 epochs as specified in `qat
9891026

9901027
Quantization-aware training can be <u>disabled</u> by specifying `--qat-policy None`.
9911028

1029+
For more information, please also see [Quantization](#Quantization).
1030+
9921031
#### Batch Normalization
9931032

9941033
Batch normalization after `Conv1d` and `Conv2d` layers is supported using “fusing”. The fusing operation merges the effect of batch normalization layers into the parameters of the preceding convolutional layer. For detailed information about batch normalization fusing/folding, see Section 3.2 of the following paper: https://arxiv.org/pdf/1712.05877.pdf.
@@ -1085,7 +1124,6 @@ The following table describes the command line arguments for `batchnormfuser.py`
10851124
| `-o`, `--out_path` | Set output checkpoint path for saving fused model | `-o best_without_bn.pth.tar` |
10861125
| `-oa`, `--out_arch` | Set output architecture name (architecture without batchnorm layers) | `-oa ai85simplenet` |
10871126

1088-
10891127
### Quantization
10901128

10911129
There are two main approaches to quantization — quantization-aware training and post-training quantization. The MAX78000/MAX78002 support both approaches.
@@ -1186,9 +1224,9 @@ The training/verification data is located (by default) in `data/DataSetName`, fo
11861224

11871225
Train the new network/new dataset. See `scripts/train_mnist.sh` for a command line example.
11881226

1189-
#### Netron - Network Visualization
1227+
#### Netron Network Visualization
11901228

1191-
The Netron tool (https://github.com/lutzroeder/Netron) can visualize networks, similar to what is available within Tensorboard. To use Netron, use `train.py` to export the trained network to ONNX, and upload the ONNX file.
1229+
The [Netron tool](https://github.com/lutzroeder/Netron) can visualize networks, similar to what is available within Tensorboard. To use Netron, use `train.py` to export the trained network to ONNX, and upload the ONNX file.
11921230

11931231
```shell
11941232
(ai8x-training) $ ./train.py --model ai85net5 --dataset MNIST --evaluate --exp-load-weights-from checkpoint.pth.tar --device MAX78000 --summary onnx
@@ -1198,6 +1236,8 @@ The Netron tool (https://github.com/lutzroeder/Netron) can visualize networks, s
11981236

11991237
---
12001238

1239+
1240+
12011241
## Network Loader (AI8Xize)
12021242

12031243
_The `ai8xize` network loader currently depends on PyTorch and Nervana’s Distiller. This requirement will be removed in the future._
@@ -1350,6 +1390,8 @@ To generate an RTL simulation for the same network and sample data in the direct
13501390
(ai8x-synthesize) $ ./ai8xize.py --rtl --verbose --autogen rtlsim --log --test-dir rtlsim --prefix ai85-mnist --checkpoint-file trained/ai85-mnist.pth.tar --config-file networks/mnist-chw-ai85.yaml --device MAX78000
13511391
```
13521392

1393+
1394+
13531395
### Network Loader Configuration Language
13541396

13551397
Network descriptions are written in YAML (see https://en.wikipedia.org/wiki/YAML). There are two sections in each file — global statements and a sequence of layer descriptions.
@@ -1826,7 +1868,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
18261868
--------------------------------------------------------
18271869
Logging to TensorBoard - remember to execute the server:
18281870
> tensorboard --logdir='./logs'
1829-
1871+
18301872
=> loading checkpoint ../ai8x-synthesis/trained/new.pth.tar
18311873
=> Checkpoint contents:
18321874
+----------------------+-------------+----------+
@@ -1840,7 +1882,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
18401882
| optimizer_type | type | SGD |
18411883
| state_dict | OrderedDict | |
18421884
+----------------------+-------------+----------+
1843-
1885+
18441886
=> Checkpoint['extras'] contents:
18451887
+-----------------+--------+-------------------+
18461888
| Key | Type | Value |
@@ -1851,7 +1893,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
18511893
| clipping_scale | float | 0.85 |
18521894
| current_top1 | float | 99.46666666666667 |
18531895
+-----------------+--------+-------------------+
1854-
1896+
18551897
Loaded compression schedule from checkpoint (epoch 165)
18561898
=> loaded 'state_dict' from checkpoint '../ai8x-synthesis/trained/new.pth.tar'
18571899
Optimizer Type: <class 'torch.optim.sgd.SGD'>
@@ -1867,7 +1909,7 @@ np.save(os.path.join('tests', 'sample_mnist'), a, allow_pickle=False, fix_import
18671909
Test: [ 30/ 40] Loss 51.816276 Top1 99.518229 Top5 99.986979
18681910
Test: [ 40/ 40] Loss 53.596094 Top1 99.500000 Top5 99.990000
18691911
==> Top1: 99.500 Top5: 99.990 Loss: 53.596
1870-
1912+
18711913
==> Confusion:
18721914
[[ 979 0 0 0 0 0 0 0 1 0]
18731915
[ 0 1132 1 0 0 0 0 2 0 0]
@@ -1896,50 +1938,19 @@ The MAX78000/MAX78002 accelerator can generate an interrupt on completion, and i
18961938

18971939
To run another inference, ensure all groups are disabled (stopping the state machine, as shown in `cnn_init()`). Next, load the new input data and start processing.
18981940

1899-
#### Softmax, and Data Unload in C
1900-
1901-
`ai8xize.py` can generate a call to a software Softmax function using the command line switch `--softmax`. That function is provided in the `assets/device-all` folder. To use the provided software Softmax on MAX78000/MAX78002, the last layer output should be 32-bit wide (`output_width: 32`).
1902-
1903-
The software Softmax function is optimized for processing time and it quantizes the input. When the last layer uses weights that are not 8-bits, the software function used will shift the input values first.
1904-
1905-
![softmax](docs/softmax.png)
1906-
1907-
1908-
1909-
#### Generated Files and Upgrading the CNN Model
1910-
1911-
The generated C code comprises the following files. Some of the files are customized based in the project name, and some are custom for a combination of project name and weight/sample data inputs:
1912-
1913-
| File name | Source | Project specific? | Model/weights change? |
1914-
| ------------ | -------------------------------- | ----------------- | --------------------- |
1915-
| Makefile | template in assets/embedded-ai87 | Yes | No |
1916-
| cnn.c | generated | Yes | **Yes** |
1917-
| cnn.h | template in assets/device-all | Yes | **Yes** |
1918-
| weights.h | generated | Yes | **Yes** |
1919-
| log.txt | generated | Yes | **Yes** |
1920-
| main.c | generated | Yes | No |
1921-
| sampledata.h | generated | Yes | No |
1922-
| softmax.c | assets/device-all | No | No |
1923-
| model.launch | template in assets/eclipse | Yes | No |
1924-
| .cproject | template in assets/eclipse | Yes | No |
1925-
| .project | template in assets/eclipse | Yes | No |
1926-
1927-
In order to upgrade an embedded project after retraining the model, point the network generator to a new empty directory and regenerate. Then, copy the four files that will have changed to your original project — `cnn.c`, `cnn.h`, `weights.h`, and `log.txt`. This allows for persistent customization of the I/O code and project (for example, in `main.c` and additional files) while allowing easy model upgrades.
19281941

1929-
The generator also adds all files from the `assets/eclipse`, `assets/device-all`, and `assets/embedded-ai87` folders. These files (when starting with `template` in their name) will be automatically customized to include project specific information as shown in the following table:
1942+
#### Overview of the Functions in main.c
19301943

1931-
| Key | Replaced by |
1932-
| --------------------- | ------------------------------------------------------------ |
1933-
| `##__PROJ_NAME__##` | Project name (works on file names as well as the file contents) |
1934-
| `##__ELF_FILE__##` | Output elf (binary) file name |
1935-
| `##__BOARD__##` | Board name (e.g., `EvKit_V1`) |
1936-
| `##__FILE_INSERT__##` | Network statistics and timer |
1944+
The generated code is split between API code (in `cnn.c`) and data dependent code in `main.c` or `main_riscv.c`. The data dependent code is based on a known-answer test. The `main()` function shows the proper sequence of steps to load and configure the CNN accelerator, run it, unload it, and verify the result.
19371945

1938-
##### Contents of the device-all Folder
1946+
`void load_input(void);`
1947+
Load the example input. This function can serve as a template for loading data into the CNN accelerator. Note that the clocks and power to the accelerator must be enabled first. If this is skipped, the device may hang and the [recovery procedure](https://github.com/MaximIntegratedAI/MaximAI_Documentation/tree/master/MAX78000_Feather#how-to-unlock-a-max78000-that-can-no-longer-be-programmed) may have to be used.
19391948

1940-
* For MAX78000/MAX78002, the software Softmax is implemented in `softmax.c`.
1941-
* A template for the `cnn.h` header file in `templatecnn.h`. The template is customized during code generation using model statistics and timer, but uses common function signatures for all projects.
1949+
`int check_output(void);`
1950+
This function verifies that the known-answer test works correctly in hardware (using the example input). This function is typically not needed in the final application.
19421951

1952+
`int main(void);`
1953+
This is the main function and can serve as a template for the user application. It shows the correct sequence of operations to initialize, load, run, and unload the CNN accelerator.
19431954

19441955

19451956
#### Overview of the Generated API Functions
@@ -1988,6 +1999,50 @@ Turn on the boost circuit on `port`.`pin`. This is only needed for very energy i
19881999
Turn off the boost circuit connected to `port`.`pin`.
19892000

19902001

2002+
#### Softmax, and Data Unload in C
2003+
2004+
`ai8xize.py` can generate a call to a software Softmax function using the command line switch `--softmax`. That function is provided in the `assets/device-all` folder. To use the provided software Softmax on MAX78000/MAX78002, the last layer output should be 32-bit wide (`output_width: 32`).
2005+
2006+
The software Softmax function is optimized for processing time and it quantizes the input. When the last layer uses weights that are not 8-bits, the software function used will shift the input values first.
2007+
2008+
![softmax](docs/softmax.png)
2009+
2010+
2011+
#### Generated Files and Upgrading the CNN Model
2012+
2013+
The generated C code comprises the following files. Some of the files are customized based in the project name, and some are custom for a combination of project name and weight/sample data inputs:
2014+
2015+
| File name | Source | Project specific? | Model/weights change? |
2016+
| ------------ | -------------------------------- | ----------------- | --------------------- |
2017+
| Makefile | template in assets/embedded-ai87 | Yes | No |
2018+
| cnn.c | generated | Yes | **Yes** |
2019+
| cnn.h | template in assets/device-all | Yes | **Yes** |
2020+
| weights.h | generated | Yes | **Yes** |
2021+
| log.txt | generated | Yes | **Yes** |
2022+
| main.c | generated | Yes | No |
2023+
| sampledata.h | generated | Yes | No |
2024+
| softmax.c | assets/device-all | No | No |
2025+
| model.launch | template in assets/eclipse | Yes | No |
2026+
| .cproject | template in assets/eclipse | Yes | No |
2027+
| .project | template in assets/eclipse | Yes | No |
2028+
2029+
In order to upgrade an embedded project after retraining the model, point the network generator to a new empty directory and regenerate. Then, copy the four files that will have changed to your original project — `cnn.c`, `cnn.h`, `weights.h`, and `log.txt`. This allows for persistent customization of the I/O code and project (for example, in `main.c` and additional files) while allowing easy model upgrades.
2030+
2031+
The generator also adds all files from the `assets/eclipse`, `assets/device-all`, and `assets/embedded-ai87` folders. These files (when starting with `template` in their name) will be automatically customized to include project specific information as shown in the following table:
2032+
2033+
| Key | Replaced by |
2034+
| --------------------- | ------------------------------------------------------------ |
2035+
| `##__PROJ_NAME__##` | Project name (works on file names as well as the file contents) |
2036+
| `##__ELF_FILE__##` | Output elf (binary) file name |
2037+
| `##__BOARD__##` | Board name (e.g., `EvKit_V1`) |
2038+
| `##__FILE_INSERT__##` | Network statistics and timer |
2039+
2040+
##### Contents of the device-all Folder
2041+
2042+
* For MAX78000/MAX78002, the software Softmax is implemented in `softmax.c`.
2043+
* A template for the `cnn.h` header file in `templatecnn.h`. The template is customized during code generation using model statistics and timer, but uses common function signatures for all projects.
2044+
2045+
19912046

19922047
#### Energy Measurement
19932048

@@ -1997,14 +2052,21 @@ When running C code generated with `--energy`, the power display on the EVKit wi
19972052

19982053
*Note: MAX78000 uses LED1 and LED2 to trigger power measurement via MAX32625 and MAX34417.*
19992054

2000-
See https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/MAX78000_Evaluation_Kit/MAX78000%20Power%20Monitor%20and%20Energy%20Benchmarking%20Guide.pdf for more information about benchmarking.
2055+
See the [benchmarking guide](https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/MAX78000_Evaluation_Kit/MAX78000%20Power%20Monitor%20and%20Energy%20Benchmarking%20Guide.pdf) for more information about benchmarking.
2056+
2057+
20012058

20022059
## Further Information
20032060

2004-
Additional information about the evaluation kits, and the software development kit (SDK) is available on the web at https://github.com/MaximIntegratedAI/aximAI_Documentation
2061+
Additional information about the evaluation kits, and the software development kit (SDK) is available on the web at https://github.com/MaximIntegratedAI/MaximAI_Documentation
2062+
2063+
2064+
20052065

20062066
---
20072067

2068+
2069+
20082070
## AHB Memory Addresses
20092071

20102072
The following tables show the AHB memory addresses for the MAX78000 accelerator:
@@ -2221,6 +2283,4 @@ Do not try to push any changes into the master branch. Instead, create a fork an
22212283
The following document has more information:
22222284
https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/CONTRIBUTING.md
22232285

2224-
---
2225-
2226-
o
2286+
---

README.pdf

89.4 KB
Binary file not shown.

0 commit comments

Comments
 (0)