Skip to content

Commit b162abc

Browse files
author
Robert Muchsel
authored
Replace 1x1 kernels in streaming layers with 3x3 + pad (#143)
1 parent 8a1d12e commit b162abc

File tree

5 files changed

+40
-2
lines changed

5 files changed

+40
-2
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# MAX78000 Model Training and Synthesis
22

3-
_June 21, 2021_
3+
_June 29, 2021_
44

55
The Maxim Integrated AI project is comprised of five repositories:
66

@@ -414,7 +414,7 @@ if [ $? -eq 1 ] ; then
414414
fi
415415
```
416416

417-
The debugger requires OpenOCD. On Windows, an OpenOCD executable is installed with the SDK. On macOS and Linux, the OpenOCD fork from [https://github.com/MaximIntegratedMicros/openocd.git](https://github.com/MaximIntegratedMicros/openocd.git) must be used. An Ubuntu Linux binary is available at https://github.com/MaximIntegratedAI/MAX78000_SDK/blob/master/Tools/OpenOCD/openocd. *Note: A copy of the configuration files and a `run-openocd-maxdap` script are contained in the `hardware` folder of the `ai8x-synthesis` project.*
417+
The debugger requires OpenOCD. On Windows, an OpenOCD executable is installed with the SDK. On macOS and Linux, the OpenOCD fork from [https://github.com/MaximIntegratedMicros/openocd.git](https://github.com/MaximIntegratedMicros/openocd.git) must be used. An x86_64 Ubuntu Linux binary is available at https://github.com/MaximIntegratedAI/MAX78000_SDK/blob/master/Tools/OpenOCD/openocd. *Note: A copy of the configuration files and a `run-openocd-maxdap` script are contained in the `hardware` folder of the `ai8x-synthesis` project.*
418418

419419
`gen-demos-max78000.sh` will create code that is compatible with the SDK and copy it into the SDK’s Example directories.
420420

@@ -807,6 +807,7 @@ The MAX78000 hardware does not support arbitrary network parameters. Specificall
807807
* Streaming is limited to 8 consecutive layers or fewer, and is limited to four FIFOs (up to 4 input channels in CHW and up to 16 channels in HWC format), see [FIFOs](#FIFOs).
808808
* For streaming layers, bias values may not be added correctly in all cases.
809809
* The *final* streaming layer must use padding.
810+
* Layers that use 1×1 kernels without padding are automatically replaced with equivalent layers that use 3×3 kernels with padding.
810811

811812
* The weight memory supports up to 768 * 64 3×3 Q7 kernels (see [Number Format](#Number-Format)).
812813
When using 1-, 2- or 4-bit weights, the capacity increases accordingly.

README.pdf

1.01 KB
Binary file not shown.

izer/backend/max7800x.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,10 +268,23 @@ def create_net(self) -> str: # pylint: disable=too-many-locals,too-many-branche
268268
eprint('Streaming in the first layer requires use of a FIFO.')
269269
if any(streaming) and start_layer != 0:
270270
eprint('`--start_layer` must be 0 when using streaming.')
271+
271272
for ll in range(min(tc.dev.MAX_STREAM_LAYERS, layers)):
272273
if next_sequence[ll] != -1 and next_sequence[ll] != ll + 1 and streaming[ll]:
273274
eprint(f'`next_sequence` must be {ll+1} when using streaming in layer {ll}. '
274275
f'Currently configured: {next_sequence[ll]}')
276+
277+
if tc.dev.EMULATE_1X1_STREAMING and streaming[ll] and kernel_size[ll] == [1, 1]:
278+
wprint(f'Layer {ll}: Using 3x3 kernels to emulate 1x1 streaming layer')
279+
# Create 3x3 weights from 1x1 weights and emulate using 3x3 kernels
280+
weight33 = np.zeros((kernel[ll].shape[0], 3, 3), dtype=np.int64)
281+
weight33[:, 1, 1] = kernel[ll][:, 0, 0]
282+
kernel[ll] = weight33
283+
assert padding[ll] == [0, 0]
284+
padding[ll] = [1, 1]
285+
effective_pad[ll] = [1, 1]
286+
kernel_size[ll][0] = kernel_size[ll][1] = 3
287+
275288
if not tc.dev.SUPPORT_STREAM_NONPAD_FINAL and streaming[ll] \
276289
and (next_sequence[ll] == -1 or not streaming[next_sequence[ll]]) \
277290
and (padding[ll][0] == 0 or padding[ll][1] == 0):

izer/test/test_conv2d_1x1.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,29 @@ def convolve(data, weight, expected):
6161
print("SUCCESS" if np.array_equal(output, expected) else "*** FAILURE ***")
6262
assert np.array_equal(output, expected)
6363

64+
# Create 3x3 weights from 1x1 weights
65+
# and emulate using 3x3 kernels
66+
shape33 = (weight.shape[0], weight.shape[1], 3, 3)
67+
weight33 = np.zeros(shape33, dtype=np.int64)
68+
weight33[:, :, 1, 1] = weight[:, :, 0, 0]
69+
70+
output = compute.conv2d(
71+
data,
72+
weight33,
73+
None,
74+
data.shape,
75+
expected.shape,
76+
kernel_size=[3, 3],
77+
stride=[1, 1],
78+
pad=[1, 1],
79+
dilation=[1, 1],
80+
fractional_stride=[1, 1],
81+
output_pad=[0, 0],
82+
groups=1,
83+
)
84+
print("PYTORCH OK" if np.array_equal(output, t) else "*** FAILURE ***")
85+
assert np.array_equal(output, t)
86+
6487

6588
def test_conv2d():
6689
"""Main program to test compute.conv2d."""

izer/tornadocnn.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ class Dev:
4444
REQUIRE_NEW_STREAMING = False
4545
REQUIRE_FIFO_CPL = True
4646
EMULATE_ELTWISE_MP = False
47+
EMULATE_1X1_STREAMING = True
4748
USE_PROCESSORS = True
4849
MODERN_SIM = False
4950

0 commit comments

Comments
 (0)