Verilog implementation of a fully parameterised (variable feature map and kernel sizes) and pipelined (use shift registers to help in MAC placement and timing) convolution engine accelerator
Key Points:
- Each MAC unit has fixed weights and feature map is fed as input to perform convolution operation
- valid_conv is used to represent if convolution is valid as in current implentation the convolution operation wraps around the feature map resulting in invalid configuration
- Preventing this wrap by blocking calculations will stop our pipleing and cause each new convolution to be calculated from scratch (increasing time) or for extra memory to be used
- pooling code to be uploaded