DSP packing Vivado backend

[This report](https://docs.amd.com/v/u/en-US/wp486-deep-learning-int8) describes DSP packing for int8. I would like to extend it for quantization with fewer bits, increasing the speedup and reducing the DSP/LUT utilization even more.

Things to take into consideration:
- Since the optimization only concerns hls, there should be an attribute on each layer in python whether the packing implementation is used
- The `product` function should be extended. Depending on the cases below, the structure of the weight matrix might need to change
  - weight sharing vs input sharing
  - sequential vs cascaded operation

There is an implementation for 8-bits from @violatingcp [[1]](https://github.com/violatingcp/hls4ml/blob/e22fb6d00d465e09dffa0c94c56e39facbf9fa70/hls4ml/templates/vivado/nnet_utils/nnet_dense.h#L191) [[2]](https://github.com/violatingcp/hls4ml/blob/809be00c470897bf4f6475eac960ae3f696f0fca/hls4ml/templates/vivado/nnet_utils/nnet_dense.h#L115)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DSP packing Vivado backend #1096

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DSP packing Vivado backend #1096

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions