-
Notifications
You must be signed in to change notification settings - Fork 472
Open
Labels
Description
This report describes DSP packing for int8. I would like to extend it for quantization with fewer bits, increasing the speedup and reducing the DSP/LUT utilization even more.
Things to take into consideration:
- Since the optimization only concerns hls, there should be an attribute on each layer in python whether the packing implementation is used
- The
product
function should be extended. Depending on the cases below, the structure of the weight matrix might need to change- weight sharing vs input sharing
- sequential vs cascaded operation
There is an implementation for 8-bits from @violatingcp [1] [2]