This repository provides a complete Verilog implementation of the Factored Systolic Array (FSA) and the enhanced Hard-Multiple Carry-Portioned (HCP) FSA architectures Fully synthesizable Verilog RTL code for:
1.) Radix-4 SA (baseline)
2.) Radix-8 SA
3.) Factored Radix-8 FSA
4.) Hard-Multiple Carry-Portioned (HCP) FSA
-
Configurable word-lengths: 8-bit, 16-bit, and 32-bit
-
Testbenches with randomized inputs
-
Pipelined multiplier stages with Wallace tree optimization
It is an implementation of the folowing paper - Factored Systolic Arrays Based on Radix-8 Multiplication for Machine Learning Acceleration