Skip to content

ZantFoundation/ZantAccelerator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Zant Accelerator

A high-performance, FPGA-based accelerator core developed by the Zant Foundation, with future ASIC tape-out in mind.


Table of Contents


Introduction

The Zant Accelerator is a modular, reconfigurable compute engine designed to offload and accelerate compute-intensive kernels on FPGA, with a future ASIC tape-out planned. It provides:

  • A lightweight 16-bit floating-point unit (FPU)
  • A scalable systolic array for matrix and tensor operations
  • Low-latency QSPI/SPI host interface
  • High-throughput DRAM interface with alignment buffers

This project lives in the Zant Foundation GitHub organization as part of our open-source hardware initiative.


Key Features

  • 16-bit FPU

    • Custom floating-point unit optimized for energy efficiency and throughput
    • Already implemented and under functional test
  • Systolic Array

    • Parameterizable 2D array of processing elements (PEs)
    • Supports matrix-vector and matrix-matrix multiply accumulation
  • Memory & I/O

    • Alignment Buffers: Prevent misalignment and ensure back-to-back burst support
    • Ring Buffer: Low-latency staging between FPU/systolic array and DRAM
    • Quad-SPI Slave Interface: Host access for configuration, micro-code upload, and debug
    • Serializer/Deserializer: 32-bit wide interface to external DRAM
  • Scalable & Portable

    • Modular RTL in Verilog/SystemVerilog
    • Ready for synthesis on Intel/Altera and Microsemi FPGAs, with future ASIC
    • Clean separation of control, datapath, and interface layers

Architecture Overview

   +-------------+       +----------------------+     +----------------+
   |  QuadSPI    |<----->| QuadSPI Slave IFace  |<--->|  On-Chip BRAM  |
   |   Host      |       +----------------------+     +----------------+
   +-------------+                      |
                               64-bit   |
                                        v
                              +----------------+
                              | 16-bit FPU     |
                              +----------------+
                                        |
                               64-bit   |
                                        v
                              +----------------+
                              | Ring Buffer    |
                              +----------------+
                                        |
                               64-bit   |
                                        v
                          +------------------------+
                          | Controller & Aligners  |
                          +------------------------+
                                        |
                          +------------------------+
                          |   Systolic PE Array    |
                          +------------------------+
                                        |
                              32-bit    |
                                        v
                              +----------------+
                              | Serializer /   |
                              | DRAM Interface |
                              +----------------+
  • Control & Matrix Aligner orchestrates data flow between buffers, FPU, and PE array.
  • Alignment Buffers line-up incoming/outgoing bursts to/from DRAM for maximum throughput.

Supported Platforms

  • Intel/Altera MAX 10 — in active test
  • Microsemi PolarFire — in active test
  • Intel Agilex — under evaluation for next-gen prototypes
  • (Future) ASIC tape-out through IHP (July production target)

Current Status

  • 🔄 16-bit FPU completed — functional verification in progress
  • 🚧 Quad-SPI slave interface prototyped
  • 🚧 Ring buffer & alignment buffer RTL in place
  • 🔄 Testing on MAX10 & PolarFire boards
  • 🚧 Systolic array PE design started
  • 🚧 QSPI/SPI command parser & control FSM in development

Getting Started

  1. Clone the repo

    git clone https://github.com/ZantFoundation/zant-accelerator.git
    cd zant-accelerator
  2. Install dependencies

  3. Open the Top-Level Project

    • MAX10: quartus/zant_max10.qpf
    • PolarFire: libero/zant_polarfire.gsdproj

Building & Testing

  • Synthesis & Implementation

    # Quartus example
    cd quartus
    quartus_sh --flow compile zant_max10
  • Simulation

    make sim
  • Hardware Validation

    • Load the .sof (MAX10) or .bit (PolarFire) onto your dev board
    • Use the Python testbench in sw/tests/ to drive QSPI commands and verify outputs
    • Monitor throughput and latency via UART/logs

Roadmap & Next Steps

  1. Finalize 16-bit FPU tests, integrate power/area optimizations
  2. Complete PE design for the full systolic array (configurable dimensions)
  3. Implement QSPI/SPI command parser with register map
  4. Enhance alignment buffers to support multi-burst DRAM patterns
  5. Evaluate Agilex port for higher clock rates
  6. ASIC tape-out: finalize RTL, constraints and hand-off to IHP (target: July)
  7. Software stack: driver, runtime API, and high-level libraries

Contributing

We welcome contributions from the community!

  1. Fork this repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add awesome feature')
  4. Submit a pull request

Please follow our Contributing Guidelines and the Zant Foundation’s Code of Conduct.


License

This project is released under the MIT License. See LICENSE for details.


Contact



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •