Skip to content

avijeet-trivedi/branch-predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fast, Accurate, Fetch: Adventures in Superscalar Branch Prediction

This repository contains a 2-way superscalar Fetch Unit implementation with cutting-edge branch predictors, developed as part of UIUC’s ECE 511 coursework. The design maximizes performance through fast and accurate branch prediction, minimizing pipeline flushes and energy waste.

Directory Structure

## Directory Structure

1. Fetch
    a. Bin         - Compiled binaries
    b. Hdl         - RTL implementation (SystemVerilog)
    c. Hvl         - Testbenches and verification code
    d. Lint        - Linting reports and checks
    e. Pkg         - Shared SV packages
    f. Sim         - Simulation outputs and scripts
    g. Synth       - Synthesis results and constraints

2. L-perceptron    - Long-history perceptron predictor

3. ltage           - L-TAGE branch predictor

4. perceptron      - Standard perceptron predictor

5. Tage            - Tagged Geometric History Length predictor (TAGE)

Superscalar Fetch Engine

Design Overview

alt text

Fetch Stage

The Fetch Stage in a superscalar out-of-order CPU aims to maximize throughput with low misprediction penalty:

  1. Fast Next-Line Predictor (NLP)

    • Small Branch Target Buffer (BTB) + Bi-Modal Table (BIM)
    • Quickly fetches the next sequential block
  2. Accurate Backup Predictor (TAGE)

    • Refines initial guesses using multi-history patterns
    • Branch Checker triggers an immediate redirect on misprediction

Fetched instructions and metadata are held in:

  • Fetch Queue: buffers instruction packets
  • Fetch Target Queue (FTQ): stores branch addresses & predictors for the ROB

On misprediction or backend redirect, the front-end restarts at the corrected PC.


Testbench Environment

Models full pipeline stages (decode, execute, memory, commit) with:

  • Register-file updates
  • Load/store data-memory responses

Each fetch packet (2 instructions) is executed sequentially to simplify inter-instruction dependencies. At commit, FTQ metadata is checked to detect and recover from any misprediction.


Getting Started

Prerequisites

  • SystemVerilog simulator (e.g., Synopsys VCS)
  • GNU Make
  • SRAM generator (e.g., OpenRAM)
  • CBP2016 benchmark traces

Building & Running

1. Generate SRAM macros

cd fetch/sram
make

2. Run Fetch unit simulation (CoreMark)

cd fetch/Sim
make run_vcs_top_tb PROG=../testcode/coremark_rv32i.elf

3. Lint

cd fetch/Lint
make lint

4. Synthesis

cd fetch/Synth
make synth

Standalone Branch Predictors

Standalone IPs developed:

  • TAGE
  • Perceptron
  • L-TAGE
  • L-Perceptron

Building & Running

Each can be built and benchmarked independently using CBP2016 traces.

The following benchmark traces were used:

  • SHORT_MOBILE-28
  • SHORT_MOBILE-56
  • SHORT_SERVER-11
  • SHORT_SERVER-71
  • LONG_MOBILE-4
  • LONG_MOBILE-5
  • LONG_MOBILE-13
  • LONG_SERVER-1

Example: build & run perceptron predictor

cd perceptron
make run_vcs_top_tb TRACE=/class/ece411/cbp2016/traces/SHORT_MOBILE-27.bt9.trace

Contributors

•	[Avijeet Trivedi](https://github.com/avijeet-trivedi)
•	[Jessica Vaz](https://github.com/jessicavaz16)
•	[Leon Ku](https://github.com/lku-illinois)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published