Fast, Accurate, Fetch: Adventures in Superscalar Branch Prediction

This repository contains a 2-way superscalar Fetch Unit implementation with cutting-edge branch predictors, developed as part of UIUC’s ECE 511 coursework. The design maximizes performance through fast and accurate branch prediction, minimizing pipeline flushes and energy waste.

Directory Structure

## Directory Structure

1. Fetch
    a. Bin         - Compiled binaries
    b. Hdl         - RTL implementation (SystemVerilog)
    c. Hvl         - Testbenches and verification code
    d. Lint        - Linting reports and checks
    e. Pkg         - Shared SV packages
    f. Sim         - Simulation outputs and scripts
    g. Synth       - Synthesis results and constraints

2. L-perceptron    - Long-history perceptron predictor

3. ltage           - L-TAGE branch predictor

4. perceptron      - Standard perceptron predictor

5. Tage            - Tagged Geometric History Length predictor (TAGE)

Superscalar Fetch Engine

Design Overview

Fetch Stage

The Fetch Stage in a superscalar out-of-order CPU aims to maximize throughput with low misprediction penalty:

Fast Next-Line Predictor (NLP)
- Small Branch Target Buffer (BTB) + Bi-Modal Table (BIM)
- Quickly fetches the next sequential block
Accurate Backup Predictor (TAGE)
- Refines initial guesses using multi-history patterns
- Branch Checker triggers an immediate redirect on misprediction

Fetched instructions and metadata are held in:

Fetch Queue: buffers instruction packets
Fetch Target Queue (FTQ): stores branch addresses & predictors for the ROB

On misprediction or backend redirect, the front-end restarts at the corrected PC.

Testbench Environment

Models full pipeline stages (decode, execute, memory, commit) with:

Register-file updates
Load/store data-memory responses

Each fetch packet (2 instructions) is executed sequentially to simplify inter-instruction dependencies. At commit, FTQ metadata is checked to detect and recover from any misprediction.

Getting Started

Prerequisites

SystemVerilog simulator (e.g., Synopsys VCS)
GNU Make
SRAM generator (e.g., OpenRAM)
CBP2016 benchmark traces

Building & Running

1. Generate SRAM macros

cd fetch/sram
make

2. Run Fetch unit simulation (CoreMark)

cd fetch/Sim
make run_vcs_top_tb PROG=../testcode/coremark_rv32i.elf

3. Lint

cd fetch/Lint
make lint

4. Synthesis

cd fetch/Synth
make synth

Standalone Branch Predictors

Standalone IPs developed:

TAGE
Perceptron
L-TAGE
L-Perceptron

Building & Running

Each can be built and benchmarked independently using CBP2016 traces.

The following benchmark traces were used:

SHORT_MOBILE-28
SHORT_MOBILE-56
SHORT_SERVER-11
SHORT_SERVER-71
LONG_MOBILE-4
LONG_MOBILE-5
LONG_MOBILE-13
LONG_SERVER-1

Example: build & run perceptron predictor

cd perceptron
make run_vcs_top_tb TRACE=/class/ece411/cbp2016/traces/SHORT_MOBILE-27.bt9.trace

Contributors

•	[Avijeet Trivedi](https://github.com/avijeet-trivedi)
•	[Jessica Vaz](https://github.com/jessicavaz16)
•	[Leon Ku](https://github.com/lku-illinois)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fast, Accurate, Fetch: Adventures in Superscalar Branch Prediction

Directory Structure

Superscalar Fetch Engine

Design Overview

Fetch Stage

Testbench Environment

Getting Started

Prerequisites

Building & Running

1. Generate SRAM macros

2. Run Fetch unit simulation (CoreMark)

3. Lint

4. Synthesis

Standalone Branch Predictors

Building & Running

Example: build & run perceptron predictor

Contributors

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
L-perceptron		L-perceptron
fetch		fetch
ltage		ltage
perceptron		perceptron
tage		tage
README.md		README.md
image.png		image.png

avijeet-trivedi/branch-predictor

Folders and files

Latest commit

History

Repository files navigation

Fast, Accurate, Fetch: Adventures in Superscalar Branch Prediction

Directory Structure

Superscalar Fetch Engine

Design Overview

Fetch Stage

Testbench Environment

Getting Started

Prerequisites

Building & Running

1. Generate SRAM macros

2. Run Fetch unit simulation (CoreMark)

3. Lint

4. Synthesis

Standalone Branch Predictors

Building & Running

Example: build & run perceptron predictor

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages