Skip to content

Emulate Silent Data Corruptions #114

@ligurio

Description

@ligurio

SiliFuzz: Fuzzing CPUs by proxy:

When discussing misbehaving CPUs we distinguish between logic bugs and electrical defects. A logic bug is an invalid CPU behavior inherent to a particular CPU microarchitecture or stepping, e.g. [2, 3, 4]. An electrical defect is an invalid CPU behavior that happens only on one or several chips. Often, but not always, an electrical defect affects only one of the physical cores on the chip. A defect may be present in a CPU from day one (if it was missed by the vendor during testing) or it may be a result of physical wear-out (e.g., circuit aging). For the purposes of this paper the distinction between day-one defects and wear-out is not important.

The term “Silent Data Corruption” (SDC) has been coined [5, 6, 7, 8] to represent CPU defects (and bugs) that do not cause an immediate observable failure, but instead silently lead to incorrect computation. This phenomenon has been known for a very long time, and is directly related to concepts like fault-secure operation, data integrity checking, and silent errors. In many production environments SDCs are more dangerous than crashes, because most server software ecosystems are built with tolerance to crashes, but not with tolerance to unobserved data corruption. Not all CPU defects induce SDCs, but many do.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions