Skip to content

Commit 9242718

Browse files
authored
Fuzzer: Modify functions late (#7313)
Previously we would create a function and then immediately modify it in interesting ways (replace code, move code around, etc.). We also modified any initial functions (in the content we are given to build on top of) at the very start. That was simple, by modifying later we can get some benefits. This PR makes us work in a loop: while more: if random: add function else modify some random function previously added This lets us unify the modification code to one place (rather than have it separate for initial functions and for created functions). Also, by modifying late, we can end up with calls from early functions to late ones, as we may create A and B before modifying A, and end up mutating something into a call to B. Previously we never had such forward calls. For such late modding to work, the FunctionCreationContext needs to be a little smarter, since we don't use a single context for creation and modification - now the modification may happen later. Specifically, it needs to scan the body to see the next label index that is valid to add, which we did not need before. This PR also moves some things out of the context destructor, because now we will run it more than once on the same function, and some things only need to happen once like adding the hang limit checks.
1 parent 4de55c1 commit 9242718

File tree

5 files changed

+227
-173
lines changed

5 files changed

+227
-173
lines changed

src/tools/fuzzing.h

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -228,10 +228,7 @@ class TranslateToFuzzReader {
228228
// type => list of locals with that type
229229
std::unordered_map<Type, std::vector<Index>> typeLocals;
230230

231-
FunctionCreationContext(TranslateToFuzzReader& parent, Function* func)
232-
: parent(parent), func(func) {
233-
parent.funcContext = this;
234-
}
231+
FunctionCreationContext(TranslateToFuzzReader& parent, Function* func);
235232

236233
~FunctionCreationContext();
237234

@@ -343,8 +340,14 @@ class TranslateToFuzzReader {
343340
Expression* makeImportSleep(Type type);
344341
Expression* makeMemoryHashLogging();
345342

346-
// Function creation
343+
// Function operations. The main processFunctions() loop will call addFunction
344+
// as well as modFunction().
345+
void processFunctions();
346+
// Add a new function.
347347
Function* addFunction();
348+
// Modify an existing function.
349+
void modFunction(Function* func);
350+
348351
void addHangLimitChecks(Function* func);
349352

350353
// Recombination and mutation

src/tools/fuzzing/fuzzing.cpp

Lines changed: 120 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -362,11 +362,7 @@ void TranslateToFuzzReader::build() {
362362
addImportCallingSupport();
363363
addImportSleepSupport();
364364
modifyInitialFunctions();
365-
// keep adding functions until we run out of input
366-
while (!random.finished()) {
367-
auto* func = addFunction();
368-
addInvocations(func);
369-
}
365+
processFunctions();
370366
if (fuzzParams->HANG_LIMIT > 0) {
371367
addHangLimitSupport();
372368
}
@@ -1125,6 +1121,41 @@ void TranslateToFuzzReader::addHashMemorySupport() {
11251121
}
11261122
}
11271123

1124+
TranslateToFuzzReader::FunctionCreationContext::FunctionCreationContext(
1125+
TranslateToFuzzReader& parent, Function* func)
1126+
: parent(parent), func(func) {
1127+
parent.funcContext = this;
1128+
1129+
// Note the types of all locals.
1130+
computeTypeLocals();
1131+
1132+
// Find the right index for labelIndex: we emit names like label$5, so we need
1133+
// the index to be larger than all currently existing.
1134+
if (!func->body) {
1135+
return;
1136+
}
1137+
1138+
struct Finder : public PostWalker<Finder, UnifiedExpressionVisitor<Finder>> {
1139+
Index maxIndex = 0;
1140+
1141+
void visitExpression(Expression* curr) {
1142+
// Note all scope names, and fix up all uses.
1143+
BranchUtils::operateOnScopeNameDefs(curr, [&](Name& name) {
1144+
if (name.is()) {
1145+
if (name.startsWith("label$")) {
1146+
auto str = name.toString();
1147+
str = str.substr(6);
1148+
Index index = atoi(str.c_str());
1149+
maxIndex = std::max(maxIndex, index + 1);
1150+
}
1151+
}
1152+
});
1153+
}
1154+
} finder;
1155+
finder.walk(func->body);
1156+
labelIndex = finder.maxIndex;
1157+
}
1158+
11281159
TranslateToFuzzReader::FunctionCreationContext::~FunctionCreationContext() {
11291160
// We must ensure non-nullable locals validate. Later down we'll run
11301161
// TypeUpdating::handleNonDefaultableLocals which will make them validate by
@@ -1151,9 +1182,6 @@ TranslateToFuzzReader::FunctionCreationContext::~FunctionCreationContext() {
11511182
// fixup to ensure we validate.
11521183
TypeUpdating::handleNonDefaultableLocals(func, parent.wasm);
11531184

1154-
if (parent.fuzzParams->HANG_LIMIT > 0) {
1155-
parent.addHangLimitChecks(func);
1156-
}
11571185
assert(breakableStack.empty());
11581186
assert(hangStack.empty());
11591187
parent.funcContext = nullptr;
@@ -1296,14 +1324,78 @@ Expression* TranslateToFuzzReader::makeMemoryHashLogging() {
12961324
return builder.makeCall(logImportNames[Type::i32], {hash}, Type::none);
12971325
}
12981326

1327+
void TranslateToFuzzReader::processFunctions() {
1328+
// Functions that are eligible for being modded. We only do so once to each
1329+
// function, at most, so once we do we remove it from here.
1330+
std::vector<Function*> moddable;
1331+
1332+
// Defined initial functions are moddable.
1333+
for (auto& func : wasm.functions) {
1334+
if (!func->imported()) {
1335+
moddable.push_back(func.get());
1336+
}
1337+
}
1338+
1339+
// Add invocations, which can help execute the code here even if the function
1340+
// was not exported (or was exported but with a signature that traps
1341+
// immediately, like receiving a non-nullable ref, that the fuzzer can't
1342+
// provide from JS). Note we cannot iterate on wasm.functions because
1343+
// addInvocations modifies that.
1344+
for (auto* func : moddable) {
1345+
addInvocations(func);
1346+
}
1347+
1348+
// We do not want to always mod in the same frequency. Pick a chance to mod a
1349+
// function. When the chance is maximal we will mod every single function, and
1350+
// immediately after creating it; when the chance is minimal we will not mod
1351+
// anything; values in the middle will end up randomly modding some functions,
1352+
// at random times (random times are useful because we might create function
1353+
// A, then B, then mod A, and since B has already been created, the modding of
1354+
// A may lead to calls to B).
1355+
const int RESOLUTION = 10;
1356+
auto chance = upTo(RESOLUTION + 1);
1357+
1358+
// Keep working while we have random data.
1359+
while (!random.finished()) {
1360+
if (!moddable.empty() && upTo(RESOLUTION) < chance) {
1361+
// Mod an existing function.
1362+
auto index = upTo(moddable.size());
1363+
auto* func = moddable[index];
1364+
modFunction(func);
1365+
1366+
// Remove this function from the vector by swapping the last item to its
1367+
// place, and truncating.
1368+
moddable[index] = moddable.back();
1369+
moddable.pop_back();
1370+
} else {
1371+
// Add a new function
1372+
auto* func = addFunction();
1373+
addInvocations(func);
1374+
1375+
// It may be modded later, if we allow out-of-bounds: we emit OOB checks
1376+
// in the code we just generated, and any changes could break that.
1377+
if (allowOOB) {
1378+
moddable.push_back(func);
1379+
}
1380+
}
1381+
}
1382+
1383+
// At the very end, add hang limit checks (so no modding can override them).
1384+
if (fuzzParams->HANG_LIMIT > 0) {
1385+
for (auto& func : wasm.functions) {
1386+
if (!func->imported()) {
1387+
addHangLimitChecks(func.get());
1388+
}
1389+
}
1390+
}
1391+
}
1392+
12991393
// TODO: return std::unique_ptr<Function>
13001394
Function* TranslateToFuzzReader::addFunction() {
13011395
LOGGING_PERCENT = upToSquared(100);
13021396
auto allocation = std::make_unique<Function>();
13031397
auto* func = allocation.get();
13041398
func->name = Names::getValidFunctionName(wasm, "func");
1305-
FunctionCreationContext context(*this, func);
1306-
assert(funcContext->typeLocals.empty());
13071399
Index numParams = upToSquared(fuzzParams->MAX_PARAMS);
13081400
std::vector<Type> params;
13091401
params.reserve(numParams);
@@ -1322,7 +1414,9 @@ Function* TranslateToFuzzReader::addFunction() {
13221414
}
13231415
func->vars.push_back(type);
13241416
}
1325-
context.computeTypeLocals();
1417+
// Generate the function creation context after we filled in locals, which it
1418+
// will scan.
1419+
FunctionCreationContext context(*this, func);
13261420
// with small chance, make the body unreachable
13271421
auto bodyType = func->getResults();
13281422
if (oneIn(10)) {
@@ -1334,29 +1428,6 @@ Function* TranslateToFuzzReader::addFunction() {
13341428
} else {
13351429
func->body = make(bodyType);
13361430
}
1337-
// Our OOB checks are already in the code, and if we recombine/mutate we
1338-
// may end up breaking them. TODO: do them after the fact, like with the
1339-
// hang limit checks.
1340-
if (allowOOB) {
1341-
// Notice the locals and their types again, as more may have been added
1342-
// during generation of the body. We want to be able to local.get from those
1343-
// as well.
1344-
// TODO: We could also add a "localize" phase here to stash even more things
1345-
// in locals, so that they can be reused. But we would need to be
1346-
// careful with non-nullable locals (which error if used before being
1347-
// set, or trap if we make them nullable, both of which are bad).
1348-
context.computeTypeLocals();
1349-
// Recombinations create duplicate code patterns.
1350-
recombine(func);
1351-
// Mutations add random small changes, which can subtly break duplicate
1352-
// code patterns.
1353-
mutate(func);
1354-
// TODO: liveness operations on gets, with some prob alter a get to one
1355-
// with more possible sets.
1356-
// Recombination, mutation, etc. can break validation; fix things up
1357-
// after.
1358-
fixAfterChanges(func);
1359-
}
13601431

13611432
// Add hang limit checks after all other operations on the function body.
13621433
wasm.addFunction(std::move(allocation));
@@ -1392,6 +1463,20 @@ Function* TranslateToFuzzReader::addFunction() {
13921463
return func;
13931464
}
13941465

1466+
void TranslateToFuzzReader::modFunction(Function* func) {
1467+
FunctionCreationContext context(*this, func);
1468+
1469+
dropToLog(func);
1470+
// TODO: if we add OOB checks after creation, then we can do it on
1471+
// initial contents too, and it may be nice to *not* run these
1472+
// passes, like we don't run them on new functions. But, we may
1473+
// still want to run them some of the time, at least, so that we
1474+
// check variations on initial testcases even at the risk of OOB.
1475+
recombine(func);
1476+
mutate(func);
1477+
fixAfterChanges(func);
1478+
}
1479+
13951480
void TranslateToFuzzReader::addHangLimitChecks(Function* func) {
13961481
// loop limit
13971482
for (auto* loop : FindAll<Loop>(func->body).list) {
@@ -1822,9 +1907,6 @@ void TranslateToFuzzReader::modifyInitialFunctions() {
18221907
if (wasm.functions.empty()) {
18231908
return;
18241909
}
1825-
// Pick a chance to fuzz the contents of a function.
1826-
const int RESOLUTION = 10;
1827-
auto chance = upTo(RESOLUTION + 1);
18281910
// Do not iterate directly on wasm.functions itself (that is, avoid
18291911
// for (x : wasm.functions)
18301912
// ) as we may add to it as we go through the functions - make() can add new
@@ -1840,43 +1922,11 @@ void TranslateToFuzzReader::modifyInitialFunctions() {
18401922
(func->module == "fuzzing-support" || preserveImportsAndExports)) {
18411923
continue;
18421924
}
1843-
FunctionCreationContext context(*this, func);
18441925
if (func->imported()) {
1926+
FunctionCreationContext context(*this, func);
18451927
func->module = func->base = Name();
18461928
func->body = make(func->getResults());
18471929
}
1848-
// Optionally, fuzz the function contents.
1849-
if (upTo(RESOLUTION) >= chance) {
1850-
dropToLog(func);
1851-
// Notice params as well as any locals generated above.
1852-
// TODO add some locals? and the rest of addFunction's operations?
1853-
context.computeTypeLocals();
1854-
// TODO: if we add OOB checks after creation, then we can do it on
1855-
// initial contents too, and it may be nice to *not* run these
1856-
// passes, like we don't run them on new functions. But, we may
1857-
// still want to run them some of the time, at least, so that we
1858-
// check variations on initial testcases even at the risk of OOB.
1859-
recombine(func);
1860-
mutate(func);
1861-
fixAfterChanges(func);
1862-
// TODO: This triad of functions appears in another place as well, and
1863-
// could be handled by a single function. That function could also
1864-
// decide to reorder recombine and mutate or even run more cycles of
1865-
// them.
1866-
}
1867-
}
1868-
1869-
// Add invocations, which can help execute the code here even if the function
1870-
// was not exported (or was exported but with a signature that traps
1871-
// immediately, like receiving a non-nullable ref, that the fuzzer can't
1872-
// provide from JS). Note we need to use a temp vector for iteration, as
1873-
// addInvocations modifies wasm.functions.
1874-
std::vector<Function*> funcs;
1875-
for (auto& func : wasm.functions) {
1876-
funcs.push_back(func.get());
1877-
}
1878-
for (auto* func : funcs) {
1879-
addInvocations(func);
18801930
}
18811931

18821932
// Remove a start function - the fuzzing harness expects code to run only
Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,35 @@
11
Metrics
22
total
3-
[exports] : 24
4-
[funcs] : 30
3+
[exports] : 77
4+
[funcs] : 111
55
[globals] : 21
66
[imports] : 5
77
[memories] : 1
88
[memory-data] : 5
9-
[table-data] : 8
9+
[table-data] : 45
1010
[tables] : 1
1111
[tags] : 0
12-
[total] : 3231
13-
[vars] : 112
14-
Binary : 266
15-
Block : 529
16-
Break : 115
17-
Call : 122
18-
CallIndirect : 3
19-
Const : 529
20-
Drop : 31
21-
GlobalGet : 271
22-
GlobalSet : 190
23-
If : 181
24-
Load : 58
25-
LocalGet : 233
26-
LocalSet : 177
27-
Loop : 65
28-
Nop : 59
29-
RefFunc : 8
30-
Return : 31
31-
Select : 16
32-
Store : 29
33-
Switch : 4
34-
Unary : 220
35-
Unreachable : 94
12+
[total] : 9163
13+
[vars] : 296
14+
Binary : 620
15+
Block : 1503
16+
Break : 288
17+
Call : 580
18+
CallIndirect : 101
19+
Const : 1500
20+
Drop : 136
21+
GlobalGet : 772
22+
GlobalSet : 562
23+
If : 478
24+
Load : 126
25+
LocalGet : 630
26+
LocalSet : 487
27+
Loop : 166
28+
Nop : 78
29+
RefFunc : 45
30+
Return : 87
31+
Select : 75
32+
Store : 60
33+
Switch : 2
34+
Unary : 588
35+
Unreachable : 279

0 commit comments

Comments
 (0)