-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Background
defer
is a core and incredibly useful component of Zig, which helps avoid bugs in resource management. Similarly, errdefer
helps avoid bugs in situations where a resource should be cleaned up only on error conditions. However, there is a third, lesser-known, piece of syntax: capturing errdefer
. It looks like this:
errdefer |err| {
// `err` is the error being returned; you can do stuff with it!
// Like a normal `defer/`errdefer`, you can't `return` or `try` in this block, so you can't change
// which error is returned; but you can e.g. log it
}
This feature is used incredibly rarely; many experienced Zig users do not even know that it exists. As a data point, in the Zig repository, ignoring test coverage and documentation for it, there are exactly two uses of this language feature:
zig/tools/update_cpu_features.zig
Line 1529 in 8e79fc6
errdefer |err| std.debug.panic("panic: {s}", .{@errorName(err)}); |
Line 298 in 8e79fc6
errdefer |err| oom(err); |
A feature being rarely used is not necessarily in itself a reason to remove that feature. However, errdefer
has a bigger design problem.
The Problem
Here's a question: in errdefer |err|
, what is the type of err
?
The obvious thing would be that it has the type of "every error which can be returned after this point in the function", but this isn't a feasible definition; it brings a good amount of (quite boring) complexity to the language specification in terms of how it interacts with things like inferred error sets, and implementing this would require a type of logic in the compiler which likely has compiler performance implications. So, in reality, there are 3 reasonable choices:
- The type of
err
is the current function's error set. - The type of
err
isanyerror
. - The type of
err
is the error type which is being returned at a givenreturn
site (so theerrdefer
body is reanalyzed at every possible error return with a different type forerr
).
Let's go through these three options. Note that right now, the compiler has inconsistent and unhelpful behavior here, so this is an unsolved problem.
The first option is actually fairly good, with two big caveats:
- In functions with inferred error sets,
errdefer |err| switch (err)
becomes impossible (it would emit a dependency loop because we don't know what cases need to be in theswitch
). - It makes something like
errdefer |err| fatal("error: {}", .{err});
impossible, since the function needs to return an error forerr
to be typed correctly.
The second option means that any switch on the captured err
must have an else
prong, so if you want to switch
on the captured error, this option is strictly worse than using a wrapper function which switch
es on the returned error (since at this point the error type is known and can be exhaustively switched on). However, people are likely to reach for errdefer
anyway out of convenience, and shoot themselves in the foot by losing type safety.
The third option is what we usually do today, but:
- It breaks
switch
on the captured error, because@TypeOf(err)
will usually only contain a subset of all possible errors, so you'll get errors thatswitch
prongs are impossible (because e.g.error.Foo
can't be returned at this particular return site, soerr
iserror{Bar}
instead). - It makes it significantly harder for any compiler implementation to deduplicate
errdefer
s across different error return sites, because they are analyzed differently. (This Zig implementation does not do this deduplication anyway, but it's good for the language spec to make it viable!)
One observation here is that all three of these solutions have big problems with switch
on the captured error -- and this is a construct we want to encourage, not discourage! The main other use case for capturing errdefer
is to log errors. The thing is, this use case is actually not a brilliant use for this construct, because it can lead to code bloat (due to the error logging logic being duplicated at every error return site). Generally, the body of an errdefer
should be incredibly simple; it's essentially just intended for resource cleanup. Capturing errdefer
encourages using the construct in more complex ways, which is usually not a good thing! Also, if you're logging an error, this generally indicates you aren't going to explicitly handle the different error cases higher on the call stack, so it's probably desirable to "collapse" these errors down into one (e.g. turn your set of 5 errors into a blanket error.FooFailed
). This is something errdefer |err|
does not support. As such, there are several big advantages to using a "wrapper function" approach instead, like this:
pub fn foo() error{FooFailed}!void {
fooInner() catch |err| {
std.log.err("error: {s}", .{@errorName(err)});
return error.FooFailed;
};
}
fn fooInner() !void {
// ...
}
So, capturing errdefer
syntax raises a difficult design problem, where all solutions seem unsatisfactory. Furthermore, the common use cases for this feature -- and indeed, the ones it seems to encourage -- tend to emit bloated code and suffer from worse error sets. That leads on to this proposal.
Proposal
Remove errdefer |err|
syntax from Zig. Uh... yeah, that's kinda it.
Migration
The two uses in the compiler can be trivially rewritten with these diffs.
lib/fuzzer.zig
fn traceValue(f: *Fuzzer, x: usize) void {
- errdefer |err| oom(err);
- try f.traced_comparisons.put(gpa, x, {});
+ f.traced_comparisons.put(gpa, x, {}) catch |err| oom(err);
}
tools/update_cpu_features.zig
fn processOneTarget(job: Job) void {
- errdefer |err| std.debug.panic("panic: {s}", .{@errorName(err)});
+ processOneTargetInner(job) catch |err| std.debug.panic("panic: {s}", .{@errorName(err)});
+}
+fn processOneTargetInner(job: Job) !void {
const target = job.target;
These are also basically the two solutions for current uses of this feature: either split your function in two, or perhaps use catch
at the error site(s) if there aren't many. This improves clarity (one less language feature to understand in order to understand your code!), type safety (no issues with switch
exhaustivity, and you can "collapse" error sets where you want as discussed above), and code bloat (no duplication of the error handling path at every possible error return).