Skip to content

Proposal: make by-ref capture semantics consistent #23509

@mlugg

Description

@mlugg

Background

In Zig, several constructs can capture a value "by reference" using the syntax |*x|. The captures which can be by-ref are:

  • The capture of an iterable for operand (i.e. not a range)
  • The payload capture of a if on an optional or error union
  • The payload capture of a while on an optional or error union
  • The payload capture of a switch prong

However, there is an inconsistency in how these by-ref captures work between for operands and other constructs. The inconsistency is made clear by these two examples:

var arr: [5]u32 = undefined;
for (arr) |*x| x.* = 0;
var opt: ?u32 = 123;
if (opt) |*x| x.* = 456;

The second of these snippets compiles; the first does not. For the first snippet to compile, the for operand needs to be &arr.

This difference occurs because in the latter case, Zig effectively inserts an & before the operand when the capture is "by-ref". while and switch captures also follow the latter behavior.

The for behavior has some advantages. Mainly, it is more explicit; you can tell exactly what the code is doing. Implicit references can look a bit weird in a few cases. Consider:

while (it.next()) |*x| {
    // ...
}

What's that snippet doing? You'd be forgiven for thinking that next returns a *?T or similar, but actually, this inserts an implicit &, meaning x is a pointer to a local temporary. If next did return *?T, you'd currently have to write:

while (it.next().*) |*x| {
    // ...
}

Okay, so we're dereferencing the pointer... but we somehow get a pointer out which is derived from it? The logic here does make sense, but it's quite weird.

The implicit-ref behavior also meshes particularly strangely with switch, because any prong having a by-ref capture changes how the operand is evaluated. This means there can be arbitrary distance between the operand and the code which decides how it is evaluated, which is pretty strange.

Proposal

Remove the implicit-ref behavior of if, while, and switch on their operands. These constructs will now accept pointers to their operand types, and pointers will be required for by-ref captures.

Here's how this looks in practice:

var opt: ?u32 = 123;
if (&opt) |*x| x.* = 456;
// this is also allowed, although redundant, when you capture by value
if (&opt) |x| _ = x;

// and likewise for error unions
var eu: anyerror!u32 = 123;
if (&eu) |*x| {
    x.* = 456;
} else |_| {}
if (&eu) |x| {
    _ = x; // by-val capture even though the operand is a pointer
} else |_| {}

// `while` works exactly like `if`

// `switch` looks like this:
const U = union(enum) {
    foo: u32,
    bar: u64,
};
var u: U = .{ .x = 123 };
switch (&u) {
    .foo => |*x| x.* = 456,
    .bar => |y| _ = y,
}
// again, it's okay for the captures to be by-val
switch (&u) {
    .foo => |x| _ = x,
    .bar => |y| _ = y,
}

There's one problem here: switch (&u) already has a different meaning! Currently, switch on pointers is allowed, and switches on the address. As such, this proposal also proposes removing switch on pointers as it works today. If desired, the exact same behavior can be achieved by switching on @intFromPtr of a pointer. (I don't think switch on pointers is heavily relied upon in the wild; my source for this is that it's pretty much completely broken on the LLVM backend right now!)

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedThis proposal is planned.breakingImplementing this issue could cause existing code to no longer compile or have different behavior.proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions