Skip to content

stabilize offset_of_enum #143954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fee1-dead
Copy link
Member

@fee1-dead fee1-dead commented Jul 15, 2025

Stabilization report

Closes #120141. See previous stabilization report and subsequent FCP for background

General design

What is the RFC for this feature and what changes have occurred to the user-facing design since the RFC was finalized?

The original RFC for offset_of! has a section on enums in the "Future possibilities" section. The 2024 effort to get this feature stabilized had some discussion around whether an RFC to specify this additional behavior is necessary, as this is an extension upon the existing offset_of feature. This doesn't preclude us from deciding that an RFC ends up being necessary but my personal opinion is that enough discussion has occurred and requiring an additional RFC may end up being a nuisance. But I'm only going to protest a tiny bit if there does end up a requirement for an RFC to pass first.

What behavior are we committing to that has been controversial? Summarize the major arguments pro/con.

There are mainly two, Syntax and Uninhabited variants.

Syntax

There are three kinds of syntax proposed for this issue so far. I am proposing the stabilize the current syntax.

1: Current syntax

Write the base type as first argument, then use the variant in path:

#![feature(offset_of_enum)]
fn main() {
    enum Foo {
        Bar(Bar),
    }
    
    enum Bar {
        Inner(Option<u8>),
    }
    
    println!("{}", std::mem::offset_of!(Foo, Bar.0.Inner.0.Some.0));
}

2: @nikomatsakis's alternative syntax

Pattern matching: See original comment. It wasn't entirely clear how it should be formulated

#![feature(offset_of_enum)]
fn main() {
    enum Foo {
        Bar(Bar),
    }
    
    enum Bar {
        Inner(Option<u8>),
    }
    
    println!("{}", std::mem::offset_of!(Foo @ Foo::Bar(inner)));
    // unclear how nested fields would work, here's a sketch:
    println!("{}", std::mem::offset_of!(Foo @ Foo::Bar(Bar::Inner(Some(inner)))));
}

3: Syntax from the offset_of! RFC

Write the base type and the variant as first argument:

#![feature(offset_of_enum)]
fn main() {
    enum Foo {
        Bar(Bar),
    }
    
    enum Bar {
        Inner(Option<u8>),
    }
    
    // unclear how nested enum variants would work
    println!("{}", std::mem::offset_of!(Foo::Bar, 0));
}

Syntax 3 is the most restrictive because writing the path to the variant as the first argument prevents access to nested fields (because fields can not go through variants). This may be unhelpful.

Syntax 2 is unclear. With pattern matching we can make nested field access work, but here are some downsides:

  1. It's a subset of the pattern matching syntax to make sure it is both infallible and only introduces one binding. We might have to ban some infallible patterns (Or patterns, deref/box patterns, etc), and this might be surprising (or the knowledge may not be easily transferable).
  2. It seems to want to do more than just enum variant access. The syntax will cover structs, but we already have a stable syntax for accessing struct fields. A switch to pattern matching syntax should consider replacing the existing struct field access syntax so as to not have two ways of doing a single thing.

Syntax 1 is the current syntax. It supports going into nested fields.

If we wanted to switch to a different syntax, it should either be consistent with the current one or aims to replace the current one. Syntax 1 is consistent so it should be preferred. Replacing the current one is probably a discussion if one feels the need to later.

Uninhabited variants

@steffahn commented during the last FCP about whether we should allow access to offsets in variants that contain uninhabited types. This is based on the assumption that we may find it helpful to completetly optimize the Bar variant away in the future, so that MyEnum below becomes a zero-sized type:

enum MyEnum {
    Foo,
    Bar(u32, !)
}

If we stabilized this as-is, it may allow us to write a u32 into offset_of!(MyEnum, Bar.0), thus preventing this particular optimization from ever happening.

However, it was mentioned in the tracking issue that repr(Rust) guarantees that "fields [inside a variant] do not overlap"1, "the fields can be ordered such that the offset plus the size of any field is less than or equal to the offset of the next field in the ordering", so it seems very difficult (and perhaps impossible) for us to optimize the case above while still keeping the guarantee (the size of the field 0 in MyEnum::Bar should be 4 bytes)

It should also be noted that this feature alone does not support in-place initialization for enums. It may be used together with RFC 3727 to do so, but stabilizing this feature does not itself enable such a use. More discussion in the next section.

Are there extensions to this feature that remain unstable? How do we know that we are not accidentally committing to those?

offset_of_slice is a separate feature that allows finding the offset of an unsized [T] field (normally located at the tail of a layout) and is not covered by stabilizing this feature.

This feature becomes very powerful when combined with RFC 3727 (unsafe set_discriminant as well as discriminant_of! for variants): it allows in-place initialization of enum. It also enables a form of reflection on repr(Rust) enums whereas previously it was only possible to do on repr(C). More specifically, it allows downstream users to generate metadata about an enum's variants and their fields, which include discriminants and the offsets of each field, which allows for in-place initialization of enums as well as reading the runtime value of an enum.

This would also mean that specific structural dependent code may no longer need their own procedural macros to implement their behavior as they can just read into the fields of a struct/variants of an enum with the metadata generated, i.e. we're making https://facet.rs/ more capable.

Note that stabilizing just offset_of_enum without discriminant_of! still allows someone to store a function pointer for each variant that tests whether the runtime value is of that variant.

Has a Call for Testing period been conducted? If so, what feedback was received?

Does any OSS nightly users use this feature? For instance, a useful indication might be "search <grep.app> for #![feature(FEATURE_NAME)] and had N results".

potential uses of this in the wild: https://github.com/search?q=%2F%28%3F-i%29offset_of%21%5C%28%5B%5E%5Cn%29%5D*%3F%2C+%28%5BA-Z%5D%5Ba-z%5D*%29%2B%28%5C..*%29%3F%5C%29%2F+NOT+windows_sys+NOT+ntapi+NOT+windows+NOT+%22rust-bindgen%22+NOT+ffi&type=code See also feature name search results.

Implementation quality

Summarize the major parts of the implementation and provide links into the code (or to PRs)

An example for async closures: https://rustc-dev-guide.rust-lang.org/coroutine-closures.html.

#114208

Summarize existing test coverage of this feature

Consider what the "edges" of this feature are. We're particularly interested in seeing tests that assure us about exactly what nearby things we're not stabilizing.

Within each test, include a comment at the top describing the purpose of the test and what set of invariants it intends to demonstrate. This is a great help to those reviewing the tests at stabilization time.

  • What does the test coverage landscape for this feature look like?
    • Tests for compiler errors when you use the feature wrongly or make mistakes?
    • Tests for the feature itself:
      • Limits of the feature (so failing compilation)
      • Exercises of edge cases of the feature
      • Tests that checks the feature works as expected (where applicable, //@ run-pass).
    • Are there any intentional gaps in test coverage?

Link to test folders or individual tests (ui/codegen/assembly/run-make tests, etc.).

Copied from previous stabilization report:

What outstanding bugs in the issue tracker involve this feature? Are they stabilization-blocking?

None

What FIXMEs are still in the code for that feature and why is it ok to leave them there?

None

Summarize contributors to the feature by name for recognition and assuredness that people involved in the feature agree with stabilization

@GKFX for implementing the feature; @jamesmunns for the Unsafe Set Discriminant RFC which captures many of the same audiences this feature also has; @thomcc for the original offset_of! RFC; @dingxiangfei2009 for the previous stabilization report

Participants from the previous FCP discussion: @Amanieu, @joshtriplett, @exrok, @nikomatsakis, @steffahn, @RalfJung, @hanna-kruppe, @programmerjake, @traviscross

Which tools need to be adjusted to support this feature. Has this work been done?

Consider rustdoc, clippy, rust-analyzer, rustfmt, rustup, docs.rs.

N/A

Type system and execution rules

What compilation-time checks are done that are needed to prevent undefined behavior?

(Be sure to link to tests demonstrating that these tests are being done.)

offset_of is type-checked with check_expr_offset_of

Does the feature's implementation need checks to prevent UB or is it sound by default and needs opt in in places to perform the dangerous/unsafe operations? If it is not sound by default, what is the rationale?

It is sound because it returns an offset on its own and doesn't perform unsafe operations on its own. We only have to make sure to return the correct offset, which is done through offset_through_subfield

Can users use this feature to introduce undefined behavior, or use this feature to break the abstraction of Rust and expose the underlying assembly-level implementation? (Describe.)

Not really. All UB that use this feature can be achieved without using this feature.

What updates are needed to the reference/specification? (link to PRs when they exist)

None

Common interactions

Does this feature introduce new expressions and can they produce temporaries? What are the lifetimes of those temporaries?

None

What other unstable features may be exposed by this feature?

None

Footnotes

  1. Note that the Reference text does not make it clear whether it applies to single variants or just structs. It appears to apply to variants. It can't apply across variants because fields in different variants will happily overlap.

@rustbot
Copy link
Collaborator

rustbot commented Jul 15, 2025

r? @SparrowLii

rustbot has assigned @SparrowLii.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 15, 2025
@SparrowLii
Copy link
Member

r? @Amanieu

@rustbot rustbot assigned Amanieu and unassigned SparrowLii Jul 15, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this missing test coverage for array indexing cases, e.g.

#![feature(offset_of_enum)]

enum Alpha {
    One([u8; 1]),
}

fn main() {
    println!("{}", std::mem::offset_of!(Alpha, One.0[0]));
    //~^ ERROR array indexing not supported in offset_of
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that error is missing a test yes - should be easy enough to add a line as in your example to one of these files. (I won't open a separate PR for that test as it would merge-conflict with this one.)

#![feature(offset_of_enum)]
use std::mem;

#[repr(u8)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion (for the cursed code specialists): hm, does this have any interesting interactions w/ #[repr(transparent)], #[repr(u{N}/i{N})], #[repr(packed)] and #[repr(align(N))] I wonder

@Amanieu Amanieu added the I-lang-nominated Nominated for discussion during a lang team meeting. label Jul 15, 2025
@traviscross traviscross added T-lang Relevant to the language team P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang labels Jul 15, 2025
@traviscross
Copy link
Contributor

Regarding the pattern matching alternative...

cc @Nadrieril @dianne

@tmandry
Copy link
Member

tmandry commented Jul 15, 2025

enum MyEnum {
    Foo,
    Bar(u32, !)
}

If we stabilized this as-is, it may allow us to write a u32 into offset_of!(MyEnum, Bar.0), thus preventing this particular optimization from ever happening.

Can we leave room by making it a compiler error to get the offset of a field in an uninhabited variant?

@traviscross
Copy link
Contributor

traviscross commented Jul 16, 2025

However, it was mentioned in the tracking issue that repr(Rust) guarantees that "fields [inside a variant] do not overlap"1, "the fields can be ordered such that the offset plus the size of any field is less than or equal to the offset of the next field in the ordering", so it seems very difficult (and perhaps impossible) for us to optimize the case above while still keeping the guarantee (the size of the field 0 in MyEnum::Bar should be 4 bytes)

The Reference also says that

the only data layout guarantees made by this representation are those required for soundness.

and that it

does not imply that the fields have distinct addresses...

(It goes on about ZSTs, but it could be read as ambiguous as to whether that's just an example.)

Unless there's some other reason that making MyEnum::Bar zero-sized would violate soundness, I might be inclined to read this as simply a gap in the Reference. We don't specifically guarantee, e.g., that the size of the field matches the size of the field's type. And:

There are no other guarantees of data layout made by this representation.

@workingjubilee
Copy link
Member

workingjubilee commented Jul 16, 2025

@tmandry That would require monomorphization-time checking, wouldn't it? (Not sure that is a major point against, just a drawback.)

@traviscross
Copy link
Contributor

Note that stabilizing just offset_of_enum without discriminant_of! still allows someone to store a function pointer for each variant that tests whether the runtime value is of that variant.

If you could, please provide the example for this.

@traviscross
Copy link
Contributor

traviscross commented Jul 16, 2025

That would require monomorphization-time checking, wouldn't it? (Not sure that is a major point against, just a drawback.)

Yes, I'd expect so.

@traviscross
Copy link
Contributor

traviscross commented Jul 16, 2025

What updates are needed to the reference/specification? (link to PRs when they exist)

None

We do have an "Accessing discriminant" section where we talk about, e.g., std::mem::discriminant. I wonder whether we shouldn't also be talking about accessing field offsets. Something to think about.

cc @ehuss

@fee1-dead
Copy link
Member Author

fee1-dead commented Jul 16, 2025

The Reference also says that

the only data layout guarantees made by this representation are those required for soundness.

and that it

does not imply that the fields have distinct addresses...

(It goes on about ZSTs, but it could be read as ambiguous as to whether that's just an example.)

Unless there's some other reason that making MyEnum::Bar zero-sized would violate soundness, I might be inclined to read this as simply a gap in the Reference. We don't specifically guarantee, e.g., that the size of the field matches the size of the field's type. And:

I don't want to language lawyer here but, I believe

the only data layout guarantees made by this representation are those required for soundness.

is descriptive rather than prescriptive. But if we indeed optimized MyEnum2::Two away,

pub enum MyEnum2 {
    One,
    Two(i32, i32, !)
}

The two i32s would certainly overlap. Notwithstanding our pre-existing guarantee about fields never overlapping, is it sound to allow them anyways? Maybe, since you're supposed to never encounter it during runtime, so why allow reading or writing to the other fields at runtime at all?

Okay, but some people want to do in-place initialization. Given MyStruct is in a routine to get in-place initialized.

pub struct MyStruct {
    a: i32,
    b: i32,
    c: !,
} 

Perhaps the code will want to do some work to write to a and b first since we're writing to MaybeUninit<MyStruct> (note that the ship has already sailed since we allow offset_of on structs) (also, this is an interesting way to spell "this type can never be fully initialized but we can still use parts of it as storage"), then some code that handles c panics or something. This use case carries over to enums.

Can we leave room by making it a compiler error to get the offset of a field in an uninhabited variant?

I'd oppose that. This would make the very common use case super hard to write, e.g.:

#![feature(offset_of_enum)]

use std::mem::MaybeUninit;
use std::mem::offset_of;

pub unsafe fn in_place_init_result<T, E>(
    x: &mut MaybeUninit<Result<T, E>>,
    is_ok: bool,
    init_ok: impl FnOnce(&mut MaybeUninit<T>) -> &mut T,
    init_err: impl FnOnce(&mut MaybeUninit<E>) -> &mut E,
) -> &mut Result<T, E> {
    // pretend we did some discriminant initialization here..
    if is_ok {
        let offset = offset_of!(Result<T, E>, Ok.0);
        let r = unsafe { &mut *(x as *mut _ as *mut MaybeUninit<T>).add(offset) };
        init_ok(r);
    } else {
        let offset = offset_of!(Result<T, E>, Err.0);
        let r = unsafe { &mut *(x as *mut _ as *mut MaybeUninit<E>).add(offset) };
        init_err(r);
    };

    unsafe { x.assume_init_mut() }
}

Why should we prevent some caller such as the following?

    in_place_init_result::<i32, !>(
        &mut MaybeUninit::uninit(),
        true,
        |x| x.write(42),
        |_| unreachable!(),
    );

But to support these usages while raising post-mono errors when offset_of! is used on uninhabited variants would be incredibly tedius for the downstream user. Most of the time they might not even know unless they actually tried using the edge cases because the error is post-mono.

If you could, please provide the example for this.

As a quick sketch:

#![feature(never_type)]
#![feature(offset_of_enum)]

use std::mem::offset_of;

pub struct EnumVariant {
    pub test: fn(*const ()) -> bool,
    pub field_offsets: &'static [usize],
}

pub enum Something {
    A(i32),
    B(i32, usize, u8),
}

pub const VARIANTS: &[EnumVariant] = &[
    EnumVariant {
        test: |ptr| matches!(unsafe { &*(ptr as *const Something) }, Something::A { .. }),
        field_offsets: &[offset_of!(Something, A.0)],
    },
    EnumVariant {
        test: |ptr| matches!(unsafe { &*(ptr as *const Something) }, Something::B { .. }),
        field_offsets: &[
            offset_of!(Something, B.0),
            offset_of!(Something, B.1),
            offset_of!(Something, B.2),
        ],
    },
];

@RalfJung
Copy link
Member

The case of uninhabited variants ties this feature pretty closely to "unsafe set discriminant", so I think we have to consider the larger picture here which is in-place initialization. If we want to allow that for arbitrary enums, we pretty much have to guarantee that all fields of all variants get the space they need, even if there are also uninhabited fields in that same variant. That's the reason why we did this for structs (and that long predates offset_of).

The only alternative I see is to make in-place init for enums a language primitive of sorts so that we can generate code that works even if we change the layout to no longer reserve space for all fields in uninhabited variants.

@dianne
Copy link
Contributor

dianne commented Jul 16, 2025

Regarding the pattern matching alternative...

The pattern-matching-like syntax is interesting, but I think I prefer the current syntax. Either way, it's new syntax to learn for a new feature; making it look like pattern matching could maybe better mirror how fields' addresses are taken, but fundamentally offset_of! is not pattern matching. The field-projection-like syntax better represents what it's doing and doesn't suggest it can do things it can't.

It's a little funny that many enum variants use tuple-like syntax with the expectation they're matched on, so there'll be lots of .0s. I think the .0s look better than the parentheses, ::s, _s, ..s, etc. needed for a pattern-like syntax though.

@jamesmunns
Copy link
Member

For what it's worth for precedence, we can already get offset_of/size_of on stable for unions, which would have the same issues wrt size on eliminatable "variants":

// #![feature(never_type)]
// type Never = !;
// note: results are the same if you use `!` on nightly
type Never = core::convert::Infallible;

union Example {
    small: u32,
    large: (u8, [u8; 128], Never),
}

fn main() {
    println!("{}", core::mem::size_of::<Example>());
    println!("{}", core::mem::offset_of!(Example, large.2))
}

which prints

132
129

This means if people are "hand rolling" enums with unions + integer discriminants, it would probably cause issues if we eliminated uninhabited union variants (wrt total/max size).

@RalfJung
Copy link
Member

Unions and enums are very different. So I disagree about this somehow carrying over from unions to enums.

@Dirbaio
Copy link
Contributor

Dirbaio commented Jul 16, 2025

Maybe we can have offset_of_enum!() instead, which returns Option<usize>? It'd return None if the variant is optimized out due to containing uninhabited types.

@hanna-kruppe
Copy link
Contributor

Returning Option<usize> was discussed on the tracking issue during the last stabilization attempt, I described there what I see as downsides.

@scottmcm
Copy link
Member

If we stabilized this as-is, it may allow us to write a u32 into offset_of!(MyEnum, Bar.0), thus preventing this particular optimization from ever happening.

That's ok; we've already essentially decided that we're not going to ever be able to do that because we've already decided that writing a u32 into a MaybeUninit<(u32, !)> is important and useful and thus size_of::<(u32, !)>() is staying 4, not zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-lang-nominated Nominated for discussion during a lang team meeting. P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking Issue for enum access in offset_of