-
Notifications
You must be signed in to change notification settings - Fork 13.5k
stabilize offset_of_enum
#143954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
stabilize offset_of_enum
#143954
Conversation
r? @SparrowLii rustbot has assigned @SparrowLii. Use |
r? @Amanieu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is this missing test coverage for array indexing cases, e.g.
#![feature(offset_of_enum)]
enum Alpha {
One([u8; 1]),
}
fn main() {
println!("{}", std::mem::offset_of!(Alpha, One.0[0]));
//~^ ERROR array indexing not supported in offset_of
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that error is missing a test yes - should be easy enough to add a line as in your example to one of these files. (I won't open a separate PR for that test as it would merge-conflict with this one.)
#![feature(offset_of_enum)] | ||
use std::mem; | ||
|
||
#[repr(u8)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion (for the cursed code specialists): hm, does this have any interesting interactions w/ #[repr(transparent)]
, #[repr(u{N}/i{N})]
, #[repr(packed)]
and #[repr(align(N))]
I wonder
Regarding the pattern matching alternative... |
Can we leave room by making it a compiler error to get the offset of a field in an uninhabited variant? |
The Reference also says that
and that it
(It goes on about ZSTs, but it could be read as ambiguous as to whether that's just an example.) Unless there's some other reason that making
|
@tmandry That would require monomorphization-time checking, wouldn't it? (Not sure that is a major point against, just a drawback.) |
If you could, please provide the example for this. |
Yes, I'd expect so. |
We do have an "Accessing discriminant" section where we talk about, e.g., cc @ehuss |
I don't want to language lawyer here but, I believe
is descriptive rather than prescriptive. But if we indeed optimized pub enum MyEnum2 {
One,
Two(i32, i32, !)
} The two Okay, but some people want to do in-place initialization. Given pub struct MyStruct {
a: i32,
b: i32,
c: !,
} Perhaps the code will want to do some work to write to
I'd oppose that. This would make the very common use case super hard to write, e.g.: #![feature(offset_of_enum)]
use std::mem::MaybeUninit;
use std::mem::offset_of;
pub unsafe fn in_place_init_result<T, E>(
x: &mut MaybeUninit<Result<T, E>>,
is_ok: bool,
init_ok: impl FnOnce(&mut MaybeUninit<T>) -> &mut T,
init_err: impl FnOnce(&mut MaybeUninit<E>) -> &mut E,
) -> &mut Result<T, E> {
// pretend we did some discriminant initialization here..
if is_ok {
let offset = offset_of!(Result<T, E>, Ok.0);
let r = unsafe { &mut *(x as *mut _ as *mut MaybeUninit<T>).add(offset) };
init_ok(r);
} else {
let offset = offset_of!(Result<T, E>, Err.0);
let r = unsafe { &mut *(x as *mut _ as *mut MaybeUninit<E>).add(offset) };
init_err(r);
};
unsafe { x.assume_init_mut() }
} Why should we prevent some caller such as the following? in_place_init_result::<i32, !>(
&mut MaybeUninit::uninit(),
true,
|x| x.write(42),
|_| unreachable!(),
); But to support these usages while raising post-mono errors when
As a quick sketch: #![feature(never_type)]
#![feature(offset_of_enum)]
use std::mem::offset_of;
pub struct EnumVariant {
pub test: fn(*const ()) -> bool,
pub field_offsets: &'static [usize],
}
pub enum Something {
A(i32),
B(i32, usize, u8),
}
pub const VARIANTS: &[EnumVariant] = &[
EnumVariant {
test: |ptr| matches!(unsafe { &*(ptr as *const Something) }, Something::A { .. }),
field_offsets: &[offset_of!(Something, A.0)],
},
EnumVariant {
test: |ptr| matches!(unsafe { &*(ptr as *const Something) }, Something::B { .. }),
field_offsets: &[
offset_of!(Something, B.0),
offset_of!(Something, B.1),
offset_of!(Something, B.2),
],
},
]; |
The case of uninhabited variants ties this feature pretty closely to "unsafe set discriminant", so I think we have to consider the larger picture here which is in-place initialization. If we want to allow that for arbitrary enums, we pretty much have to guarantee that all fields of all variants get the space they need, even if there are also uninhabited fields in that same variant. That's the reason why we did this for structs (and that long predates The only alternative I see is to make in-place init for enums a language primitive of sorts so that we can generate code that works even if we change the layout to no longer reserve space for all fields in uninhabited variants. |
The pattern-matching-like syntax is interesting, but I think I prefer the current syntax. Either way, it's new syntax to learn for a new feature; making it look like pattern matching could maybe better mirror how fields' addresses are taken, but fundamentally It's a little funny that many enum variants use tuple-like syntax with the expectation they're matched on, so there'll be lots of |
For what it's worth for precedence, we can already get offset_of/size_of on stable for unions, which would have the same issues wrt size on eliminatable "variants": // #![feature(never_type)]
// type Never = !;
// note: results are the same if you use `!` on nightly
type Never = core::convert::Infallible;
union Example {
small: u32,
large: (u8, [u8; 128], Never),
}
fn main() {
println!("{}", core::mem::size_of::<Example>());
println!("{}", core::mem::offset_of!(Example, large.2))
}
This means if people are "hand rolling" enums with unions + integer discriminants, it would probably cause issues if we eliminated uninhabited union variants (wrt total/max size). |
Unions and enums are very different. So I disagree about this somehow carrying over from unions to enums. |
Maybe we can have |
Returning |
That's ok; we've already essentially decided that we're not going to ever be able to do that because we've already decided that writing a |
Stabilization report
Closes #120141. See previous stabilization report and subsequent FCP for background
The original RFC for
offset_of!
has a section on enums in the "Future possibilities" section. The 2024 effort to get this feature stabilized had some discussion around whether an RFC to specify this additional behavior is necessary, as this is an extension upon the existingoffset_of
feature. This doesn't preclude us from deciding that an RFC ends up being necessary but my personal opinion is that enough discussion has occurred and requiring an additional RFC may end up being a nuisance. But I'm only going to protest a tiny bit if there does end up a requirement for an RFC to pass first.There are mainly two, Syntax and Uninhabited variants.
Syntax
There are three kinds of syntax proposed for this issue so far. I am proposing the stabilize the current syntax.
1: Current syntax
Write the base type as first argument, then use the variant in path:
2: @nikomatsakis's alternative syntax
Pattern matching: See original comment. It wasn't entirely clear how it should be formulated
3: Syntax from the
offset_of!
RFCWrite the base type and the variant as first argument:
Syntax 3 is the most restrictive because writing the path to the variant as the first argument prevents access to nested fields (because fields can not go through variants). This may be unhelpful.
Syntax 2 is unclear. With pattern matching we can make nested field access work, but here are some downsides:
Syntax 1 is the current syntax. It supports going into nested fields.
If we wanted to switch to a different syntax, it should either be consistent with the current one or aims to replace the current one. Syntax 1 is consistent so it should be preferred. Replacing the current one is probably a discussion if one feels the need to later.
Uninhabited variants
@steffahn commented during the last FCP about whether we should allow access to offsets in variants that contain uninhabited types. This is based on the assumption that we may find it helpful to completetly optimize the
Bar
variant away in the future, so thatMyEnum
below becomes a zero-sized type:If we stabilized this as-is, it may allow us to write a
u32
intooffset_of!(MyEnum, Bar.0)
, thus preventing this particular optimization from ever happening.However, it was mentioned in the tracking issue that
repr(Rust)
guarantees that "fields [inside a variant] do not overlap"1, "the fields can be ordered such that the offset plus the size of any field is less than or equal to the offset of the next field in the ordering", so it seems very difficult (and perhaps impossible) for us to optimize the case above while still keeping the guarantee (the size of the field0
inMyEnum::Bar
should be 4 bytes)It should also be noted that this feature alone does not support in-place initialization for enums. It may be used together with RFC 3727 to do so, but stabilizing this feature does not itself enable such a use. More discussion in the next section.
offset_of_slice
is a separate feature that allows finding the offset of an unsized[T]
field (normally located at the tail of a layout) and is not covered by stabilizing this feature.This feature becomes very powerful when combined with RFC 3727 (unsafe
set_discriminant
as well asdiscriminant_of!
for variants): it allows in-place initialization of enum. It also enables a form of reflection onrepr(Rust)
enums whereas previously it was only possible to do onrepr(C)
. More specifically, it allows downstream users to generate metadata about an enum's variants and their fields, which include discriminants and the offsets of each field, which allows for in-place initialization of enums as well as reading the runtime value of an enum.This would also mean that specific structural dependent code may no longer need their own procedural macros to implement their behavior as they can just read into the fields of a struct/variants of an enum with the metadata generated, i.e. we're making https://facet.rs/ more capable.
Note that stabilizing just
offset_of_enum
withoutdiscriminant_of!
still allows someone to store a function pointer for each variant that tests whether the runtime value is of that variant.potential uses of this in the wild: https://github.com/search?q=%2F%28%3F-i%29offset_of%21%5C%28%5B%5E%5Cn%29%5D*%3F%2C+%28%5BA-Z%5D%5Ba-z%5D*%29%2B%28%5C..*%29%3F%5C%29%2F+NOT+windows_sys+NOT+ntapi+NOT+windows+NOT+%22rust-bindgen%22+NOT+ffi&type=code See also feature name search results.
#114208
Copied from previous stabilization report:
enum
senum
sNone
None
@GKFX for implementing the feature; @jamesmunns for the Unsafe Set Discriminant RFC which captures many of the same audiences this feature also has; @thomcc for the original
offset_of!
RFC; @dingxiangfei2009 for the previous stabilization reportParticipants from the previous FCP discussion: @Amanieu, @joshtriplett, @exrok, @nikomatsakis, @steffahn, @RalfJung, @hanna-kruppe, @programmerjake, @traviscross
N/A
offset_of
is type-checked withcheck_expr_offset_of
It is sound because it returns an offset on its own and doesn't perform unsafe operations on its own. We only have to make sure to return the correct offset, which is done through
offset_through_subfield
Not really. All UB that use this feature can be achieved without using this feature.
None
None
None
Footnotes
Note that the Reference text does not make it clear whether it applies to single variants or just structs. It appears to apply to variants. It can't apply across variants because fields in different variants will happily overlap. ↩