From 3a4f5ac8e22b96de59da62fa8a5a570f1f701ed6 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 12 Jun 2024 19:13:23 -0400 Subject: [PATCH 01/42] Add `align_attr` RFC --- text/3806-align-attr.md | 377 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 377 insertions(+) create mode 100644 text/3806-align-attr.md diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md new file mode 100644 index 00000000000..8f838455d53 --- /dev/null +++ b/text/3806-align-attr.md @@ -0,0 +1,377 @@ +- Feature Name: `align_attr` +- Start Date: 2025-05-01 +- RFC PR: [rust-lang/rfcs#3806](https://github.com/rust-lang/rfcs/pull/3806) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Add an `#[align(…)]` attribute to set the minimum alignment of `struct` and `enum` +fields, `static`s, functions, and local variables. + +# Motivation +[motivation]: #motivation + +## Bindings to C and C++ + +[C](https://en.cppreference.com/w/c/language/_Alignas) and [C++](https://en.cppreference.com/w/cpp/language/alignas) +provide an `alignas` modifier to set the alignment of specific struct fields. To +represent such structures in Rust, `bindgen` is sometimes forced to add explicit +padding fields: + +```c +// C code +#include +struct foo { + uint8_t x; + _Alignas(128) uint8_t y; + uint8_t z; +}; +``` + +```rust +// Rust bindings generated by `bindgen` +#[repr(C, align(128))] +pub struct foo { + pub x: u8, + pub __bindgen_padding_0: [u8; 127usize], + pub y: u8, + pub z: u8, +} +``` + +The `__bindgen_padding_0` field makes the generated bindings more confusing and +less ergonomic. Also, it is unsound: the padding should be using `MaybeUninit`. +And even then, there is no guarantee of ABI compatibility on all potential +platforms. + +## Packing values into fewer cache lines + +When working with large values (lookup tables, for example), it is often +desirable, for optimal performance, to pack them into as few cache lines as +possible. One way of doing this is to force the alignment of the value to be at +least the size of the cache line, or perhaps the greatest common denominator of +the value and cache line sizes. + +The simplest way of accomplishing this in Rust today is to use a wrapper struct +with a `#[repr(align(…))]` attribute: + +```rust +type SomeLargeType = [[u8; 64]; 21]; + +#[repr(align(128))] +struct CacheAligned(T); + +static LOOKUP_TABLE: CacheAligned = CacheAligned(SomeLargeType { + data: todo!(), +}); +``` + +However, this approach has several downsides: + +- It requires defining a separate wrapper type. +- It changes the type of the item, which may not be allowed if it is part of the + crate's public API. +- It may add padding to the value, which might not be necessary or desirable. + +In some cases, it can also improve performance to align function items in the +same way. + +## Required alignment for low-level use cases + +Some low-level use-cases (for example, the [RISC-V `mtvec` +register](https://five-embeddev.com/riscv-priv-isa-manual/Priv-v1.12/machine.html#machine-trap-vector-base-address-register-mtvec)) +require functions or statics to have a certain minimum alignment. + +## Interoperating with systems that have types where size is not a multiple of alignment + +In Rust, a type’s size is always a multiple of its alignment. However, there are +other languages that can interoperate with Rust, where this is not the case +(WGSL, for example). It’s important for Rust to be able to represent such +structures. + +# Explanation +[explanation]: #explanation + +The `align` attribute is a new inert, built-in attribute that can be applied to +ADT fields, `static` items, function items, and local variable declarations. The +attribute accepts a single required parameter, which must be a power-of-2 +integer literal from 1 up to 229. (This is the same as +`#[repr(align(…))]`.) + +Multiple `align` attributes may be present on the same item; the highest +alignment among them will be used. The compiler may signal this case with a +warn-by-default lint. + +## On ADT fields + +The `align` attribute may be applied to any field of any `struct`, `enum`, or +`union` that is not `#[repr(transparent)]`. + +```rust +#[repr(C)] +struct Foo { + #[align(8)] + a: u32, +} + +enum Bar { + Variant(#[align(16)] u128), +} + +union Baz { + #[align(16)] + a: u32, +} +``` + +The effect of the attribute is to force the address of the field to have at +least the specified alignment. If the field already has at least that +alignment, due to the required alignment of its type or to a `repr` attribute on +the containing type, the attribute has no effect. + +In contrast to a `repr(align(…))` wrapper struct, an `align` annotation does *not* +necessarily add extra padding to force the field to have a size that is a +multiple of its alignment. (The size of the containing ADT must still be a +multiple of its alignment; that hasn't changed.) + +`align` attributes for fields of a `#[repr(packed(n))]` ADT may not specify an +alignment higher than `n`. + +```rust +#[repr(packed(4))] +struct Sardines { + #[align(2)] // OK + a: u8, + #[align(4)] // OK + b: u16, + #[align(8)] //~ ERROR + c: u32, +} +``` + +`align()` attributes on ADT fields are shown in `rustdoc`-generated documentation. + +## Interaction with `repr(C)` + +`repr(C)` currently has two contradictory meanings: “a simple, linear layout +algorithm that works the same everywhere” and “an ABI matching that of the +target’s standard C compiler”. This RFC does not aim to reslove that conflict; +that is being discussed as part of [RFC +3718](https://github.com/rust-lang/rfcs/pull/3718). Henceforth, we will use +`repr(C_for_real)` to denote “match the system C compiler”, and `repr(linear)` +to denote “simple, portable layout algorithm”; but those names are not +normative. + +### `repr(C_for_real)` + +The layout of a `repr(C_for_real)` ADT with `align` attributes on its fields is +identical to that of the corresponding C ADT declared with `alignas` +annotations. For example, the struct below is equivalent to the C `struct foo` +from the motivation section: + +```rust +#[repr(C_for_real)] +pub struct foo { + pub x: u8, + #[align(128)] + pub y: u8, + pub z: u8, +} +``` + +### `repr(linear)` + +In a `repr(linear)` ADT, a field with an `align` attribute has its alignment, as +well as the alignment of the containing ADT, increased to at least what the +attribute specifies. + +For example, the following two structs have the same layout in memory (though +not necessarily the same ABI): + +```rust +#[repr(linear)] +pub struct foo { + pub x: u8, + #[align(128)] + pub y: u8, + pub z: u8, +} +``` + +```rust +#[repr(linear, align(128))] +pub struct foo2 { + pub x: u8, + pub _padding: [MaybeUninit; 127usize], + pub y: u8, + pub z: u8, +} +``` + +## On `static`s + +Any `static` item (including `static`s inside `extern` blocks) may have an +`align` attribute applied: + +```rust +#[align(32)] +static BAZ: [u32; 12] = [0xDEADBEEF; 12]; + +unsafe extern "C" { + #[align(2)] + safe static BOZZLE: u8; +} +``` + +The effect of the attribute is to force the `static` to be stored with at least +the specified alignment. The attribute does not force padding bytes to be added +after the `static`. For `static`s inside `unsafe extern` blocks, if the `static` +does not meet the specified alignment, the behavior is undefined. (For +misaligned `static` items declared inside old-style `extern` blocks, UB occurs +only if the item is used.) + +The `align` attribute may also be applied to thread-local `static`s created with +the `thread_local!` macro; the attribute affects the alignment of the underlying +value, not that of the outer `std::thread::LocalKey`. + +```rust +thread_local! { + #[align(64)] + static FOO: u8 = 42; +} + +fn main() { + FOO.with(|r| { + let p: *const u8 = r; + assert_eq!(p.align_offset(64), 0); + }); +} +``` + +`align()` attributes on `static`s are shown in `rustdoc`-generated documentation. + +## On function items + +On function items, `#[align(…)]` sets the alignment of the function’s code. This +replaces `#[repr(align(…))]` on function items from `#![feature(fn_align)]`. + +`align` attributes on function items are shown in `rustdoc`-generated documentation. + +## On local variables + +The `align` attribute may also be applied to local variable declarations inside +`let` bindings. The attribute forces the local to have at least the alignment +specified: + +```rust +fn main() { + let (a, #[align(4)] b, #[align(2)] mut c) = (4u8, 2u8, 1u8); + c *= 2; + dbg!(a, b, c); + + if let Some(#[align(4)] x @ 1..) = Some(42u8) { + dbg!(x); + let p: *const u8 = x; + assert_eq!(p.align_offset(4), 0); + } +} +``` + +`align` attributes may not be applied to function parameters. + +```rust +fn foo(#[align(8)] _a: u32) {} //~ ERROR +``` + +They also may not be applied to `_` bindings. + +```rust +let #[align(4)] _ = true; //~ ERROR +``` + +# Drawbacks +[drawbacks]: #drawbacks + +- This feature adds additional complexity to the languge. +- The distinction between `align` and `repr(align)` may be confusing for users. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Compared to the wrapper type approach, the `align` attribute adds additional +flexibility, because it does not force the insertion of padding. If we don't +adopt this feature, `bindgen` will continue to generate suboptimal bindings, and +users will continue to be forced to choose between suboptimal alignment and +additional padding. + +## `#[align(…)]` vs `#[repr(align(…))]` + +One potential alternative would be to use `#[repr(align(…))]` everywhere, +instead of introducing a new attribute. + +Benefits of this alternative: + +- No new attribute polluting the namespace. +- Requesting a certain alignment is spelled the same everywhere. +- `#[repr(…)]` on fields might accept additional options in the future. + +Drawbacks: + +- `#[repr(align(…))]` is a longer and noisier syntax. +- `#[repr(…)]` on non-ADTs, with the possible exception of field definitions, will + probably only ever accept `align(…)` as an argument. It would not be consistent + with the existing `#[repr(…)]` on ADTs. +- `#[align(…)]` *only* aligns, while `#[repr(align(…))]` also pads to a multiple + of the alignment. Having different syntax makes that distinction more clear. + +## `#[align(n)]` vs `#[align = n]` + +`align = n` might be misinterpreted as requesting an alignment of *exactly* `n`, +instead of *at least* `n`. + +# Prior art +[prior-art]: #prior-art + +This proposal is the Rust equivalent of [C +`alignas`](https://en.cppreference.com/w/c/language/_Alignas_). + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +1. What should the syntax be for applying the `align` attribute to `ref`/`ref + mut` bindings? + + - Option A: the attribute goes inside the `ref`/`ref mut`. + +```rust +fn foo(x: &u8) { + let ref #[align(4)] _a = *x; +} +``` + + - Option B: the attribute goes outside the `ref`/`ref mut`. + +```rust +fn foo(x: &u8) { + let #[align(4)] ref _a = *x; +} +``` + +(I believe the simplest option is to forbid this combination entirely for now.) + +2. Does MSVC do something weird with `alignas`? In other words, is the concern + about `repr(C)` vs `repr(linear)` purely theoretical at this point, or does + it matter in practice today? + +# Future possibilities +[future-possibilities]: #future-possibilities + +- The `align(…)` and `repr(align(…))` attributes currently accept only integer + literals as parameters. In the future, they could support `const` expressions + as well. +- We could provide additional facilities for controlling the layout of ADTs; for + example, a way to specify exact field offsets or arbitrary padding. +- We could add type-safe APIs for over-aligned pointers; for example, + over-aligned reference types that are subtypes of `&`/`&mut`. +- We could also add similar APIs for over-aligned function pointers. From 4badbc9c639974c2c9c23ee6a4d8a1d5a9f10c73 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 2 May 2025 09:42:19 -0400 Subject: [PATCH 02/42] Add more mixing packed and aligned to future possibilities --- text/3806-align-attr.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 8f838455d53..a250531bf0d 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -375,3 +375,6 @@ fn foo(x: &u8) { - We could add type-safe APIs for over-aligned pointers; for example, over-aligned reference types that are subtypes of `&`/`&mut`. - We could also add similar APIs for over-aligned function pointers. +- We could loosen the restriction that fields of a `packed(n)` struct cannot + specify an alignment greater that `n`. (Apparently, some C compilers allow + something similar.) From 2b647a812333608dd7d75432de755e2b2c30fef3 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 5 May 2025 00:23:47 -0400 Subject: [PATCH 03/42] Avoid making commitments for interaction with `extern static` --- text/3806-align-attr.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index a250531bf0d..cb03ad422b1 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -227,9 +227,10 @@ unsafe extern "C" { The effect of the attribute is to force the `static` to be stored with at least the specified alignment. The attribute does not force padding bytes to be added after the `static`. For `static`s inside `unsafe extern` blocks, if the `static` -does not meet the specified alignment, the behavior is undefined. (For -misaligned `static` items declared inside old-style `extern` blocks, UB occurs -only if the item is used.) +does not meet the specified alignment, the behavior is undefined. (This UB is +analogous to the UB that can result if the static item is not a valid value of +its type. The question of whether the UB can occur even if the item is unused, +has the same answer for both cases.) The `align` attribute may also be applied to thread-local `static`s created with the `thread_local!` macro; the attribute affects the alignment of the underlying From 13cc4f2650e503ab33b6972203e2aebfd91d31ab Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Thu, 8 May 2025 00:51:16 -0400 Subject: [PATCH 04/42] Expand comparison with C and C++ Also, justify prohibition on fn params. --- text/3806-align-attr.md | 42 ++++++++++++++++++++++++++++++----------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index cb03ad422b1..8d545f08d81 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -99,9 +99,9 @@ attribute accepts a single required parameter, which must be a power-of-2 integer literal from 1 up to 229. (This is the same as `#[repr(align(…))]`.) -Multiple `align` attributes may be present on the same item; the highest -alignment among them will be used. The compiler may signal this case with a -warn-by-default lint. +Multiple instances of the `align` attribute may be present on the same item; the +highest alignment among them will be used. The compiler may signal this case +with a warn-by-default lint. ## On ADT fields @@ -135,8 +135,8 @@ necessarily add extra padding to force the field to have a size that is a multiple of its alignment. (The size of the containing ADT must still be a multiple of its alignment; that hasn't changed.) -`align` attributes for fields of a `#[repr(packed(n))]` ADT may not specify an -alignment higher than `n`. +Instances of the `align` attribute for fields of a `#[repr(packed(n))]` ADT may +not specify an alignment higher than `n`. ```rust #[repr(packed(4))] @@ -150,7 +150,7 @@ struct Sardines { } ``` -`align()` attributes on ADT fields are shown in `rustdoc`-generated documentation. +`align` attributes on ADT fields are shown in `rustdoc`-generated documentation. ## Interaction with `repr(C)` @@ -250,14 +250,15 @@ fn main() { } ``` -`align()` attributes on `static`s are shown in `rustdoc`-generated documentation. +`align` attributes on `static`s are shown in `rustdoc`-generated documentation. ## On function items On function items, `#[align(…)]` sets the alignment of the function’s code. This replaces `#[repr(align(…))]` on function items from `#![feature(fn_align)]`. -`align` attributes on function items are shown in `rustdoc`-generated documentation. +`align` attributes on function items are shown in `rustdoc`-generated +documentation. ## On local variables @@ -279,7 +280,7 @@ fn main() { } ``` -`align` attributes may not be applied to function parameters. +The `align` attribute may not be applied to function parameters. ```rust fn foo(#[align(8)] _a: u32) {} //~ ERROR @@ -331,11 +332,30 @@ Drawbacks: `align = n` might be misinterpreted as requesting an alignment of *exactly* `n`, instead of *at least* `n`. +## `#[align(…)]` on function parameters + +We could choose to allow this. However, this RFC specifies that it should be +rejected, because users might incorrectly think the attribute affects ABI when +it does not. C and C++ make the same choice. + # Prior art [prior-art]: #prior-art -This proposal is the Rust equivalent of [C -`alignas`](https://en.cppreference.com/w/c/language/_Alignas_). +This proposal is the Rust equivalent of +[C](https://en.cppreference.com/w/c/language/_Alignas_) and +[C++](https://en.cppreference.com/w/cpp/language/alignas) `alignas`. + +There are a few significant semantic differences between those features and this +RFC: + +- `#[align]` additionally allows applying the attribute to function item + declarations, which `alignas` does not permit. +- C++, but not C, allows applying `alignas` to type declarations, like Rust’s + `repr(align)`; this RFC does not permit that usage. +- `alignas(n)` accepts any integer constant expression or type name for `n`; + this RFC accepts only integer literals (for now). +- `alignas(n)` allows `n` to be zero, in which case the specifier is ignored; + this RFC does not permit that usage. # Unresolved questions [unresolved-questions]: #unresolved-questions From bbe09ee15765ee854bf7e533dc6b7481ab6160e1 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Thu, 8 May 2025 23:43:35 -0400 Subject: [PATCH 05/42] Minor clarifications --- text/3806-align-attr.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 8d545f08d81..00b1bfbce4b 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -43,7 +43,7 @@ pub struct foo { The `__bindgen_padding_0` field makes the generated bindings more confusing and less ergonomic. Also, it is unsound: the padding should be using `MaybeUninit`. And even then, there is no guarantee of ABI compatibility on all potential -platforms. +targets. ## Packing values into fewer cache lines @@ -74,7 +74,7 @@ However, this approach has several downsides: crate's public API. - It may add padding to the value, which might not be necessary or desirable. -In some cases, it can also improve performance to align function items in the +In some cases, it can also improve performance to align a function's code in the same way. ## Required alignment for low-level use cases @@ -130,10 +130,11 @@ least the specified alignment. If the field already has at least that alignment, due to the required alignment of its type or to a `repr` attribute on the containing type, the attribute has no effect. -In contrast to a `repr(align(…))` wrapper struct, an `align` annotation does *not* -necessarily add extra padding to force the field to have a size that is a +In contrast to a `repr(align(…))` wrapper struct, an `align` annotation does +*not* necessarily add extra padding to force the field to have a size that is a multiple of its alignment. (The size of the containing ADT must still be a -multiple of its alignment; that hasn't changed.) +multiple of its alignment, which must in turn be no less than that of the +most-aligned field. That hasn't changed.) Instances of the `align` attribute for fields of a `#[repr(packed(n))]` ADT may not specify an alignment higher than `n`. @@ -254,8 +255,10 @@ fn main() { ## On function items -On function items, `#[align(…)]` sets the alignment of the function’s code. This -replaces `#[repr(align(…))]` on function items from `#![feature(fn_align)]`. +On function items, `#[align(…)]` sets the alignment of the function’s code. (It +does not affect the alignment of its function item type, which remains a +1-ZST.) This replaces `#[repr(align(…))]` on function items, from +`#![feature(fn_align)]`. `align` attributes on function items are shown in `rustdoc`-generated documentation. From 81748b661f6a6d3858ca33ac4fe9bcf4e165112a Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Thu, 8 May 2025 23:49:05 -0400 Subject: [PATCH 06/42] Delete double spaces --- text/3806-align-attr.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 00b1bfbce4b..cd7271b76cd 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -256,8 +256,8 @@ fn main() { ## On function items On function items, `#[align(…)]` sets the alignment of the function’s code. (It -does not affect the alignment of its function item type, which remains a -1-ZST.) This replaces `#[repr(align(…))]` on function items, from +does not affect the alignment of its function item type, which remains a 1-ZST.) +This replaces `#[repr(align(…))]` on function items, from `#![feature(fn_align)]`. `align` attributes on function items are shown in `rustdoc`-generated @@ -312,7 +312,7 @@ additional padding. ## `#[align(…)]` vs `#[repr(align(…))]` -One potential alternative would be to use `#[repr(align(…))]` everywhere, +One potential alternative would be to use `#[repr(align(…))]` everywhere, instead of introducing a new attribute. Benefits of this alternative: From 51b8069cde7d24610471eac609e6605ab5c1503b Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 9 May 2025 00:23:55 -0400 Subject: [PATCH 07/42] Thumb weirdness, expand discussion of `repr` --- text/3806-align-attr.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index cd7271b76cd..92f3ab84432 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -260,6 +260,12 @@ does not affect the alignment of its function item type, which remains a 1-ZST.) This replaces `#[repr(align(…))]` on function items, from `#![feature(fn_align)]`. +The numerical value of a function pointer to function with an `#[align(n)]` +attribute is *not* always guaranteed to be a multiple of `n` on all targets. For +example, on 32-bit ARM, the low bit of the function pointer is set for functions +using the Thumb instruction set, even though the actual code of the function is +always aligned to at least 2 bytes. + `align` attributes on function items are shown in `rustdoc`-generated documentation. @@ -319,14 +325,20 @@ Benefits of this alternative: - No new attribute polluting the namespace. - Requesting a certain alignment is spelled the same everywhere. -- `#[repr(…)]` on fields might accept additional options in the future. +- `#[repr(…)]` on fields might accept additional options in the future, for + specifying layout and padding more preciesely. +- `#[repr(…)]` on statics and function items could also in theory take on new + roles in the future. For example, `#[instruction_set(…)]` could become + `#[repr(instruction_set(…))]`, and/or `export_name` could become + `#[repr(export_name(…))]`. Drawbacks: - `#[repr(align(…))]` is a longer and noisier syntax. -- `#[repr(…)]` on non-ADTs, with the possible exception of field definitions, will - probably only ever accept `align(…)` as an argument. It would not be consistent - with the existing `#[repr(…)]` on ADTs. +- `#[repr(…)]` on non-ADTs, with the possible exception of field definitions, + will probably only ever accept `align(…)` as an argument, unless we choose to + overturn the precedent of e.g. `#[instruction_set(…)]`. It would çertainly not + be consistent with the existing `#[repr(…)]` on ADTs. - `#[align(…)]` *only* aligns, while `#[repr(align(…))]` also pads to a multiple of the alignment. Having different syntax makes that distinction more clear. From 855a766d255d6fa89f50b2c3bc0c85fcc427e591 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 9 May 2025 00:37:25 -0400 Subject: [PATCH 08/42] Rephrase --- text/3806-align-attr.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 92f3ab84432..3b1b93526ec 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -327,18 +327,18 @@ Benefits of this alternative: - Requesting a certain alignment is spelled the same everywhere. - `#[repr(…)]` on fields might accept additional options in the future, for specifying layout and padding more preciesely. -- `#[repr(…)]` on statics and function items could also in theory take on new - roles in the future. For example, `#[instruction_set(…)]` could become - `#[repr(instruction_set(…))]`, and/or `export_name` could become - `#[repr(export_name(…))]`. +- `#[repr(…)]` on function items could also accept `instruction_set(…)` as an + argument, replacing the existing attribute of that name. Drawbacks: - `#[repr(align(…))]` is a longer and noisier syntax. -- `#[repr(…)]` on non-ADTs, with the possible exception of field definitions, - will probably only ever accept `align(…)` as an argument, unless we choose to - overturn the precedent of e.g. `#[instruction_set(…)]`. It would çertainly not - be consistent with the existing `#[repr(…)]` on ADTs. +- `#[repr(…)]` on non-ADTs would never accept the same set of options as on + ADTs. On field definitions, it might accept additional options to precisely + control layout; on function items, it might accept `instruction_set(…)`, if we + were to overturn the precedent of that being a standalone attribute. On + statics and local variables, I doubt it would ever accept anything else at + all. - `#[align(…)]` *only* aligns, while `#[repr(align(…))]` also pads to a multiple of the alignment. Having different syntax makes that distinction more clear. From 1c7a402b54929ae0bee79a7bf4f9db7941116f01 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 9 May 2025 00:41:20 -0400 Subject: [PATCH 09/42] Typo --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 3b1b93526ec..caa854d2287 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -260,7 +260,7 @@ does not affect the alignment of its function item type, which remains a 1-ZST.) This replaces `#[repr(align(…))]` on function items, from `#![feature(fn_align)]`. -The numerical value of a function pointer to function with an `#[align(n)]` +The numerical value of a function pointer to a function with an `#[align(n)]` attribute is *not* always guaranteed to be a multiple of `n` on all targets. For example, on 32-bit ARM, the low bit of the function pointer is set for functions using the Thumb instruction set, even though the actual code of the function is From dee1e70baea49c358576d718c7ee9e622ed8c580 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 16 May 2025 14:41:49 -0400 Subject: [PATCH 10/42] Address `async` and closures --- text/3806-align-attr.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index caa854d2287..542212e867a 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -260,6 +260,9 @@ does not affect the alignment of its function item type, which remains a 1-ZST.) This replaces `#[repr(align(…))]` on function items, from `#![feature(fn_align)]`. +On `async fn`, the attribute controls the alignment of the code of the function +that returns the `Future`. + The numerical value of a function pointer to a function with an `#[align(n)]` attribute is *not* always guaranteed to be a multiple of `n` on all targets. For example, on 32-bit ARM, the low bit of the function pointer is set for functions @@ -414,3 +417,7 @@ fn foo(x: &u8) { - We could loosen the restriction that fields of a `packed(n)` struct cannot specify an alignment greater that `n`. (Apparently, some C compilers allow something similar.) +- Once + [`#![feature(stmt_expr_attributes)]`](https://github.com/rust-lang/rust/issues/15701) + is stable, we could allow applying `#![align(…))]` to closures and async + blocks as well. From 4701919691415ac2de70f1870411cc258c4c015e Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 16 May 2025 14:55:00 -0400 Subject: [PATCH 11/42] Elaborate on function parameters prohibition --- text/3806-align-attr.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 542212e867a..647b65042ab 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -356,6 +356,22 @@ We could choose to allow this. However, this RFC specifies that it should be rejected, because users might incorrectly think the attribute affects ABI when it does not. C and C++ make the same choice. +To give an example of what could go wrong, consider the following function: + +```rust +fn example(#[align(1024)] very_large_value: [u64; 8192]) { + // use `very_large_value` by reference +} +``` + +Calling this function this function will most likely involve first passing +`very_large_value` on the stack or by pointer, and then copying the entire array +to a new place on the stack in order to align it. This implicit extra stack copy +is not present for `#[align(…)]`ed locals. Forbidding this, and requiring users +to make the move/copy explicit, avoids the performance footgun. + +We could always lift this limitation in the future. + # Prior art [prior-art]: #prior-art From 59b12309792fcc61b188a6d5d87e4559dc2bbc8b Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 17 May 2025 10:52:01 -0400 Subject: [PATCH 12/42] Fix typo --- text/3806-align-attr.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 647b65042ab..122e0c1bc08 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -364,11 +364,11 @@ fn example(#[align(1024)] very_large_value: [u64; 8192]) { } ``` -Calling this function this function will most likely involve first passing -`very_large_value` on the stack or by pointer, and then copying the entire array -to a new place on the stack in order to align it. This implicit extra stack copy -is not present for `#[align(…)]`ed locals. Forbidding this, and requiring users -to make the move/copy explicit, avoids the performance footgun. +Calling this function will most likely involve first passing `very_large_value` +on the stack or by pointer, and then copying the entire array to a new place on +the stack in order to align it. This implicit extra stack copy is not present +for `#[align(…)]`ed locals. Forbidding this, and requiring users to make the +move/copy explicit, avoids the performance footgun. We could always lift this limitation in the future. From 16f3beedf699fa2dc12fbbb8cb7d461785aa4bd7 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 17 May 2025 11:12:24 -0400 Subject: [PATCH 13/42] Justify `async fn` behavior --- text/3806-align-attr.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 122e0c1bc08..5e6c80394c0 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -83,6 +83,11 @@ Some low-level use-cases (for example, the [RISC-V `mtvec` register](https://five-embeddev.com/riscv-priv-isa-manual/Priv-v1.12/machine.html#machine-trap-vector-base-address-register-mtvec)) require functions or statics to have a certain minimum alignment. +## Pointer tagging + +Users may want to specify a minimum alignment for various items, in order to +leave the low bits of pointers to such items free to store additional data. + ## Interoperating with systems that have types where size is not a multiple of alignment In Rust, a type’s size is always a multiple of its alignment. However, there are @@ -372,6 +377,30 @@ move/copy explicit, avoids the performance footgun. We could always lift this limitation in the future. +## Interaction with `async fn` + +This RFC specifies that when applied to `async fn`, the `align` attribute should +affect the alignment of the function that returns the future. This breaks +precedent with `#[inline]`, which affects the alignment of the future `poll` +method. + +There is good reason for this difference. In the case of `inline`, controlling +the inlineability of the function that returns the future is almost never what +you want. That function is mostly trivial, and there is little reason to deviate +from the default of inlining it in most cases. Controlling the inlineability of +the `poll` method is far more useful. + +In contrast, there are several potential reasons to want to control the +alignment of an `async fn`. For example, this could be used in concert with +function pointer tagging schemes. If users apply `#[align(…)]` to an `async fn` +item believing that it will affect the alignment of the function’s pointer (as +it does with any other function item), but instead it affects that of the `poll` +method, that could even result in UB. Therefore, it makes more sense to choose +the simpler and more consistent rule of having the `#[align(…)]` attribute +affect the alignment of the function that returns the future. + +The current `#![feature(fn_align)]` works this way already. + # Prior art [prior-art]: #prior-art From 522977017a91b09dab7dac58b4b4ccd077b9d09a Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 17 May 2025 15:53:09 -0400 Subject: [PATCH 14/42] Elaborate on `ref`/`ref mut` --- text/3806-align-attr.md | 104 ++++++++++++++++++++++++++++++++++------ 1 file changed, 89 insertions(+), 15 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 5e6c80394c0..baafe4a66ae 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -6,18 +6,19 @@ # Summary [summary]: #summary -Add an `#[align(…)]` attribute to set the minimum alignment of `struct` and `enum` -fields, `static`s, functions, and local variables. +Add an `#[align(…)]` attribute to set the minimum alignment of `struct` and +`enum` fields, `static`s, functions, and local variables. # Motivation [motivation]: #motivation ## Bindings to C and C++ -[C](https://en.cppreference.com/w/c/language/_Alignas) and [C++](https://en.cppreference.com/w/cpp/language/alignas) -provide an `alignas` modifier to set the alignment of specific struct fields. To -represent such structures in Rust, `bindgen` is sometimes forced to add explicit -padding fields: +[C](https://en.cppreference.com/w/c/language/_Alignas) and +[C++](https://en.cppreference.com/w/cpp/language/alignas) provide an `alignas` +modifier to set the alignment of specific struct fields. To represent such +structures in Rust, `bindgen` is sometimes forced to add explicit padding +fields: ```c // C code @@ -131,9 +132,9 @@ union Baz { ``` The effect of the attribute is to force the address of the field to have at -least the specified alignment. If the field already has at least that -alignment, due to the required alignment of its type or to a `repr` attribute on -the containing type, the attribute has no effect. +least the specified alignment. If the field already has at least that alignment, +due to the required alignment of its type or to a `repr` attribute on the +containing type, the attribute has no effect. In contrast to a `repr(align(…))` wrapper struct, an `align` annotation does *not* necessarily add extra padding to force the field to have a size that is a @@ -401,6 +402,25 @@ affect the alignment of the function that returns the future. The current `#![feature(fn_align)]` works this way already. +## `#[align(…)] mut local` via `mut #[align(…)] local` + +This RFC proposes that the `#[align(…)]` attribute should come before the `mut` +keyword when declaring an aligned local variable. + +Local variables are declared with [identifier +patterns](https://doc.rust-lang.org/reference/patterns.html#identifier-patterns). +The local is defined by three pieces of information: + +- Its name, which is declared explicity +- Its mutability, which is declared explicity via the presence or absence of the + `mut` keyword +- Its type, which is derived implicitly from the structure and type of the + surrounding pattern. + +In Rust, attributes come before the element they modify. In this case, the `mut` +keyword is an integral part of the local’s declaration; therefore, the attribute +should precede it. + # Prior art [prior-art]: #prior-art @@ -423,8 +443,16 @@ RFC: # Unresolved questions [unresolved-questions]: #unresolved-questions -1. What should the syntax be for applying the `align` attribute to `ref`/`ref - mut` bindings? +## MSVC + +Does MSVC do something weird with `alignas`? In other words, is the concern +about `repr(C)` vs `repr(linear)` purely theoretical at this point, or does it +matter in practice today? + +## Interaction with `ref`/`ref mut` + +What should the syntax be for applying the `align` attribute to `ref`/`ref mut` +bindings? - Option A: the attribute goes inside the `ref`/`ref mut`. @@ -442,11 +470,57 @@ fn foo(x: &u8) { } ``` -(I believe the simplest option is to forbid this combination entirely for now.) +I believe the simplest option is to forbid this combination entirely for now, +especially as there is effectively no use-case for it. + +### How I believe we should decide this eventually + +In my view, the resolution of this question hinges on whether `ref`/`ref mut` +are an integral part of the local declaration. My instinct is to say that they +are not. Like the rest of the surrounding pattern, they describe how the initial +value of the local should be extracted from the scrutinee. Like this surrounding +pattern, they implicitly affect the local’s type, but don’t otherwise affect its +properties. At the point of use, you can’t distinguish a local declared with +`ref`/`ref mut` from one declared some other way. + +However, one could argue that `ref`/`ref mut` are part of the same “binding +mode” syntactic element as `mut`, and that therefore, if the `align` attribute +precedes `mut`, it should precede `ref`/`ref mut` also. + +I believe the the correct time to resolve this question will be when we decide +on a syntax for combining `ref`/`ref mut` with `mut`. If we choose a syntax that +make it clear that these are distinct elements: + +```rust +let ref (mut x) = …; +let ref mut (mut x) = …; +``` + +Then, that would imply that `#[align(…)]` should be applied just before the `mut`: + +```rust +let ref #[align(…)] x = …; +let ref mut #[align(…)] x = …; +let ref #[align(…)] mut x = …; +let ref mut #[align(…)] mut x = …; +``` + +But if we choose a syntax that treats tem as components of a single “binding +mode” element: -2. Does MSVC do something weird with `alignas`? In other words, is the concern - about `repr(C)` vs `repr(linear)` purely theoretical at this point, or does - it matter in practice today? +```rust +let mut ref x = …; +let mut ref mut x = …; +``` + +Then, `#[align]` should always precede that element: + +```rust +let #[align(…)] ref x = …; +let #[align(…)] ref mut x = …; +let #[align(…)] mut ref x = …; +let #[align(…)] mut ref mut x = …; +``` # Future possibilities [future-possibilities]: #future-possibilities From 96350ee7f307aff5b7ce9f563b86c747c22b20d0 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sun, 18 May 2025 14:28:54 -0400 Subject: [PATCH 15/42] Fix tpo --- text/3806-align-attr.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index baafe4a66ae..14a18a662a0 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -382,8 +382,7 @@ We could always lift this limitation in the future. This RFC specifies that when applied to `async fn`, the `align` attribute should affect the alignment of the function that returns the future. This breaks -precedent with `#[inline]`, which affects the alignment of the future `poll` -method. +precedent with `#[inline]`, which affects the future `poll` method. There is good reason for this difference. In the case of `inline`, controlling the inlineability of the function that returns the future is almost never what From 0c21fb37d7a9068a1385fad35a6cd89938fed777 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sun, 18 May 2025 15:09:50 -0400 Subject: [PATCH 16/42] Fix example --- text/3806-align-attr.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 14a18a662a0..e2547039be7 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -63,9 +63,7 @@ type SomeLargeType = [[u8; 64]; 21]; #[repr(align(128))] struct CacheAligned(T); -static LOOKUP_TABLE: CacheAligned = CacheAligned(SomeLargeType { - data: todo!(), -}); +static LOOKUP_TABLE: CacheAligned = CacheAligned(todo!()); ``` However, this approach has several downsides: From 959a94ceb2638e2f8cf4940054bf600f16bdea57 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sun, 18 May 2025 15:19:26 -0400 Subject: [PATCH 17/42] Cache line clarifications --- text/3806-align-attr.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index e2547039be7..28ad081089d 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -49,10 +49,10 @@ targets. ## Packing values into fewer cache lines When working with large values (lookup tables, for example), it is often -desirable, for optimal performance, to pack them into as few cache lines as -possible. One way of doing this is to force the alignment of the value to be at -least the size of the cache line, or perhaps the greatest common denominator of -the value and cache line sizes. +desirable, for optimal performance, to ensure they cross over as few cache lines +as possible. One way of doing this is to force the alignment of the value to be +at least the size of the cache line—or, for smaller values, +`next_power_of_two()` of the value size. The simplest way of accomplishing this in Rust today is to use a wrapper struct with a `#[repr(align(…))]` attribute: From 75f615a4465f5985a2f404b673fef76afb44a8d8 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sun, 18 May 2025 15:20:28 -0400 Subject: [PATCH 18/42] Minor rephrase --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 28ad081089d..25f5827adb3 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -71,7 +71,7 @@ However, this approach has several downsides: - It requires defining a separate wrapper type. - It changes the type of the item, which may not be allowed if it is part of the crate's public API. -- It may add padding to the value, which might not be necessary or desirable. +- It may add unnecessary padding to the value, wasting memory. In some cases, it can also improve performance to align a function's code in the same way. From 8b1fa67ad456dc11e28ad5edaa104427bbdb7d59 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 11:54:10 -0400 Subject: [PATCH 19/42] Fix typos --- text/3806-align-attr.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 25f5827adb3..4dc5b3e8cb4 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -161,7 +161,7 @@ struct Sardines { `repr(C)` currently has two contradictory meanings: “a simple, linear layout algorithm that works the same everywhere” and “an ABI matching that of the -target’s standard C compiler”. This RFC does not aim to reslove that conflict; +target’s standard C compiler”. This RFC does not aim to resolve that conflict; that is being discussed as part of [RFC 3718](https://github.com/rust-lang/rfcs/pull/3718). Henceforth, we will use `repr(C_for_real)` to denote “match the system C compiler”, and `repr(linear)` @@ -311,7 +311,7 @@ let #[align(4)] _ = true; //~ ERROR # Drawbacks [drawbacks]: #drawbacks -- This feature adds additional complexity to the languge. +- This feature adds additional complexity to the language. - The distinction between `align` and `repr(align)` may be confusing for users. # Rationale and alternatives @@ -333,7 +333,7 @@ Benefits of this alternative: - No new attribute polluting the namespace. - Requesting a certain alignment is spelled the same everywhere. - `#[repr(…)]` on fields might accept additional options in the future, for - specifying layout and padding more preciesely. + specifying layout and padding more precisely. - `#[repr(…)]` on function items could also accept `instruction_set(…)` as an argument, replacing the existing attribute of that name. @@ -408,9 +408,9 @@ Local variables are declared with [identifier patterns](https://doc.rust-lang.org/reference/patterns.html#identifier-patterns). The local is defined by three pieces of information: -- Its name, which is declared explicity -- Its mutability, which is declared explicity via the presence or absence of the - `mut` keyword +- Its name, which is declared explicitly +- Its mutability, which is declared explicitly via the presence or absence of + the `mut` keyword - Its type, which is derived implicitly from the structure and type of the surrounding pattern. From 95f397295d2e9c9c24b66a184e0de8a52e2acd5a Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 11:57:02 -0400 Subject: [PATCH 20/42] Link to WGSL alignment rules --- text/3806-align-attr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 4dc5b3e8cb4..247bfa4d4c6 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -91,8 +91,8 @@ leave the low bits of pointers to such items free to store additional data. In Rust, a type’s size is always a multiple of its alignment. However, there are other languages that can interoperate with Rust, where this is not the case -(WGSL, for example). It’s important for Rust to be able to represent such -structures. +([WGSL](https://www.w3.org/TR/WGSL/#alignment-and-size), for example). It’s +important for Rust to be able to represent such structures. # Explanation [explanation]: #explanation From fefe94518ba19e8e8ad2ce820722a998c7769e0a Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 12:11:16 -0400 Subject: [PATCH 21/42] Function parameter prohibition is semantic --- text/3806-align-attr.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 247bfa4d4c6..8ad3eb8f41f 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -296,7 +296,8 @@ fn main() { } ``` -The `align` attribute may not be applied to function parameters. +The `align` attribute may not be applied to function parameters. (This +prohibition is semantic, not syntactic; it is allowed under `#[cfg(false)]`). ```rust fn foo(#[align(8)] _a: u32) {} //~ ERROR From a4787174864d5c118d13b33d55940025e8d38b5c Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 12:45:39 -0400 Subject: [PATCH 22/42] Justify not allowing attribute on `_` --- text/3806-align-attr.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 8ad3eb8f41f..19d5f1e627f 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -303,7 +303,8 @@ prohibition is semantic, not syntactic; it is allowed under `#[cfg(false)]`). fn foo(#[align(8)] _a: u32) {} //~ ERROR ``` -They also may not be applied to `_` bindings. +The attribute may only be applied to binding patterns. It may not be applied to +any other type of pattern, including wildcard patterns: ```rust let #[align(4)] _ = true; //~ ERROR @@ -419,6 +420,12 @@ In Rust, attributes come before the element they modify. In this case, the `mut` keyword is an integral part of the local’s declaration; therefore, the attribute should precede it. +## `#[align(…)]` on `_` wildcard patterns + +This makes no semantic sense. Just as binding modes can only be applied to +bindings, `#[align(…)]` also can only be applied to bindings. `_` patterns are +not bindings; they are a completely separate element in the grammar. + # Prior art [prior-art]: #prior-art From 32b2a93e59c1883d76dac2be4d1ad1b231f05a30 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 14:40:53 -0400 Subject: [PATCH 23/42] More cache clarifications --- text/3806-align-attr.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 19d5f1e627f..2418352121a 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -49,9 +49,10 @@ targets. ## Packing values into fewer cache lines When working with large values (lookup tables, for example), it is often -desirable, for optimal performance, to ensure they cross over as few cache lines -as possible. One way of doing this is to force the alignment of the value to be -at least the size of the cache line—or, for smaller values, +desirable to ensure they cross over as few cache lines as possible. This +mimimizes the amount of data that is brought into cache when the value is +accessed heavily. One way of doing this is to force the alignment of the value +to be at least the size of the cache line—or, for smaller values, `next_power_of_two()` of the value size. The simplest way of accomplishing this in Rust today is to use a wrapper struct From f140b23ff2397850be7ceda61a1b693fa8e16224 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Mon, 19 May 2025 16:17:45 -0400 Subject: [PATCH 24/42] Rename a heading --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 2418352121a..1a808d0a6c6 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -46,7 +46,7 @@ less ergonomic. Also, it is unsound: the padding should be using `MaybeUninit`. And even then, there is no guarantee of ABI compatibility on all potential targets. -## Packing values into fewer cache lines +## Packing a value into fewer cache lines When working with large values (lookup tables, for example), it is often desirable to ensure they cross over as few cache lines as possible. This From 60f23d2d9f68f34d4ea6c7ca336b9f03927b7da9 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 28 May 2025 11:20:54 -0400 Subject: [PATCH 25/42] Fix typo --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 1a808d0a6c6..ba722901d34 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -511,7 +511,7 @@ let ref #[align(…)] mut x = …; let ref mut #[align(…)] mut x = …; ``` -But if we choose a syntax that treats tem as components of a single “binding +But if we choose a syntax that treats them as components of a single “binding mode” element: ```rust From 3351aa94fdbd58a40176b8fcd5102febfb6a199f Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 14 Jun 2025 13:31:18 -0400 Subject: [PATCH 26/42] Fix typo --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index ba722901d34..ee4adccd500 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -50,7 +50,7 @@ targets. When working with large values (lookup tables, for example), it is often desirable to ensure they cross over as few cache lines as possible. This -mimimizes the amount of data that is brought into cache when the value is +minimizes the amount of data that is brought into cache when the value is accessed heavily. One way of doing this is to force the alignment of the value to be at least the size of the cache line—or, for smaller values, `next_power_of_two()` of the value size. From e10fe79f2377c7da09de28b14afae8da55d9e168 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 14 Jun 2025 14:13:50 -0400 Subject: [PATCH 27/42] Make sentence less ambiguous --- text/3806-align-attr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index ee4adccd500..f5dcedcefd4 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -110,8 +110,8 @@ with a warn-by-default lint. ## On ADT fields -The `align` attribute may be applied to any field of any `struct`, `enum`, or -`union` that is not `#[repr(transparent)]`. +The `align` attribute may be applied to any field of any +non-`#[repr(transparent)]` `struct`, `enum`, or `union`. ```rust #[repr(C)] From 441e05b3ec5532daa8c51950d3be58ee4274c043 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 13:28:57 -0400 Subject: [PATCH 28/42] Minor rephrase --- text/3806-align-attr.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index f5dcedcefd4..c172ca03862 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -371,11 +371,11 @@ fn example(#[align(1024)] very_large_value: [u64; 8192]) { } ``` -Calling this function will most likely involve first passing `very_large_value` -on the stack or by pointer, and then copying the entire array to a new place on -the stack in order to align it. This implicit extra stack copy is not present -for `#[align(…)]`ed locals. Forbidding this, and requiring users to make the -move/copy explicit, avoids the performance footgun. +On typical platforms, calling this function will involve first passing +`very_large_value` on the stack or by pointer, and then copying the entire array +to a new place on the stack in order to align it. This implicit extra stack copy +is not present for `#[align(…)]`ed locals. Forbidding this, and requiring users +to make the move/copy explicit, avoids the performance footgun. We could always lift this limitation in the future. From 86937062bcc193b985585dd29bdf67418545ed19 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 13:46:47 -0400 Subject: [PATCH 29/42] Address target-specific limitations --- text/3806-align-attr.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index c172ca03862..93dac84f828 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -101,8 +101,13 @@ important for Rust to be able to represent such structures. The `align` attribute is a new inert, built-in attribute that can be applied to ADT fields, `static` items, function items, and local variable declarations. The attribute accepts a single required parameter, which must be a power-of-2 -integer literal from 1 up to 229. (This is the same as -`#[repr(align(…))]`.) +positive integer literal. + +The maximum alignment is the same as `#[repr(align(…))]`: 229. That +being said, some targets may not be able to support very high alignments in all +contexts. In such cases, the compiler must impose a lower limit for those +specific contexts on those specific targets. The compiler may choose to emit a +lint warning for high, non-portable alignment specifiers. Multiple instances of the `align` attribute may be present on the same item; the highest alignment among them will be used. The compiler may signal this case From 42fb9e22e97870b1e4174c518b4855405568e8b8 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 14:14:13 -0400 Subject: [PATCH 30/42] `align` on trait declarations, semver --- text/3806-align-attr.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 93dac84f828..16afed2e258 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -113,6 +113,11 @@ Multiple instances of the `align` attribute may be present on the same item; the highest alignment among them will be used. The compiler may signal this case with a warn-by-default lint. +`#[align(…)]` on a public item is part of the item’s public API, and it is a +semver-breaking change to lower it. (Except where specified otherwise below, +raising it is not breaking.) In most cases, to avoid making this commitment, one +can use `#[cfg_attr(not(doc), align(…))]`. + ## On ADT fields The `align` attribute may be applied to any field of any @@ -174,6 +179,10 @@ that is being discussed as part of [RFC to denote “simple, portable layout algorithm”; but those names are not normative. +Of course, if a type declaration is using one of these `repr`s to make a public +API commitement as to the exact layout of a type, then any change to field +`#[align(…)]`s may be breaking. + ### `repr(C_for_real)` The layout of a `repr(C_for_real)` ADT with `align` attributes on its fields is @@ -273,6 +282,13 @@ This replaces `#[repr(align(…))]` on function items, from On `async fn`, the attribute controls the alignment of the code of the function that returns the `Future`. +On function items in trait declarations, `#[align(…)]` specifies the minimum +alignment that all implementations of the item must have. `impl` blocks +containing the item must specify an `#[align(…)]` at least as high. The `dyn` +implementation gemerated by the compiler must also provide this alignment. Any +change to `#[align(…)]` on a function item in a trait declaration is therefore +semver-breaking. + The numerical value of a function pointer to a function with an `#[align(n)]` attribute is *not* always guaranteed to be a multiple of `n` on all targets. For example, on 32-bit ARM, the low bit of the function pointer is set for functions From 9f5fb37d0bab79a6f64304e13864bfdc6b779748 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 15:33:30 -0400 Subject: [PATCH 31/42] Overaligned fn ptrs future possibility --- text/3806-align-attr.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 16afed2e258..40e58534937 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -559,6 +559,7 @@ let #[align(…)] mut ref mut x = …; example, a way to specify exact field offsets or arbitrary padding. - We could add type-safe APIs for over-aligned pointers; for example, over-aligned reference types that are subtypes of `&`/`&mut`. + - We could also introduce a similar facility for function pointers. - We could also add similar APIs for over-aligned function pointers. - We could loosen the restriction that fields of a `packed(n)` struct cannot specify an alignment greater that `n`. (Apparently, some C compilers allow From 553b5ef7667d12c72c66346d90ebe71b5449a72e Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 15:39:00 -0400 Subject: [PATCH 32/42] Fix typo --- text/3806-align-attr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 40e58534937..14a9affb9c4 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -566,5 +566,5 @@ let #[align(…)] mut ref mut x = …; something similar.) - Once [`#![feature(stmt_expr_attributes)]`](https://github.com/rust-lang/rust/issues/15701) - is stable, we could allow applying `#![align(…))]` to closures and async - blocks as well. + is stable, we could allow applying `#[align(…))]` to closures and async blocks + as well. From 73973de1daddccb81c1b745fa1c3016496ad7021 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 16:09:15 -0400 Subject: [PATCH 33/42] Fix typo Co-authored-by: Jubilee --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 14a9affb9c4..33a47d3f845 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -285,7 +285,7 @@ that returns the `Future`. On function items in trait declarations, `#[align(…)]` specifies the minimum alignment that all implementations of the item must have. `impl` blocks containing the item must specify an `#[align(…)]` at least as high. The `dyn` -implementation gemerated by the compiler must also provide this alignment. Any +implementation generated by the compiler must also provide this alignment. Any change to `#[align(…)]` on a function item in a trait declaration is therefore semver-breaking. From 4688f1149316b2957d85261da64639b61bf2baf5 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 16:10:21 -0400 Subject: [PATCH 34/42] Fix typo --- text/3806-align-attr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 33a47d3f845..08999280b97 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -180,7 +180,7 @@ to denote “simple, portable layout algorithm”; but those names are not normative. Of course, if a type declaration is using one of these `repr`s to make a public -API commitement as to the exact layout of a type, then any change to field +API commitment as to the exact layout of a type, then any change to field `#[align(…)]`s may be breaking. ### `repr(C_for_real)` From 9efcb250ffcd403e503cdc848d8c66257b1babbb Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 16:27:56 -0400 Subject: [PATCH 35/42] =?UTF-8?q?Don=E2=80=99t=20require=20repeating=20`al?= =?UTF-8?q?ign`=20in=20impls?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/3806-align-attr.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 08999280b97..0045489d4e2 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -282,12 +282,14 @@ This replaces `#[repr(align(…))]` on function items, from On `async fn`, the attribute controls the alignment of the code of the function that returns the `Future`. -On function items in trait declarations, `#[align(…)]` specifies the minimum -alignment that all implementations of the item must have. `impl` blocks -containing the item must specify an `#[align(…)]` at least as high. The `dyn` -implementation generated by the compiler must also provide this alignment. Any -change to `#[align(…)]` on a function item in a trait declaration is therefore -semver-breaking. +On function items in trait declarations, `#[align(n)]` (for some alignment `n`) +specifies the minimum alignment that all implementations of the item must have. +The generated implementation for the trait’s `dyn` type will also provide this +alignment. `impl` blocks do not have to repeat the `#[align(n)]` attribute, it +is implicit. (This enables the trait definition to raise `n` without breaking +implementations.) That being said, `impl` blocks are free to specify an +`align(…)` higher than that provided by the trait. (If they specify a *lower* +alignment, it will simply be ignored.) The numerical value of a function pointer to a function with an `#[align(n)]` attribute is *not* always guaranteed to be a multiple of `n` on all targets. For From 7c155c8619e79af7bfc511554877c671fa2bd352 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Fri, 20 Jun 2025 16:40:46 -0400 Subject: [PATCH 36/42] Clarify "function's code" as "entry symbol" --- text/3806-align-attr.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 0045489d4e2..5ffebbba4c4 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -274,9 +274,9 @@ fn main() { ## On function items -On function items, `#[align(…)]` sets the alignment of the function’s code. (It -does not affect the alignment of its function item type, which remains a 1-ZST.) -This replaces `#[repr(align(…))]` on function items, from +On function items, `#[align(…)]` sets the alignment of the function’s entry +symbol. (It does not affect the alignment of its function item type, which +remains a 1-ZST.) This replaces `#[repr(align(…))]` on function items, from `#![feature(fn_align)]`. On `async fn`, the attribute controls the alignment of the code of the function From 45a9d3d456892bfc2b48e7c4b870c92e0bacf38b Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 25 Jun 2025 15:36:40 -0400 Subject: [PATCH 37/42] Link to definition of "inert" --- text/3806-align-attr.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 5ffebbba4c4..cc2f67f854a 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -98,10 +98,11 @@ important for Rust to be able to represent such structures. # Explanation [explanation]: #explanation -The `align` attribute is a new inert, built-in attribute that can be applied to -ADT fields, `static` items, function items, and local variable declarations. The -attribute accepts a single required parameter, which must be a power-of-2 -positive integer literal. +The `align` attribute is a new +[inert](https://doc.rust-lang.org/reference/attributes.html#r-attributes.activity), +built-in attribute that can be applied to ADT fields, `static` items, function +items, and local variable declarations. The attribute accepts a single required +parameter, which must be a power-of-2 positive integer literal. The maximum alignment is the same as `#[repr(align(…))]`: 229. That being said, some targets may not be able to support very high alignments in all From 893fd65f3effac7f5e51a1fbeae398653faee500 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 25 Jun 2025 15:43:18 -0400 Subject: [PATCH 38/42] Clarify rustdoc --- text/3806-align-attr.md | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index cc2f67f854a..fabea879036 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -115,9 +115,9 @@ highest alignment among them will be used. The compiler may signal this case with a warn-by-default lint. `#[align(…)]` on a public item is part of the item’s public API, and it is a -semver-breaking change to lower it. (Except where specified otherwise below, -raising it is not breaking.) In most cases, to avoid making this commitment, one -can use `#[cfg_attr(not(doc), align(…))]`. +semver-breaking change to lower it. As such, it is shown in `rustdoc`-generated +documentation. To avoid making this commitment, one can use +`#[cfg_attr(not(doc), align(…))]`. ## On ADT fields @@ -167,8 +167,6 @@ struct Sardines { } ``` -`align` attributes on ADT fields are shown in `rustdoc`-generated documentation. - ## Interaction with `repr(C)` `repr(C)` currently has two contradictory meanings: “a simple, linear layout @@ -271,8 +269,6 @@ fn main() { } ``` -`align` attributes on `static`s are shown in `rustdoc`-generated documentation. - ## On function items On function items, `#[align(…)]` sets the alignment of the function’s entry @@ -298,9 +294,6 @@ example, on 32-bit ARM, the low bit of the function pointer is set for functions using the Thumb instruction set, even though the actual code of the function is always aligned to at least 2 bytes. -`align` attributes on function items are shown in `rustdoc`-generated -documentation. - ## On local variables The `align` attribute may also be applied to local variable declarations inside From 5827b3948e0bcff0be121ec2b94b344ca286c668 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 25 Jun 2025 16:40:14 -0400 Subject: [PATCH 39/42] =?UTF-8?q?Rationale=20for=201-ZST=20`#[align(?= =?UTF-8?q?=E2=80=A6)]`=20function=20item=20types?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/3806-align-attr.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index fabea879036..b1c3ef2ccb9 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -444,6 +444,10 @@ This makes no semantic sense. Just as binding modes can only be applied to bindings, `#[align(…)]` also can only be applied to bindings. `_` patterns are not bindings; they are a completely separate element in the grammar. +## `#[align(…)]` function item types being 1-ZSTs + +Anything else would pessimize memory usage for no reason. + # Prior art [prior-art]: #prior-art @@ -477,7 +481,7 @@ matter in practice today? What should the syntax be for applying the `align` attribute to `ref`/`ref mut` bindings? - - Option A: the attribute goes inside the `ref`/`ref mut`. +- Option A: the attribute goes inside the `ref`/`ref mut`. ```rust fn foo(x: &u8) { @@ -485,7 +489,7 @@ fn foo(x: &u8) { } ``` - - Option B: the attribute goes outside the `ref`/`ref mut`. +- Option B: the attribute goes outside the `ref`/`ref mut`. ```rust fn foo(x: &u8) { From 5da3dcb40afdfe0ad6531427151b32b5ad81d39f Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Wed, 25 Jun 2025 20:28:02 -0400 Subject: [PATCH 40/42] Function pointer tagging future possibility --- text/3806-align-attr.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index b1c3ef2ccb9..a6275e2cd29 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -568,3 +568,6 @@ let #[align(…)] mut ref mut x = …; [`#![feature(stmt_expr_attributes)]`](https://github.com/rust-lang/rust/issues/15701) is stable, we could allow applying `#[align(…))]` to closures and async blocks as well. +- We could add tools to make it easier to implement function pointer tagging in + a way that’s resilient to the ARM Thumb issue (and similar strangeness on + hypothetical future targets). From 42f111b06eaa6f5ef7216c68e893bc544e0495a6 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 28 Jun 2025 17:58:37 -0400 Subject: [PATCH 41/42] =?UTF-8?q?`#[align(=E2=80=A6)]`=20is=20compatible?= =?UTF-8?q?=20with=20`#[naked]`?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/3806-align-attr.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index a6275e2cd29..8849f6890a6 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -294,6 +294,8 @@ example, on 32-bit ARM, the low bit of the function pointer is set for functions using the Thumb instruction set, even though the actual code of the function is always aligned to at least 2 bytes. +`#[align(…)]` is compatible with `#[naked]`. + ## On local variables The `align` attribute may also be applied to local variable declarations inside From e79807a89e393419094fa1c8c79bf34be3d39215 Mon Sep 17 00:00:00 2001 From: Jules Bertholet Date: Sat, 28 Jun 2025 18:43:18 -0400 Subject: [PATCH 42/42] =?UTF-8?q?`#[align(=E2=80=A6)]`=20on=20functions=20?= =?UTF-8?q?in=20`extern`=20blocks?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- text/3806-align-attr.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/text/3806-align-attr.md b/text/3806-align-attr.md index 8849f6890a6..a3692fd774e 100644 --- a/text/3806-align-attr.md +++ b/text/3806-align-attr.md @@ -228,7 +228,7 @@ pub struct foo2 { } ``` -## On `static`s +## On `static` items Any `static` item (including `static`s inside `extern` blocks) may have an `align` attribute applied: @@ -296,6 +296,11 @@ always aligned to at least 2 bytes. `#[align(…)]` is compatible with `#[naked]`. +`#[align(…)]` may be used on function items inside `extern` blocks. This imposes +a requirement on the symbol being linked to. The UB that can result if this +alignment is not satisfied, follows the same rules as the UB that can result +from an incorrect function signature. + ## On local variables The `align` attribute may also be applied to local variable declarations inside