Skip to content

Commit 4f28344

Browse files
committed
Improve documentation of Place and Operand
1 parent f262ca1 commit 4f28344

File tree

1 file changed

+121
-13
lines changed
  • compiler/rustc_middle/src/mir

1 file changed

+121
-13
lines changed

compiler/rustc_middle/src/mir/mod.rs

Lines changed: 121 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1775,8 +1775,98 @@ pub struct CopyNonOverlapping<'tcx> {
17751775
///////////////////////////////////////////////////////////////////////////
17761776
// Places
17771777

1778-
/// A path to a value; something that can be evaluated without
1779-
/// changing or disturbing program state.
1778+
/// Places roughly correspond to a "location in memory." Places in MIR are the same mathematical
1779+
/// object as places in Rust. This of course means that what exactly they are is undecided and part
1780+
/// of the Rust memory model. However, they will likely contain at least the following three pieces
1781+
/// of information in some form:
1782+
///
1783+
/// 1. The part of memory that is referred to (see discussion below for details).
1784+
/// 2. The type of the place and an optional variant index. See [`PlaceTy`][tcx::PlaceTy]
1785+
/// 3. The provenance with which the place is being accessed.
1786+
///
1787+
/// We'll give a description below of how the first two of these three properties are computed for a
1788+
/// place. We cannot give a description of the provenance, because that is part of the undecided
1789+
/// aliasing model - we only include it here at all to acknowledge its existence.
1790+
///
1791+
/// For a place that has no projections, ie `Place { local, projection: [] }`, the part of memory is
1792+
/// the local's full allocation and the type is the type of the local. For any other place, we
1793+
/// define the values as a function of the parent place, that is the place with its last
1794+
/// [`ProjectionElem`] stripped. The way this is computed of course depends on the kind of that last
1795+
/// projection element:
1796+
///
1797+
/// - [`Downcast`](ProjectionElem::Downcast): This projection sets the place's variant index to the
1798+
/// given one, and makes no other changes. A `Downcast` projection on a place with its variant
1799+
/// index already set is not well-formed.
1800+
/// - [`Field`](ProjectionElem::Field): `Field` projections take their parent place and create a
1801+
/// place referring to one of the fields of the type. The referred to place in memory is where
1802+
/// the layout places the field. The type becomes the type of the field.
1803+
///
1804+
/// These projections are only legal for tuples, ADTs, closures, and generators. If the ADT or
1805+
/// generator has more than one variant, the parent place's variant index must be set, indicating
1806+
/// which variant is being used. If it has just one variant, the variant index may or may not be
1807+
/// included - the single possible variant is inferred if it is not included.
1808+
/// - [`ConstantIndex`](ProjectionElem::ConstantIndex): Computes an offset in units of `T` into the
1809+
/// place as described in the documentation for the `ProjectionElem`. The resulting part of
1810+
/// memory is the location of that element of the array/slice, and the type is `T`. This is only
1811+
/// legal if the parent place has type `[T; N]` or `[T]` (*not* `&[T]`).
1812+
/// - [`Subslice`](ProjectionElem::Subslice): Much like `ConstantIndex`. It is also only legal on
1813+
/// `[T; N]` and `[T]`. However, this yields a `Place` of type `[T]`, and may refer to more than
1814+
/// one element in the parent place.
1815+
/// - [`Index`](ProjectionElem::Index): Like `ConstantIndex`, only legal on `[T; N]` or `[T]`.
1816+
/// However, `Index` additionally takes a local from which the value of the index is computed at
1817+
/// runtime. Computing the value of the index involves interpreting the `Local` as a
1818+
/// `Place { local, projection: [] }`, and then computing its value as if done via
1819+
/// [`Operand::Copy`]. The array/slice is then indexed with the resulting value. The local must
1820+
/// have type `usize`.
1821+
/// - [`Deref`](ProjectionElem::Deref): Derefs are the last type of projection, and the most
1822+
/// complicated. They are only legal on parent places that are references, pointers, or `Box`. A
1823+
/// `Deref` projection begins by creating a value from the parent place, as if by
1824+
/// [`Operand::Copy`]. It then dereferences the resulting pointer, creating a place of the
1825+
/// pointed to type.
1826+
///
1827+
/// **Needs clarification**: What about metadata resulting from dereferencing wide pointers (and
1828+
/// possibly from accessing unsized locals - not sure how those work)? That probably deserves to go
1829+
/// on the list above and be discussed too. It is also probably necessary for making the indexing
1830+
/// stuff lass hand-wavey.
1831+
///
1832+
/// **Needs clarification**: When it says "part of memory" what does that mean precisely, and how
1833+
/// does it interact with the metadata?
1834+
///
1835+
/// One possible model that I believe makes sense is that "part of memory" is actually just the
1836+
/// address of the beginning of the referred to range of bytes. For sized types, the size of the
1837+
/// range is then stored in the type, and for unsized types it's stored (possibly indirectly,
1838+
/// through a vtable) in the metadata.
1839+
///
1840+
/// Alternatively, the "part of memory" could be a whole range of bytes. Initially seemed more
1841+
/// natural to me, but seems like it falls apart after a little bit.
1842+
///
1843+
/// More likely though, we should call this detail a part of the Rust memory model and let that deal
1844+
/// with the precise definition of this part of a place. If we feel strongly, I don't think we *have
1845+
/// to* though. MIR places are more flexible than Rust places, and we might be able to make a
1846+
/// decision on the flexible parts without semi-stabilizing the source language. (end NC)
1847+
///
1848+
/// Computing a place may be UB - this is certainly the case with dereferencing, which requires
1849+
/// sufficient provenance, but it may additionally be the case for some of the other field
1850+
/// projections.
1851+
///
1852+
/// It is undecided when this UB kicks in. As best I can tell that is the question being discussed
1853+
/// in [UCG#319]. Summarizing from that thread, I believe the options are:
1854+
///
1855+
/// [UCG#319]: https://github.com/rust-lang/unsafe-code-guidelines/issues/319
1856+
///
1857+
/// 1. Each intermediate place must have provenance for the whole part of memory it refers to. This
1858+
/// is the status quo.
1859+
/// 2. Only for intermediate place where the last projection was *not* a deref. This corresponds to
1860+
/// "Check inbounds on place projection".
1861+
/// 3. Only on place to value conversions, assignments, and referencing operation. This corresponds
1862+
/// to "remove the restrictions from `*` entirely."
1863+
/// 4. On each intermediate place if the place is used for a place to value conversion as part of
1864+
/// an assignment assignment or it is used for a referencing operation. For a raw pointer
1865+
/// computation, never. This corresponds to "magic?".
1866+
///
1867+
/// Hopefully I am not misrepresenting anyone's opinions - please let me know if I am. Currently,
1868+
/// Rust chooses option 1. This is checked by MIRI and taken advantage of by codegen (via `gep
1869+
/// inbounds`). That is possibly subject to change.
17801870
#[derive(Copy, Clone, PartialEq, Eq, Hash, TyEncodable, HashStable)]
17811871
pub struct Place<'tcx> {
17821872
pub local: Local,
@@ -2145,24 +2235,42 @@ pub struct SourceScopeLocalData {
21452235
///////////////////////////////////////////////////////////////////////////
21462236
// Operands
21472237

2148-
/// These are values that can appear inside an rvalue. They are intentionally
2149-
/// limited to prevent rvalues from being nested in one another.
2238+
/// An operand in MIR represents a "value" in Rust, the definition of which is undecided and part of
2239+
/// the memory model. One proposal for a definition of values can be found [on UCG][value-def].
2240+
///
2241+
/// [value-def]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/value-domain.md
2242+
///
2243+
/// The most common way to create values is via a place to value conversion. A place to value
2244+
/// conversion is an operation which reads the memory of the place and converts it to a value. This
2245+
/// is a fundamentally *typed* operation. Different types will do different things. These are some
2246+
/// possible examples of what Rust may - but will not necessarily - decide to do on place to value
2247+
/// conversions:
2248+
///
2249+
/// 1. Types with validity constraints cause UB if the validity constraint is not met
2250+
/// 2. References/pointers may have their provenance change or cause other provenance related
2251+
/// side-effects.
2252+
///
2253+
/// A place to value conversion on a place that has its variant index set is not well-formed.
2254+
/// However, note that this rule only applies to places appearing in MIR bodies. Many functions,
2255+
/// such as [`Place::ty`], still accept such a place. If you write a function for which it might be
2256+
/// ambiguous whether such a thing is accepted, make sure to document your choice clearly.
21502257
#[derive(Clone, PartialEq, TyEncodable, TyDecodable, Hash, HashStable)]
21512258
pub enum Operand<'tcx> {
2152-
/// Copy: The value must be available for use afterwards.
2153-
///
2154-
/// This implies that the type of the place must be `Copy`; this is true
2155-
/// by construction during build, but also checked by the MIR type checker.
2259+
/// Creates a value by performing a place to value conversion at the given place. The type of
2260+
/// the place must be `Copy`
21562261
Copy(Place<'tcx>),
21572262

2158-
/// Move: The value (including old borrows of it) will not be used again.
2263+
/// Creates a value by performing a place to value conversion for the place, just like the
2264+
/// `Copy` operand.
2265+
///
2266+
/// This *may* additionally overwrite the place with `uninit` bytes, depending on how we decide
2267+
/// in [UCG#188]. You should not emit MIR that may attempt a subsequent second place to value
2268+
/// conversion on this place without first re-initializing it.
21592269
///
2160-
/// Safe for values of all types (modulo future developments towards `?Move`).
2161-
/// Correct usage patterns are enforced by the borrow checker for safe code.
2162-
/// `Copy` may be converted to `Move` to enable "last-use" optimizations.
2270+
/// [UCG#188]: https://github.com/rust-lang/unsafe-code-guidelines/issues/188
21632271
Move(Place<'tcx>),
21642272

2165-
/// Synthesizes a constant value.
2273+
/// Constants are already semantically values, and remain unchanged.
21662274
Constant(Box<Constant<'tcx>>),
21672275
}
21682276

0 commit comments

Comments
 (0)