Skip to content

Commit d6b6744

Browse files
committed
Move value definitions to appropriate chapters under types.
1 parent b74f458 commit d6b6744

File tree

7 files changed

+148
-169
lines changed

7 files changed

+148
-169
lines changed

src/memory-model.md

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,45 @@
11
# Memory model
22

3-
Rust does not yet have a defined memory model. Various academics and industry professionals
4-
are working on various proposals, but for now, this is an under-defined place
5-
in the language.
3+
r[memory]
4+
5+
The Memory Model of Rust is incomplete and not fully decided. The following is some of the detail worked out so far.
6+
7+
## Bytes
8+
9+
r[memory.byte]
10+
11+
r[memory.byte.intro]
12+
The most basic unit of memory in Rust is a byte. All values in Rust are computed from 0 or more bytes read from an allocation.
13+
14+
> [!NOTE]
15+
> While bytes in Rust are typically lowered to hardware bytes, they may contain additional values,
16+
> such as being uninitialized, or storing part of a pointer.
17+
18+
r[memory.byte.init]
19+
Each byte may be initialized, and contain a value of type `u8`, as well as an optional pointer fragment. When present, the pointer fragment carries [provenance][type.pointer.provenance] information.
20+
21+
r[memory.byte.uninit]
22+
Each byte may be uninitialized.
23+
24+
> [!NOTE]
25+
> Uninitialized bytes do not have a value and do not have a pointer fragment.
26+
27+
## Value Encoding
28+
29+
r[memory.encoding]
30+
31+
r[memory.encoding.intro]
32+
Each type in Rust has 0 or more values, which can have operations performed on them
33+
34+
> [!NOTE]
35+
> `0u8`, `1337i16`, and `Foo{bar: "baz"}` are all values
36+
37+
r[memory.encoding.op]
38+
Each value of a type can be encoded into a sequence of bytes, and decoded from a sequence of bytes, which has a length equal to the size of the type.
39+
The operation to encode or decode a value is determined by the representation of the type.
40+
41+
> [!NOTE]
42+
> Representation is related to, but is not the same property as, the layout of the type.
43+
44+
r[memory.encoding.decode]
45+
If a value of type `T` is decoded from a sequence of bytes that does not correspond to a defined value, the behavior is undefined. If a value of type `T` is decoded from a sequence of bytes that contain pointer fragments, which are not used to represent the value, the pointer fragments are ignored.

src/types/boolean.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,10 @@ r[type.bool.layout]
2121
An object with the boolean type has a [size and alignment] of 1 each.
2222

2323
r[type.bool.repr]
24-
The value false has the bit pattern `0x00` and the value true has the bit pattern
25-
`0x01`. It is [undefined behavior] for an object with the boolean type to have
26-
any other bit pattern.
24+
A `bool` is represented as a single initialized byte with a value of `0x00` corresponding to `false` and a value of `0x01` corresponding to `true`. This byte does not have a pointer fragment.
25+
26+
> [!NOTE]
27+
> No other representations are valid for `bool`. Undefined Behaviour occurs when any other byte is read as type `bool`.
2728
2829
r[type.bool.usage]
2930
The boolean type is the type of many operands in various [expressions]:

src/types/function-pointer.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,13 @@ let bo: Binop = add;
5555
x = bo(5,7);
5656
```
5757

58+
r[type.fn-pointer.value]
59+
A value of a function pointer type consists of an non-null address. A function pointer value is represented the same as an address represented as an unsigned integer type with the same width as the function pointer.
60+
61+
> [!NOTE]
62+
> Whether or not a function pointer value has provenance, and whether or not this provenance is represented as pointer fragments, is not yet decided.
63+
64+
5865
## Attributes on function pointer parameters
5966

6067
r[type.fn-pointer.attributes]

src/types/numeric.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Type | Minimum | Maximum
2929
`i128` | -(2<sup>127</sup>) | 2<sup>127</sup>-1
3030

3131

32+
3233
## Floating-point types
3334

3435
r[type.numeric.float]
@@ -65,3 +66,37 @@ r[type.numeric.validity]
6566

6667
For every numeric type, `T`, the bit validity of `T` is equivalent to the bit
6768
validity of `[u8; size_of::<T>()]`. An uninitialized byte is not a valid `u8`.
69+
70+
## Representation
71+
72+
r[type.numeric.repr]
73+
74+
r[type.numeric.repr.integer]
75+
Each value of an integer type is a whole number. For unsigned types, this is a positive integer or `0`. For signed types, this can either be a positive integer, negative integer, or `0`.
76+
77+
r[type.numeric.repr.integer-width]
78+
The range of values an integer type can represent depends on its signedness and its width, in bits. The width of type `uN` or `iN` is `N`. The width of type `usize` or `isize` is the value of the `target_pointer_width` property.
79+
80+
> [!NOTE]
81+
> There are exactly `1<<N` unique values of an integer type of width `N`.
82+
83+
r[type.numeric.repr.unsigned]
84+
A value `i` of an unsigned integer type `U` is represented by a sequence of initialized bytes, where the `m`th successive byte according to the byte order of the platform is `(i >> (m*8)) as u8`, where `m` is between `0` and the size of `U`. None of the bytes produced by encoding an unsigned integer has a pointer fragment.
85+
86+
> [!NOTE]
87+
> The two primary byte orders are `little` endian, where the bytes are ordered from lowest memory address to highest, and `big` endian, where the bytes are ordered from highest memory address to lowest.
88+
> The `cfg` predicate `target_endian` indicates the byte order
89+
90+
> [!WARN]
91+
> On `little` endian, the order of bytes used to decode an integer type is the same as the natural order of a `u8` array - that is, the `m` value corresponds with the `m` index into a same-sized `u8` array. On `big` endian, however, the order is the opposite of this order - that is, the `m` value corresponds with the `size_of::<T>() - m` index in that array.
92+
93+
r[type.numeric.repr.signed]
94+
A value `i` of a signed integer type with width `N` is represented the same as the corresponding value of the unsigned counterpart type which is congruent modulo `2^N`.
95+
96+
r[type.numeric.repr.float]
97+
A floating-point value is represented the same as a value of the unsigned integer type with the same width given by its [IEEE 754-2019] encoding.
98+
99+
r[type.numeric.repr.float-format]
100+
The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`.
101+
102+
[IEEE 754-2019]: https://ieeexplore.ieee.org/document/8766229

src/types/pointer.md

Lines changed: 52 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -81,19 +81,61 @@ r[type.pointer.smart]
8181

8282
The standard library contains additional 'smart pointer' types beyond references and raw pointers.
8383

84-
## Bit validity
84+
## Pointer values and representation
8585

86-
r[type.pointer.validity]
86+
r[type.pointer.value]
8787

88-
r[type.pointer.validity.pointer-fragment]
89-
Despite pointers and references being similar to `usize`s in the machine code emitted on most platforms,
90-
the semantics of transmuting a reference or pointer type to a non-pointer type is currently undecided.
91-
Thus, it may not be valid to transmute a pointer or reference type, `P`, to a `[u8; size_of::<P>()]`.
88+
r[type.pointer.value.thin]
89+
Each thin pointer consists of an address and an optional [provenance][type.pointer.provenance]. The address refers to which byte the pointer points to. The provenance refers to which bytes the pointer is allowed to access, and the allocation those bytes are within.
90+
91+
> [!NOTE]
92+
> A pointer that does not have a provenance may be called an invalid or dangling pointer.
93+
94+
r[type.pointer.value.thin-repr]
95+
The representation of a value of a thin pointer is a sequence of initialized bytes with `u8` values given by the representation of its address as a value of type `usize`, and pointer fragments corresponding to its provenance, if present.
96+
97+
r[type.pointer.value.thin-ref]
98+
A thin reference to `T` consists of a non-null, well aligned address, and provenance for `size_of::<T>()` bytes starting from that address. The representation of a thin reference to `T` is the same as the pointer with the same address and provenance.
99+
100+
> [!NOTE]
101+
> This is true for both shared and mutable references. There are additional constraints enforced by the aliasing model that are not yet fully decided.
102+
103+
r[type.pointer.value.wide]
104+
A wide pointer or reference consists of a data pointer or reference, and a pointee-specific metadata value.
105+
106+
r[type.pointer.value.wide-reference]
107+
The data pointer of a wide reference has a non-null address, well aligned for `align_of_val(self)`, and with provenance for `size_of_val(self)` bytes.
108+
109+
r[type.pointer.value.wide-representation]
110+
A wide pointer or reference is represented the same as `struct WidePointer<M>{data: *mut (), metadata: M}` where `M` is the pointee metadata type, and the `data` and `metadata` fields are the corresponding parts of the pointer.
111+
112+
> [!NOTE]
113+
> The `WidePointer` struct has no guarantees about layout, and has the default representation.
114+
115+
116+
## Pointer Provenance
117+
118+
r[type.pointer.provenance]
119+
120+
r[type.pointer.provenance.intro]
121+
Pointer Provenance is a term that refers to additional data carried by pointer values in Rust, beyond its address. When stored in memory, Provenance is encoded in the Pointer Fragment part of each byte of the pointer.
122+
123+
r[type.pointer.provenance.allocation]
124+
Whenever a pointer to a particular allocation is produced by using the reference or raw reference operators, or when a pointer is returned from an allocation function, the resulting pointer has provenance that refers to that allocation.
125+
126+
> [!NOTE]
127+
> There is additional information encoded by provenance, but the exact scope of this information is not yet decided.
128+
129+
r[type.pointer.provenance.dangling]
130+
A pointer is dangling if it has no provenance, or if it has provenance to an allocation that has since been deallocated. An access, except for an access of size zero, using a dangling pointer, is undefined behavior.
131+
132+
> [!NOTE]
133+
> Allocations include local and static variables, as well as temporaries. Local Variables and Temporaries are deallocated when they go out of scope.
134+
135+
> [!WARN]
136+
> The above is necessary, but not sufficient, to avoid undefined behavior. The full requirements for pointer access is not yet decided.
137+
> A reference obtained in safe code is guaranteed to be valid for its usable lifetime, unless interfered with by unsafe code.
92138
93-
r[type.pointer.validity.raw]
94-
For thin raw pointers (i.e., for `P = *const T` or `P = *mut T` for `T: Sized`),
95-
the inverse direction (transmuting from an integer or array of integers to `P`) is always valid.
96-
However, the pointer produced via such a transmutation may not be dereferenced (not even if `T` has size zero).
97139

98140
[Interior mutability]: ../interior-mutability.md
99141
[_Lifetime_]: ../trait-bounds.md

src/types/textual.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,13 @@ A value of type `char` is a [Unicode scalar value] (i.e. a code point that is
1010
not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to 0xD7FF
1111
or 0xE000 to 0x10FFFF range.
1212

13-
r[type.text.char-precondition]
14-
It is immediate [undefined behavior] to create \1
15-
`char` that falls outside this range. A `[char]` is effectively a UCS-4 / UTF-32
16-
string of length 1.
13+
> [!NOTE]
14+
> It is immediate [undefined behavior] to create a
15+
> `char` that falls outside this range. A `[char]` is effectively a UCS-4 / UTF-32
16+
> string of length 1.
17+
18+
r[type.text.char-repr]
19+
A value of type `chart` is represented as the value of type `u32` with value equal to the code point that it represents.
1720

1821
r[type.text.str-value]
1922
A value of type `str` is represented the same way as `[u8]`, a slice of

src/values.md

Lines changed: 0 additions & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -2,155 +2,6 @@
22

33
r[value]
44

5-
## Bytes
6-
7-
r[value.byte]
8-
9-
r[value.byte.intro]
10-
The most basic unit of memory in Rust is a byte. All values in Rust are computed from 0 or more bytes read from an allocation.
11-
12-
> [!NOTE]
13-
> While bytes in Rust are typically lowered to hardware bytes, they may contain additional values,
14-
> such as being uninitialized, or storing part of a pointer.
15-
16-
r[value.byte.init]
17-
Each byte may be initialized, and contain a value of type `u8`, as well as an optional pointer fragment.
18-
19-
r[value.byte.uninit]
20-
Each byte may be uninitialized.
21-
22-
> [!NOTE]
23-
> Uninitialized bytes do not have a value and do not have a pointer fragment.
24-
25-
## Value Encoding
26-
27-
r[value.encoding]
28-
29-
r[value.encoding.intro]
30-
Each type in Rust has 0 or more values, which can have operations performed on them
31-
32-
> [!NOTE]
33-
> `0u8`, `1337i16`, and `Foo{bar: "baz"}` are all values
34-
35-
r[value.encoding.op]
36-
Each value of a type can be encoded into a sequence of bytes, and decoded from a sequence of bytes, which has a length equal to the size of the type.
37-
The operation to encode or decode a value is determined by the representation of the type.
38-
39-
> [!NOTE]
40-
> Representation is related to, but is not the same property as, the layout of the type.
41-
42-
r[value.encoding.decode]
43-
If a value of type `T` is decoded from a sequence of bytes that does not correspond to a defined value, the behavior is undefined. If a value of type `T` is decoded from a sequence of bytes that contain pointer fragments, which are not used to represent the value, the pointer fragments are ignored.
44-
45-
## Pointer Provenance
46-
47-
r[value.provenance]
48-
49-
r[value.provenance.intro]
50-
Pointer Provenance is a term that refers to additional data carried by pointer values in Rust, beyond its address. When stored in memory, Provenance is encoded in the Pointer Fragment part of each byte of the pointer.
51-
52-
r[value.provenance.allocation]
53-
Whenever a pointer to a particular allocation is produced by using the reference or raw reference operators, or when a pointer is returned from an allocation function, the resulting pointer has provenance that refers to that allocation.
54-
55-
> [!NOTE]
56-
> There is additional information encoded by provenance, but the exact scope of this information is not yet decided.
57-
58-
r[value.provenance.dangling]
59-
A pointer is dangling if it has no provenance, or if it has provenance to an allocation that has since been deallocated. An access, except for an access of size zero, using a dangling pointer, is undefined behavior.
60-
61-
> [!NOTE]
62-
> Allocations include local and static variables, as well as temporaries. Local Variables and Temporaries are deallocated when they go out of scope.
63-
64-
> [!WARN]
65-
> The above is necessary, but not sufficient, to avoid undefined behavior. The full requirements for pointer access is not yet decided.
66-
> A reference obtained in safe code is guaranteed to be valid for its usable lifetime, unless interfered with by unsafe code.
67-
68-
## Primitive Values
69-
70-
r[value.primitive]
71-
72-
r[value.primitive.integer]
73-
Each value of an integer type is a whole number. For unsigned types, this is a positive integer or `0`. For signed types, this can either be a positive integer, negative integer, or `0`.
74-
75-
r[value.primtive.integer-width]
76-
The range of values an integer type can represent depends on its signedness and its width, in bits. The width of type `uN` or `iN` is `N`. The width of type `usize` or `isize` is the value of the `target_pointer_width` property.
77-
78-
r[value.primitive.integer-range]
79-
The range of an unsigned integer type of width `N` is between `0` and `1<<N - 1` inclusive. The range of a signed integer type of width `N` is between `-(1<<(N-1)` and `1<<(N-1) - 1` inclusive.
80-
81-
> [!NOTE]
82-
> There are exactly `1<<N` unique values of an integer type of width `N`.
83-
84-
r[value.primitive.unsigned-repr]
85-
A value `i` of an unsigned integer type `U` is represented by a sequence of initialized bytes, where the `m`th successive byte according to the byte order of the platform is `(i >> (m*8)) as u8`, where `m` is between `0` and the size of `U`. None of the bytes produced by encoding an unsigned integer has a pointer fragment.
86-
87-
> [!NOTE]
88-
> The two primary byte orders are `little` endian, where the bytes are ordered from lowest memory address to highest, and `big` endian, where the bytes are ordered from highest memory address to lowest.
89-
> The `cfg` predicate `target_endian` indicates the byte order
90-
91-
> [!WARN]
92-
> On `little` endian, the order of bytes used to decode an integer type is the same as the natural order of a `u8` array - that is, the `m` value corresponds with the `m` index into a same-sized `u8` array. On `big` endian, however, the order is the opposite of this order - that is, the `m` value corresponds with the `size_of::<T>() - m` index in that array.
93-
94-
r[value.primitive.signed-repr]
95-
A value `i` of a signed integer type with width `N` is represented the same as the corresponding value of the unsigned counterpart type which is congruent modulo `2^N`.
96-
97-
r[value.primitive.char]
98-
Each value of type `char` is a Unicode Scalar Value, between `U+0000` and `U+10FFFF` (excluding the surrogate range `U+D800` through `U+DFFF`).
99-
100-
r[value.primitive.char-repr]
101-
The representation of type `char` is the same as the representation of the `u32` corresponding to the Code Point Number encoding by the `char`.
102-
103-
r[value.primitive.bool]
104-
The two values of type `bool` are `true` and `false`. The representation of `true` is an initialized byte with value `0x01`, and the representation of `false` is an initialized byte with value `0x00`. Neither value is represented with a pointer fragment.
105-
106-
r[value.primitive.float]
107-
A floating-point value consists of either a rational number, which is within the range and precision dictated by the type, an infinity, or a NaN value.
108-
109-
r[value.primitive.float-repr]
110-
A floating-point value is represented the same as a value of the unsigned integer type with the same width given by its [IEEE 754-2019] encoding.
111-
112-
r[value.primitive.float-format]
113-
The [IEEE 754-2019] `binary32` format is used for `f32`, and the `binary64` format is used for `f64`.
114-
115-
[IEEE 754-2019]: https://ieeexplore.ieee.org/document/8766229
116-
117-
## Pointer Value
118-
119-
r[value.pointer]
120-
121-
r[value.pointer.thin]
122-
Each thin pointer consists of an address and an optional provenance. The address refers to which byte the pointer points to. The provenance refers to which bytes the pointer is allowed to access, and the allocation those bytes are within.
123-
124-
> [!NOTE]
125-
> A pointer that does not have a provenance may be called an invalid or dangling pointer.
126-
127-
r[value.pointer.thin-repr]
128-
The representation of a value of a thin pointer is a sequence of initialized bytes with `u8` values given by the representation of its address as a value of type `usize`, and pointer fragments corresponding to its provenance, if present.
129-
130-
r[value.pointer.thin-ref]
131-
A thin reference to `T` consists of a non-null, well aligned address, and provenance for `size_of::<T>()` bytes starting from that address. The representation of a thin reference to `T` is the same as the pointer with the same address and provenance.
132-
133-
> [!NOTE]
134-
> This is true for both shared and mutable references. There are additional constraints enforced by the aliasing model.
135-
136-
r[value.pointer.wide]
137-
A wide pointer or reference consists of a data pointer or reference, and a pointee-specific metadata value.
138-
139-
r[value.pointer.wide-reference]
140-
The data pointer of a wide reference has a non-null address, well aligned for `align_of_val(self)`, and with provenance for `size_of_val(self)` bytes.
141-
142-
r[value.pointer.wide-representation]
143-
A wide pointer or reference is represented the same as `struct WidePointer<M>{data: *mut (), metadata: M}` where `M` is the pointee metadata type, and the `data` and `metadata` fields are the corresponding parts of the pointer.
144-
145-
> [!NOTE]
146-
> The `WidePointer` struct has no guarantees about layout, and has the default representation.
147-
148-
r[value.pointer.fn]
149-
A value of a function pointer type consists of an non-null address. A function pointer value is represented the same as an address represented as an unsigned integer type with the same width as the function pointer.
150-
151-
> [!NOTE]
152-
> Whether or not a function pointer value has provenance, and whether or not this provenance is represented as pointer fragments, is not yet decided.
153-
1545
## Aggregate Values
1556

1567
r[value.aggregate]

0 commit comments

Comments
 (0)