Skip to content

Commit c8da0a4

Browse files
committed
Add Values and Representation chapter
1 parent 8676ab7 commit c8da0a4

File tree

2 files changed

+176
-0
lines changed

2 files changed

+176
-0
lines changed

src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@
9999
- [Type coercions](type-coercions.md)
100100
- [Destructors](destructors.md)
101101
- [Lifetime elision](lifetime-elision.md)
102+
- [Values and Representation](values.md)
102103

103104
- [Special types and traits](special-types-and-traits.md)
104105

src/values.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Values and Representation
2+
3+
r[value]
4+
5+
## Bytes
6+
7+
r[value.byte]
8+
9+
r[value.byte.intro]
10+
The Most basic unit of Memory in Rust is a Byte. All values in Rust are computed from 0 or more bytes read from an allocation.
11+
12+
> [!NOTE]
13+
> While bytes in Rust are typically lowered to hardware bytes, they may contain additional values,
14+
> such as being uninitialized, or storing part of a pointer.
15+
16+
r[value.byte.init]
17+
Each byte may be initialized, and contain a value of type `u8`, as well as an optional pointer fragment.
18+
19+
r[value.byte.uninit]
20+
Each byte may be uninitialized.
21+
22+
> [!NOTE]
23+
> Uninitialized bytes do not have a value and do not have a pointer fragment.
24+
25+
## Value Encoding
26+
27+
r[value.encoding]
28+
29+
r[value.encoding.intro]
30+
Each type in Rust has 0 or more values, which can have operations performed on them
31+
32+
> [!NOTE]
33+
> `0u8`, `1337i16`, and `Foo{bar: "baz"}` are all values
34+
35+
r[value.encoding.op]
36+
Each value of a type can be encoded into a sequence of bytes, and decoded from a sequence of bytes, which has a length equal to the size of the type.
37+
The operation to encode or decode a value is determined by the representation of the type.
38+
39+
> [!NOTE]
40+
> Representation is related to, but is not the same property as, the layout of the type.
41+
42+
r[value.encoding.decode]
43+
If a value of type `T` is decoded from a sequence of bytes that does not correspond to a defined value, the behavior is undefined. If a value of type `T` is decoded from a sequence of bytes that contain pointer fragments, which are not used to represent the value, the pointer fragments are ignored.
44+
45+
## Primitive Values
46+
47+
r[value.primitive]
48+
49+
r[value.primitive.integer]
50+
Each value of an integer type is a whole number. For unsigned types, this is a positive integer or `0`. For signed types, this can either be a positive integer, negative integer, or `0`.
51+
52+
r[value.primtive.integer-width]
53+
The range of values an integer type can represent depends on its signedness and its width, in bits. The width of type `uN` or `iN` is `N`. The width of type `usize` or `isize` is the value of the `target_pointer_width` property.
54+
55+
r[value.primitive.integer-range]
56+
The range of an unsigned integer type of width `N` is between `0` and `1<<N - 1` inclusive. The range of a signed integer type of width `N` is between `-(1<<(N-1)` and `1<<(N-1) - 1` inclusive.
57+
58+
> [!NOTE]
59+
> There are exactly `1<<N` unique values of an integer type of width `N`.
60+
61+
r[value.primitive.unsigned-repr]
62+
A value `i` of an unsigned integer type `U` is represented by a sequence of initialized bytes, where the `m`th successive byte according to the byte order of the platform is `(i >> (m*8)) as u8`, where `m` is between `0` and the size of `U`. None of the bytes produced by encoding an unsigned integer has a pointer fragment.
63+
64+
> [!NOTE]
65+
> The two primary byte orders are `little` endian, where the bytes are ordered from lowest memory address to highest, and `big` endian, where the bytes are ordered from highest memory address to lowest.
66+
> The `cfg` predicate `target_endian` indicates the byte order
67+
68+
> [!WARN]
69+
> On `little` endian, the order of bytes used to decode an integer type is the same as the natural order of a `u8` array - that is, the `m` value corresponds with the `m` index into a same-sized `u8` array. On `big` endian, however, the order is the opposite of this order - that is, the `m` value corresponds with the `size_of::<T>() - m` index in that array.
70+
71+
r[value.primitive.signed-repr]
72+
A value `i` of a signed integer type with width `N` is represented the same as the corresponding value of the unsigned counterpart type which is congruent modulo `2^N`.
73+
74+
r[value.primitive.char]
75+
Each value of type `char` is a Unicode Scalar Value, between `U+0000` and `U+10FFFF` (excluding the surrogate range `U+D800` through `U+DFFF`).
76+
77+
r[value.primitive.char-repr]
78+
The representation of type `char` is the same as the representation of the `u32` corresponding to the Code Point Number encoding by the `char`.
79+
80+
r[value.primitive.bool]
81+
The two values of type `bool` are `true` and `false`. The representation of `true` is an initialized byte with value `0x01`, and the representation of `false` is an initialized byte with value `0x00`. Neither value is represented with a pointer fragment.
82+
83+
## Pointer Value
84+
85+
r[value.pointer]
86+
87+
r[value.pointer.thin]
88+
Each thin pointer consists of an address and an optional provenance. The address refers to which byte the pointer points to. The provenance refers to which bytes the pointer is allowed to access, and the allocation those bytes are within.
89+
90+
> [!NOTE]
91+
> A pointer that does not have a provenance may be called an invalid or dangling pointer.
92+
93+
r[value.pointer.thin-repr]
94+
The representation of a value of a thin pointer is a sequence of initialized bytes with `u8` values given by the representation of its address as a value of type `usize`, and pointer fragments corresponding to its provenance, if present.
95+
96+
r[value.pointer.thin-ref]
97+
A thin reference to `T` consists of a non-null, well aligned address, and provenance for `size_of::<T>()` bytes starting from that address. The representation of a thin reference to `T` is the same as the pointer with the same address and provenance.
98+
99+
> [!NOTE]
100+
> This is true for both shared and mutable references. There are additional constraints enforced by the aliasing model.
101+
102+
r[value.pointer.fat]
103+
A fat pointer or reference consists of a data pointer or reference, and a pointee-specific metadata value.
104+
105+
r[value.pointer.fat-reference]
106+
The data pointer of a fat reference has a non-null address, well aligned for `align_of_val(self)`, and with provenance for `size_of_val(self)` bytes.
107+
108+
r[value.pointer.fat-representation]
109+
A fat pointer or reference is represented the same as `struct FatPointer<M>{data: *mut (), metadata: M}` where `M` is the pointee metadata type, and the `data` and `metadata` fields are the corresponding parts of the pointer.
110+
111+
> [!NOTE]
112+
> The `FatPointer` struct has no guarantees about layout, and has the default representation.
113+
114+
r[value.pointer.fn]
115+
A value of a function pointer type consists of an non-null address. A function pointer value is represented the same as an address represented as an unsigned integer type with the same width as the function pointer.
116+
117+
> [!NOTE]
118+
> Whether or not a function pointer value has provenance, and whether or not this provenance is represented as pointer fragments, is not yet decided.
119+
120+
## Aggregate Values
121+
122+
r[value.aggregate]
123+
124+
r[value.aggregate.value-bytes]
125+
A byte `b` in the representation of an aggregate is a value byte if there exists a field of that aggregate such that:
126+
* The field has some type `T`,
127+
* The offset of that field `o` is such that `b` falls at an offset in `o..(o+size_of::<T>())`,
128+
* Either `T` is a primitive type or the offset of `b` within the field is a value byte in the representation of `T`.
129+
130+
> [!NOTE]
131+
> A byte in a union is a value byte if it is a value byte in *any* field.
132+
133+
r[value.aggregate.padding]
134+
Every byte in an aggregate which is not a value byte is a padding byte.
135+
136+
r[value.aggregate.struct]
137+
A value of a struct type consists of the values of each of its fields.
138+
The representation of such a struct contains the representation of the value of each field at its corresponding offset.
139+
140+
r[value.aggregate.union]
141+
A value of a union type consists of a sequence of bytes, corresponding to each value byte. The value bytes of a union are represented exactly.
142+
143+
> [!NOTE]
144+
> When a union value is constructed or a field is read/written to, the value of that field is encoded or decoded appropriately.
145+
146+
r[value.aggregate.padding-uninit]
147+
When a value of an aggregate is encoded, each padding byte is left as uninit
148+
149+
> [!NOTE]
150+
> It is valid for padding bytes to hold a value other than uninit when decoded, and these bytes are ignored when decoding an aggregate.
151+
152+
r[value.aggregate.tuple-array]
153+
The fields of a tuple or an array are the elements of that tuple or array.
154+
155+
## Enum Values
156+
157+
r[value.enum]
158+
159+
r[value.enum.intro]
160+
An enum value corresponds to exactly one variant of the enum, and consists of the fields of that variant
161+
162+
> [!NOTE]
163+
> An enum with no variants therefore has no values.
164+
165+
r[value.enum.variant-padding]
166+
A byte is a padding byte in a variant `V` if the byte is not used for computing the discriminant, and the byte would be a padding byte in a struct consisting of the fields of the variant at the same offsets.
167+
168+
r[value.enum.value-padding]
169+
A byte is a padding byte of an enum if it is a padding byte in each variant of the enum. A byte that is not a padding byte of an enum is a value byte.
170+
171+
r[value.enum.repr]
172+
The representation of a value of an enum type includes the representation of each field of the variant at the appropriate offsets. When encoding a value of an enum type, each byte which is a padding byte in the variant is set to uninit. In the case of a [`repr(C)`][layout.repr.c.adt] or a [primitive-repr][layout.repr.primitive.adt] enum, the discriminant of the variant is represented as though by the appropriate integer type stored at offset 0.
173+
174+
> [!NOTE]
175+
> Most `repr(Rust)` enums will also store a discriminant in the representation of the enum, but the exact placement or type of the discriminant is unspecified, as is the value that represents each variant.

0 commit comments

Comments
 (0)