Skip to content

Commit b1f40cf

Browse files
authored
Merge pull request #106 from Gankro/2018-cleanups-1
cleanups for Rust 2018
2 parents c11cd6d + 7f019ec commit b1f40cf

File tree

5 files changed

+163
-71
lines changed

5 files changed

+163
-71
lines changed

src/README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,20 @@
22

33
#### The Dark Arts of Advanced and Unsafe Rust Programming
44

5-
# NOTE: This is a draft document that discusses several unstable aspects of Rust, and may contain serious errors or outdated information.
6-
75
> Instead of the programs I had hoped for, there came only a shuddering blackness
86
and ineffable loneliness; and I saw at last a fearful truth which no one had
97
ever dared to breathe before — the unwhisperable secret of secrets — The fact
108
that this language of stone and stridor is not a sentient perpetuation of Rust
119
as London is of Old London and Paris of Old Paris, but that it is in fact
12-
quite unsafe, its sprawling body imperfectly embalmed and infested with queer
10+
quite `unsafe`, its sprawling body imperfectly embalmed and infested with queer
1311
animate things which have nothing to do with it as it was in compilation.
1412

15-
This book digs into all the awful details that are necessary to understand in
16-
order to write correct Unsafe Rust programs. Due to the nature of this problem,
17-
it may lead to unleashing untold horrors that shatter your psyche into a billion
18-
infinitesimal fragments of despair.
13+
This book digs into all the awful details that you need to understand when
14+
writing Unsafe Rust programs.
15+
16+
> THE KNOWLEDGE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
17+
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF UNLEASHING INDESCRIBABLE HORRORS THAT
18+
SHATTER YOUR PSYCHE AND SET YOUR MIND ADRIFT IN THE UNKNOWABLY INFINITE COSMOS.
1919

2020
Should you wish a long and happy career of writing Rust programs, you should
2121
turn back now and forget you ever saw this book. It is not necessary. However

src/data.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,17 @@
33
Low-level programming cares a lot about data layout. It's a big deal. It also
44
pervasively influences the rest of the language, so we're going to start by
55
digging into how data is represented in Rust.
6+
7+
This chapter is ideally in agreement with, and rendered redundant by,
8+
the [Type Layout section of the Reference][ref-type-layout]. When this
9+
book was first written, the reference was in complete disrepair, and the
10+
Rustonomicon was attempting to serve as a partial replacement for the reference.
11+
This is no longer the case, so this whole chapter can ideally be deleted.
12+
13+
We'll keep this chapter around for a bit longer, but ideally you should be
14+
contributing any new facts or improvements to the Reference instead.
15+
16+
17+
18+
19+
ref-type-layout: ../reference/type-layout.html

src/exotic-sizes.md

Lines changed: 85 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,102 @@
11
# Exotically Sized Types
22

3-
Most of the time, we think in terms of types with a fixed, positive size. This
4-
is not always the case, however.
3+
Most of the time, we expect types to have a statically known and positive size.
4+
This isn't always the case in Rust.
55

66

77

88

99

1010
# Dynamically Sized Types (DSTs)
1111

12-
Rust in fact supports Dynamically Sized Types (DSTs): types without a statically
12+
Rust supports Dynamically Sized Types (DSTs): types without a statically
1313
known size or alignment. On the surface, this is a bit nonsensical: Rust *must*
1414
know the size and alignment of something in order to correctly work with it! In
15-
this regard, DSTs are not normal types. Due to their lack of a statically known
16-
size, these types can only exist behind some kind of pointer. Any pointer to a
17-
DST consequently becomes a *fat* pointer consisting of the pointer and the
15+
this regard, DSTs are not normal types. Because they lack a statically known
16+
size, these types can only exist behind a pointer. Any pointer to a
17+
DST consequently becomes a *wide* pointer consisting of the pointer and the
1818
information that "completes" them (more on this below).
1919

20-
There are two major DSTs exposed by the language: trait objects, and slices.
20+
There are two major DSTs exposed by the language:
21+
22+
* trait objects: `dyn MyTrait`
23+
* slices: `[T]`, `str`, and others
2124

2225
A trait object represents some type that implements the traits it specifies.
2326
The exact original type is *erased* in favor of runtime reflection
2427
with a vtable containing all the information necessary to use the type.
25-
This is the information that completes a trait object: a pointer to its vtable.
28+
The information that completes a trait object pointer is the vtable pointer.
29+
The runtime size of the pointee can be dynamically requested from the vtable.
2630

2731
A slice is simply a view into some contiguous storage -- typically an array or
28-
`Vec`. The information that completes a slice is just the number of elements
29-
it points to.
32+
`Vec`. The information that completes a slice pointer is just the number of elements
33+
it points to. The runtime size of the pointee is just the statically known size
34+
of an element multiplied by the number of elements.
3035

3136
Structs can actually store a single DST directly as their last field, but this
3237
makes them a DST as well:
3338

3439
```rust
3540
// Can't be stored on the stack directly
36-
struct Foo {
41+
struct MySuperSlice {
3742
info: u32,
3843
data: [u8],
3944
}
4045
```
4146

47+
Although such a type is largely useless without a way to construct it. Currently the
48+
only properly supported way to create a custom DST is by making your type generic
49+
and performing an *unsizing coercion*:
50+
51+
```rust
52+
struct MySuperSliceable<T: ?Sized> {
53+
info: u32,
54+
data: T
55+
}
56+
57+
fn main() {
58+
let sized: MySuperSliceable<[u8; 8]> = MySuperSliceable {
59+
info: 17,
60+
data: [0; 8],
61+
};
62+
63+
let dynamic: &MySuperSliceable<[u8]> = &sized;
64+
65+
// prints: "17 [0, 0, 0, 0, 0, 0, 0, 0]"
66+
println!("{} {:?}", dynamic.info, &dynamic.data);
67+
}
68+
```
69+
70+
(Yes, custom DSTs are a largely half-baked feature for now.)
71+
72+
73+
74+
4275

4376
# Zero Sized Types (ZSTs)
4477

45-
Rust actually allows types to be specified that occupy no space:
78+
Rust also allows types to be specified that occupy no space:
4679

4780
```rust
48-
struct Foo; // No fields = no size
81+
struct Nothing; // No fields = no size
4982

5083
// All fields have no size = no size
51-
struct Baz {
52-
foo: Foo,
84+
struct LotsOfNothing {
85+
foo: Nothing,
5386
qux: (), // empty tuple has no size
5487
baz: [u8; 0], // empty array has no size
5588
}
5689
```
5790

5891
On their own, Zero Sized Types (ZSTs) are, for obvious reasons, pretty useless.
5992
However as with many curious layout choices in Rust, their potential is realized
60-
in a generic context: Rust largely understands that any operation that produces
61-
or stores a ZST can be reduced to a no-op. First off, storing it doesn't even
62-
make sense -- it doesn't occupy any space. Also there's only one value of that
63-
type, so anything that loads it can just produce it from the aether -- which is
93+
in a generic context: Rust largely understands that any operation that produces
94+
or stores a ZST can be reduced to a no-op. First off, storing it doesn't even
95+
make sense -- it doesn't occupy any space. Also there's only one value of that
96+
type, so anything that loads it can just produce it from the aether -- which is
6497
also a no-op since it doesn't occupy any space.
6598

66-
One of the most extreme example's of this is Sets and Maps. Given a
99+
One of the most extreme examples of this is Sets and Maps. Given a
67100
`Map<Key, Value>`, it is common to implement a `Set<Key>` as just a thin wrapper
68101
around `Map<Key, UselessJunk>`. In many languages, this would necessitate
69102
allocating space for UselessJunk and doing work to store and load UselessJunk
@@ -78,9 +111,8 @@ support values.
78111

79112
Safe code need not worry about ZSTs, but *unsafe* code must be careful about the
80113
consequence of types with no size. In particular, pointer offsets are no-ops,
81-
and standard allocators (including jemalloc, the one used by default in Rust)
82-
may return `nullptr` when a zero-sized allocation is requested, which is
83-
indistinguishable from out of memory.
114+
and standard allocators may return `null` when a zero-sized allocation is
115+
requested, which is indistinguishable from the out of memory result.
84116

85117

86118

@@ -97,7 +129,7 @@ enum Void {} // No variants = EMPTY
97129
```
98130

99131
Empty types are even more marginal than ZSTs. The primary motivating example for
100-
Void types is type-level unreachability. For instance, suppose an API needs to
132+
an empty type is type-level unreachability. For instance, suppose an API needs to
101133
return a Result in general, but a specific case actually is infallible. It's
102134
actually possible to communicate this at the type level by returning a
103135
`Result<T, Void>`. Consumers of the API can confidently unwrap such a Result
@@ -125,9 +157,35 @@ But this trick doesn't work yet.
125157

126158
One final subtle detail about empty types is that raw pointers to them are
127159
actually valid to construct, but dereferencing them is Undefined Behavior
128-
because that doesn't actually make sense. That is, you could model C's `void *`
129-
type with `*const Void`, but this doesn't necessarily gain anything over using
130-
e.g. `*const ()`, which *is* safe to randomly dereference.
160+
because that wouldn't make sense.
161+
162+
We recommend against modelling C's `void*` type with `*const Void`.
163+
A lot of people started doing that but quickly ran into trouble because
164+
Rust doesn't really have any safety guards against trying to instantiate
165+
empty types with unsafe code, and if you do it, it's Undefined Behaviour.
166+
This was especially problematic because developers had a habit of converting
167+
raw pointers to references and `&Void` is *also* Undefined Behaviour to
168+
construct.
169+
170+
`*const ()` (or equivalent) works reasonably well for `void*`, and can be made
171+
into a reference without any safety problems. It still doesn't prevent you from
172+
trying to read or write values, but at least it compiles to a no-op instead
173+
of UB.
174+
175+
176+
177+
178+
179+
# Extern Types
180+
181+
There is [an accepted RFC][extern-types] to add proper types with an unknown size,
182+
called *extern types*, which would let Rust developers model things like C's `void*`
183+
and other "declared but never defined" types more accurately. However as of
184+
Rust 2018, the feature is stuck in limbo over how `size_of::<MyExternType>()`
185+
should behave.
186+
187+
131188

132189

133190
[dst-issue]: https://github.com/rust-lang/rust/issues/26403
191+
[extern-types]: https://github.com/rust-lang/rfcs/blob/master/text/1861-extern-types.md

src/other-reprs.md

Lines changed: 36 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,26 @@ or C++. Any type you expect to pass through an FFI boundary should have
1515
necessary to soundly do more elaborate tricks with data layout such as
1616
reinterpreting values as a different type.
1717

18-
However, the interaction with Rust's more exotic data layout features must be
18+
We strongly recommend using [rust-bindgen][] and/or [cbdingen][] to manage your FFI
19+
boundaries for you. The Rust team works closely with those projects to ensure
20+
that they work robustly and are compatible with current and future guarantees
21+
about type layouts and reprs.
22+
23+
The interaction of `repr(C)` with Rust's more exotic data layout features must be
1924
kept in mind. Due to its dual purpose as "for FFI" and "for layout control",
2025
`repr(C)` can be applied to types that will be nonsensical or problematic if
2126
passed through the FFI boundary.
2227

2328
* ZSTs are still zero-sized, even though this is not a standard behavior in
2429
C, and is explicitly contrary to the behavior of an empty type in C++, which
25-
still consumes a byte of space.
30+
says they should still consume a byte of space.
2631

27-
* DST pointers (fat pointers), tuples, and enums with fields are not a concept
32+
* DST pointers (wide pointers) and tuples are not a concept
2833
in C, and as such are never FFI-safe.
2934

35+
* Enums with fields also aren't a concept in C or C++, but a valid bridging
36+
of the types [is defined][really-tagged].
37+
3038
* If `T` is an [FFI-safe non-nullable pointer
3139
type](ffi.html#the-nullable-pointer-optimization),
3240
`Option<T>` is guaranteed to have the same layout and ABI as `T` and is
@@ -36,13 +44,13 @@ still consumes a byte of space.
3644
* Tuple structs are like structs with regards to `repr(C)`, as the only
3745
difference from a struct is that the fields aren’t named.
3846

39-
* This is equivalent to one of `repr(u*)` (see the next section) for enums. The
40-
chosen size is the default enum size for the target platform's C application
41-
binary interface (ABI). Note that enum representation in C is implementation
47+
* `repr(C)` is equivalent to one of `repr(u*)` (see the next section) for
48+
fieldless enums. The chosen size is the default enum size for the target platform's C
49+
application binary interface (ABI). Note that enum representation in C is implementation
4250
defined, so this is really a "best guess". In particular, this may be incorrect
4351
when the C code of interest is compiled with certain flags.
4452

45-
* Field-less enums with `repr(C)` or `repr(u*)` still may not be set to an
53+
* Fieldless enums with `repr(C)` or `repr(u*)` still may not be set to an
4654
integer value without a corresponding variant, even though this is
4755
permitted behavior in C or C++. It is undefined behavior to (unsafely)
4856
construct an instance of an enum that does not match one of its
@@ -58,33 +66,35 @@ be additional zero-sized fields). The effect is that the layout and ABI of the
5866
whole struct is guaranteed to be the same as that one field.
5967

6068
The goal is to make it possible to transmute between the single field and the
61-
struct. An example of that is the [`UnsafeCell`], which can be transmuted into
69+
struct. An example of that is [`UnsafeCell`], which can be transmuted into
6270
the type it wraps.
6371

6472
Also, passing the struct through FFI where the inner field type is expected on
65-
the other side is allowed. In particular, this is necessary for `struct
66-
Foo(f32)` to have the same ABI as `f32`.
73+
the other side is guaranteed to work. In particular, this is necessary for `struct
74+
Foo(f32)` to always have the same ABI as `f32`.
6775

6876
More details are in the [RFC][rfc-transparent].
6977

7078

7179

7280
# repr(u*), repr(i*)
7381

74-
These specify the size to make a field-less enum. If the discriminant overflows
82+
These specify the size to make a fieldless enum. If the discriminant overflows
7583
the integer it has to fit in, it will produce a compile-time error. You can
7684
manually ask Rust to allow this by setting the overflowing element to explicitly
7785
be 0. However Rust will not allow you to create an enum where two variants have
7886
the same discriminant.
7987

80-
The term "field-less enum" only means that the enum doesn't have data in any
81-
of its variants. A field-less enum without a `repr(u*)` or `repr(C)` is
88+
The term "fieldless enum" only means that the enum doesn't have data in any
89+
of its variants. A fieldless enum without a `repr(u*)` or `repr(C)` is
8290
still a Rust native type, and does not have a stable ABI representation.
8391
Adding a `repr` causes it to be treated exactly like the specified
8492
integer size for ABI purposes.
8593

86-
Any enum with fields is a Rust type with no guaranteed ABI (even if the
87-
only data is `PhantomData` or something else with zero size).
94+
If the enum has fields, the effect is similar to the effect of `repr(C)`
95+
in that there is a defined layout of the type. This makes it possible to
96+
pass the enum to C code, or access the type's raw representation and directly
97+
manipulate its tag and fields. See [the RFC][really-tagged] for details.
8898

8999
Adding an explicit `repr` to an enum suppresses the null-pointer
90100
optimization.
@@ -107,13 +117,16 @@ compiler might be able to paper over alignment issues with shifts and masks.
107117
However if you take a reference to a packed field, it's unlikely that the
108118
compiler will be able to emit code to avoid an unaligned load.
109119

110-
**[As of Rust 1.30.0 this still can cause undefined behavior.][ub loads]**
120+
**[As of Rust 2018, this still can cause undefined behavior.][ub loads]**
111121

112122
`repr(packed)` is not to be used lightly. Unless you have extreme requirements,
113123
this should not be used.
114124

115125
This repr is a modifier on `repr(C)` and `repr(rust)`.
116126

127+
128+
129+
117130
# repr(align(n))
118131

119132
`repr(align(n))` (where `n` is a power of two) forces the type to have an
@@ -126,8 +139,15 @@ kinds of concurrent code).
126139
This is a modifier on `repr(C)` and `repr(rust)`. It is incompatible with
127140
`repr(packed)`.
128141

142+
143+
144+
145+
129146
[reference]: https://github.com/rust-rfcs/unsafe-code-guidelines/tree/master/reference/src/representation
130147
[drop flags]: drop-flags.html
131148
[ub loads]: https://github.com/rust-lang/rust/issues/27060
132149
[`UnsafeCell`]: ../std/cell/struct.UnsafeCell.html
133150
[rfc-transparent]: https://github.com/rust-lang/rfcs/blob/master/text/1758-repr-transparent.md
151+
[really-tagged]: https://github.com/rust-lang/rfcs/blob/master/text/2195-really-tagged-unions.md
152+
[rust-bindgen]: https://rust-lang-nursery.github.io/rust-bindgen/
153+
[cbindgen]: https://github.com/eqrion/cbindgen

0 commit comments

Comments
 (0)