Skip to content

Commit 9c4e4e1

Browse files
committed
tweak wording; explain why MaybeUninit; mention allocator non-determinism
1 parent 94918b9 commit 9c4e4e1

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

text/0000-rust-has-provenance.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
Pointers (this includes values of reference type) in Rust have **two** components.
1010
* The pointer's "address" says where in memory the pointer is currently pointing.
11-
* The pointer's "provenance" says where in memory the pointer is allowed to access when.
11+
* The pointer's "provenance" says where and when the pointer is allowed to access memory.
1212

1313
(This is disregarding any "metadata" that may come with wide pointers, it only talks about thin pointers / the data part of a wide pointer.)
1414

@@ -134,6 +134,12 @@ if x == y {
134134

135135
[^determined]: Beyond the contents of this RFC, this assumes that integers cannot be uninitialized, which current codegen relies on in the form of `noundef` attributes.
136136

137+
However, as a low-level systems language, Rust still needs some way to store and copy "memory with arbitrary content", including pointers that can have provenance.
138+
Popular belief says that an array of `u8` is suited for this purpose, but that is not true, because of provenance as stated above.
139+
In fact, "arbitrary content" may be "uninitialized memory", and `u8` must be initialized, so this is already not true even when disregarding provenance.
140+
However, `MaybeUninit<u8>` *is* suited for this purpose.
141+
It already must be able to store and copy uninitialized memory; there is no downside to also letting it store and copy pointers with provenance.
142+
137143
## Descriptive vs prescriptive provenance
138144

139145
Note that "provenance" is a somewhat unfortunate term.
@@ -242,6 +248,12 @@ Almost all reasonably usable compiler backends use *some form* of provenance log
242248
(The one exception we are aware of is cranelift, but that is not currently suited as a backend for release builds -- and it is unlikely to ever be suited for release builds unless it starts making use of provenance.)
243249
There essentially is no known alternative to having provenance in some form.
244250

251+
One often-suggested alternative is to rely on allocator non-determinism:
252+
unrelated code cannot "guess" the address of a memory allocation that was not "exposed", and therefore we can still optimize accesses to this allocation.
253+
This actually works for some cases, and can even be made to work [in combination with a finite address space](https://research.ralfj.de/twinsem/twinsem.pdf), albeit the semantics already start looking rather unusual at that point.
254+
However, all of the examples in the "motivation" section were chosen to *not* be resolved by allocator non-determinism.
255+
If we want to do these optimizations (and we are already doing some of them today), we need provenance.
256+
245257
There is some possibility for alternative designs around what happens on pointer-to-integer transmutation: (1) they could act like pointer-to-integer casts, or (2) they could be outright UB, or (3) they could strip the provenance from the pointer to yield a valid integer, but the provenance has been irreversably lost.
246258
For (1), making it work like a pointer-to-integer cast is problematic since pointer-to-integer casts [are side-effecting operations when considering provenance](https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html), and as such cannot be removed even if their result is unused.
247259
Making all transmutation sites (which includes every load from memory) possibly side-effecting that way would be a disaster for optimizations (it would prohibit elimination of dead loads), so option (1) seems infeasible.

0 commit comments

Comments
 (0)