Skip to content

Commit 472f4e8

Browse files
nikomatsakismark-i-m
authored andcommitted
describe region inference and member constraints in some detail
1 parent 4615a9a commit 472f4e8

File tree

5 files changed

+466
-68
lines changed

5 files changed

+466
-68
lines changed

src/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@
7777
- [MIR type checker](./borrow_check/type_check.md)
7878
- [Region inference](./borrow_check/region_inference.md)
7979
- [Constraint propagation](./borrow_check/region_inference/constraint_propagation.md)
80+
- [Lifetime parameters](./borrow_check/region_inference/lifetime_parameters.md)
81+
- [Member constraints](./borrow_check/region_inference/member_constraints.md)
8082
- [Placeholders and universes](./borrow_check/region_inference/placeholders_and_universes.md)
8183
- [Closure constraints](./borrow_check/region_inference/closure_constraints.md)
8284
- [Errror reporting](./borrow_check/region_inference/error_reporting.md)

src/borrow_check/region_inference.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,8 @@ TODO: write about _how_ these regions are computed.
7171

7272
[`UniversalRegions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/universal_regions/struct.UniversalRegions.html
7373

74+
<a name="region-variables"></a>
75+
7476
## Region variables
7577

7678
The value of a region can be thought of as a **set**. This set contains all

src/borrow_check/region_inference/constraint_propagation.md

Lines changed: 144 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,121 @@
11
# Constraint propagation
22

3-
The main work of the region inference is **constraint
4-
propagation**. This means processing the set of constraints to compute
5-
the final values for all the region variables.
3+
The main work of the region inference is **constraint propagation**,
4+
which is done in the [`propagate_constraints`] function. There are
5+
three sorts of constraints that are used in NLL, and we'll explain how
6+
`propagate_constraints` works by "layering" those sorts of constraints
7+
on one at a time (each of them is fairly independent from the others):
68

7-
## Kinds of constraints
9+
- liveness constraints (`R live at E`), which arise from liveness;
10+
- outlives constraints (`R1: R2`), which arise from subtyping;
11+
- [member constraints][m_c] (`member R_m of [R_c...]`), which arise from impl Trait.
812

9-
Each kind of constraint is handled somewhat differently by the region inferencer.
13+
[`propagate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.propagate_constraints
14+
[m_c]: ./member_constraints.html
15+
16+
In this chapter, we'll explain the "heart" of constraint propagation,
17+
covering both liveness and outlives constraints.
18+
19+
## Notation and high-level concepts
20+
21+
Conceptually, region inference is a "fixed-point" computation. It is
22+
given some set of constraints `{C}` and it computes a set of values
23+
`Values: R -> {E}` that maps each region `R` to a set of elements
24+
`{E}` (see [here][riv] for more notes on region elements):
25+
26+
- Initially, each region is mapped to an empty set, so `Values(R) =
27+
{}` for all regions `R`.
28+
- Next, we process the constraints repeatedly until a fixed-point is reached:
29+
- For each constraint C:
30+
- Update `Values` as needed to satisfy the constraint
1031

11-
### Liveness constraints
32+
[riv]: ../region-inference.html#region-variables
33+
34+
As a simple example, if we have a liveness constraint `R live at E`,
35+
then we can apply `Values(R) = Values(R) union {E}` to make the
36+
constraint be satisfied. Similarly, if we have an outlives constraints
37+
`R1: R2`, we can apply `Values(R1) = Values(R1) union Values(R2)`.
38+
(Member constraints are more complex and we discuss them below.)
39+
40+
In practice, however, we are a bit more clever. Instead of applying
41+
the constraints in a loop, we can analyze the constraints and figure
42+
out the correct order to apply them, so that we only have to apply
43+
each constraint once in order to find the final result.
44+
45+
Similarly, in the implementation, the `Values` set is stored in the
46+
`scc_values` field, but they are indexed not by a *region* but by a
47+
*strongly connected component* (SCC). SCCs are an optimization that
48+
avoids a lot of redundant storage and computation. They are explained
49+
in the section on outlives constraints.
50+
51+
## Liveness constraints
1252

1353
A **liveness constraint** arises when some variable whose type
1454
includes a region R is live at some point P. This simply means that
1555
the value of R must include the point P. Liveness constraints are
1656
computed by the MIR type checker.
1757

18-
We represent them by keeping a (sparse) bitset for each region
19-
variable, which is the field [`liveness_constraints`], of type
20-
[`LivenessValues`]
58+
A liveness constraint `R live at E` is satisfied if `E` is a member of
59+
`Values(R)`. So to "apply" such a constraint to `Values`, we just have
60+
to compute `Values(R) = Values(R) union {E}`.
61+
62+
The liveness values are computed in the type-check and passes to the
63+
region inference upon creation in the `liveness_constraints` argument.
64+
These are not represented as individual constraints like `R live at E`
65+
though; instead, we store a (sparse) bitset per region variable (of
66+
type [`LivenessValues`]). This way we only need a single bit for each
67+
liveness constraint.
2168

2269
[`liveness_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#structfield.liveness_constraints
2370
[`LivenessValues`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/values/struct.LivenessValues.html
2471

25-
### Outlives constraints
72+
One thing that is worth mentioning: All lifetime parameters are always
73+
considered to be live over the entire function body. This is because
74+
they correspond to some portion of the *caller's* execution, and that
75+
execution clearly includes the time spent in this function, since the
76+
caller is waiting for us to return.
2677

27-
An outlives constraint `'a: 'b` indicates that the value of `'a` must
28-
be a **superset** of the value of `'b`. On creation, we are given a
29-
set of outlives constraints in the form of a
30-
[`ConstraintSet`]. However, to work more efficiently with outlives
31-
constraints, they are [converted into the form of a graph][graph-fn],
32-
where the nodes of the graph are region variables (`'a`, `'b`) and
33-
each constraint `'a: 'b` induces an edge `'a -> 'b`. This conversion
34-
happens in the [`RegionInferenceContext::new`] function that creates
35-
the inference context.
36-
37-
[`ConstraintSet`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/constraints/struct.ConstraintSet.html
38-
[graph-fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/constraints/struct.ConstraintSet.html#method.graph
39-
[`RegionInferenceContext::new`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.new
78+
## Outlives constraints
4079

41-
### Member constraints
80+
An outlives constraint `'a: 'b` indicates that the value of `'a` must
81+
be a **superset** of the value of `'b`. That is, an outlives
82+
constraint `R1: R2` is satisfied if `Values(R1)` is a superset of
83+
`Values(R2)`. So to "apply" such a constraint to `Values`, we just
84+
have to compute `Values(R1) = Values(R1) union Values(R2)`.
4285

43-
A member constraint `'m member of ['c_1..'c_N]` expresses that the
44-
region `'m` must be *equal* to some **choice regions** `'c_i` (for
45-
some `i`). These constraints cannot be expressed by users, but they arise
46-
from `impl Trait` due to its lifetime capture rules. Consinder a function
47-
such as the following:
86+
One observation that follows from this is that if you have `R1: R2`
87+
and `R2: R1`, then `R1 = R2` must be true. Similarly, if you have:
4888

49-
```rust
50-
fn make(a: &'a u32, b: &'b u32) -> impl Trait<'a, 'b> { .. }
5189
```
52-
53-
Here, the true return type (often called the "hidden type") is only
54-
permitted to capture the lifeimes `'a` or `'b`. You can kind of see
55-
this more clearly by desugaring that `impl Trait` return type into its
56-
more explicit form:
57-
58-
```rust
59-
type MakeReturn<'x, 'y> = impl Trait<'x, 'y>;
60-
fn make(a: &'a u32, b: &'b u32) -> MakeReturn<'a, 'b> { .. }
90+
R1: R2
91+
R2: R3
92+
R3: R4
93+
R4: R1
6194
```
6295

63-
Here, the idea is that the hidden type must be some type that could
64-
have been written in place of the `impl Trait<'x, 'y>` -- but clearly
65-
such a type can only reference the regions `'x` or `'y` (or
66-
`'static`!), as those are the only names in scope. This limitation is
67-
then translated into a restriction to only access `'a` or `'b` because
68-
we are returning `MakeReturn<'a, 'b>`, where `'x` and `'y` have been
69-
replaced with `'a` and `'b` respectively.
96+
then `R1 = R2 = R3 = R4` follows. We take advantage of this to make things
97+
much faster, as described shortly.
7098

71-
## SCCs in the outlives constraint graph
99+
In the code, the set of outlives constraints is given to the region
100+
inference context on creation in a parameter of type
101+
[`ConstraintSet`]. The constraint set is basically just a list of `'a:
102+
'b` constraints.
72103

73-
The most common sort of constraint in practice are outlives
74-
constraints like `'a: 'b`. Such a cosntraint means that `'a` is a
75-
superset of `'b`. So what happens if we have two regions `'a` and `'b`
76-
that mutually outlive one another, like so?
104+
### The outlives constraint graph and SCCs
77105

78-
```
79-
'a: 'b
80-
'b: 'a
81-
```
106+
In order to work more efficiently with outlives constraints, they are
107+
[converted into the form of a graph][graph-fn], where the nodes of the
108+
graph are region variables (`'a`, `'b`) and each constraint `'a: 'b`
109+
induces an edge `'a -> 'b`. This conversion happens in the
110+
[`RegionInferenceContext::new`] function that creates the inference
111+
context.
112+
113+
[`ConstraintSet`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/constraints/struct.ConstraintSet.html
114+
[graph-fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/constraints/struct.ConstraintSet.html#method.graph
115+
[`RegionInferenceContext::new`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.new
82116

83-
In this case, we can conclude that `'a` and `'b` must be equal
84-
sets. In fact, it doesn't have to be just two regions. We could create
85-
an extended "chain" of outlives constraints:
117+
When using a graph representation, we can detect regions that must be equal
118+
by looking for cycles. That is, if you have a constraint like
86119

87120
```
88121
'a: 'b
@@ -91,11 +124,8 @@ an extended "chain" of outlives constraints:
91124
'd: 'a
92125
```
93126

94-
Here, we know that `'a..'d` are all equal to one another.
95-
96-
As mentioned above, an outlives constraint like `'a: 'b` can be viewed
97-
as an edge in a graph `'a -> 'b`. Cycles in this graph indicate regions
98-
that mutually outlive one another and hence must be equal.
127+
then this will correspond to a cycle in the graph containing the
128+
elements `'a...'d`.
99129

100130
Therefore, one of the first things that we do in propagating region
101131
values is to compute the **strongly connected components** (SCCs) in
@@ -138,9 +168,55 @@ superset of the value of `S1`. One crucial thing is that this graph of
138168
SCCs is always a DAG -- that is, it never has cycles. This is because
139169
all the cycles have been removed to form the SCCs themselves.
140170

141-
## How constraint propagation works
171+
### Applying liveness constraints to SCCs
172+
173+
The liveness constraints that come in from the type-checker are
174+
expressed in terms of regions -- that is, we have a map like
175+
`Liveness: R -> {E}`. But we want our final result to be expressed
176+
in terms of SCCs -- we can integrate these liveness constraints very
177+
easily just by taking the union:
178+
179+
```
180+
for each region R:
181+
let S by the SCC that contains R
182+
Values(S) = Values(S) union Liveness(R)
183+
```
184+
185+
In the region inferencer, this step is done in [`RegionInferenceContext::new`].
186+
187+
### Applying outlives constraints
188+
189+
Once we have computed the DAG of SCCs, we use that to structure out
190+
entire computation. If we have an edge `S1 -> S2` between two SCCs,
191+
that means that `Values(S1) >= Values(S2)` must hold. So, to compute
192+
the value of `S1`, we first compute the values of each successor `S2`.
193+
Then we simply union all of those values together. To use a
194+
quasi-iterator-like notation:
195+
196+
```
197+
Values(S1) =
198+
s1.successors()
199+
.map(|s2| Values(s2))
200+
.union()
201+
```
202+
203+
In the code, this work starts in the [`propagate_constraints`]
204+
function, which iterates over all the SCCs. For each SCC S1, we
205+
compute its value by first computing the value of its
206+
successors. Since SCCs form a DAG, we don't have to be conecrned about
207+
cycles, though we do need to keep a set around to track whether we
208+
have already processed a given SCC or not. For each successor S2, once
209+
we have computed S2's value, we can union those elements into the
210+
value for S1. (Although we have to be careful in this process to
211+
properly handle [higher-ranked
212+
placeholders](./placeholders_and_universes.html). Note that the value
213+
for S1 already contains the liveness constraints, since they were
214+
added in [`RegionInferenceContext::new`].
215+
216+
Once that process is done, we now have the "minimal value" for S1,
217+
taking into account all of the liveness and outlives
218+
constraints. However, in order to complete the process, we must also
219+
consider [member constraints][m_c], which are described in [a later
220+
section][m_c].
142221

143-
The main work of constraint propagation is done in the
144-
`propagation_constraints` function.
145222

146-
[`propagate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.propagate_constraints
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Universal regions
2+
3+
"Universal regions" is the name that the code uses to refer to "named
4+
lifetimes" -- e.g., lifetime parameters and `'static`. The name
5+
derives from the fact that such lifetimes are "universally quantified"
6+
(i.e., we must make sure the code is true for all values of those
7+
lifetimes). It is worth spending a bit of discussing how lifetime
8+
parameters are handled during region inference. Consider this example:
9+
10+
```rust
11+
fn foo<'a, 'b>(x: &'a u32, y: &'b u32) -> &'b u32 {
12+
x
13+
}
14+
```
15+
16+
This example is intended not to compile, because we are returning `x`,
17+
which has type `&'a u32`, but our signature promises that we will
18+
return a `&'b u32` value. But how are lifetimes like `'a` and `'b
19+
integrated into region inference, and how this error wind up being
20+
detected?
21+
22+
## Universal regions and their relationships to one another
23+
24+
Early on in region inference, one of the first things we do is to
25+
construct a [`UniversalRegions`] struct. This struct tracks the
26+
various universal regions in scope on a particular function. We also
27+
create a [`UniversalRegionRelations`] struct, which tracks their
28+
relationships to one another. So if you have e.g. `where 'a: 'b`, then
29+
the [`UniversalRegionRelations`] struct would track that `'a: 'b` is
30+
known to hold (which could be tested with the [`outlives`] function.
31+
32+
[`UniversalRegions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/universal_regions/struct.UniversalRegions.html
33+
[`UniversalRegionsRelations`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/type_check/free_region_relations/struct.UniversalRegionRelations.html
34+
[`outlives`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/type_check/free_region_relations/struct.UniversalRegionRelations.html#method.outlives
35+
36+
## Everything is a region variable
37+
38+
One important aspect of how NLL region inference works is that **all
39+
lifetimes** are represented as numbered variables. This means that the
40+
only variant of [`ty::RegionKind`] that we use is the [`ReVar`]
41+
variant. These region variables are broken into two major categories,
42+
based on their index:
43+
44+
[`ty::RegionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/enum.RegionKind.html
45+
[`ReVar`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/enum.RegionKind.html#variant.ReVar
46+
47+
- 0..N: universal regions -- the ones we are discussing here. In this
48+
case, the code must be correct with respect to any value of those
49+
variables that meets the declared relationships.
50+
- N..M: existential regions -- inference variables where the region
51+
inferencer is tasked with finding *some* suitable value.
52+
53+
In fact, the universal regions can be further subdivided based on
54+
where they were brought into scope (see the [`RegionClassification`]
55+
type). These subdivions are not important for the topics discussed
56+
here, but become important when we consider [closure constraint
57+
propagation](./closure_constraints.html), so we discuss them there.
58+
59+
[`RegionClassification`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/universal_regions/enum.RegionClassification.html#variant.Local
60+
61+
## Universal lifetimes as the elements of a region's value
62+
63+
As noted previously, the value that we infer for each region is a set
64+
`{E}`. The elements of this set can be points in the control-flow
65+
graph, but they can also be an element `end('a)` corresponding to each
66+
universal lifetime `'a`. If the value for some region `R0` includes
67+
`end('a`), then this implies that R0 must extend until the end of `'a`
68+
in the caller.
69+
70+
## The "value" of a universal region
71+
72+
During region inference, we compute a value for each universal region
73+
in the same way as we compute values for other regions. This value
74+
represents, effectively, the **lower bound** on that universal region
75+
-- the things that it must outlive. We now describe how we use this
76+
value to check for errors.
77+
78+
## Liveness and universal regions
79+
80+
All universal regions have an initial liveness constraint that
81+
includes the entire function body. This is because lifetime parameters
82+
are defined in the caller and must include the entirety of the
83+
function call that invokes this particular function. In addition, each
84+
universal region `'a` includes itself (that is, `end('a)`) in its
85+
liveness constraint (i.e., `'a` must extend until the end of
86+
itself). In the code, these liveness constraints are setup in
87+
[`init_free_and_bound_regions`].
88+
89+
[`init_free_and_bound_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.init_free_and_bound_regions
90+
91+
## Propagating outlives constraints for universal regions
92+
93+
So, consider the first example of this section:
94+
95+
```rust
96+
fn foo<'a, 'b>(x: &'a u32, y: &'b u32) -> &'b u32 {
97+
x
98+
}
99+
```
100+
101+
Here, returning `x` requires that `&'a u32 <: &'b u32`, which gives
102+
rise to an outlives constraint `'a: 'b`. Combined with our default liveness
103+
constraints we get:
104+
105+
```
106+
'a live at {B, end('a)} // B represents the "bunction body"
107+
'b live at {B, end('b)}
108+
'a: 'b
109+
```
110+
111+
When we process the `'a: 'b` constraint, therefore, we will add
112+
`end('b)` into the value for `'a`, resulting in a final value of `{B,
113+
end('a), end('b)}`.
114+
115+
## Detecting errors
116+
117+
Once we have finished constraint propagation, we then enforce a
118+
constraint that if some universal region `'a` includes an element
119+
`end('b)`, then `'a: 'b` must be declared in the function's bounds. If
120+
not, as in our example, that is an error. This check is done in the
121+
[`check_universal_regions`] function, which simply iterates over all
122+
universal regions, inspects their final value, and tests against the
123+
declared [`UniversalRegionRelations`].
124+
125+
[`check_universal_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/nll/region_infer/struct.RegionInferenceContext.html#method.check_universal_regions

0 commit comments

Comments
 (0)