Skip to content

Commit e8f918c

Browse files
committed
expand some notes about expansion :P
1 parent 0980ebf commit e8f918c

File tree

2 files changed

+104
-39
lines changed

2 files changed

+104
-39
lines changed

src/macro-expansion.md

Lines changed: 81 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -14,45 +14,87 @@ we will look at the specifics of expanding different types of macros.
1414

1515
## Expansion and AST Integration
1616

17-
TODO: expand these notes (har har)...
18-
19-
- Expansion happens over a whole crate at once.
20-
- We run `fully_expand_fragment` on the crate
21-
- If `fully_expand_fragment` is run not on a whole crate, it means that we are performing eager expansion.
22-
- We do this for some built-ins that expect literals (not exposed to users).
23-
- It performs a subset of actions performed by non-eager expansion, so the discussion below focuses on eager expansion.
24-
- Original description here: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
25-
- Algorithm: `fully_expand_fragment` works in iterations. We repeat until there are no unresolved macros left.
26-
- Resolve imports in our partially built crate as much as possible.
27-
- (link to name-resolution chapter) names resolved from "closer" scopes (e.g. current block) to further ones (e.g. prelude)
28-
- A resolution fails differently for different scopes, e.g. for a module scope it means no unexpanded macros and no unresolved glob imports in that module.
29-
- Collect as many macro invocations as possible from our partially built crate
30-
(fn-like, attributes, derives) from the crate and add them to the queue.
31-
- Take a macro from the queue, and attempt to resolve it.
32-
- If it's resolved - run its expander function that consumes tokens or AST and produces tokens or AST (depending on the macro kind). (If it's not resolved, then put it back into the queue.)
33-
- At this point, we know everything about the macro itself and can call `set_expn_data` to fill in its properties in the global data -- that is the hygiene data associated with `ExpnId`.
34-
- The macro's expander function returns a piece of AST (or tokens). We need to integrate that piece of AST into the big existing partially built AST.
35-
- If the macro produces tokens (e.g. a proc macro), we will have to parse into an AST, which may produce parse errors.
36-
- During expansion, we create `SyntaxContext`s (heirarchy 2).
37-
- This is essentially where the "token-like mass" becomes a proper set-in-stone AST with side-tables
38-
- These three passes happen one after another on every AST fragment freshly expanded from a macro
39-
- `NodeId`s are assigned by `InvocationCollector`
40-
- also collects new macro calls from this new AST piece and adds them to the queue
41-
- def_paths are created and `DefId`s are assigned to them by `DefCollector`
42-
- `Name`s are put into modules (from the resolver's point of view) by `BuildReducedGraphVisitor`
43-
- After expanding a single macro and integrating its output continue to the next iteration of `fully_expand_fragment`.
44-
- If we make no progress in an iteration, then we have reached a compilation error (e.g. an undefined macro).
45-
46-
- We attempt to recover from failures (unresolved macros or imports) for the sake of diagnostics
47-
- recovery can't cause compilation to suceed. We know that it will fail at this point.
48-
- we expand errors into `ExprKind::Err` or something like that for unresolved macros
49-
- this allows compilation to continue past the first error so that we can report more errors at a time
50-
51-
### Relationship to name resolution
52-
53-
- name resolution is done for macro and import names during expansion and integration into the AST, as discussed above
54-
- For all other names we certainly know whether a name is resolved successfully or not on the first attempt, because no new names can appear, due to hygiene
55-
- They are resolved in a later pass, see `librustc_resolve/late.rs`
17+
First of all, expansion happens at the crate level. Given a raw source code for
18+
a crate, the compiler will produce a massive AST with all macros expanded, all
19+
modules inlined, etc.
20+
21+
The primary entry point for this process is the
22+
[`MacroExpander::fully_expand_fragment`][fef] method. Usually, we run this
23+
method on a whole crate. If it is not run on a full crate, it means we are
24+
doing _eager macro expansion_. Eager expansion means that we expand the
25+
arguments of a macro invocation before the macro invocation itself. This is
26+
implemented only for a few special built-in macros that expect literals (it's
27+
not a generally available feature of Rust). Eager expansion generally performs
28+
a subset of the things that lazy (normal) expansion does, so we will focus on
29+
lazy expansion for the rest of this chapter.
30+
31+
At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a
32+
queue of unresolved macro invocations (that is, macros we haven't found the
33+
definition of yet). We repeatedly try to pick a macro from the queue, resolve
34+
it, expand it, and integrate it back. If we can't make progress in an
35+
iteration, this represents a compile error. Here is the [algorithm][original]:
36+
37+
[fef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.MacroExpander.html#method.fully_expand_fragment
38+
[original]: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
39+
40+
0. Initialize an `queue` of unresolved macros.
41+
1. Repeat until `queue` is empty (or we make no progress, which is an error):
42+
0. [Resolve](./name-resolution.md) imports in our partially built crate as
43+
much as possible.
44+
1. Collect as many macro invocations as possible from our partially built
45+
crate (fn-like, attributes, derives) and add them to the queue.
46+
2. Dequeue the first element, and attempt to resolve it.
47+
3. If it's resolved:
48+
0. Run the macro's expander function that consumes tokens or AST and
49+
produces tokens or AST (depending on the macro kind).
50+
- At this point, we know everything about the macro itself and can
51+
call `set_expn_data` to fill in its properties in the global data
52+
-- that is the hygiene data associated with `ExpnId`. (See [the
53+
"Hygiene" section below][hybelow]).
54+
1. Integrate that piece of AST into the big existing partially built
55+
AST. This is essentially where the "token-like mass" becomes a
56+
proper set-in-stone AST with side-tables. It happens as follows:
57+
- If the macro produces tokens (e.g. a proc macro), we parse into
58+
an AST, which may produce parse errors.
59+
- During expansion, we create `SyntaxContext`s (heirarchy 2). (See
60+
[the "Hygiene" section below][hybelow])
61+
- These three passes happen one after another on every AST fragment
62+
freshly expanded from a macro:
63+
- [`NodeId`]s are assigned by [`InvocationCollector`]. This
64+
also collects new macro calls from this new AST piece and
65+
adds them to the queue.
66+
- ["Def paths"][defpath] are created and [`DefId`]s are
67+
assigned to them by [`DefCollector`].
68+
- Names are put into modules (from the resolver's point of
69+
view) by [`BuildReducedGraphVisitor`].
70+
2. After expanding a single macro and integrating its output, continue
71+
to the next iteration of [`fully_expand_fragment`][fef].
72+
4. If it's not resolved:
73+
0. Put the macro back in the queue
74+
1. Continue to next iteration...
75+
76+
[defpaths]: https://rustc-dev-guide.rust-lang.org/hir.html?highlight=def,path#identifiers-in-the-hir
77+
[`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
78+
[`InvocationCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.InvocationCollector.html
79+
[`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
80+
[`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
81+
[`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
82+
[hybelow]: #hygiene-and-heirarchies
83+
84+
If we make no progress in an iteration, then we have reached a compilation
85+
error (e.g. an undefined macro). We attempt to recover from failures
86+
(unresolved macros or imports) for the sake of diagnostics. This allows
87+
compilation to continue past the first error, so that we can report more errors
88+
at a time. Recovery can't cause compilation to suceed. We know that it will
89+
fail at this point. The recovery happens by expanding unresolved macros into
90+
[`ExprKind::Err`][err].
91+
92+
[err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err
93+
94+
Notice that name resolution is involved here: we need to resolve imports and
95+
macro names in the above algorithm. However, we don't try to resolve other
96+
names yet. This happens later, as we will see in the [next
97+
chapter](./name-resolution.md).
5698

5799
## Hygiene and Heirarchies
58100

src/name-resolution.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
# Name resolution
22

3+
In the previous chapters, we saw how the AST is built with all macros expanded.
4+
We saw how doing that requires doing some name resolution to resolve imports
5+
and macro names. In this chapter, we show how this is actually done and more.
6+
7+
In fact, we don't do full name resolution during macro expansion -- we only
8+
resolve imports and macros at that time. This is required to know what to even
9+
expand. Later, after we have the whole AST, we due full name resolution to
10+
resolve all names in the crate. This happens in [`rustc_resolve::late`][late].
11+
Unlike during macro expansion, in this late expansion, we only need to try to
12+
resolve a name once, since no new names can be added. If we fail to resolve a
13+
name now, then it is a compiler error.
14+
15+
Name resolution can be complex. There are a few different namespaces (e.g.
16+
macros, values, types, lifetimes), and names my be valid at different (nested)
17+
scopes. Also, different types of names can fail to be resolved differently, and
18+
failures can happen differently at different scopes. For example, for a module
19+
scope, failure means no unexpanded macros and no unresolved glob imports in
20+
that module. On the other hand, in a function body, failure requires that a
21+
name be absent from the block we are in, all outer scopes, and the global
22+
scope.
23+
24+
[late]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/late/index.html
25+
326
## Basics
427

528
In our programs we can refer to variables, types, functions, etc, by giving them

0 commit comments

Comments
 (0)