Skip to content

Commit 96922e3

Browse files
authored
Merge pull request #152 from nikomatsakis/reusability-tweaks
improve term macros
2 parents 320a42f + 422e8a6 commit 96922e3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+1698
-926
lines changed

book/src/SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
- [`formality_core`: the Formality system](./formality_core.md)
55
- [Defining your lang](./formality_core/lang.md)
66
- [Defining terms with the `term` macro](./formality_core/terms.md)
7-
- [Implementing `Fold` and `Visit` by hand](./formality_core/impl_fold_visit.md)
8-
- [Implementing `Parse` by hand](./formality_core/impl_parse.md)
7+
- [Parsing](./formality_core/parse.md)
8+
- [Customizing debug](./formality_core/debug.md)
99
- [Variables](./formality_core/variables.md)
1010
- [Collections](./formality_core/collections.md)
1111
- [Judgment functions and inference rules](./formality_core/judgment_fn.md)
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,25 @@
11
# Collections
2+
3+
When using formality, it's best to use the following collection types:
4+
5+
* for sequences, use the standard `Vec` type
6+
* `formality_core::Set<T>` -- equivalent to `BTreeSet` but shorter. We use a `BTreeSet` because it has deterministic ordering for all operations.
7+
* `formality_core::Map<K, V>` -- equivalent to `BTreeMap` but shorter. We use a `BTreeMap` because it has deterministic ordering for all operations.
8+
9+
## Macros
10+
11+
We also define macros:
12+
13+
* `seq![...]` -- equivalent to `vec![]` but permits flattening with `..` notation, as described below
14+
* `set![...]` -- like `seq!`, but produces a `Set`
15+
16+
In these macros you can "flatten" things that support `IntoIterator`, so `set![..a, ..b]` will effectively perform a set union of `a` and `b`.
17+
18+
## Casting between collections and tuples
19+
20+
It is possible to upcast from variable tuple types to produce collections:
21+
22+
* A `Vec<E1>` can be upcast to a `Vec<E2>` if `E1: Upcast<E2>`.
23+
* A `Set<E1>` can be upcast to a `Set<E2>` if `E1: Upcast<E2>`.
24+
* Tuples of elements (e.g., `(E1, E2)` or `(E1, E2, E3)`) can be **upcast** to sets up to a fixed arity.
25+
* Sets and vectors can be **downcast** to `()` and `(E, C)`, where `()` succeeds only for empty collections, and `(E, C)` extracts the first element `E` and a collection `C` with all remaining elements (note that elements in sets are always ordered, so the first element is well defined there). This is useful when writing judgment rules that operate over sequences and sets.

book/src/formality_core/debug.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Customizing the debug
2+
3+
By default, the `#[term]` macro will generate a `Debug` impl that is guided by the `#[grammar]` attributes on your type (see the [parsing](./parse.md) section for more details). But sometimes you want to generate custom logic. You can include a `#[customize(debug)]` declaration to allow that. Most of the type, when you do this, you will also want to [customize parsing](./parse.md#customizing-the-parse), as the `RigidTy` does:
4+
5+
```rust
6+
{{#include ../../../crates/formality-types/src/grammar/ty.rs:RigidTy_decl}}
7+
```
8+
9+
Now you must simply implement `Debug` in the usual way. Here is the `RigidTy` declaration:
10+
11+
12+
```rust
13+
{{#include ../../../crates/formality-types/src/grammar/ty/debug_impls.rs:RigidTy_impl}}
14+
```

book/src/formality_core/impl_fold_visit.md

Lines changed: 0 additions & 22 deletions
This file was deleted.

book/src/formality_core/parse.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Parsing
2+
3+
Formality's `#[term]` and `#[grammar]` attributes let you specify the grammar that will be used to parse your structs/enums.
4+
5+
For structs, there is a single grammar, specified as part of the term:
6+
7+
```rust
8+
#[term( /* grammar here */ )]
9+
struct MyStruct { }
10+
```
11+
12+
For enums, the grammar is placed in `#[grammar]` attributes on each variant:
13+
14+
```rust
15+
#[term]
16+
struct MyEnum {
17+
#[grammar( /* grammar here */ )]
18+
Foo(...),
19+
}
20+
```
21+
22+
### Ambiguity and precedence
23+
24+
When parsing an enum there will be multiple possibilities. We will attempt to parse them all. If more than one succeeds, the parser will attempt to resolve the ambiguity. Ambiguity can be resolved in two ways:
25+
26+
* Explicit precedence: By default, every variant has precedence 0, but you can override this by annotating variants with `#[precedence(N)]` (where `N` is some integer). This will override the precedence for that variant. Variants with higher precedences are preferred.
27+
* Reduction prefix: When parsing, we track the list of things we had to parse. If there are two variants at the same precedence level, but one of them had to parse strictly more things than the other and in the same way, we'll prefer the longer one. So for example if one variant parsed a `Ty` and the other parsed a `Ty Ty`, we'd take the `Ty Ty`.
28+
29+
Otherwise, the parser will panic and report ambiguity. The parser panics rather than returning an error because ambiguity doesn't mean that there is no way to parse the given text as the nonterminal -- rather that there are multiple ways. Errors mean that the text does not match the grammar for that nonterminal.
30+
31+
### Symbols
32+
33+
A grammar consists of a series of *symbols*. Each symbol matches some text in the input string. Symbols come in two varieties:
34+
35+
* Most things are *terminals* or *tokens*: this means they just match themselves:
36+
* For example, the `*` in `#[grammar($v0 * $v1)]` is a terminal, and it means to parse a `*` from the input.
37+
* Delimeters are accepted but must be matched, e.g., `( /* tokens */ )` or `[ /* tokens */ ]`.
38+
* Things beginning with `$` are *nonterminals* -- they parse the contents of a field. The grammar for a field is generally determined from its type.
39+
* If fields have names, then `$field` should name the field.
40+
* For position fields (e.g., the T and U in `Mul(Expr, Expr)`), use `$v0`, `$v1`, etc.
41+
* Exception: `$$` is treated as the terminal `'$'`.
42+
* Nonterminals can also accept modes:
43+
* `$field` -- just parse the field's type
44+
* `$*field` -- the field must be a `Vec<T>` -- parse any number of `T` instances. Something like `[ $*field ]` would parse `[f1 f2 f3]`, assuming `f1`, `f2`, and `f3` are valid values for `field`.
45+
* `$,field` -- similar to the above, but uses a comma separated list (with optional trailing comma). So `[ $,field ]` will parse something like `[f1, f2, f3]`.
46+
* `$?field` -- will parse `field` and use `Default::default()` value if not present.
47+
48+
### Greediness
49+
50+
Parsing is generally greedy. So `$*x` and `$,x`, for example, consume as many entries as they can. Typically this works best if `x` begins with some symbol that indicates whether it is present.
51+
52+
### Default grammar
53+
54+
If no grammar is supplied, the default grammar is determined as follows:
55+
56+
* If a `#[cast]` or `#[variable]` annotation is present, then the default grammar is just `$v0`.
57+
* Otherwise, the default grammar is the name of the type (for structs) or variant (for enums), followed by `()`, with the values for the fields in order. So `Mul(Expr, Expr)` would have a default grammar `mul($v0, $v1)`.
58+
59+
### Customizing the parse
60+
61+
If you prefer, you can customize the parse by annotating your term with `#[customize(parse)]`. In the Rust case, for example, the parsing of `RigidTy` is customized ([as is the debug impl](./debug.md)):
62+
63+
```rust
64+
{{#include ../../../crates/formality-types/src/grammar/ty.rs:RigidTy_decl}}
65+
```
66+
67+
You must then supply an impl of `Parse` yourself. Because `Parse` is a trait alias, you actually want to implement `CoreParse<L>` for your language type `L`. Inside the code you will want to instantiate a `Parser` and then invoke `parse_variant` for every variant, finally invoking `finish`.
68+
69+
In the Rust code, the impl for `RigidTy` looks as follows:
70+
71+
72+
```rust
73+
{{#include ../../../crates/formality-types/src/grammar/ty/parse_impls.rs:RigidTy_impl}}
74+
```

book/src/formality_core/terms.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,8 @@ The `#[term]` macro says that these are terms and we should generate all the boi
3535

3636
The `#[term]` also accepts some internal annotations:
3737

38-
* `#[cast]` can be applied on an enum variant with a single argument. It says that we should generate upcast/downcast impls to allow conversion. In this case, between a `Variable` and an `Expr`. This would generate an impl `Variable: Upcast<Expr>` that lets you convert a variable to an expression, and a downcast impl `Expr: Downcast<Variable>` that lets you try to convert from an expression to a variable. Downcasting is *fallible*, which means that the downcast will only succeed if this is an `Expr::Variable`. If this is a `Expr::Add`, the downcast would return `None`.
38+
* `#[cast]` can be applied on an enum variant with a single argument. It says that this variant represents an "is-a" relationship and hence we should generate upcast/downcast impls to allow conversion. In this case, a variable is a kind of expression -- i.e,. wrapping a variable up into an expression doesn't carry any semantic meaning -- so we want variables to be upcastable to expressions (and expressions to be downcast to variables). The `#[cast]` attribute will therefore generate an impl `Variable: Upcast<Expr>` that lets you convert a variable to an expression, and a downcast impl `Expr: Downcast<Variable>` that lets you try to convert from an expression to a variable. Downcasting is *fallible*, which means that the downcast will only succeed if this is an `Expr::Variable`. If this is a `Expr::Add`, the downcast would return `None`.
39+
* There is a special case version of `#[cast]` called `#[variable]`. It indicates that the variant represents a (type) variable -- we are not using it here because this an expression variable. Variables are rather specially in folding/parsing to allow for substitution, binders, etc.
3940
* `#[grammar]` tells the parser and pretty printer how to parse this variant. The `$v0` and `$v1` mean "recursively parse the first and second arguments". This will parse a `Box<Expr>`, which of course is implemented to just parse an `Expr`.
4041

4142
If you are annotating a struct, the `#[term]` just accepts the grammar directly, so `#[term($name)] struct Variable` means "to parse a variable, just parse the name field (a string)".

crates/formality-core/src/cast.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,16 @@ where
216216
}
217217
}
218218

219+
impl<T: Clone, U, E> UpcastFrom<Result<T, E>> for Result<U, E>
220+
where
221+
T: Upcast<U>,
222+
E: Clone,
223+
{
224+
fn upcast_from(term: Result<T, E>) -> Self {
225+
term.map(|t| t.upcast())
226+
}
227+
}
228+
219229
impl DowncastTo<()> for () {
220230
fn downcast_to(&self) -> Option<()> {
221231
Some(())

crates/formality-core/src/language.rs

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,17 @@ pub trait Language: 'static + Copy + Ord + Hash + Debug + Default {
2222
+ UpcastFrom<CoreExistentialVar<Self>>
2323
+ UpcastFrom<CoreBoundVar<Self>>;
2424

25-
/// The character (typically `<`) used to open binders.
25+
/// The token (typically `<`) used to open binders.
2626
const BINDING_OPEN: char;
2727

28-
/// The character (typically `>`) used to open binders.
28+
/// The token (typically `>`) used to open binders.
2929
const BINDING_CLOSE: char;
30+
31+
/// Keywords to disallow as identifiers everywhere.
32+
/// It is possible to do positional keywords too, if you want,
33+
/// but it requires custom parsing impls for your types.
34+
/// No fun.
35+
const KEYWORDS: &'static [&'static str];
3036
}
3137

3238
/// For consistency with types like `CoreVariable<L>`, we write `CoreKind<L>` instead of `Kind<L>`.

crates/formality-core/src/lib.rs

Lines changed: 11 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88
// Re-export things from dependencies to avoid everybody repeating same set
99
// in their Cargo.toml.
10+
pub use anyhow::anyhow;
1011
pub use anyhow::bail;
1112
pub use contracts::requires;
1213
pub use tracing::debug;
@@ -89,6 +90,7 @@ macro_rules! declare_language {
8990
type Parameter = $param:ty;
9091
const BINDING_OPEN = $binding_open:expr;
9192
const BINDING_CLOSE = $binding_close:expr;
93+
const KEYWORDS = [$($kw:expr),* $(,)?];
9294
}
9395
) => {
9496
$(#[$the_lang_m:meta])*
@@ -104,6 +106,7 @@ macro_rules! declare_language {
104106
type Parameter = $param;
105107
const BINDING_OPEN: char = $binding_open;
106108
const BINDING_CLOSE: char = $binding_close;
109+
const KEYWORDS: &'static [&'static str] = &[$($kw),*];
107110
}
108111

109112
$crate::trait_alias! {
@@ -143,7 +146,7 @@ macro_rules! declare_language {
143146

144147
/// Parses `text` as a term with no bindings in scope.
145148
#[track_caller]
146-
pub fn try_term<T>(text: &str) -> anyhow::Result<T>
149+
pub fn try_term<T>(text: &str) -> $crate::Fallible<T>
147150
where
148151
T: Parse,
149152
{
@@ -155,26 +158,12 @@ macro_rules! declare_language {
155158
/// References to the given string will be replaced with the given parameter
156159
/// when parsing types, lifetimes, etc.
157160
#[track_caller]
158-
pub fn term_with<T, B>(bindings: impl IntoIterator<Item = B>, text: &str) -> anyhow::Result<T>
161+
pub fn term_with<T, B>(bindings: impl IntoIterator<Item = B>, text: &str) -> $crate::Fallible<T>
159162
where
160163
T: Parse,
161164
B: $crate::Upcast<(String, $param)>,
162165
{
163-
let scope = $crate::parse::Scope::new(bindings.into_iter().map(|b| b.upcast()));
164-
let (t, remainder) = match T::parse(&scope, text) {
165-
Ok(v) => v,
166-
Err(errors) => {
167-
let mut err = anyhow::anyhow!("failed to parse {text}");
168-
for error in errors {
169-
err = err.context(error.text.to_owned()).context(error.message);
170-
}
171-
return Err(err);
172-
}
173-
};
174-
if !$crate::parse::skip_whitespace(remainder).is_empty() {
175-
anyhow::bail!("extra tokens after parsing {text:?} to {t:?}: {remainder:?}");
176-
}
177-
Ok(t)
166+
$crate::parse::core_term_with::<FormalityLang, T, B>(bindings, text)
178167
}
179168
}
180169
}
@@ -237,12 +226,13 @@ macro_rules! id {
237226

238227
impl CoreParse<crate::FormalityLang> for $n {
239228
fn parse<'t>(
240-
_scope: &parse::Scope<crate::FormalityLang>,
229+
scope: &parse::Scope<crate::FormalityLang>,
241230
text: &'t str,
242231
) -> parse::ParseResult<'t, Self> {
243-
let (string, text) = parse::identifier(text)?;
244-
let n = $n::new(&string);
245-
Ok((n, text))
232+
$crate::parse::Parser::single_variant(scope, text, stringify!($n), |p| {
233+
let string = p.identifier()?;
234+
Ok($n::new(&string))
235+
})
246236
}
247237
}
248238

0 commit comments

Comments
 (0)