Skip to content

Commit eab2309

Browse files
committed
Switch frontmatter syntax
- Based on the escaping problem, switch away from markdown-style - Based on my experience, switch to frontmatter-style - Based on observing a new-to-this-syntax user, allow blank lines - Allow infostring to be optional but don't remve it
1 parent 2955406 commit eab2309

File tree

1 file changed

+56
-73
lines changed

1 file changed

+56
-73
lines changed

text/3503-frontmatter.md

Lines changed: 56 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,10 @@
1010
Add a frontmatter syntax to Rust as a way for [cargo to have manifests embedded in source code][RFC 3502]:
1111
````rust
1212
#!/usr/bin/env cargo
13-
```cargo
13+
---
1414
[dependencies]
1515
clap = { version = "4.2", features = ["derive"] }
16-
```
16+
---
1717

1818
use clap::Parser;
1919

@@ -30,11 +30,6 @@ fn main() {
3030
}
3131
````
3232

33-
Note that to share these in markdown, a priority use case, extra backticks are needed for the markdown code fence to escape the frontmatter code fence.
34-
We expect most users will not be familiar enough with the markdown spec to know this, especially for one of our primary target audiences: those new to Rust.
35-
This can also be frustrating for experienced users as three backticks is an ingrained habbit and it is common to need to go back and edit a post to properly escape the frontmatter.
36-
However, when weighing out the syntactic needs and the alternatives, we felt this was the least bad option.
37-
3833
# Motivation
3934
[motivation]: #motivation
4035

@@ -55,14 +50,13 @@ name: My Blog Post
5550
Hello world!
5651
```
5752

58-
We are carrying this concept over to Rust with a twist: using fence code blocks which
59-
will be familiar to Rust developers when documenting their code:
53+
We are carrying this concept over to Rust while merging some lessons from commonmark's fenced code blocks:
6054
````rust
6155
#!/usr/bin/env cargo
62-
```cargo
56+
---
6357
[dependencies]
6458
clap = { version = "4.2", features = ["derive"] }
65-
```
59+
---
6660

6761
use clap::Parser;
6862

@@ -79,20 +73,19 @@ fn main() {
7973
}
8074
````
8175

82-
As we work to better understand how tool authors will want to use frontmatter, we are restricting it to just the `cargo` infostring.
83-
This means users will only be exposed to this within the concept of ["cargo script"][RFC 3502].
76+
Like with [commonmark code fences](https://spec.commonmark.org/0.30/#info-string),
77+
an info-string is allowed after the opening `---` for use by the command interpreting the block to identify the contents of the block.
8478

8579
# Reference-level explanation
8680
[reference-level-explanation]: #reference-level-explanation
8781

88-
When parsing Rust code, after stripping the shebang (`#!`), rustc will strip a fenced code block:
89-
- Must be immediately at the top (after shebang stripping), meaning no blank lines
90-
- Opens with 3+ backticks and "cargo" followed by a newline
91-
- As we aren't supporting an arbitrarily nested file format (though may show up in one), we likely don't need the flexibility
92-
- We are prioritizing on "one right way to do it" to make it easier to learn to write and to read a variety of files.
93-
- All content is ignored until the same number of backticks is found at the start of a line.
94-
It is an error to have anything besides spaces and tabs between the backticks and the newline.
95-
- Unlike commonmark, it is an error to not close the fenced code block seeing to detect problems earlier in the process seeing as the primary content is what comes after the fenced code block
82+
When parsing Rust code, after stripping the shebang (`#!`), rustc will strip the frontmatter:
83+
- May include 0+ blank lines (whitespace + newline)
84+
- Opens with 3+ dashes followed by 0+ whitespace, an optional identifier, 0+ whitespace, and a newline
85+
- The variable number of dashes is an escaping mechanism in case `---` shows up in the content
86+
- All content is ignored by `rustc` until the same number of dashes is found at the start of a line.
87+
The line must terminate by 0+ whitespace and then a newline.
88+
- Unlike commonmark, it is an error to not close the frontmatter seeing to detect problems earlier in the process seeing as the primary content is what comes after the frontmatter
9689

9790
As cargo will be the first step in the process to parse this,
9891
the responsibility for high quality error messages will largely fall on cargo.
@@ -102,7 +95,6 @@ the responsibility for high quality error messages will largely fall on cargo.
10295

10396
- A new concept for Rust syntax, adding to overall cognitive load
10497
- Ecosystem tooling updates to deal with new syntax
105-
- **When sharing in markdown documents (e.g. GitHub issues), requires people escape markdown code fences with an extra backtick which they are likely not used to doing (or aware even exists)**
10698

10799
# Rationale and alternatives
108100
[rationale-and-alternatives]: #rationale-and-alternatives
@@ -112,8 +104,6 @@ we considered starting with only allowing this in the root `mod` (e.g. `main.rs`
112104
but decided to allow it in any file mostly for ease of implementation.
113105
Like with Python, this allows any file in a package (with the correct deps and `mod`s) to be executed, allowing easier interacting experiences in verifying behavior.
114106

115-
As for the hard-coded infostring used by cargo, that is a decision for [RFC 3502].
116-
117107
## Required vs Optional Shebang
118108

119109
We could require the shebang to be present for all cargo-scripts.
@@ -126,6 +116,28 @@ However, statically analyzing a shebang is [complicated](https://stackoverflow.c
126116
and we are wanting to avoid it in the core workflow.
127117
This isn't to say that tools like rust-analyzer might choose to require it to help their workflow.
128118

119+
## Blank lines
120+
121+
Originally, the proposal viewed the block as being "part of" the shebang and didn't allow them to be separated by blank lines.
122+
However, the shebang is optional and users are likely to assume they can use blanklines
123+
(see https://www.youtube.com/watch?v=S8MLYZv_54w).
124+
125+
This could cause ordering confusion (doc comments vs attributes vs frontmatter)
126+
127+
## Infostring
128+
129+
The main question on infostrings is whether they are tool-defined or rustc-defined.
130+
At one time, we proposed requiring the infostring and requiring it be `cargo` as a way to defer this decision.
131+
132+
As the design requirements are catered to processing by external tools, as opposed to rustc,
133+
we are instead reserving this syntax for external tools by making the infostrings tool-defined.
134+
The Rust toolchain (rustc, clippy, rustdoc, etc) already have access to attributes for user-provided content.
135+
If they need a more ergonomic way of specifying content, we should solve that more generally for attributes.
136+
137+
With that decision made, the infostring can be optional.
138+
Can it also be deferred out?
139+
Possibly, but we are leaving them in for unpredictable exception cases and in case users want to make the syntax explicit for their editor (especially if its not `cargo` which more trivial editor implementations will likely assume).
140+
129141
## Syntax
130142

131143
[RFC 3502] lays out some design principles, including
@@ -149,7 +161,7 @@ When choosing the syntax, our care-abouts are
149161
- Leave the door open in case we want to reuse the syntax for embedded lockfiles
150162
- Leave the door open for single-file `lib`s
151163

152-
### Fenced Code Block Frontmatter
164+
### Frontmatter
153165

154166
This proposed syntax builds off of the precedence of Rust having syntax specialized for an external tool
155167
(doc-comments for rustdoc).
@@ -168,10 +180,11 @@ This proposal mirrors the location of YAML frontmatter (absolutely first).
168180
As we learn more of its uses and problems people run into in practice,
169181
we can evaluate if we want to loosen any of the rules.
170182

171-
We are intentionally supporting only a subset of commonmark code fences.
172-
Markdown, like HTML, is meant to always be valid which is different than Rust syntax.
173-
Differences include:
174-
- backticks but not tilde's
183+
Differences with YAML frontmatter include:
184+
- Variable number of dashes (for escaping)
185+
- Optional frontmatter
186+
187+
Besides characters, differences with commonmark code fences include:
175188
- no indenting of the fenced code block
176189
- open/close must be a matching pair, rather than the close having "the same or more"
177190

@@ -187,47 +200,45 @@ Benefits:
187200
- In the future, this can be leveraged by other build systems or tools
188201

189202
Downsides:
190-
- **When sharing in markdown documents (e.g. GitHub issues), requires people escape markdown code fences with an extra backtick which they are likely not used to doing (or aware even exists)**
191-
- Maintainers seeding GitHub issue templates with 4 backticks can help
192203
- Familiar syntax in an unfamiliar use may make users feel unsettled, unsure how to proceed (what works and what doesn't).
193204
- If viewed from the lens of a comment, it isn't a variant of comment syntax like doc-comments
194205

195206
### Alternative 1: Vary the opening/closing character
196207

197-
Instead of backticks, we could do another character, like
198-
- `-`, making it look like YAML presentation streams, following the pattern of static site generators
199-
- `+` like [zola's frontmatter](https://www.getzola.org/documentation/getting-started/overview/#markdown-content)
200-
- `~`, using a lesser known markdown character
208+
Instead of dashes, we could do another character, like
209+
- backticks, like in commonmark code fences
210+
- `~`, using a lesser known markdown code fence character
211+
- `+` like [zola and hugo's TOML frontmatter](https://www.getzola.org/documentation/getting-started/overview/#markdown-content)
201212
- `=`
202213
- Open with `>>>` and close with `<<<`, like with HEREDOC (or invert it)
203214

204-
In practice:
205-
```rust
215+
In practice (with infostrings):
216+
````rust
206217
#!/usr/bin/env cargo
207-
---cargo
218+
```cargo
208219
[package]
209220
edition = "2018"
210-
---
221+
```
211222

212223
fn main() {
213224
}
214-
```
225+
````
215226
```rust
216227
#!/usr/bin/env cargo
217-
+++cargo
228+
~~~cargo
218229
[package]
219230
edition = "2018"
220-
+++
231+
~~~
221232

222233
fn main() {
223234
}
224235
```
225236
```rust
226237
#!/usr/bin/env cargo
227-
~~~cargo
238+
+++cargo
228239
[package]
229240
edition = "2018"
230-
~~~
241+
+++
231242

232243
fn main() {
233244
}
@@ -263,19 +274,10 @@ fn main() {
263274
}
264275
```
265276

266-
Benefits
267-
- With `-`, it builds on people's familiarity with static site generators
268-
- People can insert cargo-scripts into markdown (like chat, github issues)
269-
without being familiar enough with markdown to know how to escape backticks
270-
and to actually remember how to do it
271-
272277
Downsides
273-
- With `-`
274-
- We've extended the frontmatter syntax with an infostring, undoing some of the "familiarity" benefit
275-
- Potential congantive disonance as those familiar with frontmatter are used to YAML being there
276278
- With `>>>` it isn't quite like HEREDOC to have less overhead
277279
- `>>>`, `<<<`, `|||`, `===` at the beginning of lines start to look like merge conflicts which might confuse external tools
278-
- Doesn't feel very rust-like
280+
- Backticks have a problem with users knowing how to and remembering to escape these blocks when sharing them in markdown. Knowing the syntax (only because I've implemented a parser for it), I'm at about 50/50 on whether I properly escape.
279281

280282
Note:
281283
- `"` was not considered because that can feel too familiar and users might carry over their expectations for how strings work
@@ -617,28 +619,9 @@ pprint([(k, v["title"]) for k, v in data.items()][:10])
617619
- Support infostring attributes
618620
- We need to better understand use cases for how this should be extended, particularly what the syntax should be (see infostring language)
619621
- A safe starting point could be to say that a space or comma separates attributes and everything after it is defined as part of the "language"
620-
- Loosen the code-fence syntax, like allowing newlines
621622
- Add support for a `#[frontmatter(info = "", content = "")]` attribute that this syntax maps to.
622623
- Since nothing will read this, whether we do it now or in the future will have no affect
623624

624-
## Optional or additional infostrings
625-
626-
We could support:
627-
- Treat `cargo` as the default infostring
628-
- Support more infostring languages
629-
630-
The question comes down to whether
631-
- rustc owns the definition of the infostring, allowing us to add additional types of metadata (rustfmt config, static analyzer config, etc)
632-
- This would be similar to our [hard coding of "tool" attributes](https://github.com/rust-lang/rust/issues/44690)
633-
- the shebang tool owns the definition of the infostring
634-
635-
By us hard coding `cargo` in the infostring in rustc,
636-
we are intentionally deferring the decision for which path we should go down.
637-
638-
We can add additional infostrings on a case-by-case basis.
639-
In doing so, we can learn more about the use cases involved which can help us
640-
get a better picture for which route we should go down.
641-
642625
## Multiple frontmatters
643626

644627
At least for cargo's use cases, the only other file that we would consider supporting is `Cargo.lock`

0 commit comments

Comments
 (0)