Skip to content

Commit 6d156b3

Browse files
authored
Merge pull request rust-lang#669 from rylev/improve-ci-docs
Improve CI docs
2 parents 4ca046c + 5b8dcb0 commit 6d156b3

File tree

1 file changed

+62
-8
lines changed

1 file changed

+62
-8
lines changed

src/infra/docs/rustc-ci.md

Lines changed: 62 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,17 @@
11
# How the Rust CI works
22

3+
Rust CI ensures that the master branch of rust-lang/rust is always in a valid state.
4+
5+
A developer submitting a pull request to rust-lang/rust, experiences the following:
6+
7+
- A small subset of tests and checks are run on each commit to catch common errors.
8+
- When the PR is ready and approved, the "bors" tool enqueues a full CI run.
9+
- The full run either queues the specific PR or the PR is "rolled up" with other changes.
10+
- Eventually a CI run containing the changes from the PR is performed and either passes or fails with an error the developer must address.
11+
312
## Which jobs we run
413

5-
The `rust-lang/rust` repository uses GitHub Actions to test [all the other
14+
The `rust-lang/rust` repository uses GitHub Actions to test [all the
615
platforms][platforms] we support. We currently have two kinds of jobs running
716
for each commit we want to merge to master:
817

@@ -12,15 +21,14 @@ for each commit we want to merge to master:
1221
[rustup-toolchain-install-master] tool; The same builds are also used for
1322
actual releases: our release process basically consists of copying those
1423
artifacts from `rust-lang-ci2` to the production endpoint and signing them.
15-
1624
- Non-dist jobs run our full test suite on the platform, and the test suite of
1725
all the tools we ship through rustup; The amount of stuff we test depends on
1826
the platform (for example some tests are run only on Tier 1 platforms), and
1927
some quicker platforms are grouped together on the same builder to avoid
2028
wasting CI resources.
2129

2230
All the builds except those on macOS and Windows are executed inside that
23-
platform’s custom Docker container. This has a lot of advantages for us:
31+
platform’s custom [Docker container]. This has a lot of advantages for us:
2432

2533
- The build environment is consistent regardless of the changes of the
2634
underlying image (switching from the trusty image to xenial was painless for
@@ -32,13 +40,22 @@ platform’s custom Docker container. This has a lot of advantages for us:
3240
- Users can run the same tests in the same environment locally by just running
3341
`src/ci/docker/run.sh image-name`, which is awesome to debug failures.
3442

43+
The docker images prefixed with `dist-` are used for building artifacts while those without that prefix run tests and checks.
44+
3545
We also run tests for less common architectures (mainly Tier 2 and Tier 3
3646
platforms) in CI. Since those platforms are not x86 we either run
3747
everything inside QEMU or just cross-compile if we don’t want to run the tests
3848
for that platform.
3949

50+
These builders are running on a special pool of builders set up and maintained for us by GitHub.
51+
52+
Almost all build steps shell out to separate scripts. This keeps the CI fairly platform independent (i.e., we are not
53+
overly reliant on GitHub Actions). GitHub Actions is only relied on for bootstrapping the CI process and for orchestrating
54+
the scripts that drive the process.
55+
4056
[platforms]: https://doc.rust-lang.org/nightly/rustc/platform-support.html
4157
[rustup-toolchain-install-master]: https://github.com/kennytm/rustup-toolchain-install-master
58+
[Docker container]: https://github.com/rust-lang/rust/tree/master/src/ci/docker
4259
[dist-x86_64-linux]: https://github.com/rust-lang/rust/blob/master/src/ci/docker/host-x86_64/dist-x86_64-linux/Dockerfile
4360

4461
## Merging PRs serially with bors
@@ -63,25 +80,47 @@ Since the merge commit is based on the latest master and only one can be tested
6380
at the same time, when the results are green master is fast-forwarded to that
6481
merge commit.
6582

83+
The `auto` branch and other branches used by bors live on a fork of rust-lang/rust:
84+
[rust-lang-ci/rust]. This was originally done due to some security limitations in GitHub
85+
Actions. These limitations have been addressed, but we've not yet done the work of removing
86+
the use of the fork.
87+
6688
Unfortunately testing a single PR at the time, combined with our long CI (~3
67-
hours for a full run), means we can’t merge too many PRs in a single day, and a
89+
hours for a full run)[^1], means we can’t merge too many PRs in a single day, and a
6890
single failure greatly impacts our throughput for the day. The maximum number
6991
of PRs we can merge in a day is around 8.
7092

93+
The large CI run times and requirement for a large builder pool is largely due to the
94+
fact that full release artifacts are built in the `dist-` builders. This is worth it
95+
because these release artifacts:
96+
97+
- allow perf testing even at a later date
98+
- allow bisection when bugs are discovered later
99+
- ensure release quality since if we're always releasing, we can catch problems early
100+
101+
Bors [runs on ecs](https://github.com/rust-lang/simpleinfra/blob/master/terraform/bors/app.tf) and uses a sqlite database running in a volume as storage.
102+
103+
[^1]: As of January 2023, the bottleneck are the `dist-x86_64-linux` and `dist-x86_64-linux-alt` runners because of their usage of [BOLT] and [PGO] optimization tooling.
104+
71105
[bors]: https://github.com/bors
72106
[homu]: https://github.com/rust-lang/homu
73107
[homu-queue]: https://bors.rust-lang.org/queue/rust
108+
[rust-lang-ci/rust]: https://github.com/rust-lang-ci/rust
109+
[BOLT]: https://github.com/facebookincubator/BOLT
110+
[PGO]: https://en.wikipedia.org/wiki/Profile-guided_optimization
74111

75112
### Rollups
76113

77114
Some PRs don’t need the full test suite to be executed: trivial changes like
78115
typo fixes or README improvements *shouldn’t* break the build, and testing
79116
every single one of them for 2 to 3 hours is a big waste of time. To solve this
80117
we do a "rollup", a PR where we merge all the trivial PRs so they can be tested
81-
together. Rollups are created manually by a team member who uses their
82-
judgement to decide if a PR is risky or not, and are the best tool we have at
118+
together. Rollups are created manually by a team member using the "create a rollup" button on the [bors queue]. The team member uses their
119+
judgment to decide if a PR is risky or not, and are the best tool we have at
83120
the moment to keep the queue in a manageable state.
84121

122+
[bors queue]: https://bors.rust-lang.org/queue/rust
123+
85124
### Try builds
86125

87126
Sometimes we need a working compiler build before approving a PR, usually for
@@ -91,6 +130,8 @@ a separate branch (`try`), and they basically work the same as normal builds,
91130
without the actual merge at the end. Any number of try builds can happen at the
92131
same time, even if there is a normal PR in progress.
93132

133+
You can see the CI configuration for try builds [here](https://github.com/rust-lang/rust/blob/9d46c7a3e69966782e163877151c1f0cea8b630a/src/ci/github-actions/ci.yml#L728-L741).
134+
94135
[perf]: https://perf.rust-lang.org
95136
[crater]: https://github.com/rust-lang/crater
96137

@@ -179,8 +220,8 @@ automatically, posting it on the PR.
179220
The bot is not hardcoded to look for error strings, but was trained with a
180221
bunch of build failures to recognize which lines are common between builds and
181222
which are not. While the generated snippets can be weird sometimes, the bot is
182-
pretty good at identifying the relevant lines even if it’s an error we never
183-
saw before.
223+
pretty good at identifying the relevant lines even if it’s an error we've never
224+
seen before.
184225

185226
[rla]: https://github.com/rust-lang/rust-log-analyzer
186227

@@ -206,5 +247,18 @@ few days before we promote nightly to beta.
206247

207248
More information is available in the [toolstate documentation].
208249

250+
### GitHub Actions Templating
251+
252+
GitHub Actions does not natively support templating which can cause configurations to be large and difficult to change. We use YAML anchors for templating and a custom tool, [`expand-yaml-anchors`], to expand [the template] into the CI configuration that [GitHub uses][ci config].
253+
254+
This templating language is fairly straightforward:
255+
256+
- `&` indicates a template section
257+
- `*` expands the indicated template in place
258+
- `<<` merges yaml dictionaries
259+
209260
[rust-toolstate]: https://rust-lang-nursery.github.io/rust-toolstate
210261
[toolstate documentation]: ../toolstate.md
262+
[`expand-yaml-anchors`]: https://github.com/rust-lang/rust/tree/master/src/tools/expand-yaml-anchors
263+
[the template]: https://github.com/rust-lang/rust/blob/736c675d2ab65bcde6554e1b73340c2dbc27c85a/src/ci/github-actions/ci.yml
264+
[ci config]: https://github.com/rust-lang/rust/blob/master/.github/workflows/ci.yml

0 commit comments

Comments
 (0)