Skip to content

How to Create and Review PRs

Jude Nelson edited this page Feb 24, 2023 · 8 revisions

Overview

NOTE(jude): this is a work in progress. Please do not substantially edit until this line is removed. Thanks!

A good PR review sets both the submitter and reviewers up for success. It minimizes the time required by both parties to get the code into an acceptable state, without sacrificing quality.

This document describes some best practices on how to create and review PRs. The target audience is people who have commit access to this repository. This is a living document -- developers can and should document their own experiences here.

Reviewer Expectations

The overall task of a reviewer is to create an acceptance plan for the submitter. This is simply the list of things that the submitter must do in order for the PR to be merged. The acceptance plan should be coherent, cohesive, and complete enough that the reviewer will understand exactly what they need to do to make the PR worthy of merging, without further reviews.

That said, reviewers should strive to complete the review in one round. Ideally, the reviewer provides enough detail to the submitter that the submitter can make all of the requested changes without further supervision.

Reviewers should aim to perform a reviewer in one sitting whenever possible. This enables a reviewer to time-box their review, and ensures that by the time they finish studying the code, they have a full, fresh understanding of what the PR does. This, in turn, sets them up for success when writing up the acceptance plan. It also enables reviewers to mark time for it on their calendars.

Code reviews should be timely. A PR review should begin no more than 2 business days after the PR is submitted. The develop and next branches in particular often change quickly, so letting a PR languish only creates more merge work for the submitter.

If a review cannot be begun within 2 business days, then the reviewers should tell the submitter when they can begin. This gives the reviewer the opportunity to keep working on the PR (if needed) or even withdraw and resubmit it.

Submitter Expectations

Everyone is busy all the time with a host of different tasks. Consequently, a PR's size and scope should be constrained so that a review can be written for it no more than 2 hours. This time block starts when the reviewer opens the diff, and ends when the reviewer hits the "submit review" button. If it takes more than 2 hours, then the PR should be broken into multiple PRs unless the reviewers agree to spend more time on it. A PR can be rejected if the reviewers believe they will need longer than this.

The size and scale of a PR depend on the reviewers' abilities to process the change. Different reviewers and submitters have different levels of familiarity with the codebase. Moreover, everyone has a different schedule -- sometimes, some people are more busy than others. A successful PR submitter takes the reviewers' familiarity and availability into account when crafting the PR, even going to far as to ask in advance if a particular person could be available for review.

Submission Guidelines

A PR submission's text should answer the following questions for reviewers:

  • What problem is being solved by this PR?
  • What does the solution do to address them?
  • Why is this the best solution? What alternatives were considered, and why are they worse?
  • What do reviewers need to be familiar with in order to provide useful feedback?
  • What issue(s) are addressed by this PR?

In addition, the PR submission should answer the prompts of the Github template we use for PRs.

The code itself should adhere to the following guidelines, which both submitters and reviewers should check:

Documentation

  • Each file must have a copyright statement.
  • Any new non-test modules should have module-level documentation explaining what the module does, and how it fits into the blockchain as a whole.
  • Any new files must have some top-of-file documentation that describes what the contained code does, and how it fits into the overall module.
  • Each public function, struct, and trait should have a Rustdoc comment block describing the API contract it offers. This goes for private structs and traits as well.
  • Each non-trivial private function should likewise have a Rustdoc comment block. Trivial ones that are self-explanatory, like getters and setters, do not need documentation.
  • Each struct member must have a Rustdoc comment string indicating what it does, and how it is used. This can be as little as a one-liner, as long as the relevant information is communicated.

Factoring

  • Public struct and trait definitions go into the mod.rs file. Private traits and structs can go anywhere.

  • Each non-mod.rs file represents at most one subsystem. It may include multiple struct implementations and trait implementations. The filename should succinctly identify the subsystem.

  • Directories represent collections of related but distinct subsystems.

  • To the greatest extent possible, business logic and I/O should be separated. A common pattern used in the codebase is to place the business logic into an "inner" function that does not do I/O, and handle I/O reads and writes in an "outer" function. The "outer" function only does the needful I/O and passes the data into the "inner" function. The "inner" function is often private, whereas the "outer" function is often public.

Refactoring

  • Any PR that does a large-scale refactoring (i.e. spanning multiple large subsystems) must be in its own PR. Refactoring often adds line noise that obscures the new functional changes that the PR proposes. Small-scale refactorings are permitted to ship with functional changes.

  • Refactoring PRs can generally be bigger, because they are easier to review. However, large refactorings that affect functional behavior of the system should be discussed first before carried out. This is because it is imperative that they do not stay open for very long (to keep the submitter's maintenance burden low), but nevertheless reviewing them must still take at most 2 hours. Discussing them first front-loads part of the review process.

Databases

  • If at all possible, the database schema should be preserved. Exceptions can be made on a case-by-case basis. The reason for this is that it's a big ask for people to re-sync nodes from genesis when they upgrade to a new point release.

  • Any changes to a database schema must also ship with schema migration logic, as well as test coverage for it.

  • The submitter must verify that any new database queries are indexed. Table scans are not permitted if they can be avoided (and they almost always can be).

  • Database changes cannot be consensus-critical unless part of a hard fork (see below).

Consensus-Critical Code

A consensus-critical change is a change that affects how the Stacks blockchain processes blocks, microblocks, or transactions, such that a node with the patch will produce a different state root hash than a node without the patch. If this is even possible, then the PR is automatically treated as a consensus-critical change and must ship as part of a hard fork.

  • All changes to consensus-critical code must be opened against next. It is never acceptable to open them against develop or master.

  • All consensus-critical changes must be gated on the Stacks epoch. They may only take effect once the system enters a specific epoch (and this must be documented).

A non-exhaustive list of examples of consensus-critical changes include:

  • Changing block, microblock, or transaction wire formats
  • Changing the data that gets stored to a MARF key/value pair in the Clarity or Stacks chainstate MARFs
  • Changing the order in which data gets stored in the above
  • Adding, changing, or removing Clarity functions
  • Changing the cost of a Clarity function
  • Adding new kinds of transactions, or enabling certain transaction data field values that were previously forbidden.

Testing

  • Tests should focus on the business logic with mocked data. To the greatest extent possible, each error path should be tested in addition to the success path. A submitter should expect to spend most of their test-writing time focusing on error paths; getting the success path to work is often much easier than the error paths.

  • Tests should verify that the I/O code paths work, but do so in a way that does not "clobber" other tests or prevent other tests from running in parallel (if it can be avoided).

  • If randomness is needed, tests should use a seeded random number generator if possible. This ensures that they will reliably pass in CI.

  • When testing a consensus-critical code path, the test coverage should verify that the new behavior is only possible within the epoch(s) in which the behavior is slated to activate. Above all else, backwards-compatibility is a hard requirement.

Clone this wiki locally