-
Notifications
You must be signed in to change notification settings - Fork 7.6k
doc: show how to use to git to submit smaller and faster PRs #83839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is great—it really is—and I will provide feedback on the suggested additions that I don't see any reason not to include. To put it differently, I find it unfair to tell contributors something along the lines of "RTFM wrt how to 'properly' contribute" when their natural tendency will always be to use common sense and what they observe as being some kind of norm. "Do as we say, not as you see...". Am I making sense? Thoughts? |
I totally get what you mean and I think what you call "double standards" and "unfair", others would call it... "trust"? Contributors with a good track record tend to get less scrutiny whereas brand new people rarely get any slack. I don't think this is intrinsically wrong because the risk level is actually different and trusting old timers reduces the total review workload which is always, always a problem[*]. But there is a very fine balance and it can definitely go too far sometimes. [*] code reviews are like everything else in QA: everyone wants quality but no one wants to pay for it.
Me neither - I don't think there is any easy way to solve this besides maintainers trying to keep an eye on their own biases and helping each other. This is not an exact science. If anything at all, I hope this PR only helps mitigate such "unfairness" by clarifying a guideline that should apply to all? BTW there is a common misconception that newcomers cannot review code. It is true that new reviewers cannot be as thorough but they can still be very useful:
This is a bit of a special case because... who cares if vendors break ONLY their own products? Small exaggeration to get the point across, it's getting late. You see what I mean.
Now that is 100% wrong and should be flagged - and probably reverted temporarily!
What I'm trying to reinforce in this PR is: smaller PRs are also in the interest of the submitter themselves. Even experienced people who managed to fly quickly "under the radar" don't want to end up wasting time post merge dealing with regressions, drama and reverts. Do they? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems useful. My immediate thought goes to all the giant PRs I've pushed over the years, but in general subsystem-/tree-wide changes aren't really the subject of general docs. For sure we can all agree this is the kind of PR we'd prefer to see reviewed and merged.
36dce5b
to
0c1fe17
Compare
The documentation had for a long time a section that specifically recommends to submit "smaller PRs" for review: https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs Yet submitters keep submitting large PRs on a regular basis, sometimes very large ones. See a couple of very recent examples below. (Reminder: submitting a giant, draft PR for pure _testing_ purposes and NOT for review is a perfectly fine) The "natural" explanation is that submitters optimistically and wrongly believe that dumping a large amount of code at once onto reviewers will be dealt with faster than in smaller chunks. This is most likely a contributing factor but most people should quickly learn from bad experience. Yet we keep seeing large PRs on a regular basis. So there must be other factors too. Based on personal but fairly extensive git support experience, another top reason is likely git usability and some lack of git knowledge (neither the first nor the last time git usability would have a significant impact) To help with this, add to the existing git guide the simple command that lets split a large submission in several, smaller PRs. This can only help demystify and promote smaller PRs while barely growing the size of the documentation. While at it, also add a couple missing benefits of smaller PRs. Recent examples of large PRs: - In the controversial and giant PR zephyrproject-rtos#77368 (comment) the exhausted submitter wrote: > Every time any one person requests a rebase that changes the PR, > unless there's consensus, there's no mechanism (manual/project process > or built into GitHub) to know/prevent a different person from rejecting > the new changes. That PR had 21 commits (18 in the final version), 82 files changed and 400 (!) comments. The sheer size massively increased the likelihood of the problem described. Re-submitting it in smaller chunks would obviously have mitigated that problem. Yet that PR was never split and stayed huge...? - In this second example, a large PR was submitted with different authors. A disagreement emerged about squashing across different authors: zephyrproject-rtos#78795 (comment) If this PR had been split in smaller chunks, then the squashing discussion might have been avoided entirely. Whether squashing is good or bad in this particular case is irrelevant (and already discussed at great in length in zephyrproject-rtos#83117). What matters here is: the submitter lost that chance by submitting a larger PR with different authors. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
0c1fe17
to
02f4742
Compare
v2: added new item: "Better test coverage AFTER merge". @andyross 's approval lost sorry (I know it's a pet peeve of yours...) |
I'd argue that we shouldn't be allowing really large PRs by experienced/trusted contributors either. Everything enumerated @marc-hb about small PRs still applies regardless of the contributor. As a community, we should strive to be more consistent in enforcement of policies and recommended practices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how verbose do we need to get? This just repeats what was already in the docs:
- https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs
- https://docs.zephyrproject.org/latest/contribute/guidelines.html#contribution-workflow
Over the last year, with 10k PRs, the mediam commit number in PRs was one. I do not thing this is a general problem we have, and those PRs that have multiple commits, probably they have this for a good reason.
Some of the points here probably can go into one of the existing spots I quickly found, I do not think we need to expand on this across the tree repeating ourselves and duplicating the message (which is a common theme in our docs).
and getting code merged before the whole work is complete. This faster | ||
approach is known as "Stacked Diffs". | ||
|
||
- Less wasted work if reviewers or maintainers reject the direction of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this considered a "benefit"? If the direction is rejected, and part of it were already merged, that is not considered a good thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not new, I merely moved it. I could remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would remove it honestly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nashif for the feedback.
how verbose do we need to get? This just repeats what was already in the docs:
https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs
https://docs.zephyrproject.org/latest/contribute/guidelines.html#contribution-workflow
These are the two pages touched by this PR so you got me confused... I really tried to add only new points but maybe I could combine some and make things shorter. Can you please be more specific and point at what looks redundant?
I'm 99% sure the "Stacked Diffs" git command is new - and also new to a lot of git users in my "git support" experience. It's IMHO the most important addition in this PR and why it's the title.
Some of the points here probably can go into one of the existing spots I quickly found, ...
Did you mean the two references above or something else?
Over the last year, with 10k PRs, the mediam commit number in PRs was one.
I do not thing this is a general problem we have, and those PRs that have multiple commits, probably they have this for a good reason.
The "Smaller PRs" section does not set any hard or even soft limit on the number of commits per PRs (that would be one "authoritative" way to be much shorter!), I think it barely mentions commits at all. Instead of metrics, it lists the "non-exact science" benefits of smaller PRs and CI so submitters and reviewers are better armed to go and find the sweet spot in each particular case.
That section does in practice focuses on "pathological" cases like brand new features and major refactors because they're the ones costing a disproportionate amount of time (for both reviewers and submitters). That's common for many guidelines: in general you don't need them.
I do not think we need to expand on this across the tree repeating ourselves and duplicating the message (which is a common theme in our docs).
I think I hate copy/paste/diverge more than anyone else but there's often a way to be less verbose and I'd love specific suggestions.
and getting code merged before the whole work is complete. This faster | ||
approach is known as "Stacked Diffs". | ||
|
||
- Less wasted work if reviewers or maintainers reject the direction of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not new, I merely moved it. I could remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I appreciate and like the idea, but I think the principle of diverging between the local and remote branches' may be confusing to users. Feel free to convince me otherwise though.
Disagreements, nitpicks and other delays in one place do not hold back | ||
everything else. | ||
|
||
- Even when smaller PRs are dependent of each other, you can start reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Even when smaller PRs are dependent of each other, you can start reviews | |
- Even when smaller PRs are dependent of each other (and thus require duplicating commits), you can start reviews |
- Contentious and long discussions have a more restricted "splash damage" | ||
because they cannot drown unrelated topics that have successfully escaped | ||
in separate Pull Requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rephrase this so that it is simpler to understand, perhaps:
- The risk of the PR being derailed or side tracked by unrelated topics becomes significantly smaller.
and getting code merged before the whole work is complete. This faster | ||
approach is known as "Stacked Diffs". | ||
|
||
- Less wasted work if reviewers or maintainers reject the direction of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would remove it honestly.
opposite effect is generally achieved. Submitting several, smaller | ||
pull requests does *not* require creating and managing local git | ||
branches:: | ||
|
||
git push origin big_work~15:refs/heads/big_work_part1 | ||
|
||
In this example, the 15 most recent commits in the ``big_work`` local | ||
branch are *omitted* from the ``big_work_part1`` remote branch and pull | ||
request. This "Stacked Diffs" approach lets you submit part 1 and get | ||
reviews started long before the rest is ready. When smaller parts are | ||
independent of each other, rotating them with ``git rebase -i`` (see below) | ||
lets you submit them concurrently which is even faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would get rid of this. It is in my opinion way too complex for many users to grasp easily. Instead, you could tell people how they can use rebase --onto
or reset+
cherry-pick` to rebase a PR that has had part of its commits merged in a previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very least please don't use refs/heads/
, I don't think many people actually use those in practice.
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
The documentation had for a long time a section that specifically recommends to submit "smaller PRs" for review:
https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs
Yet submitters keep submitting large PRs on a regular basis, sometimes very large ones. See a couple of very recent examples below.
(Reminder: submitting a giant, draft PR for pure testing purposes and NOT for review is a perfectly fine)
The "natural" explanation is that submitters optimistically and wrongly believe that dumping a large amount of code at once onto reviewers will be dealt with faster than in smaller chunks. This is most likely a contributing factor but most people should quickly learn from bad experience. Yet we keep seeing large PRs on a regular basis. So there must be other factors too.
Based on personal but fairly extensive git support experience, another top reason is likely git usability and some lack of git knowledge (neither the first nor the last time git usability would have a significant impact)
To help with this, add to the existing git guide the simple command that lets split a large submission in several, smaller PRs. This can only help demystify and promote smaller PRs while barely growing the size of the documentation.
While at it, also add a couple missing benefits of smaller PRs.
Recent examples of large PRs:
That PR had 21 commits (18 in the final version), 82 files changed and 400 (!) comments. The sheer size massively increased the likelihood of the problem described. Re-submitting it in smaller chunks would obviously have mitigated that problem. Yet that PR was never split and stayed huge...?
Introduce Bouffalo Lab SoC's #78795 (comment) If this PR had been split in smaller chunks, then the squashing discussion might have been avoided entirely. Whether squashing is good or bad in this particular case is irrelevant (and already discussed at great in length in doc: contribution guidelines: Clarify and extend #83117). What matters here is: the submitter lost that chance by submitting a larger PR with different authors.