What are your thoughts on the "ship, show, ask" workflow? [1] In that workflow, ...

imiric · 2024-09-26T08:32:33 1727339553

I wasn't aware of this particular workflow, but I've heard of similar ones before. The late Pieter Hintjens, of ZeroMQ fame, advocated for a strategy called "optimistic merging"[1], which essentially abandons the standard code review process in favor of merging changes ASAP, and fixing any issues as they arise later.

I'm not a fan of this. It allows contributors to abandon any established coding standards, while placing more burden on maintainers to fix issues later. This in practice never happens, so the quality of the codebase degrades over time. Not to mention that it allows malicious actors to merge changes much more easily.

As for "ship, show, ask" specifically, I have similar reservations. I think all changes should go through a review process, even if someone just glances at the changes. It not only gives the opportunity to leave feedback, but also serves as communication so that everyone is aware of the proposed changes. Also, making the decision of whether to choose "ship", "show" or "ask" might only work for senior and well disciplined developers. I think that in most teams you have a mixture of experience, so you'd inevitably end up in situations where a change should've been "show" or "ask", but was "ship", and viceversa. I don't think teams will ever align on a single strategy to make a correct decision, since it's based on subjective opinion to begin with. Always following the same workflow gets rid of these uncertainties.

[1]: http://hintjens.com/blog:106

seadan83 · 2024-09-26T17:07:25 1727370445

After having done a few thousand CRs, I've come to believe that CR is of mixed value. Not always good, nor all good. Removing the bad CRs and doing CR when it is useful IMO helps maximizes productivity.

First, I assume a few constraints on "ship, show, ask." First constraint is that all changes go for at least one, if not multiple self reviews. Second, the ability to ship is not outright granted. Until someone has demonstrated they can ship reasonable, working code- they are not given write permissions. They must "ask."

Next, ship is not a given. Things that are not interesting are shipped. This enhances the communication part of PRs. If a PR is sent, it is important or wants extra review. That removes the noise of everything being a PR, and removes the noise from PRs where minor changes are smuggled in for the sake of efficiency. This also removes a lot of nitpicking as well. When asked to CR, we often feel we need to CR, and if our only feedback are minor things- then that is the feedback. 'Perfect' becomes the enemy of good enough. Saving time, efficiency are huge. Do CR when it is useful, and only then.

Last, post merge reviews can still be done by the maintainers to ensure things are going smoothly. Hence, post merge review is an option for both ship and show. This means the coding lifecycle is blocked in CR only when the CR is valuable.

That CR is blocking I feel is a very important drawback. Typos linger because time is finite and only those that do not appreciate that will try to fix everything. The team then gets bogged down spending too much time fixing small things. Leaving the small things is not ideal either. If the overhead to fix small things is tiny, suddenly it takes 5 minutes to do what would otherwise require another person and possibly the next day.

For example, A PR preceded by renaming a few files is powerful. That is a game changer compared to a hard to review CR, staged PRs, or simply never doing the renaming because it runs the diff.

Blanket CR can also ossify a project. Most projects I've seen are written quickly and in large chunks. If they are reviewed at all, it is a lot of 2000 line "LGTM", zero comment reviews. Then when careful CR is done (often by the "scaling" team), suddenly things that are original become gospel. The CR debates the absolute best naming, original intent, spending magnitudes more time than was originally put into the code in the first place. Meanwhile the code was done quickly, slightly worse than good enough- and now good enough is not even good enough. The CR now demands a far higher quality bar compared to the original code when it was written. Fixing a 50k line code base at 500 lines per day is a way to _not_ fix that codebase. The original bad code is therefor fossilized into the project. Time and effort are not infinite.

That turned into a bit of a rant. I wanted to demonstrate that:

(A) there are guard rails on SSA so it is not just a free for all.

(B) selectively requesting CR when it is valuable amplifies the value of CR.

(C) removing the noise of everything is a PR makes PRs stronger as communication tools.

(D) post merge review of all code can still meet the goal of reviewing everything.

I'll tie this back to git absorb now. Git absorb rebases, which arguably invalidates any CR that was done. Ship-Show-Ask allows for there to be more explicit trust where someone does their git absorb, redoes the self review and then merges. Further, the workflow helps clean up PRs to not be as many commits.

Though, I do think the discussion of workflows and CR is really fascinating. Having been essentially just the one senior, or one of two on most all projects I've worked on in the last decade, - I'm certainly (overly) eager to discuss in detail the workflows.

imiric · 2024-09-26T19:54:00 1727380440

Thanks for your perspective. It's interesting to hear a different take on this.

I do think there's no right or wrong answer with these things. Everyone will have different opinions about their ideal workflows, especially senior engineers. The important thing is to find a good balance that everyone on the team is satisfied with, but there will always be concessions, as in any relationship.

What I have found, though, is that the more uncertainties a workflow has, the higher the chances of misunderstanding, or someone not following it in the way someone else thinks they should. So with the way "ship, show, ask" is described, and what you mention here, is that there are many constraints, and decisions to be made about what constitutes an "interesting" change, or what is worthy to be shipped without a CR and what isn't. If the team is mostly senior, and there's a lot of trust involved, this workflow might work, and it might reduce the friction of blocking CRs. But there are a lot of variables involved there.

OTOH, by always requiring CRs and approvals for shipping, there are far less decisions to be made. The workflow is simple, fixed and known to everyone. Sure, sometimes CRs step into nitpicky territory, but this is usually solved by having some conventions, such as making clear what is a nitpick, what is a minor/major suggestion, and what is a blocker. This way nitpicks and minor suggestions can be safely ignored (as long as they're ackgnowledged in some way), and they're not blockers for merging.

In practice I've found that while this workflow does take considerable time and effort, in the vast majority of cases it leads to higher quality code. Even addressing nitpicks and typo fixes is an improvement in most cases. These can be easily automated by suggestions in GitHub, which can be batched and committed directly from the web UI, so it doesn't take much effort at all.

> Last, post merge reviews can still be done by the maintainers to ensure things are going smoothly.

Sure, but is this diligently done? And how well does it scale if the project has many contributors, but few maintainers? Placing the burden on maintainers to ensure certain quality standards is unrealistic. Moreover, it assumes that maintainers have the final say in these matters, instead of their changes also being worthy of discussion, which happens during a CR. I think that the responsibility for ensuring things are running smoothly should be shared by every member of the team, including external contributors.

> Most projects I've seen are written quickly and in large chunks. If they are reviewed at all, it is a lot of 2000 line "LGTM", zero comment reviews.

Those are examples of not doing CRs correctly, not a testament that CRs are not worth the time and effort. You can just as well run into this issue with the SSA workflow. As with any issue with team dynamics, this should be resolved by communicating and aligning on core values and practices the team finds important.

> The CR debates the absolute best naming, original intent, spending magnitudes more time than was originally put into the code in the first place.

Yes, this is to be expected. It takes much more effort to read and understand code than it does to write it. It takes additional time and effort to leave a written comprehensible review, and communicate all of this asynchronously. This scales linearly for every member in the team, and there's no way around it. But the question is whether deciding to invest the time and effort in a well established CR process ultimately results in higher quality software being shipped. In my experience, it does, and it's worth it.

That said, CRs are not the only or, even failproof, method of quality assurance. As mentioned in the SSA article, pair programming is another method, but it has its own drawbacks and limitations. All of these things help, and are best used in combination. They can be uncomfortable, tedious and seem pointless at times, but deciding to not do them because of this is doing a disservice to the final product.

seadan83 · 2024-09-26T21:00:26 1727384426

> what constitutes an "interesting" change, or what is worthy to be shipped without a CR

FWIW, I have not seen much trouble in the ambiguity. Things like "move method" refactor, typo fix, add additional test cases, comment clarification, are just shipped. If the update makes an important clarification on a comment that the team should know about, show is interesting. If the code fixes a bug, would be mentioned in a stand up- show is good. Ask is good for a large refactor where there is room for error, when something does not seem clean, merits any type of double check or second opinion, or and especially when it is a first foray into a different part of the codebase.

> Sure, but is (post merge reviews) diligently done?

The beauty I would say is that it does not have to be diligently done. Feature, not a bug.

If the lead maintainers are super busy, it is okay. Perhaps they will have time later and can do things in bulk, or even not at all. The process is no worse and does not stop when the majority of lead maintainers are on vacation.

Interesting changes are still notified via email (thru PRs). This highlights what a person should be looking at.

I think this attitude gives more trust to the team and contributors. It is not the responsibility of just the maintainers and seniors to (pre or post) review everything. No blame game of: "how did this bug get through, and more importantly, how did it get through review??"

> And how well does it scale if the project has many contributors, but few maintainers

Good question. Offhand I would say no worse. Contributors would all be required to ask, which is no different than 'review everything.' The difference would be for maintainers, they would not be required to always have other maintainers do CR in every case. A second potential difference, contributors might be promoted more quickly to have write permission. That in turn helps scale the efforts of the maintainers.

I think this shines on the other extreme when there are very few maintainers. OSS requiring week long turn around times will have trouble onboarding contributors to be regular maintainers. In industry, trusting developers relatively quickly I think is healthy.

The mentality to be looking to give people write permission sooner rather than later IMO is good.

> Those are examples of not doing CRs correctly, not a testament that CRs are not worth the time and effort.

My apologies for not conveying the example clearly. The examples I'm thinking of usually had no review. These are older code bases written long ago before CR. Or, these are startup quality code bases where a few founders jammed out tons of code. Or, code that was put out during a crunch, almost always successive crunches, where the constant pressure to ship was everything. Or, some Greenfield project that was jammed out and then handed over for maintenance and enhancement. Just large codebase that were never great. I'm also implying there is often a different standard for originally authored code compared to changing existing code.

My point is there is a quality bar that is suddenly higher, a lot higher, than the existing code.

> But the question is whether deciding to invest the time and effort in a well established CR process ultimately results in higher quality software being shipped. In my experience, it does, and it's worth it.

I agree, though not sure if actually always worth it. Projects are rarely labelled a failure, but many are. That can manifest as simply too much time going by without enough being shipped.

Though, good CR culture is indeed very valuable. I think it is more an exception than the rule.

I think that complements SSA at the same time. A strong CR culture with review everything might be hard to distinguish from SSA. In that sense, "ask" could be thought of as "second opinion." When a team gets crunched, there is more flexibility. Or, if the team is immature w.r.t to CR, or if there is a significant lack of reviewers- SSA removes barriers.

The idea is to capture 95% of the benefit of CR with 50% the cost.

> Sure, sometimes CRs step into nitpicky territory

Pointing out typos IMO is useful. Though, I've learned to personally delete any CR feedback I write that starts with "nitpick." I no longer believe it to actually be all that helpful or productive. It is a signal that a style discussion is due. It is important I think to respect that a CR is a blocking thing. Keeping someone an extra whatever amount of time for nitpicks is missing the bigger picture of what is important IMO

----

I very much appreciate the dialog and perspective! Thank you for your considered feedback here. I largely agree with your points, particularly the ones I did not respond to.

I suffer a bit too from the lesson of things taking too long is it's own failure mode. Success is not at all guaranteed, even if quality and solid code is shipped. At the same time, solid and quality code is required to scale, and is required for most success. That is to say that projects do fail very much in spite of quality engineering.

keybored · 2024-09-26T19:37:02 1727379422

Why is the preferred goal to “rebase-squash” (redundant) merge? The implied onus here seems to be for the other party to move towards that strategy. But why this practice should be the goal does not seem to be mentioned.

Step one isn’t to find some way to accept squash-merge. Step one is to find some reason for squash-merge.

seadan83 · 2024-10-03T17:13:59 1727975639

I think you are coming at this from quite the different angle. To semi quote what I was responding to:

> always squashing PRs is not a good strategy [because of the following reason]

Hence my question,if workflow X mostly solves the following reasons, then always squashing is potentially a reasonable strategy.

I think you are asking, why do that at all?

The answer is IMO quite involved, nuanced, and not at all one sized fits all.

FWIW: (1) all commits to master become atomic if a passing CI is required before rebase squash-merge.

(2) rebase creates a linear history. Non linear history is very difficult to bisect, and can really be complicated. I don't see much positives from that complexity.

(3) generally PRs are functional sized pieces of work. (That is ambiguous.) If it helps, PRs are more boulders than pebbles. Having boulders in the history is more useful than every small pebble. Net benefit is the history then becomes a smattering of well marked minor changes and a set of functional changes.

Side note, I like anything that helps the workflow of "touch clean code only, touch only the code you need to, clean the code before you touch it". (Clean defined there as "not a pile of illogical, obtusely written crap - riddled with latent bugs). In other words, make the code simple and obvious first, then modify the simple and obvious code. I give little credit to those that spend days studying code to make a one-line "masterful" change. That does not scale, everyone has to do that, and there are thousands of lines. All things in their own context though...

(4) utility and communication for commits. IMO ideally a PR is initiated with a single commit that is written "for the history". This commit description is both descriptive for the reviewers, the git history, and is the complete text for the PR. This creates the most value IMO. I view PRs as ephemeral, any extra text written for the sake of PR is overhead. Then, any following commits are pushed to communicate to the reviewers. The audience is no longer a future maintainer, nor someone looking at a PR to get a handle of the in flight changes, but is someone that has looked at the diff. Thus, commit comments like, "address TOCTOU concern", "fix typo", "add null check", "add test case", "address feedback" all then make sense. On squash-merge, those comments are discarded and the merge is just the "commit written for the history". Otherwise, without squash - those commits have a much broader audience and become long lived. I believe this typically creates a much more useful history, eliminates a lot of noise, clarifies the audience of each commit message, and finally removes any overhead where a detailed description is written in a PR but then never captured in the history (wasteful &puts context in a location that is ephemeral and a step removed)

keybored · 2024-10-13T10:28:00 1728815280

> I think you are coming at this from quite the different angle. To semi quote what I was responding to:

> > always squashing PRs is not a good strategy [because of the following reason]

I am at least coming from the same position that the OP is. Because I agree with this:

> > IMO always squashing PRs is not a good strategy. Sometimes you do want to preserve the change history, particularly if the PR does more than a single atomic change, which in practice is very common. There shouldn't be a static merge type preference at all, and this should be chosen on a case-by-case basis.

> Hence my question,if workflow X mostly solves the following reasons, then always squashing is potentially a reasonable strategy.

If OP has no problems with their current strategy (where they may squash but they are free not to) then you aren’t solving any problems. For them.

They don’t have a problem that they need to solve. You are the one who wants to fit this squash-square into their circle.

What onus do they have to read some Everything Is A Pattern dot com link in order to come to terms with a policy that they don’t need?

> I think you are asking, why do that at all?

> The answer is IMO quite involved, nuanced, and not at all one sized fits all.

Unlike squashing.

> FWIW: (1) all commits to master become atomic if a passing CI is required before rebase squash-merge.

I don’t know what atomic means. But `git log --first-parent` is equivalent to a fully squashed history. Which means that `git log --first-parent` is atomic if the squashed alternative reality is.

> (2) rebase creates a linear history.

`git bisect start --first-parent`.

There seems to be a pattern here.

> Non linear history is very difficult to bisect, and can really be complicated. I don't see much positives from that complexity.

Non-linear history can involve everything from twenty-head octopus merges to eight different root commits to backmerging twelve times before a feature branch is merged. But does saying no to the straightjacket of always-squash mean that you have to support every DAG under the sun? No.

Most sane everyday histories are limited to branching off the main branch, rebasing until it is done and then merging it back. One merge. Granted some people don’t like rebase or don’t know how to use it so they make back merges.

... And if those people make completely useless histories on that branch? Then they can squash of course. Because we’re arguing against the corporate always-squash policy. Not saying that you should never do it.

> (3) generally PRs are functional sized pieces of work. (That is ambiguous.) If it helps, PRs are more boulders than pebbles. Having boulders in the history is more useful than every small pebble. Net benefit is the history then becomes a smattering of well marked minor changes and a set of functional changes.

People can make however many commits are appropriate for the scope of the PR. It can be twenty or one.

See how the squash-always have to make the world around their “merge” strategy pristine (according to them) in order for it to work? “Just have perfect-sized PRs.” “Just have perfect-sized issues.”

> Side note, I like anything that helps the workflow of "touch clean code only, touch only the code you need to, clean the code before you touch it". (Clean defined there as "not a pile of illogical, obtusely written crap - riddled with latent bugs). In other words, make the code simple and obvious first, then modify the simple and obvious code. I give little credit to those that spend days studying code to make a one-line "masterful" change. That does not scale, everyone has to do that, and there are thousands of lines. All things in their own context though...

Okay.

> (4) utility and communication for commits. IMO ideally a PR is initiated with a single commit that is written "for the history". This commit description is both descriptive for the reviewers, the git history, and is the complete text for the PR. This creates the most value IMO.

Okay. You have still gotten to the explanation of why always-squash-merge is necessary for PRs. (We have covered the previous stuff. `--first-parent`)

We could talk about the ideal world. In an ideal world we could choose better tools that are good for this very granular PR structure. But in the real world we often end up with GitHub or some lookalike. Which is not good for granular PRs, dependent PRs, “stacked PRs” or whatever the trendy name is for the people who fetishize PRs over everything else.

In the real world we still have git(1) to be flexible around the limitations of these team-imposed tools.

And even with better tools I like being able to use several commits for one PR. Because sometimes you end up needing five preparation commits in order to solve the issue you were setting out to do. And that is often a hell of a lot more “agile” then going back to the issue tracker and making five issues in preparation for that final commit. Then making six pull requests that depend on each other.

> I view PRs as ephemeral, any extra text written for the sake of PR is overhead. Then, any following commits are pushed to communicate to the reviewers. The audience is no longer a future maintainer, nor someone looking at a PR to get a handle of the in flight changes, but is someone that has looked at the diff.

Okay.

I think that a PR can evolve to encompass more “atomic” changes then what was initially planned. That has happened to me. Me and the reviewer find out that it makes sense to make some more changes. Not “fix typos” on that same PR but changes that deserve to be commits in their own right.

There is also room for “fix typos” commits that you can then rebase away once you are done. With Git you have that flexibility.

> Thus, commit comments like, "address TOCTOU concern", "fix typo", "add null check", "add test case", "address feedback" all then make sense. On squash-merge, those comments are discarded and the merge is just the "commit written for the history". Otherwise, without squash - those commits have a much broader audience and become long lived.

Do we really have to tediously go through the same thing every time the squash-always camp tries to argue for their one-true-policy? Look at what OP said:

> > Sometimes you do want to preserve the change history, particularly if the PR does more than a single atomic change, which in practice is very common.

Which is what I just went through.

Yes. Rebase that typo fix which you introduced. That’s noise.

Probably don’t rebase that commit which made a refactor which should have been done three years ago.

> I believe this typically creates a much more useful history, eliminates a lot of noise, clarifies the audience of each commit message, and finally removes any overhead where a detailed description is written in a PR but then never captured in the history (wasteful &puts context in a location that is ephemeral and a step removed)

What me and OP has argued for allows you to do all that. And in addition to that it is also much more flexible.