Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Maiao, Stacked Diffs for GitHub (github.com/adevinta)
52 points by joaoqalves on Oct 8, 2022 | hide | past | favorite | 23 comments



> Identify which of the files should be modified first.

I'm curious how many software engineers work this way — i.e. you need to make a big change, and you can determine a sequential order in which to make changes?

My working style is mostly jumping back-and-forth across all the relevant files and tests as I go, realizing new things that I need to change, and iterating until it all works end-to-end.

Am I the outlier, or is Maiao's expected workflow the outlier?


What happens is you work somewhere that has stacked diffs and suddenly you learn how to shape your diffs to make them easy to review. Thinking of how folks will review your code in chunks while writing it makes it cleaner. Having small but easy to read diffs makes reviews faster and helps junior devs learn how to review.

Sometimes this doesn’t happen in which case you end up need to split your commit at the end. This is where git utterly fails. You end up needing git split and git absorb to make this productive.

Git split let’s you select which chunks in a commit should belong to it and then splits that into a commit and then you do it again and again until you have lots of commits. You’ll still need to probably test each one but the majority of the work is done

Git absorb takes changes on the top of your stack and magically finds which commit in your stack the each chunk should belong to and amends it to the right commit

You also need git branchless https://github.com/arxanas/git-branchless as it lets you move up and down the stack without needing to remember so much git arcana.


You don't need any of those things.

Git gui, which is part of the vanilla git distribution, already allows you to take a hunk out of an existing commit.

git absorb does sound like a nice convenience though.


I work that way. I generally constrain my changes to a single package, module or layer of the application at a time. The feature won't work end-to-end until after several PRs are merged. Sometimes it can take 15+ PRs, but it's a lot easier for reviewers if the changes are small and limited in scope and functionality.


You’re comfortable trading your time (increased number of diffs) for increased level of sanity checking from reviewers.

In my experience, that means days of increased effort - async reviews, more explanation needed for people who are unfamiliar with the code, an additional coordination layer of explaining where we are and where we’re going, increased probability of bike shedding, and waiting for your turn to get into CI.

Not saying you shouldn’t make that trade-off: if you feel you should, you probably should. But we need to be clear about the trade offs, to better understand when that makes sense.


Everyone at my work works that way. If the order isn't obvious or it's not easy to write in isolation, you just split it into 2 reviews/patches after the fact. if you send a big change for review, the reviewer will tell you to split it. usually very easy to split by directory


Some high-level description is what this actually does would be useful.

The main problem with "stacked diffs" or "dependent pull reviews" on GitHub is that they don't work with forks/clones.

That is: you can push a branch to your personal clone and then submit a PR against the main repository's development branch. That works fine.

You can push two stacked branches into the main repository and set up dependent pull requests (where pr1 targets the main development branch and pr2 targets pr1). That sort of works -- lots of things could be improved about the UI of that workflow, but it's at least possible to do.

You can't reasonably push two stacked branches to your personal clone and then set up dependent pull requests against the main development branch: pr2 that targets pr1 will simply not show up in the main repository.

Given the fact that the original use case of Git is to work with "stacked diffs", it's absolutely mind-blowing that GitHub still has no property sort for it.


Relevant link:

* Stacked diffs vs Pull Requests [1]

1 - https://jg.gg/2018/09/29/stacked-diffs-versus-pull-requests/


So the difference is... local workflow? Whether you branch locally or not as you do work? Am I missing something? You could do the same thing working on branches and rebasing as fixes on master come in. I just don't get the big deal here


The big deals are:

1. Getting rid of manual rebases (productivity)

2. Being able to create dependant PRs and make your peer's lives easier.

Screenshot here [1].

1 - https://imgur.com/a/QJGTBsj

EDIT: Formatting


This seems like a great tool. I’m personally hooked on the https://graphite.dev CLI and would just like to mention it for comparison.


Or better: do not assign a task requiring 500 line commit. Reviewing stream of though consisting of 10 commits is not very productive as well.

Everybody wins if github/gitlab starts to show diffs between `push -f` for a feature branch. Like in gerrit.


Oh, come on. 500 lines isn't even big.


I am pretty sure gitlab can show diffs after a force push.

It says in the history that you pushed a new version of the commit, and allows you to browse to any of the versions of the commit.


How does HN deal with refactorings that litterally touch 90% of the code, but just 3 lines?

Huge PR to refactor, and fatigue sets in. But we cannot PR chunks because the build cannot fail.


Why can't you create PR chunks without the build failing?

Usually, I strive for "vertical slices". That means that, sometimes, a PR just contains "useless" code in isolation, but that makes sense as I keep merging them (stacked diffs). My flow is usually:

1. Does this code have tests? What's the minimum chunk I need to write tests for it?

2. Write tests and the smallest refactor I can think of (it may be to create one function, p.e.)

3. Commit

4. Return to 1. until the full refactor is done

5. Throw 3-10 stacked diffs that can be merged with the build passing.

EDIT: formatting


What does it mean for a refactor to touch 90% of the code, but just 3 lines?

There's only 3.333 lines in total??


I assume the feature itself is just 3 lines, but all the changes necessary for those 3 lines to be possible to add impact the code significantly because e.g. there’s bits of data which need to be threaded through the entire thing and previously were not.


> But we cannot PR chunks because the build cannot fail

Could you code review & approve on a branch without those build restrictions, and then have an additional review of a merge commit to main?

The "has to build" requirement has been circumvented at all previous employers with some similar workflow, including on perforce, cvs, email, mercurial, and git.


Split into two commits: the first commit does a mechanical refactoring but contains no (intended) functional changes.

The second commit contains the 3 lines of functional change.


Commit description should contain the command to generate the change. Your review the command plus spot check actual changes and CI signals.


I use spr and git fixup for this workflow and it's really helped me breakup my changes better, managing multiple branches and stacking changes that way is a huge pain, but this workflow has been huge for me!

SPR: https://github.com/getcord/spr

Git fixup: https://github.com/keis/git-fixup


Check out graphite. Been LOVING it. Team is awesome and product is evolving fast.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: