Hacker News new | past | comments | ask | show | jobs | submit login
GitHub merge queue is generally available (github.blog)
192 points by ingve on July 13, 2023 | hide | past | favorite | 122 comments



Does anyone else find this article unreadable? It sounds more like a marketing piece than an explanation of what merge queue is.


Merge queues are, as the name implies, queues for pull requests/merges. They're kinda useless if your commit traffic is low (e.g. <10 per day), but become necessary once it grows past your daily CI time budget, roughly (which can happen on large monorepos).

As a very simple example, if your CI takes 10 minutes, your CI time budget is 6 merges per hour.

This is because if you merge two things in parallel without validating CI for the combined changes, your main branch could end up in a broken state.

Merge queues run CI for groups of PRs. If the group passes, all the PRs in the group land simultaneously. If it does not, the group is discarded while other group permutations are still running in parallel.

This way you can run more "sequential" CI validation runs than your CI time budget allows.

In our monorepo, we get a volume of 200-300 commits per day with CI SLO of 20 mins.

Without a queue, our best case scenario would be getting capped at ~72 commits per day before seeing regressions on main despite fully green CI (in real life, you'd see regressions a lot earlier though because throughput of PRs is spiky in nature)


> Merge queues run CI for groups of PRs. If the group passes, all the PRs in the group land simultaneously. If it does not, the group is discarded while other group permutations are still running in parallel.

That is a way of handling even higher volumes than GitHub is talking about, at the cost of a system that is a bit harder to think about. From the article:

With GitHub’s merge queue, a temporary branch is created that contains: the latest changes from the base branch, the changes from other pull requests already in the queue, and the changes from your pull request. CI then starts, with the expectation that all required status checks must pass before the branch (and the pull requests it represents) are merged.


The core principle is the same. How permutations are selected, of course, affects the performance and usability of the system.

Uber's[0] implementation, for example, does some more sophisticated speculation than just picking up whatever is sitting on the queue at the time.

Queues come with quirks, e.g. small PRs can get "blocked" behind a giant monorepo-wide codemod, for example. Naturally, one needs to consider the ROI of implementing techniques against aberrant cases vs their overall impact.

[0] https://www.uber.com/blog/research/keeping-master-green-at-s...


GitHub's merge queue does support merging multiple PRs in a single merge operation. It's the "Maximum pull requests to merge" setting


It's awful.

I scrolled down to the how does it work section where the first sentence is:

> Merge queue is designed for high-performance teams where multiple users regularly commit to a single branch

Half of the how does it work section is buzzwordy fluff.


They don't need too since the link posted here is literally the press release announcement. For the inner details one should look at the documentation at https://docs.github.com/en/repositories/configuring-branches... . It for example has a detailed example as how it handles a pull requests failing ahead in the queue https://docs.github.com/en/repositories/configuring-branches...


Collaborative coding is powerful. But to be at your team’s most optimized state, you need automated branch management that enables multiple developers to commit code on a daily basis, without frustration. This can happen if your team’s branch is busy with many team members onboarding the same commit onramp. This can be frustrating for your team, but, more importantly, it gets in the way of shipping velocity. We don’t want that journey for you!

This is why we built merge queue. We’ve reduced the tension between branch stability and velocity. Merge queue takes care of making sure your pull request is compatible with other changes ahead of it and alerting you if something goes wrong. The result: your team can focus on the good stuff—write, submit, and commit. No tool sprawls here. This flow is still in the same place with the enablement of a modified merge button because GitHub remains your one-stop-shop for an integrated, enterprise-ready platform with the industry’s best collaboration tools.


You didn't copy all the emojis! :P


I think this website filters them.


Unbearable corporate buzzword soup. Yikes.


Banal stuff from ten years ago devops continuous delivery material. It’s a good feature maybe you’re just unfamiliar with some of the theory basics?


The feature may be good, but no theory on earth can make me read stuff like that without getting sick.


I read it yesterday and couldn't figure out exactly what they were talking about and upon re-reading, yeah, it's bad marketing copy.

The problem is probably whoever wrote the blog post (who is likely not even the named author, depending on how their marketing team does things) tried to add a lot of high-level stuff to make it make sense to them without really needing to understand the details, and then dolled it up with a bunch of useless vapid quotes from customers and what not, because that is what marketing people think matters. Maybe it does make sense to have mealy-mouthed corporate speak for the overall product, since some executive is probably deciding whether to use GitHub as a whole and they might care if a big company uses it. I don't know that it makes much sense for specific features like this, especially in a fairly technical product like GitHub.


It's totally unreadable. There's 5% meat in it almost at the very end, the rest is about selling the feature.

The problem is that if you have multiple branches going into the same (mono)-repo, then they might all pass a localized CI-check, but fail if they are all merged. This is because the branches have an interaction between them. It can lead to a stall in commits and because everything hinges on the repo, work is going to stall as well.

So you serialize the branches, and impose an order on them: [x_1, x_2, x_3, ...]. Now, when running CI on one of these, x_j say, you do so in a temporary branch containing every branch x_i with i < j. This will avoid a stall up to branch x_j, if you started to merge the branches in order. If CI fails on branch x_j, you remove it from the list (queue) of branches to be merged and continue.


I feel like since the Microsoft acquisition almost all their communications at all levels have gone from detailed info about features to fluff marketing pieces.

Beyond that, their API docs prior to the acquisition were some of the best in the industry, readable and concise. Now they are just a complicated mess.


> I feel like since the Microsoft acquisition almost all their communications at all levels have gone from detailed info about features to fluff marketing pieces.

Comms teams are really terrible in this regard. They insist on a singular 'voice', which means that every article is going to go through their review and get rewritten to their standard - that standard may involve removing technical content and instead making it more layman/ marketing friendly.

It's an incredible mistake that I see made everywhere after companies hit a certain size. It then becomes up to engineers to build their own engineering blog with less oversight and then guarding it from the comms teams, which most engineers aren't interested in doing.


For GitHub the layman is a programmer no? So why remove technical info


The layman is, to a comms team, a manager, CISO, or some other mystery person I really couldn't explain to you. Yes, it's ridiculous and incorrect but that's my point.


> Does anyone else find this article unreadable? It sounds more like a marketing piece than an explanation of what merge queue is.

Yes. The first useful line in the article is

"With GitHub’s merge queue, a temporary branch is created that contains"

and to reach there you have to skip fluff paragraphs halfway down the article.


I also have a hard time understanding what it really is.

What I think it is: instead of you trying to merge into the main branch, you try to merge into a branch where all pull requests before you are already merged in.

That way any pull request before you can't cause any merge conflicts, because they are already taken into account.

At least that's what I deduct from all the marketing fluff. Maybe I'm completely wrong.


Yes, but I find it hardly different from most articles today. Whether it's a blog post or a news story, an article will likely be a mediocre exercise in creative writing (with the intent to persuade) or a bunch of marketing waffle.


It's only readable if you already know what merge queues are. Or reading their actual docs [1].

Merge queues address the problem of how to (1) merge in a lot of changes (2) while guaranteeing no breaking/conflicting changes are merged.

[1] https://docs.github.com/en/repositories/configuring-branches...


https://graphite.dev/blog/what-is-a-merge-queue

This explanation is actually a lot better


Yeah, I read the whole thing, sounds interesting, came here to see if someone could actually explain what it is.


It automates the post-approval coordination stages of a PR for maintainers.

Let's say you're an open source maintainer with 3 pending Pull Requests to merge: [1, 2, 3]. Each of which is based off `main`, has passed CI and has been approved.

If you merge all 3 at the same time, there is a chance to break the build: Your CI is testing `main <- 2`, but you're merging `main <- 1 <- 2`. A common example would be when (1) is a user-supplied change, and (2) is a dependency/localisation change, which don't cause merge conflicts but they do break the build/tests.

To do this safely, you need to re-run CI on (2) after merging (1), which is currently a manual process: you need to know that (2) is next to be merged, then rebase/pull + rerun CI for (2).

(There used to be a manual step of 'merge once CI is passed' here, GitHub has recently improved this workflow to allow automation)

Merge queues fully automate the safe approach: it merges (1), runs CI on (2) which fails, then runs CI on (3), which passes and gets merged.


What happens if someone wants to merge when the queue is already running CI? Does it interrupt CI and start over, or does it run CI to the end and then kick CI off again with every new merge added to the queue since the last CI kickoff? Or does it merge on a successful CI and put together a new queue with those new waiting merges right after?


Thanks! Github should hire you to write these posts.


You know when you submit a PR at the same time as someone else, they both pass CI, and then both merge without editing conflicts, but then it turns out there were semantic conflicts and you accidentally broke `master`?

This fixes that. It removes the race condition that exists because of the gap between testing a branch and merging it.

The solution is very simple - have a queue of PRs and automatically test & merge them one at a time.

There are some optimisations you can do to speed things up a bit, e.g. testing a bundle of PRs all at once, but that's the gist of it.

It is basically essential on any repo that has a high rate of PRs. I'm surprised so many people here haven't heard of it.

Gitlab has the same feature but they annoyingly called it something worse - merge trains, and it's only in Gitlab Premium.


to me it's the ability to test your merge on a virtual main branch


Yes, this set off all my 'gpt warning bells'. Anyone know the latest on automatic gpt detectors? Feels like it should be easy but last I checked they had a lot of false positives.


> Anyone know the latest on automatic gpt detectors?

There are many out there but I don't know about the "latest". GPT-4 itself says it has only a 10% chance of having been generated by a LLM.

These detectors are really unreliable. I've fed them content that I generated from GPT-4 and they never detect it as AI-generated.

I pity the students whose teachers will use them to detect plagiarism.


The problem is that you need to actually do training to detect AI text, and no one wants to spend money on that. The actual implementation is very easy:

1. get a corpus of real text

2. generate a corpus of AI text

3. train a model until it can tell the difference

The problem is step 2 is semi-expensive and step 3 is really expensive, so everyone is trying to shortcut the process, and of course it doesn't work.


I found the images and animations at the bottom extremely descriptive. I've been needing a feature like this, and it's immediately intuitive for me.


Yes. But look at the bottom. There's an image with the PR review screen. There's one change:

* Normally, the big green button says "Merge pull request"

* Now, the big green button says "Merge when ready"

In a large project with lots of activity, a stampede of people pressing "Merge" at the same time will cause trouble. "Merge when ready" is supposed to solve this.

It seems to mean:

> "GH, please merge this, but take it slow. Re-run the tests a few extra times to be sure."


Here's in-depth details on how it works. [1] Basically, each PR gets put in its own branch with the main branch + all the PRs ahead of it merged in. After tests pass, they are merged in order.

[1] https://docs.github.com/en/repositories/configuring-branches...


Aha, so GitHub merge queue = GitLab merge trains (or at least very similar).


Yes that’s pretty much what it is. Both are replicas of bors, and implementations https://graydon.livejournal.com/186550.html


Bors is also very similar to the Zuul CI system used for OpenStack. It has the equivalent of a merge queue (with additional support for cross repositories dependencies): https://zuul-ci.org/docs/zuul/latest/gating.html You can then have pull requests from different repositories all serialized in the same queue ensuring you don't break tests from any of the repositories participating.


Also continuous integration best practices advance one funeral at a time, it seems.


So does each new PR start new tests that will supersede the previous PR’s tests? If one PR’s tests fail, does it block all PRs behind it in the queue?

I’ve read docs several times and never found them very clear about the details.


Each PR on the queue is tested with whatever commits it would have were it merged to the target branch in queue order. So if the target branch already has commit A and commits B and C are in queue, commit D will be tested on its own temporary branch with commits A B C and D. If the tests for C fail, C is removed from the queue, and D is retested with just commits A B and D (because that's what would be on the target branch by the time it merges).


OK, thank you.


Its an announcement article that I think sells it pretty well. Its not product documentation.


Yeah, the embedded video helps a bit.


It's completely embarrassing, whatever marketing person wrote it needs to be got rid of.

> The result: your team can focus on the good stuff—write, submit, and commit. No tool sprawls here.

The good stuff? Tool sprawls? Is this written for teenagers?

> Merge queue is designed for high-performance teams where multiple users regularly commit to a single branch.

I think you meant "highly active". High performance means something else. But I can kind of see it emerging from your awful sales person brain.


You can be critical without being unnecessarily harsh.


Not everything that is harsh is unnecessary.


In what world is ranting about PR copy and saying someone should lose their job over quite literally doing what they were asked necessary?

Who, exactly, is it necessary for? The original commenter getting their rocks off on insulting someone else’s job? Others coming in and laughing at someone insulting someone else? Critically necessary.

It’s not like the original article author is going to come in, see this comment, and reflect deeply on themselves and their work.


I admit I was harboring a small hope that someone from GitHub / Microsoft might see the criticism here (not just mine) and that it might help reduce the frequency with which that sort of sales person tries to communicate with their market of software engineers. It was a bit unpleasant to suggest someone should lose their job. While they were presumably asked to write the piece, they were not asked to write it so tastelessly.


> While they were presumably asked to write the piece, they were not asked to write it so tastelessly.

So you sit next to them and therefore know what their assignment was and how well they executed on it?


If the marketing person that wrote this was told to write this, I don't see why he should be got rid of.


For doing such a bad job I guess. Marketing doesn't have to be written like the audience are teenagers. But yes it was a bit unpleasant to say that.


Or maybe "high-contention"


We've been using this for a few months to manage pull request merges for our mono repo. It's greatly improved the merge process. It makes trunk based releases faster and reliable. Kudos to the team for adding this feature.


Same here, the experience has been generally favorable.


If you do squash merges, how does the queue handle the creation of the final commit message? We already have a problem with trash-filled message bodies with GitHub, and I don't want this to make it worse.


We turned on squash only merges and use the pull request description as the commit message.

GitHub creates a temporary merge branch with the commit where you can run tests before the merge. So in the end it’s just one commit.


same ... it was pretty flaky early on, got stuck, double commits, not configurable merge-group size ... but it's pretty good now


> Any team that is part of a managed organization with public repositories and GitHub Enterprise Cloud users will be able to enable this feature on their respective repository and start streamlining their team’s pull requests immediately.

I'm a bit sad that the merge queue is not available for personal accounts. I was hoping have it replace Bors, which has been deprecated since May 1st, 2023.

https://bors.tech/newsletter/2023/05/01/tmib-76/


Yes. GitHub reserving features for organizations has kind of kludged up the whole service. Draft PRs for instance really should be available to everyone. I suspect they’re trying to do it as some sort of value add, but none of the features they are reserving are worth paying for on their own. It’s just an annoyance.


You don't need to pay to have an organization, I just created a personal organization to hold my non-experimental source repos.


Draft PRs work on all repositories, even non-org ones, though? I used them like 2 hours ago. Or are there more features for Orgs?

I do agree MQ being only for Orgs is really really annoying.


> Any team that is part of a managed organization with public repositories and GitHub Enterprise Cloud users will be able to enable this feature on their respective repository and start streamlining their team’s pull requests immediately. If this describes you, then let the commit parties commence.

That doesn’t sound very “Generally Available” to me, I really wish they’d stop this confusing feature differentiation. Just give everyone everything.

Why can’t I use merge queue or my personal repos. It’d be very useful for merging a bunch of dependabot PRs.


Because this is the kind of feature large orgs need which makes it a perfect upsell.


But no large org is using the free tier anyway?


Small-ish orgs with finite money can use the free tier if they don’t see a point.


Fair question.

Though FWIW, you have to have a fairly large scale for merge queues to be relevant in practice.


I would love to use this at $daywork, but it has the unfortunate limitation that the merge group needs to pass the same checks as the branch itself, making it doubly expensive (esp. with something like Chromatic that is susceptible to flakiness and might need manual approval).

I would love to be able to set different sets of checks that need to pass to add a PR to a merge group, and to the merge group itself to be merged, so we can better manage speed and cost.


I was surprised by this weird limitation as well. However I found its easy to workaround -- I made my CI checks provide the skipped status when they run on the PR, and then provide the success/failure status when they run on the merge-queue branch.

In a GitHub Action, this is was very easy:

  on:
    pull_request:
    merge_group:
  jobs:
    build:
      name: Whatever...
      if: github.event_name == 'merge_group'
Then the task "Whatever" can be added as a required status check, and "skipped" is good enough to allow it to function on the PR side while it actually executes on the merge queue.


Where do you see that limitation?

The docs say there's a separate event for merge_group which I assume means you can configure different checks.

https://docs.github.com/en/repositories/configuring-branches...


The branch protection rules that you apply are both used to determine if you can add a PR to the merge queue, and if the merged commit passes the merge queue checks and is OK. This isn't documented, but based upon experience using the beta.


If the events triggering the jobs are different you can handle that there e.g. immediately succeed on a pr and actually validate on a queue or the reverse.


If you can cache heavily enough, the cost of a no-change re-run is effectively zero.

Actually getting to that point can be quite hard. But it's definitely possible. But also, if there's any change then you want to re-run anything that depends on the change. Limiting the rebuild to only things that have changed also take effort. But it's really worth it for the benefits you reap, both in CI and in development -- if your edit-compile-test cycle is long enough that an extra CI run is annoying, it's long enough to be detrimental in every day development.


Another option to workaround this could be to skip/noop checks based on the branch name prefix


I hate the style of this post; too much fluff and so little information about the feature itself. Was this written by ChatGPT?

Also, I don't think emojis belong in articles like this one. Maybe I'm getting old, but I don't like it.


> Was this written by ChatGPT

Don't be ridiculous... Copilot did it.


I'm surprised that a new merge tool in 2023 is still only about lexical code conflicts (such as competing line changes). I'd expect a tool for semantic code conflicts, e.g., developer A's PR extends the customer data structure with a new field (like upsellPotential) that customers shouldn't see, while developer B's PR adds the customer data structure to the JSON that's consumed by client-side JavaScript in the customer portal.


That's what your test suite is for.


This is neat. I recall an old thread on HN where we were discussing different strategies for achieving this.

That said, I can't imagine being in a position to need this. Right now we do 3-4 merges a day and it's really comfortable. If we got to 10 merges a day, I'd start asking why we are changing so much damn code in the same space/time. I know there are really good use cases for this tooling though.


I worked in a place with 50-70 programmers, and we had couple dozens rebases during the day (I don't know why anyone would want to use merge). This was a project in its initial stage, so a lot of code needed to be written rather than revised, also, after some bad experience with trying to manage the project in multiple repositories, everything was brought under one roof. So, all teams were working with the same repository.

In this situation, we usually had a significant chance of waiting for another commit to go through. We also had a system that's similar to what Github offers to manage queued rebases. This was amplified by even the sanity test being about 30 minutes long and the requirement that each commit be tested (so, if you try to rebase a branch with 10 commits, you'd have to wait 5 hours for it to go through). However, such cases would be very rare and anyone who needed to rebase such a branch would typically send an email asking for a particular day to get their changes in. To my memory, there were two instances of something like this happening.

From what I could gather from the article and the comments, the system worked differently though. First of all, ours could try to rebase in parallel, so that if one rebase fails the test, another set of changes could be incorporated immediately. However, no attempt was made to see if the change set being submitted agrees with the changes in the pipeline. I think we might have tried that once, but later ruled this to be both confusing and in most cases catching very few issues anyways (it is in general rare that two developers work on the same exact files). The confusion part comes from detecting conflicts with the branch that doesn't get rebased due to the failed test, and then the developer who's notified about a conflict is left wondering about the real reason, because, usually, by the time they get to investigate it, the failed branch had been already changed by the branch owner.

I also worked in larger companies, but usually they try to split the repository along administrative boundaries, and the number of people pushing to the same repository at the same time is not so big. They pay for it by having to invest a lot into release management, version synchronization across multiple teams, much more integration testing, worse understanding of the product by individual developers / teams, and, in general, lower quality of the product. But, off-the-shelf VCSs don't allow for sharing large repositories easily, and, of course, the problem this merge queue is trying to address would grow more severe with the size.


One of the huge advantages of this sort of workflows is it avoid devs rebasing then pushing without having run the test suite on the rebase PR.

It’s most important if the test suite is pretty slow though (e.g. minutes) as that makes it more likely you’ll get push races.


What's the point if your pr is removed from the merge queue automatically when there are conflicts introduced? I thought that was the problem it was trying to solve.


The problem it solves is that the vast number of PRs on a busy team don't conflict, both at the source level and the unit test/CI level, so you can increase the throughput of PRs by automating the merging process and merging multiple PRs at once.


It helps prevent incompatible PRs from shipping together. If two people merge incompatible changes at the same time (because both of them have a big green button), you might end up with a broken application. The way Merge Queue does its checks allows you to detect when these kinds of changes are being attempted and return one of the PRs to its author.


> I thought that was the problem it was trying to solve.

Of course not since it’s not a problem it can solve.

The problem it tries to solve is having to rebase / merge by hand, racing for CI, and the risk of going “fuck it” and merging a change with does not conflict but is semantically incompatible with a previous change.

Most of the time your colleagues’ changes and yours don’t interfere, but on repos with lots of traffic losing the CI race and having to rebase, wait for CI again, rinse and repeat, gets old quick, when the test suite takes more than a few seconds.


If anyone from GitHub is reading, any idea if the double commit bug the merge queue had has been fixed?

https://github.com/orgs/community/discussions/36568

(This was an issue for us during the beta)


There are comments upthread which mention hitting this issue and say that things got smoother and they’re happy now.


Seems to be only available to opublic repos or enterprise customers. At least I can't see the setting in my company's private repo


> Pull request merge queues are available in any public repository owned by an organization, or in private repositories owned by organizations using GitHub Enterprise Cloud.

Source: https://docs.github.com/en/repositories/configuring-branches...


Note from https://github.com/orgs/community/discussions/46757 that "@depdendabot merge command does not work" - we struggled with this for a while when testing a few months back before deciding to pause adopting this (merge queues are particularly useful for dependabot PRs where you likely have a stack of them that you want to approve and forget).


This kind of synthetic branching seems to be useful in many places. I wonder who works on this.


It is only available for enterprise (which is unreasonably expensive: 21usd vs 4usd per month per seat). We asked our devops to develop something similar using our existing ci infra, took them a few days.


What did they use? A GitHub Action that runs on PRs and creates other PRs with merged changes?


I looked into that. Basically a GitHub action that goes over all of the prs that marked as "automerge" and updates them to master. It is not a perfect solution, but it works perfectly well for our monorepo.


I read the whole article and still have no idea how it works.


Merge: PR with tests passing on this branch gets merged.

Merge when ready: PR with tests passing on a temporary branch (which includes all the PRs already queued up) gets merged when everything already queued up is merged.

It prevents merging in a situation where branch A passes tests, branch B passes tests, but main+A+B would break.


This looks like the merge method used by Gerrit which I've missed. Good to know that it's now available on GH!


How impactful is this if we don't require up-to-date on PR branches?


Is this in/will this come to Azure DevOps any time soon? Any rumors?


Azure Repos feels like it's not getting a lot of updates anymore. They still only support ssh-rsa host keys for SSH Git, which seems like it would be something to fix after several years of OpenSSH having deprecated it. (But merge trains would really help too).


I feel like the writing is on the wall that Azure Repos is supposed to finally get Dependabot, Security Token Scanning, and more security auditing tools 2-3 years after GitHub added all of those features and the brand name for that is "GitHub Advanced Security for Azure DevOps". That brand name says a lot (and not just because it is so long).

I really wish Microsoft would just pull the trigger on Old Yeller at this point, it's almost worse watching it suffer so much.


Drawback for SalesForce it’s that it use additional scratch orgs


Squash and rebase. Linear commit histories.


I don't understand how this is a relevant alternative? We are doing squash in our team, but merge queue seems to solve another problem. Could you explain, I'm really curious?


It’s irrelevant, merge queues are orthogonal to the actual merge method.


Not feasible if CI time > time diff between two merges (e.g. CI time is 15m but there's a PR merged every 5 min).


No problem unless it’s common for say 1 in 100 PRs to conflict at any point in their changes. If (say) one in 3 have conflicts, then it’s an obvious problem. So number of PRs alone shouldn’t be the limiting factor.


Nah it's very much a problem then as well, because while not integrating a broken change is easy, fixing it once it's there can take a while as you try to disentangle what's what under pressure because at best other colleagues are hampered in their ability to work or worse use that as an opportunity to just merge their change in a hurry compounding the issue. If your integration branch is broken for half a day every week it gets old quick.


Irrelevant.

Regardless of the merge strategy (merge vs rebase), you will face the same problem with high-volume development relative to CI times.

Merge vs rebase just affects how it appears in Git, not anything about breakage guarantees.


Merge queues/merge trains solve a problem that only exists if you require all PRs/MRs to be based on the tip of the target branch. And there's little reason to do that unless your codebase is a total mess.


I don't understand this, how could you be confident your changes won't break something if they are not based on the tip of a branch? What if other conflicting changes have been merged in the mean time?


If conflicting changes have been merged in, GitLab/Hub will tell you that your branch has conflicts and prompt you to update/rebase on the target branch. That doesn't happen simply because your branch is out-of-date with the target.


There are two types of conflicts: literally changing the same lines of code (which you're talking about) and conflicts of business logic between features/modules (which I'm talking about). What if your PR depends on a database table that was dropped in the most recent commit? If your branch is based on an old commit, your tests might pass, because that table has not been deleted yet from your POV. But there will be a failure when you merge.


No, merge queue logic is still useful even if you use literal merges, or rebases. We used GitLab at my last job and had a "merge train" bot that would do more-or-less the exact same thing as GitHub does here, it would just literally use 'git merge' instead of 'git rebase'. No big difference in that regard.

In any case, many teams and groups prefer reducing the number of merges to zero, if at all possible, and rebasing all final commits on top of trunk, to make the history shorter and more manageable. It isn't really that unusual at all these days. I've been working this way for years. But merge queues/merge trains are more a matter of team size/velocity/commit rate than it is anything to do with the merge algorithm, ultimately.


You've misread my comment. Whether you rebase on the target, or merge in from the target, the effect is the same. My point is that you only need to do that if you encounter git conflicts that must be solved manually.


It's not about git conflicts, it's about behavior of the resulting repository and merging changes safely. You can absolutely have changes two changes A and B that have no conflicts to resolve textually, but can result collectively in a bad repository state (i.e. build is broken, tests fail) when both A+B are applied.

For example, say A is a patch that renames the function foo() to foobar(), and then B is a patch that calls the function foo() at a completely new callsite. The original repository is green, and both A and B are individually green too. You merge A into main, then merge B without rebasing or re-merging main, and the build is broken. Each of these changes passed the tests in isolation, but together they will result in a repository that is broken. This has nothing to do with whether the codebase is a mess or not; it's just a simple rename of a function. B and A are simply mutually exclusive, and most be ordered concretely between each other.

To fix this case manually, you have to merge A to main, then rebase B onto main (or merge main into B), which would then result in "function foo() not found", or your tests failing or build exploding. Now that the CI has caught it, you can change B to use the new function foobar() and re-attempt a merge again. Except you also have patches C-through-Z written by 5 other developers that might also conflict. This kind of example is everywhere in a large codebase; imagine that you're renaming files, changing parameter types to a function, reworking test output from debug statements, etc. And now imagine every CI run is 15 minutes long. If anyone merges in that 15 minutes before you, you have to start all over again. This also applies to all 5 developers of all other 20+ patches, too.

The long and short is that there are a lot of cases where, to be safe, you need to just rebase your change on top of the latest tip first, before you can be 100% sure the build passes. Codebase cleanliness has nothing to do with it; that's just too pessimistic.

The Merge Queue solves it a different way. Just queue up A and queue up B to be merged in series. Actually, the order doesn't matter at all. Let's say B is up first, then A, then a new patch C that is totally unrelated. The build passes with B applied, because it worked originally so it gets merged. Next up is A. A now fails, because even though it applied the patch successfully, there's a new call to the old function named foo(). So it gets kicked out. Now C is up immediately after. It succeeds, so it gets merged. At this point, the author of A is now responsible for rebasing their change and fixing the build. At no point did the author of either B or C have to be responsible for re-merging or rebasing their changes on top of main, as they triggered the happy case.

The best way I can describe merge queue versus manually rebasing is this: the merge queue is optimistic locking, while manual rebasing is pessimistic locking. The time-to-merge a change is the latency. In an optimistic lock strategy, you always try to do the thing, but just detect if it fails and safely abort. The pessimistic case requires strict serialization of the operations to ensure no conflicts, but it needlessly holds up many concurrent writers. It has nothing to do with "messiness" of the data structures, to use an optimistic lock; you might just have a really writer-heavy system on your hands! If we keep putting it in latency/locking terms, this results in a much better "p90 time-to-merge latency", in other words.

In a large team of developers, working on a big codebase, the "optimistic locking" approach of the merge queue is very effective at getting PRs merged faster, and has very few downsides.


@aselop, We encountered this problem as well. That's why we created Merge Graphs. Check out this blog: https://trunk.io/blog?post=trunk-merge


> For example, say A is a patch that renames the function foo() to foobar()

Why are you allowing commits to land that break interfaces?

If this is a public function it should go through a deprecation cycle.

If this is a private function then it shouldn't be accessed from outside its module/class/whatever in the first place. If it is, then you've got the "messy codebase" I referred to in my original comment.


They can both be private methods in a single module, the example still applies fine? You can for example rename a private function in a single rust crate, which is used by other functions within the crate, which is not exported.

I'm not sure if you're being intentionally obtuse here, it's a very simple scenario that has nothing to do with public interfaces or deprecation cycles.


> They can both be private methods in a single module, the example still applies fine?

It does, but you're now looking at an edge case that rarely arises in practice in a well-maintained codebase, and certainly not one to design your entire branching/merging/CI strategy around.

If your codebase is a spaghettified nonsense then the problem arises much more often, and so I can understand the use of merge trains/whatnot.


My situation: small team with around 10 simultaneous projects at any one time, all depending on a framework that we use for tons of different things. Everyone is constantly modifying the same parts of the code because that’s where the features should be added. It’s way cleaner than when the projects were separated and didn’t share code.

In this case, we often have conflicts in private modules. (And everything is private since we don’t provide any libraries to anyone.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: