For RHEL you are stuck on one kernel for an entire release. Every fix has to be backported from upstream, and the further from upstream you get the harder it is to do that work.
Btrfs has to be rebased _every_ release. If moves too fast and there is so much work being done that you can't just cherry pick individual fixes. This makes it a huge pain in the ass.
Then you have RHEL's "if we ship it we support it" mantra. Every release you have something that is more Frankenstein-y than it was before, and you run more of a risk of shit going horribly wrong. That's a huge liability for an engineering team that has 0 upstream btrfs contributors.
The entire local file system group are xfs developers. Nobody has done serious btrfs work at Red Hat since I left (with a slight exception with Zach Brown for a little while.)
Suse uses it as their default and has a lot of inhouse expertise. We use it in a variety of ways inside Facebook. It's getting faster and more stable, admittedly slower than I'd like, but we are getting there. This announcement from Red Hat is purely a reflection of Red Hat's engineering expertise and the way they ship kernels, and not an indictment of Btrfs itself.
I'm pretty sure, had RH wanted they could either hire or assign engineers to maintain the btrfs code, take care of patches from upstream, etc. So why didn't that happen? I wonder what is your opinion on that.
I see a bunch of possibilities (not necessarily independent ones):
1) Politics. Perhaps RH wants to kill btrfs for some reason?
I see this as rather unlikely, as RH does not have a competing solution (unlike in the Jigsaw controversy, where they have incentives to kill it in favor of the JBoss module system).
2) Inability to hire enough engineers familiar with btrfs, or assign existing engineers.
Perhaps the number of engineers would be too high, increasing costs. Especially if not only to maintain the RHEL kernels, but to contribute to btrfs and move it forward.
Or maybe there's a pushback from the current filesystems team, where most people are xfs developers?
3) Incompatible development models.
If each release requires a rebase, perhaps supporting btrfs would require too much work / too many engineers, increasing costs? I wonder what Suse and others are doing differently, except for having in-house btrfs developers.
4) Lack of trust btrfs will get mature enough for RHEL soon.
It may work for certain deployments, but for RHEL customers that may not be sufficient. That probably requires a filesystem performing well for a wider range of workloads.
5) Lack of interest from paying RHEL customers.
Many of our customers have RHEL systems (or CentOS / Scientific Linux), and I don't remember a single one of them using btrfs or planning to do so. We only deal with database servers, which is a very narrow segment of the market, and fairly conservative one when it comes to filesystems.
But overall, if customers are not interested in a feature, it's merely a pragmatic business decision not to spend money on it.
6) Better alternatives available.
I'm not aware of one, although "ZFS on Linux" is getting much better.
So I tend to see this as a pragmatic business decision, based on customer interest in btrfs on RHEL vs. costs of supporting it.
Now as to
> "Why Red Hat does not have engineers to support btrfs?"
You have to understand how most kernel teams work across all companies. Kernel engineers work on what they want to work on, and companies hire the people working on the thing the company cares about to make sure they get their changes in.
This means that the engineers have 95% of the power. Sure you can tell your kernel developer to go work on something else, but if they don't want to do that they'll just go to a different company that will let them work on what they care about.
This gives Red Hat 2 options. One is they hire existing Btrfs developers to come help do the work. That's unlikely to happen unless they get one of the new contributors, as all of the seasoned developers are not likely to move. The second is to develop the talent in-house. But again we're back at that "it's hard to tell kernel engineers what to do" problem. If nobody wants to work on it then there's not going to be anybody that will do it.
And then there's the fact that Red Hat really does rely on the community to do the bulk of the heavy lifting for a lot of areas. BPF is a great example of this, cgroups is another good example.
Btrfs isn't ready for Red Hat's customer base, nobody who works on Btrfs will deny that fact. Does it make sense for Red Hat to pay a bunch of people to make things go faster when the community is doing the work at no cost to Red Hat?
Release under a compatible license would likely see a ZFS kernel module appear in EPEL immediately; Red Hat would likely replace XFS with ZFS as the default in RHEL8 were this legally possible.
Oracle supports BtrFS in their Linux clone of RHEL. It certainly appears that Red Hat is swallowing a "poison pill" to increase Oracle's support costs (and I'm surprised that they have not swallowed more).
With these new added costs, Oracle might find it cheaper to simply support the code for the whole ecosystem (CentOS and Scientific Linux included). Given the adversarial relationship that has developed between the two protagonists, an enforceable legal agreement would likely be Red Hat's precondition.
Otherwise, BtrFS has been mortally wounded.
Oracle owns a lot of patents and I suspect both ZFS and BtrFS rely on some.
FWIW I haven't said anything about Oracle & btrfs ...
>> "Why Red Hat does not have engineers to support btrfs?"
> You have to understand how most kernel teams work across all companies. Kernel engineers work on what they want to work on, and companies hire the people working on the thing the company cares about to make sure they get their changes in.
> This means that the engineers have 95% of the power. Sure you can tell your kernel developer to go work on something else, but if they don't want to do that they'll just go to a different > company that will let them work on what they care about.
> This gives Red Hat 2 options. One is they hire existing Btrfs developers to come help do the work. That's unlikely to happen unless they get one of the new contributors, as all of the seasoned developers are not likely to move. The second is to develop the talent in-house. But again we're back at that "it's hard to tell kernel engineers what to do" problem. If nobody wants to work on it then there's not going to be anybody that will do it.
Sure, I understand many developers have their favorite area of development, and move to companies that will allow them to work on it. But surely some developers are willing to switch fields and start working on new challenges, and then there are new developers, of course. So it's not like the number of btrfs developers can't grow. It may take time to build the team, but they had several years to do that. Yet it didn't happen.
> And then there's the fact that Red Hat really does rely on the community to do the bulk of the heavy lifting for a lot of areas. BPF is a great example of this, cgroups is another good example.
I tend to see deprecation as the last state before removal of a feature. If that's the case, I don't see how community doing the heavy lifting makes any difference for btrfs in RH.
Or are you suggesting they may add it back once it gets ready for them? That's possible, but the truth is if btrfs is missing in RHEL (and derived distributions), that's a lot of users.
I don't know what are the development statistics, but if majority of btrfs developers works for Facebook (for example), I suppose they are working on improving areas important for Facebook. Some of that will overlap with use cases of RHEL users, some of it will be specific. So it likely means a slower pace of improvements relevant to RHEL users.
> Btrfs isn't ready for Red Hat's customer base, nobody who works on Btrfs will deny that fact. Does it make sense for Red Hat to pay a bunch of people to make things go faster when the community is doing the work at no cost to Red Hat?
The question is, how could it get ready for Red Hat's customer base, when there are no RH engineers working on it? Also, I assume the in-house developers are not there only to work on btrfs improvements, but also to investigate issues reported by customers. That's something you can't offload to the community.
I still think RH simply made a business decision, along the lines:
1) The btrfs possibly matters to X% of our paying customers, and some of them might leave if we deprecate it, costing us $Y.
2) In-house team of btrfs developers who would work on it and provide support to customers would cost $Z.
If $Y < $Z, deprecate btrfs.
This move by Red Hat must be seen as a provocation of Oracle, to force either greater cooperation and compliance in producing a stable BtrFS for RHEL, or the release of ZFS under a compatible license. Red Hat has put an end to BtrFS for now, and Oracle will have to go to greater lengths to use it in their clone. Customers also will not want it if it does not run equally well between RHEL and Oracle Linux.
It is obvious that Oracle will have to assume higher costs and support if they want BtrFS in RHEL. Red Hat is certainly justified in bringing Oracle to heel.
Oracle recently committed preliminary dedup support for XFS, so they must be intimately aware of the technical and legal issues behind Red Hat's move.
> This move by Red Hat must be seen as a provocation of Oracle
I doubt it.
"... initially designed at Oracle Corporation for use in Linux."
 - https://linux.oracle.com/
More likely they'll support it only on their Linux.
Assuming that RHEL v8 strips BtrFS, Oracle's RHCK will have to add support back in, and thus no longer be "compatible." Without that support, some filesystems will fail to mount at boot. In-place upgrades from v7 to v8 will be problematic.
Oracle has worked very hard to maintain "compatibility" with Red Hat, even going so far as to accept MariaDB over MySQL. Their reaction to the latest "poison pill" will be interesting.
An in-place upgrade from v7 to v8 could easily get hosed.
It's a clear indicator that RedHat doesn't want or can't support btrfs.
Which is a reflection of btrfs AND RedHat: the effort required to maintain it, the lack of usage in RHEL paying customers, the immaturity/fast development of the filesystem.
As a result, it is not at all surprising that Red Hat ends up functioning as somewhat like a baseball farm team for companies like Facebook, Google, etc. who are willing to pay more and have more liberal travel policies than Red Hat. If someone can become a strong open source contributor while working at Red Hat, they can probably get a pay raise going somewhere else.
There is a trade off --- companies that pay you much more also tend to expect that you will add a corresponding amount of value to the company's bottom line. So you might have slightly more control over what you choose to work on at Red Hat.
I imagine they pay significantly less than the other companies (e.g. Facebook) who want to hire Btrfs devs can afford to, too.
Fragmentation is also an issue, and xfs_fsr should be run at regular intervals to "defrag" an XFS file system. (I assume that) BtrFS handles this more intelligently.
I'd love to see XFS get some or all of these features.
Stop backporting fixes. You're forking the codebase.
Ship exactly what upstream provides.
Teach upstream projects how to do better release engineering if they're abandoning major releases to early or breaking API/ABI in a minor release.
edit: also stop incorrectly backporting security fixes and creating new CVEs. Seriously. Stop it.
Not all upstreams are interested in doing release engineering. There are non zero costs to doing it. It can eat up time that can be spent on bug fixes and features, or even make it too costly to change direction if a certain approach to implementation is proving more difficult than it should be.
Look at the Linux kernel. The only reason there is a stable kernel series is because Greg K-H decided it was important enough. He was unable to convince any other developers to go along with it, and eventually the decision was "if you want to support it, then you can do it."
Do you consider the stable kernel series a fork of the codebase? Should everyone be running the newest kernel every release despite the plenty of regressions that appear?
Kernel developers are not interested in making every change in such a slow and controlled manner as to avoid any regressions. And it works for them. They get a lot of stuff done, and come back and fix the regressions later.
If you don't care that much about development velocity, it's really easy to make something that is super stable.
If you only care about making things work on a very narrow use cases (to support the back end of a particular company's web servers, or just to support a single embedded device), life also gets much easier.
If you want to "move fast and break things", that's also viable.
Finally, if you have unlimited amounts of head count, life also becomes simpler.
Different parts of the Linux ecosystem have different weights on all of these issues. Some environments care about stability, but they really don't care about advanced features, at least if stability/security might be threatened. Others are interested in adding new features into the kernel because that's how they add differentiators against their competitors. Still others care about making a kernel that only works on a particular ARM SOC, and to hell if the kernel even builds for any other architecture. And Red Hat does not have infinite amounts of cash, so they have to prioritize what they support.
So a statement such as "Teach upstream projects how to do better release engineering", is positively Trumpian in its naivete. Who do you think is going to staff all of this release engineering effort? Who is going to pay for it? Upstream projects consists of some number of hobbists, and some number of engineers from companies that have their own agendas. Some of those engineers might only care about making things better for Qualcomm SOC's, and to hell with everyone else. Others might primarily interested in how Linux works on IBM Mainframes. If there are no tradeoffs, then people might mind work that doesn't hurt their interests, but helps someone else. They might even contribute a bit to helping others, in the hopes that they will help their use case. That's the whole basis of the open source methology.
But at the same time you can't assume that someone will spend vast amounts of release engineering effort if it doesn't benefit them or their company. Things just don't work that way. And an API/ABI that must be stable might get in the way of adding some new feature which is critically important to some startup which is funding another kernel engineer.
There is a reason why the Linux ecosystem is the way it is. Saying "stop it" is about as intelligent as saying that someone who is working two 25 hour part-time jobs should be given a "choice" about her healthcare plan, when none of the "choices" are affordable.
Upstreams don't have the resources to do proper release engineering, they're busy working on new features. The fact that SUSE and Red Hat spawned from a requirement for release engineering that upstreams were not able to provide should show that it takes a lot more work than you might think.
Also, can we please all agree as a community that writing patches and forking of codebases is literally the whole point of free software? If nobody should ever fork a codebase then why do we even have freedom #1 and #2? The trend of free software projects to have an anti-backport stance is getting ridiculous. If you don't want us to backport stuff, stop forcing us to do your release engineering for you.
Honestly what I think we need is to have containers that actually overlay on the host system and only include whatever specialised stuff they need on top of the host. So updates to the host do propagate into containers -- and for bonus points the container metadata can still be understood by the host.
Again and again we see that without any financial incentive, developers are loath to put any effort into backwards compatibility and interface stability.
At the same time they all want people to be running their latest and shiniest.
So in the end, what will happen is that each "app" will bundle the world, or at least as much as they feel they need to.
"edit: also stop incorrectly backporting security fixes and creating new CVEs. Seriously. Stop it."
Can you give some examples of cases where Red Hat introduced bugs in their backported patches? I follow RHEL CVEs relatively closely (because some of my packages are derived from their packages), and I can't think of an example of that happening. Debian has done so, but very rarely, that I can recall. (And, Ubuntu, too, since they just copy Debian for huge swaths of the OS.)
The number of times the APIs changes from under your feet is astounding - even with just keeping up with Red Hat, we spend around 4-6 times the engineering time on the driver compared to what we do with the Windows version of the driver, tracking upstream gave us almost an order of magnitude more work (And keep in mind that /only/ supporting the most recent upstream kernel is rarely an option - several versions need to be supported concurrently)
For many customers of Red Hat, that mantra is the very reason they use RHEL in the first place.
Sadly i feel that more and more upstream wants it both ways, be able to push their latest and shiniest, and keep ignoring any need for interface stability etc.
Frankly i suspect the end result of the likes of Flatpak will be that upstream push whole distros worth of bundled libs, just so they don't have to consider interface stability as they pound out their shinies in their best "move fast and break things" manner.
From the perspective of legacy systems, Red Hat's approach is more comfortable.