Why does btrfs have those issues compared to other filesystems? I'd love to use ...

turminal · on Dec 23, 2020

I sincerely hope you are joking but I realize this mindset is quite common these days, so let me reiterate: Rust does not magically solve problems for you. Btrfs has a lot of issues and some of them may well be of the not possible in Rust sort, but I'm quite sure most of them are not and there is nothing Rust can fix about them. Rewriting a 13 years old and very complex thing in another language is a massive effort and a big opportunity for introducing some more, possibly worse bugs along the way.

EDIT: also see panpanna's comment

ho_schi · on Dec 23, 2020

It is like a repeating story, every time someone sells the magically solution to all problems and apply a name on it people blindly believe it.

This are tools. A screwdriver is not the right tool, when you need a power drill. And vice versa. Yes, sometimes you can use both and stick with your accustomed tool.

Anyway. C and C++ and the tool chain are constantly improving like others.The moern memory sanitizers in GCC and LLVM are awesome.

throwii · on Dec 23, 2020

Any rewrite must be carefully evaluated, true. And it may not be advisable. I'm trying to understand the issues.

Are they not solvable because the kernel does not give enough guarantees as it gives to userspace, because the c-interfaces of the kernel have to be wrapped in unsafe or because of other reasons (architecture, data model, kernel constraints, ...)?

ldrndll · on Dec 23, 2020

I think the issue is that you haven’t understood the issue, and just proposed using Rust. Don’t get me wrong, I’m a huge fan of Rust, but seeing people blindly suggest it as a panacea gets frustrating for people. The reality is that when people talk about Rust being fact, it’s because a decently optimised Rust program might be comparable to an equivalent C program. C is generally not the problem when it comes to performance issues in the kernel, which means rewriting that thing in Rust wouldn’t magically make it faster.

The issue will probably be some sort of pathological case in an algorithm being used, or perhaps from a poorly chosen algorithm. The point being, it’s not clear yet, and to solve that requires understanding the problem, not effecting a needless rewrite in a new language.

throwii · on Dec 23, 2020

> C is generally not the problem when it comes to performance issues in the kernel, which means rewriting that thing in Rust wouldn’t magically make it faster.

Agreed.

> The point being, it’s not clear yet, and to solve that requires understanding the problem, not effecting a needless rewrite in a new language.

Which is why the first question is why btrfs has issues others do not have. Some mention its CoW architecture, system design, too many features and not limiting storage to 90% and so increasing complexity. Others mention usual kernel issues.

Others mention that they had no issues to begin with and that its mixed reputation is unwarranted. I'm not clear who is right, but they are data points.

Thank you for your input in cautioning of rewrites to avoid needless work, I appreciate it.

swinglock · on Dec 23, 2020

Btrfs is really only comparable to ZFS, which is also CoW, feature rich and written in C, but has been considered stable for a great many years.

thomaszander · on Dec 23, 2020

> Why does btrfs have those issues compared to other filesystems?

As someone that has built infrastructure on BtrFS for years, the scary stories are mostly just hot air and the stability of other filesystems is really not significantly better.

Bugs like this happen, this is why Linus releases many release-candiates every kernel, this one got through as 10 was a rather massive kernel and there were several regressions. Including one that caused a new release just hours after the supposedly final one. Distros wait a bit longer before shipping an updated kernel and none of these hit actual users.

As far as I know there is no reason to abstain from using Btrfs. When Fedora talked about not using it, they had as reason that they had no in-house expertise.

yjftsjthsd-h · on Dec 23, 2020

> As someone that has built infrastructure on BtrFS for years, the scary stories are mostly just hot air and the stability of other filesystems is really not significantly better.

I've lost 2 root filesystems to btrfs, on a laptop with only 1 drive (read: not even using RAID). Have you considered that you're just lucky?

Volundr · on Dec 23, 2020

In my experience, the stories are real. Our entire company was offline for a day when our central storage server quit accepting writes despite having over 50% free space. That's when I learned the hard way about the data/metadata split (something I was aware of but wasn't exactly top of mind) and BTRFS balance. You can certainly say it was my fault for not reading ALL the documentation before using BTRFS, and I'd find it hard to disagree, but any other filesystem wouldn't have had this problem.

I can't speak to if there are other foot-guns waiting around or how common problems like this are because we migrated back to FreeBSD and ZFS shortly after that experience. I do know they have since updated BTRFS to make that scenario less likely (but still not impossible).

laumars · on Dec 23, 2020

This is exactly my experience as well.

Btrfs is the C++ of file systems: it’s powerful and works for a great many people. But the tooling is intimidating to new comers and unless you know exactly what you’re doing, it’s a ticking time bomb due to the plethora of foot guns and hidden traps.

This is why some people claim to have success with it while a great many other people, rightfully, claim it’s not yet ready for prime time.

ZFS, on the other hand, has not only protected me against failing hardware but it also has sane defaults and easy to use tooling thus protecting me against my own stupidity.

rleigh · on Dec 24, 2020

I don't think it's your fault in any way.

We expect filesystems to work robustly. We do not expect them to fail after an arbitrary time interval merely by being used. Even terrible filesystems like FAT don't do that. They might get fragmented and slow, but they don't just stop. I find it incredible that this is often minimised by people; it's a complete show-stopper irrespective of the other problems Btrfs has.

I made exactly the same migration you did. ZFS has been solid, and it does exactly what it says on the tin.

circularfoyers · on Dec 23, 2020

Does Fedora now have in-house expertise now that they've decided to make Btrfs default?

curt15 · on Dec 23, 2020

Some Facebook btrfs devs agreed to help out. FB runs Fedora/CentOS + btrfs root on most of its systems.

throwii · on Dec 23, 2020

Thank you for posting your experience!

tobias3 · on Dec 23, 2020

My guess is this (compared to ZFS): With a CoW-file system like btrfs you have to problem that you need new file system space to delete something. This is problematic if the file system is full and you want to be able to write to it again by deleting something. ZFS solved this by just saying one can only fill a file system to 90% usage. At some point they even decreased this (during upgrade) and I had the issue that I couldn't write to the ZFS file system because this was lowered.

Btrfs tries to fully use the space and gets all the associated complexities. Additionally, because data/metadata ratio is not fixed one can get into situations where the file system is full and there is no more metadata space. For every action it needs to carefully check if there is enough space to actually perform the action even if the file system is nearly full. Improvements in this area caused this regression.

And no Rust wouldn't help. How often do you get a kernel Oops, dead lock or memory leak? Rust would help with those.

throwii · on Dec 23, 2020

Interesting decision they made, I wonder whether they would decide differently now after seeing all the complexity.

So Rust does not decrease the complexity, but only removes certain kinds of errors which the compiler can detect. Neither logic errors nor speed regressions.

Thank you for your input.

yjftsjthsd-h · on Dec 23, 2020

> I wonder whether they would decide differently now after seeing all the complexity.

Decide what? Cow is a fundamental part of how btrfs works, and Rust didn't exist for the majority of Linux's life. (Although if you're into that, look at Redox)

throwii · on Dec 23, 2020

Sorry I was unclear :) Maybe they would limit storage to 90% now, seeing that it increases complexity. Maybe I misunderstood the prior point though.

Thanks fr mentioning Redox!

panpanna · on Dec 23, 2020

Please understand that the rust memory and thread security mainly applies to "normal" applications.

In kernel, you can run a privileged cache or mmu instruction or a write to some magical memory position and all the sudden the "normal" rules don't apply anymore.

(But I think there are other parts of rust that are nice to have in kernel or any complex software).

throwii · on Dec 23, 2020

I thought the Rust compiler solves issues that you wouldn't immediately see with pure C, which is why I had the idea.

I didn't know this requires certain features which are not available inside the kernel. I only knew all existing interfaces may be unsafe because they are in C though. Rust does not seem as useful then.

Thank you for your input.

simcop2387 · on Dec 23, 2020

It's not so much that rust the language requires them as much as it is that other non-rust parts can quite easily stomp all over the guarantees of rust without there ever being a way of knowing it happened. So rust alone won't solve many problems, but it would let you say "this code can't do these things itself", which is still a useful distinction. It also doesn't allow you to deal with misbehaving hardware that changes memory underneath you in ways it said won't happen. Hardware sucks.

throwii · on Dec 23, 2020

Do other parts stomp often? :) But true that can happen. Especially on non-ECC systems.

I didn't think about the hardware issues, hmm. I can't see how to do that, when the compiler guarantees get invalidated by hardware. Checks are also needed like in C? (assuming there are checks which do not get compiled out..)

simcop2387 · on Dec 24, 2020

When the hardware can't make the guarantees, then software really can't do anything about it. There's really not any checks you can do, but modern hardware is getting the capabilities to try to prevent those kinds of issues with the IOMMU units, but operating system support is still hit or miss for most hardware and it won't prevent everything (just devices stomping on each-other with DMA). That's basically how the thunderbolt attacks have worked and the solutions to them.

jeromenerf · on Dec 23, 2020

> Why does btrfs have those issues compared to other filesystems?

Mostly because it has lots of features and as a consequence, is pretty large and complex. Closer to ZFS than ext2.

Btrfs suffers from a initial bad rep, which is difficult to overcome.

nicoburns · on Dec 23, 2020

ZFS itself seems very stable and not to suffer from these issues though.

jpeloquin · on Dec 23, 2020

Anecdotally, I've encountered issues with both ZFS and BTRFS at about the same rate. A public example of an apparent ZFS performance issue is https://github.com/openzfs/zfs/issues/9375 Both are much more quirky than simpler filesystems like ext4. Data integrity verification from checksumming makes it worth it though.

The ZFS vs. BTRFS choice, I think, depends more on whether you need specific features like offline deduplication or L2ARC / SLOG cache devices. And which one you're more familiar with (can troubleshoot better).

Quekid5 · on Dec 23, 2020

ZFS is also extremely well designed at a system level... which is not the impression I get from BtrFS. (Disclaimer: I have not bothered looking at BtrFS for years because ZFS has handled everything I've thrown at it very admirably. Including complicated setups with RAID-Z, etc.)

Granted, there are some limitations to the design, but it doesn't affect my use cases, so whatever...

throwii · on Dec 23, 2020

Good theory, thank you.

rleigh · on Dec 24, 2020

> Why does btrfs have those issues compared to other filesystems?

Why? There are several reasons, but if you go right back to the beginning, there's a single reason which caused all the other problems: they started coding before they had finished the design.

All of the other problems are fallout from that. Changing the design and the implementation to fix bugs after the initial implementation was done. Introducing more bugs in the process. And leaving unresolved design flaws after freezing the on-disc format.

When you look at ZFS as a comparison, the design was done and validated before they started implementing it. Not unsurprisingly, it worked as designed once the implementation was done. Up-front design work is necessary for engineering complex systems, it really goes without saying.

This isn't even unique to Btrfs, but filesystems are one thing you can't hack around with without coming to grief; you have to get it right first time when their sole purpose is to store and retrieve data reliably. Many open source projects are ridden with problems because their developers were more interested in bashing out code than stopping and thinking beforehand. Same with a lot of closed source projects as well for that matter.

In the case of Btrfs, which was aiming from the start to be a "better ZFS", they didn't even take the time to fully understand some of the design choices and compromises made in ZFS, because they ended up making choices which had terrible implications. Examples: using B-trees rather than Merkle hashes; this is at the root of many of its performance problems. Not having immutable snapshots; again has performance implications as well as safety implications, and is rooted in not having pool transaction numbers and deadlists. Not separating datasets/subvols from the directory hierarchy; presents logistical and administration challenges, while ZFS datasets can freely inherit metadata from parents and the mount locations are a separate property. ZFS isn't perfect of course, there are improvements and new features that could be made, but what is there is well designed, well thought out, and is a joy to work with.

throwii · on Dec 24, 2020

Can you tell how such evaluation on a design is done? Is some kind of formal verification, analysis or rather experimentation to figure out its properties normal?

Thank you for your input!

rleigh · on Dec 25, 2020

I wasn't involved so can't personally provide details of how this was done at Sun. Most of my knowledge comes from listening to talks and reading books on ZFS.

For work I'm involved in relating to safety-critical systems, we use the V-model for concepts, requirements, design and implementation, with extensive validation and verification activities at each level. Tools are used to manage all of the requirements, design details and implementation details and link them all together in a manner which aims to require self-consistency at all levels. When done correctly, this means that the person writing the code does not need to be particularly creative at this stage: the structure is completely detailed by the formal design. But it does require significant up-front effort to carefully consider and nail down the design to this level of detail. But it does avoid the need to continually revise and adapt an incomplete or bad design in a never-ending implementation phase.

This approach is definitely not for everyone, and there are many things one can criticise about it. But if you are willing to bear the financial cost and time costs of doing that detailed design work up front, the cost of implementation will be much lower and the product quality will be much greater. There is a lot to be said for not madly mashing keys and churning out code without thinking about the big picture, and Btrfs is a case study in what not to do.

throwii · on Dec 26, 2020

The V-model is interesting. I'm a student and kinda new to the different development models.

How to decide whether such meticulous design is necessary or not? In hindsight Btrfs may have benefited, but how to decide when to and when not to in the future?

I would also be interested to know what tools are used for this. The ones I looked at seemed quite dated.. :-)

Thank you for answering! This is very interesting to learn about

rleigh · on Dec 26, 2020

This is just my own personal take on things; I'd definitely recommend reading up on the differences between Waterfall, Agile and the V-model (and Spiral model). Note that you'll see it said that the V-model is based upon Waterfall, which is somewhat true, but it's not necessarily incompatible with Agile. You can combine the two and go all the way down and back up the "V" in sprints or "product increments", but you do need the resources to do all the revalidation and reverification at all levels each time, and this can be costly (this is effectively what the Spiral model is).

In terms of deciding if meticulous up-front design is necessary (again my own take), it depends upon the consequences of failure in the requirements, specifications, design and/or implementation. A random webapp doesn't really have much in the way of consequences other than a bit of annoyance and inconvenience. A safety-critical system can physically harm one or multiple people. Examples: car braking systems, insulin pumps, medical diagnostics, medical instruments, elevator safety controls, avionics etc. It also depends upon how feasible it is to upgrade in the field. A webapp can be updated and reloaded trivially. An embedded application in a hardware device is not trivial to upgrade, especially when it's safety-critical and has to be revalidated for the specific hardware revision.

For filesystems the safety aspect will relate to maintaining the integrity of the data you have entrusted to its care. Computer software and operating systems can have all sorts of silly bugs, but filesystem data integrity is one place where safety is sacrosanct. We set a high bar in our expectation for filesystems, not unreasonably, and after suffering from multiple dataloss incidents with Btrfs, it's clear their work did not meet our expectations. We're not even going into the performance problems here, just the data integrity aspects.

I can't say anything about the tools I use in my company. There are specialist proprietary tools available to help with some of the requirements and specifications management. I will say this: the tools themselves aren't really that important, they are just aids for convenience. The regulatory bodies don't care what tools you use. The important part is the process, of having detailed review at every level before proceeding to the next, and the same again when it comes to validation and verification activities.

Often open source projects limit themselves to some level of unit testing and integration testing, which is fine. But the coverage and quality of that testing may leave some room for improvement. It's clear that Btrfs didn't really test the failure and recovery codepaths properly during its development. Where was the individual unit testing and integration test case coverage for each failure scenario? Where the V-model goes above and beyond this is in the testing of the basic requirements and high-level concepts themselves. You've got to check that the fundamental premises the software implementation is based upon are sound and consistent.