Actually, XFS is now default on RHEL and Suse :) But honestly, over the last yea...

laumars · on July 23, 2015

I've had the opposite impression of ZFS - it's saved my data on a number of occasions when ext* would have failed completely. I personally think Sun's engineers did an amazing job with ZFS. Plus while the file system is hugely complex, management of it is remarkably straightforward - which leads to less user / sysadmin error when running it.

Sadly though, I do agree with your judgement of Btrfs. I trialled it for about 6 months and found it to be conceptually similar to ZFS but very much it's practical polar opposite.

However, like with any software decision, the right file system choice depends as much on the platform's intended purpose and the administrator, as it does on age and features. For file servers, ZFS is an excellent choice; but for low footprint appliances, you'd be better off with ext4 or xfs. And desktops is just a question of personal preference.

buster · on July 23, 2015

I'm not an expert (but, so most people are no experts in this area), but i have a feeling that ZFS needs huge amounts of memory (compared to ext). It does nifty stuff, for sure. But do i need all of that? Hardly.

For example, i am wondering why i would want to put compression into the file system. Or deduplication. Or the possibility to put some data on SSD and other on HDD. If i have a server and user data, it should be up to the application to do this stuff. That should be more efficient, because the app actually knows how to handle the data correctly and efficiently. It would also be largely independent of the file storage.

I've seen some cases where we had a storage system, pretending to do fancy stuff and fail on it. And debugging such things is a nightmare (talking about storage vendors here, though). But for example, a few years ago we had major performance problems because a ZFS mount was at about 90% of space. It was not easy to pinpoint that. After the fact it's clear, and nowadays googling it would probably give enough hints. But in the end I would very much like that my filesystem does not just slow down depending on how much filled it is. Or how much memory i have.

edit: Also, just to clarify. I think that Sun had one of the best engineers of the whole industry. Everything they did has been great. Honestly, i have huge respect for them and also for ZFS. I still think that ZFS is great, but in the end, i am wondering if it is just too much. For example, nowadays you have a lot of those stateless redundant S3 compatible storage backends. Or use Cassandra, etc. Those already copy your data multiple times. Even if they run on ZFS, you don't gain much. If you run ext4 and it actually loses data, the software cares about that. That's just one case, and of course it depends on your requirements. Just saying, those cases are increasing where the software already cares for keeping the important data safe.

laumars · on July 23, 2015

> I'm not an expert (but, so most people are no experts in this area), but i have a feeling that ZFS needs huge amounts of memory (compared to ext). It does nifty stuff, for sure. But do i need all of that? Hardly.

ZFS needs more memory than ext4, but reports for just how much memory ZFS needs is grossly over estimated. At least for desktop usage - file servers are a different matter and thus that's where those figures come from.

To use a practical example, I've ran ZFS + 5 virtual machines on 4GB RAM and not had any issues what-so-ever.

> For example, i am wondering why i would want to put compression into the file system.

A better question would be, why wouldn't you? It happens transparently and causes next to no additional overhead. But you can have that disabled in ZFS (or any other file system) if you really want to.

> Or deduplication.

Deduplication is disabled on ZFS by default. It's actually a pretty nichely used feature despite it's wider reporting.

> Or the possibility to put some data on SSD and other on HDD. If i have a server and user data, it should be up to the application to do this stuff. That should be more efficient, because the app actually knows how to handle the data correctly and efficiently. It would also be largely independent of the file storage.

I don't get your point here. ZFS doesn't behave any differently to ext in that regard. Unless you're talking about SSD cache disks, in which case that's something you have to explicitly set up.

> I've seen some cases where we had a storage system, pretending to do fancy stuff and fail on it. And debugging such things is a nightmare (talking about storage vendors here, though). But for example, a few years ago we had major performance problems because a ZFS mount was at about 90% of space. It was not easy to pinpoint that. After the fact it's clear, and nowadays googling it would probably give enough hints. But in the end I would very much like that my filesystem does not just slow down depending on how much filled it is. Or how much memory i have.

ZFS doesn't slow down if the storage pools are full; the problem you described there sounds more like fragmentation, and that affects all file systems. Also the performance of all file systems is also memory driven (and obviously storage access times). OS's cache files in RAM (this is why some memory reporting tools say Windows or Linux is using GB's RAM even when there's little or no open applications - because they don't exclude cached memory from used memory). This happens with ext4, ZFS, NTFS, xfs and even FAT32. Granted ZFS has a slightly different caching model to the Linux kernel's, but file caching is something that is free memory driven and applies to every file system. This is why file servers are usually speced with lots of RAM - even when running non-ZFS storage pools.

I appreciate that you said none of us are experts on file systems, but it sounds to me like you've based your judgement on a number of anecdotal assumptions; in that the problems you raised are either not part of ZFS's default config, or are limitations present in any and all file systems out there but you just happened upon them in ZFS.

> i am wondering if it is just too much. For example, nowadays you have a lot of those stateless redundant S3 compatible storage backends. Or use Cassandra, etc. Those already copy your data multiple times. Even if they run on ZFS, you don't gain much.

While that's true, you are now comparing Apples to oranges. But in any case, it's not best practice to be running a high performance database on top of ZFS (nor any of CoW file system). So in those instances ext4 or xfs would definitely be a better choice.

FYI, I also wouldn't recommend ZFS for small capacity / portable storage devices nor many real time appliances. But if file storage is your primary concern, then ZFS definitely has a number of advantages over ext4 and xfs which aren't over-complicated nor surplus toys (eg snapshots, CoW journalling, online scrubbing, checksums, datasets, etc).

yason · on July 23, 2015

> why the hell we need those ultracomplicated > can-do-everything filesystems?!

We don't but some people do. Or think they do, but that's the same. It's not like I'm losing something because I can always use the simpler filesystems myself. I went with ReiserFS for over a decade: never really had to think about my file systems which is what I want, after all. Once you find a good file system, you stop thinking about file systems.

For me, this means two things.

I want my file system to be safe and transactional: either my change gets in or it doesn't, but I don't want to find my file system in some degraded in-between state. Ever. I'm willing to pay for that with cpu time or i/o speed, that's like the first 90% of my requirements.

The second criteria is that it's generally lean and doesn't do anything stupid algorithmically. It should support big files, provide relatively fast directory lookups (so that 'find' will run fast), and have some decent way of packing files onto the disk that doesn't defragment the allocations too badly, and ideally do some book-keeping during idle i/o so that I don't really have to run a defragmenter, ever. But these are kind of secondary requirements that aren't worth anything unless the file system keeps my files uncorrupted and accessible first and foremost.

nickpsecurity · on July 23, 2015

Very reasonable requirements. Same here. I just want it to store files, retrieve files, have decent performance, and never screw up in a way that prevents recovery. I'd hope the baseline Is that so much to ask in 2015? ;)

StillBored · on July 24, 2015

As someone who has lost data on _EVERY_ single linux filesystem listed in this thread, I can say what I want out of a filesystem is code that hasn't changed for years. Once the "filesystem experts" move on to the latest code base, then I start to feel confident about the stability of the ones they left behind. As other said, what I want first out of a fileystem is "boring". I would much rather be restricted to small volumes/files/slow lookup times/etc, than discover a sectors worth of data missing in the middle of my file because the power was lost at the wrong moment 6 months ago.

nickpsecurity · on July 25, 2015

Making data smaller, slower, etc. doesn't solve the problem. Good design and implementation are what it takes. Wirth's work shows simplifying the interfaces, implementation, and so on can certainly help. Personally, I think the best approach is simple, object-based storage at the lower layer with complicated functionality executing at a higher layer through an interface to it. Further, for reliability, several copies on different disks with regular integrity checks to detect and mitigate issues that build up over time. There are more complex, clustered filesystems that do a a lot more than that to protect data. They can be built similarly.

The trick is making the data change, problem detection, and recovery mechanisms simple. Then, each feature is a function that leverages that in a way that's easier to analyze. The features themselves can be implemented in a way that makes their own analysis easier. So on and so forth. Standard practice in rigorous, software engineering. Not quite applied to filesystems...

ctb_mg · on July 24, 2015

What filesystems would you say provide you with high stability confidence?

vidarh · on July 23, 2015

> Just yesterday i had btrfs telling me that my disk is full (when in fact it is half full with nearly 400GB free!).

This is a major ongoing issue with Docker on Btrfs in particular. E.g. CoreOS switched to Overlayfs + Ext4 by default to reduce that problem.

nickpsecurity · on July 23, 2015

I'm with you on that. Obviously lol. There's two paths to getting the complex functionality without the problems: filesystem over object model; application layer over filesystem model.

In the 80's-90's, many systems aiming for better robustness or management realized filesystems were too complex. So, they instead implemented storage as objects that were written to disks. Many aspects important for security or reliability were done here on this simple representation. The filesystem was a less privileged component that translated the complexities of file access into procedure calls on the simpler, object storage. Apps that didn't need files per se might also call the object storage directly. Some designs put the object storage management directly on the disk with on-disk, cheap CPU. Supported integrated crypto, defrag, etc. NASD's [1] and IBM System/38 [2] are sophisticated examples in this category.

The other model was building complex filsystems on simpler ones. The clustered filesystems in supercomputing and distributed stores in cloud + academia are good examples of this. The underlying filesystem can be something simple, proven, and highly performing. Then, the more complex one is layered over one or more nodes to provide more while mostly focusing on high-level concerns. Google File System [3] and Sector [4] are good examples.

So, we can have the benefits of simple storage and complex filesystems with few of their problems. That there's many deployed in production should reduce skepticism that it sounds too good to be true. Now, we just need more efforts in these two categories to make things even better. Nice as ZFS and BTFS look, I'd rather they had just improved XFS in directions of these categories instead. Duplicated effort would've led to innovation instead on top of any innovations they produced.

[1] http://www.pdl.cmu.edu/PDL-FTP/NASD/CMU-CS-97-185.pdf

[2] http://homes.cs.washington.edu/~levy/capabook/index.html

[3] https://en.wikipedia.org/wiki/Google_File_System

[4] http://sector.sourceforge.net/

XorNot · on July 23, 2015

I use ZFS on every machine I directly control and could never consider going back.

Definitely lost way too many files to XFS zeroing things out on a crash.

nickpsecurity · on July 23, 2015

The things I read on it were nice. I'll give it that.