Hacker News new | past | comments | ask | show | jobs | submit login
How ZFS continues to be better than Btrfs. (rudd-o.com)
122 points by res0nat0r on Aug 8, 2012 | hide | past | web | favorite | 102 comments

What a non-article. ZFS is more mature than an unmature and unstable filesystem? Really?

The article is also full of irrelevant details and downright faulty statements.

ZFS is more stable. Yeah, if it wasn't that'd be really bad. Keep in mind that when ZFS was new it wasn't particularly stable either, despite the arguments put forward - a file system must mature with time.

ZFS isn't for the typical enthusiast/home-user. It can't be more apparent than the fact that ZFS dosesn't have a fsck-tool, why doesn't ZFS need a fsck-tool? Because every time you'd need it you just restore your backups from tape instead ( Should ZFS have a fsck tool? http://www.osnews.com/story/22423/Should_ZFS_Have_a_fsck_Too... )

You can yank the power cord of your machine -- you will never lose a single byte of anything committed to the disk. ZFS is always consistent on disk, and never trusts any faulty data (that might have become damaged because of hardware issues).

Nope. Never mind that ZFS assumes that harddrives works to spec, which consumer drives do not (keep in mind that ZFS is designed to be able to run on consumer drives). Meaning that you can not yank the power cord to your machine (if you care about your data).

ZFS truly is awesome, it makes all other easily available linux filesystems look really, really, really poor. Which truly sucks, if you use linux (honestly, I haven't checked the linux alternatives for ZFS but I have a really hard time believing that anyone should trust such an untested solution, just use a stable linux filesystem and wait, patiently, for btrfs).

So, along comes btrfs that mimics the feature-set of ZFS but brings along many advantages (much thanks to it being developed much later). And btrfs has a different target group. Btrfs is much better adapted for home use (it even has a fsck-tool!).

But, btrfs isn't stable (and it shouldn't be trusted in the near future)... And I just can't fathom how you can write such an comparison between btrfs and ZFS today, it just doesn't make sense.

When btrfs is stable and mature things should get interesting.

Have you actually used ZFS?

* ZFS doesn't include fsck because it doesn't make sense when your FS is guaranteed to be consistent. fsck is for when writes go bad because the window is there for that to happen.

Your claim is like saying a an electric car doesn't make sense because how is the consumer would be confused that it doesn't have a gas tank.

You don't backup ZFS filesystems to tape. I don't know of anyone doing that. It's a weird statement. If you want worst-case-scenario backups, you replicate.

Of course with auto-snapshots setup, and the basic inability of ZFS to become corrupted (IME, never had an issue with v28), you wouldn't be more or less likely to do that as a consumer than you would be to buy an external HDD for your Mac and use TimeMachine.

The only reason it isn't for the typical user is that the typical user doesn't run Solaris or FreeBSD.

* "Never mind that ZFS assumes that hard drives works to spec"

I don't know where you got that. It's actually just the opposite. It checksums everything and includes several facilities to catch early corruption and address it before you ever lose data.

On top of that most ZFS arrays I've worked with are standard 1TB to 2TB SATA systems. Nothing particularly special about the drives. Early on that was one of the big selling points of ZFS. That it was built for commodity hardware.

I personally have pulled the power plug on 24 to 48 disk arrays, literally hundreds of times while under heavy load and have never once lost data.

Not that I haven't run into other issues a few years ago, but data loss due to sudden power loss wasn't ever one of them.

It's almost comically easy to lose data (again, IME) on an ext3/4 system under the same scenarios. That's comparing zero ZFS losses to maybe a dozen of ext losses over the past 3 years.

Have you actually used ZFS?

In the lab, yes. Oracle killed my plans of adopting it.

I don't know where you got that. It's actually just the opposite. It checksums everything and includes several facilities to catch early corruption and address it before you ever lose data.

Yes, that is the idea. But it is based, as the article mentions, in ZFS uses atomic writes and barriers. So, how can you guarantee that? ZFS does it by writing data to the drive and making sure that the data actually has been written, how can ZFS be sure of that? It asks the drive. The problem is that consumer drives often lie and report that data has been written when it is currently only residing in the cache. And that's a great recipe for data corruption.

ZFS checksums are beyond awesome, but they don't combat this.

Jeff Bonwick explains it better than I could:

As you note, some disks flat-out lie: you issue the synchronize-cache command, they say "got it, boss", yet the data is still not on stable storage. Why do they do this? Because "it performs better". Well, duh -- you can make stuff really fast if it doesn't have to be correct.

Before I explain how ZFS can fix this, I need to get something off my chest: people who knowingly make such disks should be in federal prison. It is fraud to win benchmarks this way.


I personally have pulled the power plug on 24 to 48 disk arrays, literally hundreds of times while under heavy load and have never once lost data.

I hope you don't use consumer drives or really do your research.

The same problem has bitten people running ZFS in a virtual machine, such as vmware ESX, that had similar problems where they couldn't trust the drive to behave as it was told, and corruption followed. Which kind of sucks when you don't have any means of repairing the damage.

Also, the replies I've gotten seems to imply that I have something against ZFS. I love ZFS but I see more potential in btrfs - but btrfs is many years away from even being worth considering an opponent to ZFS so that is kind of moot. All I want is a decent copy-on-write linux file system with cheap snapshots (and preferably good encryption support).

Recent versions of ZFS have solved the problem of drives ignoring the synchronize cache command.

Now ZFS always keeps around the latest 3 transaction groups, regardless if any data on any one of those transaction groups has been freed/updated already.

So if the last transaction group gets corrupted during an abrupt power down, it can always go back to the latest consistent one.

Does btrfs solve any of these issues? And if so, how?

By spending effort and providing tools for data recovery and file system repair (since btrfs is neither stable nor mature these leave much to be desired but the effort is clear and it is quite a contrast compared to ZFS).

The ZFS stand on this is that nothing can go wrong. If something does go wrong, which by the way is impossible, you better have your backups ready because you are on your own.

A good fact to keep in mind is that ZFS is meant to run on adequate hardware. ECC Ram (ZFS is more prone to corruption from random memory errors than most other filesystems, it assumes your ram is reliable), drives that don't lie about writes, adequate CPU overhead. it was designed to go BIG - and when you go big, these things are a given. If you skimp on any of these, it doesn't look as good - but small-scale wasn't it's target market. (And I'm not saying nobody should use ZFS on a small scale... but it was designed with some specific assumptions that are perfectly fair given it's target market)

If you are runninga a system at a scale where ZFS makes sense and not making backups of critical data, your operations process is fundamentally broken anyway. ZFS doesn't change the need for backups one bit (and it brings to the table some novel ways of doing offsite snapshots and whatnot, to boot)

> What a non-article. ZFS is more mature than an unmature and unstable filesystem? Really?

Keep in mind that, with one exception, ZFS is pretty much rock-solid from day one.

Also, you should really read: http://www.c0t0d0s0.org/archives/6071- No,-ZFS-really-doesnt-need-a-fsck.html

Keep in mind that, with one exception, ZFS is pretty much rock-solid from day one.

Citation needed.

And that article doesn't really address any of the problems that the osnews article brings up, it just tries to sidestep the issue by more or less saying that the same fsck tool that ext uses won't work on ZFS. Great.

ZFS was released in 2005, and appeared in Solaris 10 in 2006. Close enough?

More importantly, what's the problem with scrubbing instead of fsck?

ZFS was released in 2005, and appeared in Solaris 10 in 2006. Close enough?

Followed by that was people having issues with data corruption... ZFS is great, but it does break from time to time (whether it is better than other file systems is subjective when you consider the options of restoring it). Low failure rate is, for many, a weak comfort when data is irrecoverable.

More importantly, what's the problem with scrubbing instead of fsck?

Scrubbing repairs data, not broken file systems. Scrubbing is of course great but it isn't the answer for everything.

ZFS is great, but it does break from time to time

Ah, it seems our definitions vary.

ZFS has had problems, but all production filesystems have. From a strict standpoint, what the original author wrote about bugs is wrong, but from a practical standpoint I don't think it matters. ZFS has been as solid as any other production filesystem from the beginning -- which is to say that any production filesystem might one day rear up and bite you. It's just the nature of our profession.

That's just my opinion though. :)

Scrubbing repairs data, not broken file systems.

That depends what's broken. ZFS has ditto blocks (multiple copies of the same block) going up the tree; roughly speaking, the higher up you are, the more there are. The tree and metadata has more protection than the data itself.

If, say, all the ditto blocks and all the mirrors are corrupt, reconstructing that is going to be hard. I think a case needs to be made that an fsck tool would do a better job repairing than scrubbing, which is non-obvious to me.

> Citation needed

Maybe you should provide some citations yourself first before asking citations from others.

"Keep in mind that when ZFS was new it wasn't particularly stable either"

"ZFS isn't for the typical enthusiast/home-user"

"Nope. Never mind that ZFS assumes that harddrives works to spec, which consumer drives do not"


Keep in mind that, with one exception, ZFS is pretty much rock-solid from day one.

When we used it in production about 5 years ago it was not so rock-solid. We had to bring in Sun-Support and went through some nasty downtimes for lengthy resilvering.

As an enthusiast/home-user I use ZFS, and it has saved my data in situations where no linux filesystem I've tried would. XFS doesn't have an fsck either; more to the point, more does btrfs, so you can't really give that as a reason why btrfs is more suitable for home users.

> XFS doesn't have an fsck either

Oops, actually back in 1995 when XFS was made available, it hadn't any fsck because it supposedly didn't need it. However, they finally made "xfs_repair" available one year later, which while not being called "fsck" proper, still is the same thing.

BTW one thing that xfs_repair did for me was getting back most of the data from an xlv array with a failed drive. Isn't it awesome?

Did I somehow portray that I didn't like ZFS in my post, that it wasn't mature? Or that I feel that btrfs is suitable for home users as it is today?

You said "ZFS isn't for the typical enthusiast/home-user". I don't think that's true (or rather, while there are problems with using ZFS as such a user, the problems with all the alternatives are worse); I would advise such users to use ZFS.

> So, along comes btrfs that mimics the feature-set of ZFS but brings along many advantages (much thanks to it being developed much later).

Just like SystemTap mimics the feature set of DTrace and brings along many advantages due to its later development: http://dtrace.org/blogs/ahl/2007/08/02/dtrace-knockoffs/

Did you even read the article?

Sorry but I can't come up with any response other than: did you even read my post?

It actually is a decent read: Should ZFS have a fsck tool? http://www.osnews.com/story/22423/Should_ZFS_Have_a_fsck_Too...

I definitely agree that ZFS is fantastic, the best thing since sliced bread and better than anything else ever to exist.

However, I can only run btrfs on Ubuntu natively. Therefore, to me, btrfs is better than ZFS.

Exactly. ZFS was deliberately released with a license designed to keep it out of the kernel. It's unavailable to the Linux community, so people use what they have and work on replacements. The controversy isn't technical, it's legal. And it's not resolvable by feature lists and fanboi flaming.

ZFS is great, I'm sure. But btrfs is great too and rapidly improving. I don't think the Linux world has much to worry about.

> Exactly. ZFS was deliberately released with a license designed to keep it out of the kernel.

Right, it's a big conspiracy.

No, this is mostly inarguable, and multiple Sun employees attested to it at the time (I remember a bunch of discussions on freenode specifically, but I'm sure someone can find a web link for a better cite). The fear with the Open Solaris release among the executives was that any code released would instantly be imported into Linux and destroy the perceived competetive advantage Sun had in the enterprise space. So the CDDL was crafted to be almost entirely identical to the LGPLv2 from a practical perspective (the only major distinction being whether "files" are considered independently for protection vs. the work as a whole) yet still be GPL-incompatible.

Obviously this also insulated Solaris from any of the drivers in the Linux tree, so 4.5 years later it still doesn't run on anything but custom designed hardware and a handful of vanilla x86 server configurations.

The CDDL was a great harm to the broader community. Like I said I don't see that there's much argument to be made there.

(edit rather than response to avoid prolonging a discussion: obviously Bryan was there and I wasn't, but it should be pointed out that not everyone who was there agrees with that take, nor are our points exclusive. The wikipedia page on the CDDL has a reference ([6]) which jives with my understanding at the time: http://en.wikipedia.org/wiki/Common_Development_and_Distribu... )

It's frustrating that this idea persists. Yes, the GPL was explicitly rejected, but no, it wasn't because of fear of Linux compatibility. It is true that we refused to dual-license (we didn't want to create a license-based fork), but the reason the GPLv2 was rejected is actually very simple: the strong copy-left left way too much ambiguity for our IHV partners. In particular, we wanted to allow proprietary, closed-source drivers to be shipped for OpenSolaris without a Linux-esque "taint" of the system. We also wanted distros to be created that had entirely proprietary components -- including the binaries that constituted elements of the system that we could not ourselves open source due to third party restrictions.

So yes, we rejected GPLv2 -- but it was not because we were afraid of becoming an organ donor to Linux, but rather because it would have overly restricted the freedoms of our community. In this regard, we were forward looking: it is now broadly accepted that the GPLv2 is an anti-collaborative license[1], explaining its acute decline for new work.

[1] http://dtrace.org/blogs/bmc/2012/08/01/post-revolutionary-op...

>it is now broadly accepted that the GPLv2 is an anti-collaborative license[1],

Broadly accepted where? You certainly draw grand conclusions from head-bobbing at a talk you made.

It's the licence used for I dare say the largest collaborative open source project in the world, Linux. GPL is the most widely used open source licence, used in tons of collaborative projects, from the top of my head: gcc, git, mercurial, qemu, ffmpeg, x264, blender, gimp, inkscape, mplayer, emacs, etc are all examples of other collaboratively developed GPL licenced projects.

And how exactly would it be 'anti-collaborative'? If anything it's a great licence for 'collaborative development' as all participants are legally bound by the licence to release their changes in source form when they distribute.

> GPLv2 is an anti-collaborative license

I see quite the contrary.

Any license that allows you to release your derived work as a proprietary and closed-source decreases the amount of collaboration because now nobody else can collaborate on your proprietary fork. One of the reasons for, say, IBM not to give part of AIX to FreeBSD, is that they fear, justifiably, HP may take their collaboration and incorporate it into HP-UX, giving it an advantage over IBM's proprietary product. If IBM incorporates some übercool part of AIX into Linux, HP cannot use that to benefit HP-UX. Fear of becoming an organ donor is lessened because the receiver can't run away with your liver.

Do you recall why other popular licenses at the time (apache-2.0, MIT, BSD(new), etc) were rejected?

None of those are copyleft.

Ha. Good point.

If those were the concerns, it could have been dual licensed.

Or used an Apache-style license.

IF you are the author you are free to license your work under as many licenses as you want, and not bound by the GPL at all - you inherently have that right.

The rest - agreed.

The CDDL was a great harm to the broader community

In the same way as GPL did, no doubt. FreeBSD is chock full of CDDL tech, but can't use GPL. Illumos can use *BSD code too, but not GPL.

This is back to the BSD vs GPL debate. That's just, like, your opinion, man.

Sun drops some really nice tech out in the open, which is the only reason Illumos can exist (and is now firing on all cylinders). FreeBSD nabs many of the good bits to good effect, as does OSX. And all some people do is complain. Bizarre.

Obviously, yes, this is a long standing flame war. But your analogy is wrong. FreeBSD can use GPL code, legally. They just can't release the combined work under anything but the GPL, so they choose not to. The CDDL is itself a copyleft license, it's simply not possible to deliver a license to code that derives from both CDDL and GPL components.

Obviously it was better for Sun to release OpenSolaris under the CDDL than not. But it was a very poor choice, and again in my opinion a great harm to the community.

And frankly I don't see the Linux people complaining about anything. The linked article was written by a ZFS proponent...

FreeBSD can use GPL code, legally. They just can't release the combined work under anything but the GPL, so they choose not to.

I think the distinction is academic. In practice FreeBSD can use CDDL code, but not GPL.

And frankly I don't see the Linux people complaining about anything.

Every time some piece of Sun tech comes up on a nerd site or aggregator, a licensing flamewar erupts. No matter how cool or useful that piece of tech is, some twit beats the CDDL vs GPL horse once again. Typically this argument then dominates the discussion. Sometimes it's the only comment. Every time.

It's the ultimate manifestation self-entitled bike shedding, because the doers have more interesting things to do. What I find repulsive is that the doers have made cool blue Ferraris for free, and we have giftzwergs complaining that they aren't cool red Ferraris, which is demotivating for the people creating all this. How would you feel if you created something neat, gave it away for free, and were thanked with a wall of people complaining about your choice of OSS license?

I apologize if I come off unfriendly. It's just that I'm interested in the actual tech, and this noise is really growing old.

What you're saying doesn't make any sense. The linked article wants Linux people to use ZFS instead of btrfs. How on earth is it "self-entitled bike shedding" to point out the clearly correct reason for ZFS not being in Linux?

I'm sorry you're interested in "actual tech". But in the real world stuff like the legal ability to use software sometimes gets in the way of our geeky aspirations.

This may be one of the few discussions where the licensing discussion is apropos, so consider my prior rant as something broader.

But it doesn't help when "great harm" hyperbole gets thrown around. The world has benefited from the gift, unless anyone is daft enough to argue that Illumos' existence, FreeBSD's incorporation of ZFS and DTrace, and OSX's incorporation of DTrace, are a bad thing. Hell, DTrace is even being used for PS3 game production.

> FreeBSD can use GPL code, legally. They just can't release the combined work under anything but the GPL, so they choose not to.

Linux can do the same as easily, they "just" have to relicense the kernel in BSD.

Please stop trolling.

No, no, no. This is flatly wrong (so I'll let you guess who's the one trolling). BSD doesn't need to "relicense" anything as the license to the existing code already permits combining with the GPL; a putative FreeBSD kernel with a GPL driver (i.e. a combined work) would need to be distributed under the GPL, but the rest of the code would remain unencumbered. None of the existing copyright holders would need to take any action at all, because the distribution would be within the bounds of the license they already granted.

There is simply no equivalence with the GPL vs. CDDL. Neither license permits redistribution at all when combined with the other. The only way to do this would be to, as you say, "relicense" the kernel by getting every copyright holder (there are tens of thousands by now) to redistribute their code under the CDDL.

> Obviously this also insulated Solaris from any of the drivers in the Linux tree, so 4.5 years later it still doesn't run on anything but custom designed hardware and a handful of vanilla x86 server configurations.

I used to work on this stuff. I used ThinkPads and Dell laptops bought from the store, random white boxes I had lying around, and a particular Dell workstation. Custom designed hardware? Oh, and btw, my ThinkPad had a wireless card that was supported by Solaris. Never gave it much thought, except when I switched jobs and installed Linux, which at the time didn't have the particular driver...

It does not require custom hardware, it runs on any server.

Pfft. What thinkpad did you have with an unsupported wireless card? I find that hard to believe in any case, but especially hard to believe since you had the thinkpad for more than a week and therefore it was not something brand new. I've never had any linux support issues with my multiple thinkpads.

> 4.5 years later it still doesn't run on anything but custom designed hardware and a handful of vanilla x86 server configurations

Oddly enough, I have a laptop that runs OpenSolaris better than it does Ubuntu. 12.04 wouldn't even find the ethernet controller...

> Right, it's a big conspiracy.

It's not a conspiracy, it was deliberately released with a license designed to keep it out of the Linux kernel.

For what it's worth, I've had very good luck with zfs-on-linux on my 12.04 system. Multiple pools, largest is 20TB. Zero problems after a year of use. Switched from mdadm/xfs due to major data loss from corruption, and I don't plan on looking back.


That's very helpful, thank you. I'll give it a shot, although I remember this being a few versions behind the BSD version?

I have a NAS that uses mdadm now (it did save me when one disk went completely bust), but RAIDZ sounds even better.

Wikipedia has a decent comparison here: http://en.wikipedia.org/wiki/ZFS#Comparisons looks like both the BSD and native Linux port are on version 28.

I've been using zfs on linux recently. http://zfsonlinux.org/ I'm not using it for the file system though.

It was painless to install and so far has been reliable. I'm not yet trusting it for primary storage though.

Good as reason as any to switch to FreeBSD. ZFS works great.

You can easily compile ZFS support into your kernel.

Regardless of how good ZFS is, if it can't be shipped native on Linux nobody will care. Much. Though it is a neat file system. And I would love to see it in the Kernel.

Unfortunately it's now controlled by Oracle so you're probably lucky that you can use it without hiring an 'storage consultant' for $100k+ a year.

Your first statement is unfortunately true, ZFS won't get much adoption in Linux if it's not in the kernel.

However, your second statement is blatantly false. Much of ZFS development happens in the open in the illumos and FreeBSD communities. Joyent and Nexenta base their business on ZFS and pay people to work on the open source illumos.

(Disclaimer, I used to work for Nexenta).

Those of us with data that we want to secure will care, and will run ZFS. Maybe not on Linux but on top of FreeBSD or OpenIndiana (in my case).

Sorry, but Linux isn't the end all be all open source operating system, and is the last thing I will put into a production system where I care about my data and or stability.

I installed a openindiana ZFS system for my home just a few days ago. And I don't feel alone, now my collegues probably will also seriously consider it, because I can't stop singing it's praises.

Best install I have done for some time.

meh the fan-articles are so hard to read. "fact!" "unlikely!" "using these facts!" "maybe its bad but at least zfs has it!"

well dude, here's a fact: http://lwn.net/Articles/506244/ have you even used btrfs lately? because that collides with some of your "facts"

so sure, zfs has advantages over btrfs and is arguably better right now, but the latter is still experimental. the main reason for btrfs is the proper licensing for the Linux kernel, anyway - else we'd all be hacking on zfs since day one and there wouldn't even have been a need for btrfs.

Also btrfs has been recieving substantial upgrades every kernel revision. Already its worlds better then it was in the 2.6.x series of kernel and they are finally getting around to tuning everything for performance. The truth is some of those "factS" are true, but most of them are pointless.

Jeff Bonwick [1] the lead of ZFS development [2] seem to have disappeared after leaving Oracle in late 2010. I wonder what he is doing nowadays.

[1] https://en.wikipedia.org/wiki/Jeff_Bonwick

[2] https://en.wikipedia.org/wiki/Zfs

Though Jeff is indisputably the father of ZFS, the ZFS team was (and is) much larger than Jeff. Many of the ZFS team (including its co-inventor, Matt Ahrens) are now at Delphix [1], and continue to be very active in the ZFS community[2] via illumos, its repository of record[3].

[1] http://www.youtube.com/watch?v=-zRN7XLCRhc#t=46m45s

[2] http://blog.delphix.com/matt/2011/11/01/zfs-10-year-annivers...

[3] http://blog.delphix.com/matt/2012/07/11/performance-of-zfs-d...

Bryan, what's your opinion re: the stability of the LLNL port of ZFS on Linux? The 0.6.0 release is apparently just around the corner and several high-profile people in the community, Russell Coker for one [1], have started using this on production systems. I'd love to make the move to ZFS as well but am still afraid of data loss. I started using XFS on Linux in 2002 and my gut feeling is that ZFS on Linux is now at or around a similar point of maturity. If you could share your opinion I'd be really grateful. Thanks so much.

[1] http://etbe.coker.com.au/2012/07/31/zfs-debian-wheezy/

I haven't used the LLNL port myself, but I would be shocked if it were not rock-solid. First, ZFS (unlike, say, DTrace) has reasonably limited dependencies on broader system implementation; you don't have to port other subsystems to get ZFS working. Second, even where it does have external dependencies, it can operate remarkably well when they're not functioning: because of its indirect checksums, ZFS can operate correctly (or at least, non-fatally) in the presence of nearly byzantine behavior from the I/O subsystem. Third, of the ZFS issues I've seen and helped debug over the years (and my data has been on ZFS as long as just about anyone's), none have manifested themselves as data corruption or (in the absence of physical failure that exceeded the redundancy of the pool) data loss. Finally, of these issues over the years, virtually all were fixed inside of ZFS itself -- there was no platform specificity to either the problem or the fix. (The exceptions being platform-level I/O issues that resulted in pathological performance -- but it's hard to call those ZFS issues.)

tl;dr: absent glaring port issues, ZFS on Linux is or should be at maturity of ZFS itself -- which is to say, very mature.

Brian - if only a few enthusiast users can use Solaris derivatives on their hardware, do you not think ZFS is just relevant for the Enterprise folks? Are any ex-Sun,Oracle people putting in efforts for more X86 hardware compatibility?

To me, ZFS and Dtrace are of no consequence if I can't use them on my own hardware. And I am sure there are many more people like me.

Virtually all of the work that we do in the illumos community is on x86 -- and not merely keeping it functional, but actively advancing the state of the art. For example, we at Joyent ported KVM to illumos last year[1], and since that time we have deployed many thousands of virtual (x86) machines into production on (x86) hardware running SmartOS, our illumos derivative.[2]

[1] http://lwn.net/Articles/459754/

[2] http://smartos.org/

I keep a tab on those but I was more talking about real desktop/laptop/workstation class x86 hardware, not VM.

I have tried and given up running Solaris derivatives on bare metal even if for running just command line. The hardware /peripheral support just doesn't seem to be there.

There is a simple way to guarantee that an Illumos distro works, and that's to use Intel for everything: chip, motherboard, nics, you name it.

Other setups can work, but you need to be very aware of this: http://illumos.org/hcl/

Doesn't FreeBSD include ZFS and Dtrace, and run quite well on x86 hardware?

>> To me, ZFS and Dtrace are of no consequence

>> there are many more people like me

Recently I was wondering how can there be so many tech startups around just now. Nothing under the sun is new so how can they cut a living?

Ive realised this attitude (which there is absolutely nothing wrong with) is the reason. The established players leave gaps for their own reasons and the little guy comes along and seizes the opportunity.

Skipping over cheaply accessible best in class technology, at a philosophical level at least, seems akin to pointing a 12 gauge at your toes.

I run OpenIndiana on a variety of different X86 machines, both AMD and Intel without issues. So far compatibility hasn't been an issue that I've ran into and the operating system and everything around it is rock solid.

Much better than the Linux machines that were replaced.

ZFS has a lot of cool features. Granted. But the question is whether or not you need those features, and are they worth the cost that they incur. Especially in the days of VM's, memory is often at a huge premium --- consider how much you have to pay for extra memory. This is not surprising, because if you try to pack a couple of dozen VM's on a single server, memory very often ends up being the critically short resource (servers have only so many DIMM slots, and high density memory modules are expensive).

ZFS might be great if you're running a file server where you can pack many gigabytes of memory into a server and not use it for anything else. But if you are trying to run a cloud or VM server, where you are trying to pack lots of jobs on a single machine, the cost/benefit ratio of a copy-on-write file system (whether it is btrfs or ZFS) may not be worth it.

Sure, give up deduplication, snapshots, raidz, self healing and more for a fuzzy cost/benefit ratio that is based on no data.

Take a look at SmartOS which is based on Illumos. Has KVM and is used extensively by Joyent themselves.

Sorry, but having your underlying storage be rock solid is absolutely crucial for VM's.

ZFS already has great support on Solaris and FreeBSD. If you're running linux check out http://zfsonlinux.org/ - native kernel support that avoids the GPL incompatibility of Sun's zfs implementation (CDDL license).

They don't avoid it; it's still CDDL, and it's still incompatible. This means that it can't be shipped within the kernel itself, so you have to build a module (which is marked as tainted) for your particular kernel. OpenAFS has this same issue BTW.


I think it might be more sane to run GNU user space on Solaris (e.g. Nexenta, OpenIndiana) if you want ZFS that badly.

There is also SmartOS as well.

As an outsider to the vitriol between ZFS and the Linux Filesystem community, why is there so much aggression here? People don't seem to be screaming up and down about NTFS. If you are operating at a scale where this management of data matters, aren't you doing you own analysis and proving?

When ZFS was released, Sun made some strong statements about Linux filesystems, particularly regarding data integrity and ease of management. As a Linux sysadmin with a fairly large amount of data, I thought those were completely warranted but unfortunately some people took it as attacks rather than considering the technical merits. Toss in yet another licensing flamewar and the more vocal parts Linux community wasted a few years pretending the gap would go away if they ignored it.

This is the same reason why Linux doesn't have a viable DTrace competitor: the “what's wrong with kdb?” attitude started with Linus and it took way too long for the near-unanimous sysadmin voice to be heard.

Claiming RAIDZ as an advantage is... weird. At least on a recent Solaris 10, it has absolutely terrible read/write performance.

It depends on your use case. In the world of preservation, ZFS is a godsend. If you're trying to run a database, copy-on-write is going to defeat you. Not that it isn't possible, but it's surely something you're going to have to work against.

Fortunately, there are types of vdevs other than RAIDZ, and using SSDs for the ZIL can improve things as well.


RAIDZ is about cheaper redundancy than mirroring; that's its optimization constraint, and by altering the ratio of drives to bit parity you can adjust to your risk taste. Since all drives need to be written and read for every read and write, the array will only be as fast as its slowest disk. But for large files, you still get the benefit of parallelism in drive bandwidth (rather than losing out from longer seeks). I see hefty throughput in video files on my home NAS, for example; over 400mb/sec locally, in practice it's limited by the gigabit Ethernet on my network.

most points are valid (but known).

... but some are strange. why should a filesystem implement a cifs export? ok, zfs does it. but why should it?

It's not about who implements an SMB server (it's a separate layered filesystem, smbfs, on Solaris, not ZFS itself) but about the convenience of doing only:

  # zfs set sharesmb=on export/home
and configuring access just by setting (with chmod) the Windows compatible ACLs on the ZFS filesystem itself. Authentication can also work as expected in a Windows shop. Doing all this in Samba is a PITA.

Has anyone tried to use Samba4's Active Directory implementation with Illumos ZFS SMB support (ie, without Samba's SMB implementation) ?

No, why would they? It's not needed, a Solaris server can join a Windows Active Directory domain.

I think zdw is asking if Samba can be the Active Directory controller while using the ZFS SMB stack.

Exactly - if it worked, it would be a solid Windows server replacement for storage focused deployments.

win compatible acls are incompatible to posix acls. that's hard to argue away.

ZFS has NT ACLs, not POSIX ACLs.

The file system doesn't implement it, the commands simply let you turn it on or off. CIFS is built into the kernel much like NFS on Solaris/OpenIndiana.

but why should it care about it? it's a whole different stack...

HAMMER2 ( http://apollo.backplane.com/DFlyMisc/hammer2.txt ) is far more interesting to me than sitting around waiting for someone to care enough about Btrfs to finish polishing it until it's ready. The world is ready for a post-ZFS file system, it's just a matter of who delivers first and with an acceptable license. Btrfs hasn't at all capitalized on the significant interest surrounding it, which is a shame.

Of course it's better. It's a filesystem that has had years to mature, while Btrfs is _still_ in development. There isn't even a comparison.

btrfs cannot show you how much disk space is being referred to by each subvolume, as its used and free space tracking is only per pool. The only way to see how much space a particular subvolume is taking is to use du, which gets slower the bigger the subvolume gets.

This is no longer true. As of Linux 3.6, btrfs has the ability to apply quotas to subvolumes.

A lot of these seem like btrfs user-interface problems. Is the actual filesystem incapable of doing all of these?

To steal terminology from git, I'd say that brtfs has a lot of similar plumbing as ZFS, but hardly any of the porcelain. I think the lack of striped raid is the killer for a lot of people, in spite of the hate RAID5 seems to have gained lately.

I went to a half-day Suse brunch event 2 months ago and got myself a eval copy of SLES v11sp2, just to check out brtfs. I tried to be mindful of the "it's not ZFS" attitude, and gave it a lot of leeway. Without going into the maturity of the code bse (I have no idea how stable it is), I found the capabilities and commands a bit cumbersome, counter-intuitive, and the FS as a whole just plain lacking. The article summed it up much more succinctly than I ever could.

I've been using ZFS on my primary workstation on a daily basis since the early beta days of FreeBSD 8.0 (Jun '09, I believe), and it has been a pure joy to use. ZFS v28 (the last released version from Sun/OpenSolaris, I think) found on FreeBSD 9.x is rock-solid and has reasonable performance (as compared to the speed and memory problems seen in in earlier releases).

I've put the filesystem through its paces, doing things that are rightfully called foolhardy, and I have never lost data. I've painted myself into a corner a few times (I curse the fact that you cannot remove hardware from the pool, as with Linux or AIX LVM), but never have I lost data when redundancy was in use. I have had bad sectors crop up on a single-disk zfs pool (single-copy of data, should have known better), and a "scrub" let me know exactly which files had incorrectable problems. Heck, I routinely flip the switch on my PSU when I'm too lazy to walk around my desk to do a proper shutdown, and it comes right back at next boot, no lengthy fsck, no data loss. I usually scrub monthly to check for errors, and otherwise I don't do much maintenance.

Right now, I have 6 2TB Seagate "green" drives in a raidz2 array (raid5-like with 2 parity disks). One disk has been removed due to errors, so I still have a spare disk to tide me over until I can afford a replacement. It runs very well.

I wish native ZFS crypto was available. Right now, each physical device is fully-encrypted with FreeBSD's "geli" (customer data, tax records, etc.), so ZFS doesn't even really have full domain over the physical devices, which makes it even more impressive that I've never lost data.

I will say that enabling compression and/or dedup will cause performance issues. I've got an 8-core AMD processor (XF-1850 -- hardware AES support, yay!) w/ 16GB of DDR3, and I get interactive desktop lag when heavy I/O is going against compressed and/or de-duped data. I don't know if this is a FreeBSD scheduling issue , or inherent with ZFS (any Solaris users want to comment?). Since it was mostly just to test (not like I need the space savings), I just do without.

In any case, I hope Linux's ZFS native kernel port stabilizes. brtfs just isn't there yet.

I did lose data from a ZFS raidz pool once. The cause turned out to be a slowly failing power supply. (An expensive, high-end OCZ unit, no less, with a 5-year warranty. OCZ replaced it without argument. Had it been a lesser PSU I might have suspected it sooner... oh well.)

Of course no filesystem could have survived that. But I still could really have used a salvager (fsck). I actually recovered a couple of critical files by dd-ing blocks off the disks.

FYI, I had a big array of green disks that ended up going out with ZFS, luckily not all at once. I read online that Seagate said they were not designed for use in RAID so they didn't cover it under warranty. You should replace those two disks with the Seagate Black series.

Yeah, it's in the plan. I will probably phase out disks as they fail with WD blue or blacks.

I suspect alot of people who nickpick ZFS do not have much practical experience using ZFS. My home entertainment center has been running ZFS since the OpenSolaris era, then switched to FreeBSD and so far, no _single_ loss of data. Btrfs maybe a fine fs, but I don't see any pressing need at my end to switch even if Btrfs is ready. Linux is absolutely a wonderful OS, but FreeBSD meets my requirements. Frankly, my server runs a minimal set of services and I don't run applications from there. Even though Linux has much more packages than FreeBSD, I don't need those. Thank you very much.

Does any of this matter with ZFS's license issues?

Isn't it absurd that someone would consider ignoring all of those points about technical details and user experience of free software because of license issues with other "free" software?

As a member of the illumos community, I long ago turned the question around: given ZFS, DTrace, Zones, and the rest, all available in a free (as in speech) operating system, why would I use a different system that prohibits incorporating all this valuable free software?

> Isn't it absurd that someone would consider ignoring all of those points about technical details and user experience of free software because of license issues with other "free" software?

It depends on perspective. I think it's ridiculous that someone would even put out a whole distro that's free and useful, and I am really grateful for that.

> why would I use a different system that prohibits incorporating all this valuable free software?

In my case it's simple: I don't have the time or energy to put out a distro that conforms to everything that I want.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact