
OpenZFS launch - mahrens
http://www.open-zfs.org/wiki/Announcement
======
cs702
I suspect an unmentioned goal of this project is eventually to make the
installation of OpenZFS on Linux (and other *ix operating systems) quick and
simple, legally running around Oracle's license restrictions.[1]

Unsurprisingly, the list of supporting companies[2] does not include Oracle,
which surely isn't happy about this project.

\--

[1] The source code upon which OpenZFS is based was provided by Oracle (Sun)
under the CDDL license, which prevents OpenZFS from being distributed in
binary form as part of the Linux kernel.

[2] [http://www.open-zfs.org/wiki/Companies](http://www.open-
zfs.org/wiki/Companies)

~~~
jdboyd
Installing ZFS on Linux is quite simple.

sudo apt-get install linux-headers-`uname -r` linux-headers-generic build-
essential

sudo apt-add-repository ppa:zfs-native/stable

sudo apt-get update

sudo apt-get install ubuntu-zfs

The goal of OpenZFS is to foster cooperation between the various groups using
ZFS.

~~~
nailer
That takes care of post install.

\- How do I install onto ZFS?

\- How do I maintain this? Since ZFS is outside the mainline kernel, how do I
know whoever compiles the ZFS packages will keep track of (distribution)'s
kernel package?

~~~
jdboyd
Good questions. Booting from Linux ZFS is a bit more of an advanced project.
Here is the guide for Ubuntu: [https://github.com/zfsonlinux/pkg-
zfs/wiki/HOWTO-install-Ubu...](https://github.com/zfsonlinux/pkg-
zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem)

The ZFS packages are not fully pre-compiled. The ZFS packages use the dkms
system where the kernel specific parts are recompiled when the kernel changes.
This greatly reduces the maintenance work and is a system that has been in use
by other out of tree kernel modules for a decade. Its not perfect, but it does
largely reduce the distribution specific maintenance burden.

Of course, when a new kernel version comes out, ZFS may have to make
adjustments to support it and ZFS may lag a little bit. However, currently
there are significant resources being dedicated to ZFS on Linux development
and packaging and for the last 18 months they've kept up pretty well.

It isn't perfect. I wish that ZFS could be in the linux source tree. While I
saw ZFS on linux being available, I wasn't ready to try it until last year.
However, third party file systems do have a long tradition, such as AFS, vxfs,
and numerous SAN file systems. ZFS on Linux seems to be doing very well.

------
fsckin
I built a 16TB raidz home office server this weekend using Ubuntu and ZFS on
Linux[0]. It worked great out of the box. I was even able to import a pool
created on another server without any problem. Of course, your mileage may
vary.

I was using FreeNAS previously (mainly for the ZFS support to keep my data
safe and not spend a bunch on raid controllers) and kept getting bogged down
by feeling the need to grok jails. I think jails are terrific in theory, but a
pain to work with if you're not intimately familiar with them. Maybe it's just
the way it works on FreeNAS, but newly created jails (by default on FreeNAS)
were getting a new virtual IP addresses, which really threw me for a loop. Add
to that frustration trying to get all the permissions correct just to make a
few different services work together started to get really painful.

The drop-dead simplicity of setting up exactly what I had previously on a
fresh Ubuntu box with native ZFS port _really_ warmed my cockles.

[0] [http://zfsonlinux.org/](http://zfsonlinux.org/)

~~~
mikevm
What's the advantage of using ZFS RAIDZ over mdadm? I thought that mdadm was
more flexible in growing your RAID array.

~~~
IgorPartola
I have been doing lots of research on this recently and here is the main thing
that makes ZFS win every time:

When you have a RAID of any kind you need to periodically scrub it, meaning
compare data on each drive byte by byte to all other drives (let's assume we
are talking just about mirroring). So if you have two drives in an mdadm array
and the scrubbing process finds that a block differs from drive A to drive B,
and neither drive reports an error, then the scrubber simply takes the block
from the highest numbered drive and makes that the correct data, copying it to
the other drive. What's worse is that even if you use 3 or more drives, Linux
software RAID does the same thing, despite having more info available. On the
other hand, ZFS does the scrubbing by checksums, so it knows which drive has
the correct copy of the block.

How often does this happen? According to what I have been reading, without ECC
RAM and without ZFS, your machines get roughly one corrupt bit per day. In
other words, that could be a few corrupt files per week.

My conclusion is that as I am building my NAS, I want ECC RAM and ZFS for
things I cannot easily replicate.

~~~
babas
Just to make it clear. raid-5/6 mdadm arrays does the right thing when
repairing/checking/scrubbing data. It writes the correct data if one of the
drives has a corrupted block.

[https://raid.wiki.kernel.org/index.php/RAID_Administration](https://raid.wiki.kernel.org/index.php/RAID_Administration)

    
    
      How often does this happen? According to what I have been reading, without ECC RAM and without ZFS, your machines get roughly one corrupt bit per day. In other words, that could be a few corrupt files per week.
    

This is complete nonsense without more data to back it up.

~~~
GalacticDomin8r
> Just to make it clear. raid-5/6 mdadm arrays does the right thing when
> repairing/checking/scrubbing data.

This is inherent to RAID-5/6\. Doesn't really have anything to do with mdadm
other than mdadm implements RAID-5/6\. And now you probably have a write hole.

------
r00fus
What about patent encumbrance? Has the squabble between Sun(now Oracle) and
NetApp been completely resolved? According to this[1], it was a quiet draw-
down, followed by NetApp suing some hardware manufacturers selling ZFS.

Are the patent issue laid to rest?

[1]
[http://en.swpat.org/wiki/NetApp's_filesystem_patents](http://en.swpat.org/wiki/NetApp's_filesystem_patents)

~~~
BrainInAJar
CDDL has patent shielding. Oracle is on the hook for that patent, not users of
Sun's (now Oracle's) CDDL licensed code.

------
emillon
I have trouble understanding the position of OpenZFS.

My understanding is that Oracle ZFS, which can not be integrated in mainline
Linux, can still be distributed as a separate project (zfsonlinux.org).

On the other hand, Linux kernel developers have started btrfs, which is
inspired by, but incompatible with, ZFS.

So, what is this project? I can only imagine that this is either a white room
reimplementation of ZFS, or a fork before the license changed (but I think it
was CDDL from the start).

A more interesting question would be: who should use or develop this? IMHO
this will never be on par with "the" ZFS, so btrfs is where everyone's energy
should go.

(Also, by reimplementing a ZFS product you're supporting them in a way.)

~~~
epistasis
You should read the link, it answers many of your questions here. For example,
it's not a reimplementation of ZFS, it's an organization to coordinate between
the many different groups that are actively using ZFS in their products.

Parts of btrfs may be inspired by ZFS, but btrfs doesn't even _aspire_ to some
of ZFS's niceties like zraid3. And if you're looking at ever using 4TB disks,
a third parity bit should be a requirement.

I'm a fan of btrfs, but I'm a much much bigger fan of ZFS. ZFS will almost
certainly be at the base of our next storage buildout, and btrfs probably will
not. The only thing keeping btrfs relevant is that the GPL and CDDL interact
poorly. But when there's a great, well tested, and higher tech code base, why
should people abandon it? ZFS is used by many many people in production, btrfs
by very few, and even if btrfs hits all its development roadmap it won't be
the equal of ZFS.

~~~
emillon
I read the link, and I disagree that it answers these questions. It is
presented as "the truly open source successor to the ZFS project", which is
why I understood it was a fork.

I agree that btrfs is not ready for production (and I'm not ready to hand my
precious bytes to it yet), and will probably never be as feature rich as ZFS.
But the licensing issue will always exist, and Linux needs a modern file
system -- btrfs.

That said, a lot of people/businesses use ZFS on Linux in production, so it's
nice that there is an central place where they can find documentation about
it.

~~~
epistasis
My apologies, on reading again, my first sentence comes across far snarkier
than I meant it to be! I thought that the announcement was clear, but reading
again, I can see some ambiguities.

I do agree that that Linux probably could use something better than ext4. But
Linux also needs something like ZFS.

If Linux's license makes it too difficult to run ZFS, then I can run FreeBSD,
Illumos, OpenIndiana, or whatever other open source OS I want in order to get
ZFS. But I can't replace ZFS with btrfs, and it doesn't look like btrfs wants
to be able to replace ZFS.

------
oscargrouch
you know why this is huge? its the first kernel piece of code that are and can
be shared by different open-source operating systems in the kernel codebase!

i hope more technologies could get up into this model.. while the democratic
open-source ecosystem in userland is a rule this days.. on the kernel land
this is not true and that may force us to use a OS instead of other because of
some feature that only that Os has, even if we wish to install the other one..

for instance, i love linux, but i also love the BSD´s and want them to grow as
much as linux did.. if the good things created in one OS could also be used on
other, by a proper port to that kernel, we might not being bullied to accept
on OS in favor of another, and be stuck by it!

I hope this movement make its way to others kerneland technologies! great
news!

~~~
simcop2387
Not quite the first, but likely to end up one of the most wide spread. The DRM
graphics drivers are actually older than ZFS and have been MIT licensed from
the beginning (they started from older MIT licensed X11 projects). There's
also possibly some other's from the BSD projects (network drivers and such)
that are even older than those.

~~~
oscargrouch
yeah, but in those cases you have cited, the interested party get the source
and implements it on its own codebase himself

eg. microsoft using bsd network layer on windows and freebsd porting linux drm
to its own kernel..

this is different.. its aggregated as a product that target several os´s with
different implementations for each kernel, but shareing only one core
codebase.. this tends to be much more stable and cheap to everybody.. less bug
prone, etc..

didnt heard of anything like it..

------
patrickg_zill
ZFS is really a good thing, even if you never use it.

Why?

It raised the bar on filesystems and other filesystems have innovated in
response.

~~~
VLM
I would tentatively disagree with raising the bar and describe it using my
favorite MySQL / PostGRES analogy that they have a fundamentally different
philosophy but do about the same thing.

For example, if you want to do something software raid-ish, ZFS has the
philosophy that should be done at the filesystem layer, not as a virtual
device like every other linux filesystem, ever. Its not a new feature to be
able to do RAID, but its new to embed RAID into the filesystem layer itself.
Linux style virtual RAID devices don't care if you build a FAT32 on top of
/dev/md0.

There are other examples of the same philosophy in ZFS. For example everywhere
else in linux if you want some manner of "volume manager" you simply use LVM.
ZFS has its own interesting little volume manager. Which relates to snapshots.

It's exactly the same with encryption. Every other implementation on linux
uses a loop device and your choice of algo. ZFS shoves all that inside the
filesystem.

Another philosophical decision is every other linux filesystem doesn't scrub,
but only fsck's metadata, so logically ZFS implements the exact opposite.

Although ZFS supporters are technically telling the truth when they run around
saying only ZFS can provide software RAID, or only ZFS has a volume manager,
and ext2/3/4 does not, its not relevant. I've had LVM and software RAID and
all that for many years on existing linux stuff.

One of the few true features ZFS provides is allowing ridiculously big
filesystems. Which is cool.

It is mostly a philosophical difference between modularity and monolithic
design, with pretty much everything else being modular, and ZFS being
extremely monolithic.

In that way I don't think ZFS has prodded any innovation at all in any other
filesystems other than maybe BTRS which I haven't been following because my
data is too valuable to experiment upon and filesystems aren't my thing. I
don't see the iso9660 FS driver adding native volume management, snapshotting,
software RAID, and encryption any time soon.

~~~
ansible
_ZFS shoves all that inside the filesystem. [...] I 've had LVM and software
RAID and all that for many years on existing linux stuff._

I suggest you study the design of ZFS and also NetApp's WAFL to understand why
they are combining what have traditionally been separate layers into one
larger system.

The short version is that this cross-cutting of layers enables substantial
optimizations and new features that aren't possible when everything is kept to
strict interfaces.

~~~
VLM
But your conclusion is basically what I wrote, that one design is monolithic
and one design is modular.

Yes in theory you could probably come up with a weird pathological scenario
where a monolithic design is slower and a modular design is faster. But that
usually doesn't happen.

Usually, turning a modular design into a monolithic design for a tiny
performance gain turns into an epic disaster/mistake. Maybe the whole ZFS
thing will be an exception. Probably not.

~~~
ansible
_Yes in theory you could probably come up with a weird pathological scenario
where a monolithic design is slower and a modular design is faster. But that
usually doesn 't happen. [...] Usually, turning a modular design into a
monolithic design for a tiny performance gain turns into an epic
disaster/mistake._

Well, think about this. Suppose you're running RAID-1 with two drives, and
you've got some filesystem (maybe ext4, but that doesn't matter) running on
top of that. You create one huge file, and then a little while later you
delete it. And right after that, one of the disks dies, and you replace it.

In this case, your RAID layer doesn't know that most of the data written to
the original drive is junk, and that the only really important bits are some
inodes and directory entries consuming a few MB near the end of the disk. It
has to re-mirror the entire drive from the original to the replacement before
they are in-sync again and you are fully protected. Even with modern drives,
that leaves a large window of time that you're not protected.

If, on the other hand, your RAID layer has a thicker interface to the
filesystem than just a dumb block store, it can just mirror the little bit of
metadata, and within seconds you're in sync again and fully protected.

That's just one example. There are many more. Go read the stories about people
complaining about RAID-5 and RAID-6 performance.

------
dkl
How does this and ZFS on Linux compare? How are they related? As a casual
bystander, this is all very confusing.

~~~
jdboyd
ZFS on Linux is OpenZFS ported to Linux. As various OpenZFS users (FreeBSD,
Nexenta, etc) add features to OpenZFS the code will be merged upstream then
adopted downstream by the other OpenZFS users (including ZFS on Linux).

------
crb
Still CDDL, so still cannot be included in the main Linux kernel tree?

~~~
mahrens
The legal issue is a matter of some debate (see below). In practice, I don't
think the Benevolent Dictator wants ZFS in the main Linux kernel tree at the
moment.

[http://zfsonlinux.org/faq.html#WhatAboutTheLicensingIssue](http://zfsonlinux.org/faq.html#WhatAboutTheLicensingIssue)

~~~
rgbrenner
There's no debate. The CDDL has never been compatible with the GPL... and
CDDL'd code will never be a part of the kernel.

[http://www.gnu.org/licenses/license-
list.html#CDDL](http://www.gnu.org/licenses/license-list.html#CDDL)

[http://www.groklaw.net/articlebasic.php?story=20041205023636...](http://www.groklaw.net/articlebasic.php?story=20041205023636236)

[http://www.groklaw.net/article.php?story=20050205022937327](http://www.groklaw.net/article.php?story=20050205022937327)

In this link Linus talks about loadable module licensing.. gives an example of
AFS, which he says he did not think counted as a derived work (and therefore
did not need to be licensed under the GPL).. very similar to the ZFS case.

[https://lkml.org/lkml/2003/12/3/228](https://lkml.org/lkml/2003/12/3/228)

~~~
mahrens
You don't see Linus's viewpoint as contrary to the one on gnu.org? It seems
like Linus is saying that the GPL allows AFS to be used with Linux (because it
is not a "derived work"). How is ZFS different from AFS in this respect?

~~~
rgbrenner
No, I don't. The gnu link says CDDL is not compatible with the GPL. What Linus
is saying is, if a kernel loadable module relies on kernel internals, then it
may count as a derived work, and must be licensed under a GPL-compatible
license. If it does not -- for example, it is a filesystem that was ported to
Linux (like AFS, and zfs fits here too IMO), then it may not be a derived
work, and can be licensed under a non-gpl compatible license (like the CDDL).

Obviously, including the AFS or zfs into the source WOULD DEFINITELY create a
derived work, and would require AFS/ZFS to be licensed under the GPL.

So Linus' is only explaining how a kernel loadable module could be licensed
under a GPL-incompatible license.

~~~
Dylan16807
I don't understand how putting the code into the same repository/makefile
changes whether it's derivative in terms of copyright.

~~~
rgbrenner
It's really more than simply adding it to a repository. I can store two text
files that have nothing to do with each other on my HD or in a repo or
wherever, and I would not be creating a derived work.

But if I have file A (kernel source) and file B (zfs code), and I compile A+B
into a binary (the kernel image), then I have a single work that has been
derived from A and B.

When it's suggested that ZFS be added to the kernel repo, what's really being
said is that a single work (kernel+zfs) should be created.

In contrast, ZFS is currently a kernel loadable module. We have the kernel
binary, and the module binary.. two separate works. (What Linus was clarifying
was how integrated the module could be with the kernel and it still be
considered a separate work.)

~~~
Dylan16807
So what if it was inside the official kernel repo but defaulted to building as
a module, and anyone distributing binaries left it on the default? Would that
actually satisfy the licenses or are there clauses that would get in the way?

~~~
rgbrenner
Good question. I think this is getting to specific for me to comment on.
IANAL.

~~~
_delirium
From my mostly second-hand knowledge, in practice even lawyers specializing in
IP wouldn't have a solid answer for how to make this distinction, without
looking at a specific case in detail. If it came up in a trial, the two sides
would make a version of the arguments presented here: one would emphasize that
the sources have now been "added to the kernel tree", a unified project
managed with close integration etc. etc., while the other side would argue
they were merely placed alongside the kernel sources in a version control
system, like collecting short stories in an anthology.

~~~
Dylan16807
Let me ask a slightly different question then. Does the GPLv2 _ever_ try to
control anything that is not derived from the GPL source code? Some of the
FSF's saber-rattling seems to imply that either the answer is yes, or they're
being misleading, or they have a completely ridiculous definition of
derivation.

~~~
rgbrenner
Yes, if you link to a GPL'd library your program must also be GPL'd. That is
why the LGPL was created.

Only example that immediately jumps to mind.

Edit: rephrased for clarity

~~~
Dylan16807
Don't they claim that the linking rule works via derivation? As far as I
understand it, the FSF would tell you that anything linked is always
derivative. But that doesn't mean it's true. If you could prove that a
particular instance of linking to a library was not derivative, would they
still claim your program had to be GPL?

As I understand it, the LGPL exists to 1. provide legal certainty and 2. allow
some amount of external derivation _if necessary_.

And it's easy to create an artificial dynamic-linking case where there is
provably no derivation, using multiple libraries with the same API.

~~~
rgbrenner
_Don 't they claim that the linking rule works via derivation?_

Yes.. it was the closest thing I could think of where the two pieces of
software are fairly separated. I mean you could have a huge proprietary
program, and a developer calls gsl_pow_int() from the GNU scientific library,
and the entire program must be licensed under the GPL.

I think that's about as close as you're going to get.

If you're looking for a case where the FSF said a piece of software had to be
licensed under the GPL, even though it was NOT a derivative work, I don't
think you'll find it. The reason it must be a derivative is copyright law..
the GPL can't unilaterally change that.

~~~
Dylan16807
The GPL can't extend copyright law, but it can refuse to let you distribute.

It's possible to make a license that would say "can't be distributed with
other software that does X".

But good, I'm glad there's nothing like that in the licenses here that I
missed. Just the normal derivation-based questions.

~~~
rgbrenner
Yes, you are right.. I didn't think of that case.. They cover this question in
the FSF GPL faq: [http://www.gnu.org/licenses/gpl-
faq.html#MereAggregation](http://www.gnu.org/licenses/gpl-
faq.html#MereAggregation)

And they do have a requirement if you do so: _The only condition is that you
cannot release the aggregate under a license that prohibits users from
exercising rights that each program 's individual license would grant them._

------
acd
ZFS is the best thing that happened in storage! Big thanks to the developers!
Happily uses Western Digital 25AV 2.5 disks. Will probably use WD Red series.
Total remote backup time with ZFS incremental snapshots negligible.

------
shmerl
Are they going to reconcile incompatibilities with GPL by adding additional
license? Since OpenZFS is not associated with Oracle, who owns it (as in
giving it a new / additional license?). It's a really silly situation since
according to the authors they never intended to use the license to exclude
Linux. But that's what happened.

~~~
vertex-four
Unfortunately, Sun (now Oracle) provided the original source code under the
CDDL, so re-licensing it would require Oracle's approval, which is unlikely to
happen.

ZFS's license is compatible with those of FreeBSD and Illumos, which are very
stable operating systems. Given that ZFS is most likely to be used for a SAN
or a NAS box, you can quite easily use FreeBSD for those boxes and Linux for
your application servers if you choose.

~~~
shmerl
So essentially Oracle still owns the original rights on it? How does it work
with derivatives? Let's say OpenZFS over the years will move far away from
Oracle's ZFS. Will it sill be indirectly controlled by them as in not allowing
to relicense it?

~~~
lambada
Generally it's a complicated area. Generally the only safe way to sneak out
from under an existing license would be a black-box rewrite, done by people
who hadn't looked at the source for the original version. Otherwise the
original author could claim that it's a derivative work, and thus falls under
the terms of the original license.

The CDDL in particular specifies that any modifications (changes, additions or
deletions to the source code or their files) are also under the CDDL.

See
[http://web.archive.org/web/20090305064954/http://www.sun.com...](http://web.archive.org/web/20090305064954/http://www.sun.com/cddl/cddl.html)
3.2,3.4 along with their definition of Modification.

However, even a black-box rewrite could still fall foul of any patents granted
to the original creators.

~~~
shmerl
_> However, even a black-box rewrite could still fall foul of any patents
granted to the original creators._

Well, OpenZFS is already vulnerable to it. So if Oracle will decide to
sabotage it, it easily can.

------
contingencies
I dabbled in ZFS on FreeBSD + OpenSolaris 3 years back. It was nice and all,
but it hasn't been worth the overhead of running another OS to get its
features since. I'm therefore glad to see some unity in the ZFS community to
create more trust around its use in Linux, and proud to see my beloved Gentoo
in the list of standard-bearers! Bring on the unrivalled pragmatism. Quack
quack.

Observation: Gentoo packages still point to
[http://zfsonlinux.org/](http://zfsonlinux.org/) not to [http://open-
zfs.org/](http://open-zfs.org/) .. is this the same code? I suppose so.

Further observation: The kernel code looks like, as packaged by Gentoo, it can
only be compiled as a module. Generally I disable LKMs on production systems.
Grumble.

~~~
dmpk2k
_it hasn 't been worth the overhead of running another OS to get its features
since._

That's a rather cavalier attitude towards the integrity of your data. :/

I can replace operating systems. It's a lot more difficult to replace lost or
silently-corrupted data. That makes data integrity one of my prime concerns,
which it should be for any production systems.

~~~
contingencies
_That 's a rather cavalier attitude towards the integrity of your data ...
data integrity one of my prime concerns, which it should be for any production
systems._

Different projects have different resources and requirements. There's far more
ways than just ZFS to provide for data redundancy and integrity (TMTOWTDI).

(Edit: why the downvote? Geez.)

~~~
dmpk2k
Downvote wasn't me. I agree with the sentiment regarding requirements.

However, I disagree with the bit regarding data redundancy and integrity. You
_can_ do it other ways, but that doesn't make it a good idea; it's a bit like
Greenspun's Tenth Rule, but for data. ZFS, or something like it (and there
isn't anything else like it), is the foundation of any modern setup where data
is important.

Because it's not on Linux is a terrible reason. If your data is important,
then you'll need to look elsewhere than Linux for the servers where the data
sleeps. The importance of data _requires_ it.

If data is relatively unimportant, then you're right. There are few domains
where that's true nowadays though.

~~~
contingencies
_I disagree with the bit regarding data redundancy and integrity._

You are welcome to disagree but I'd like to see some reasoning.

 _You can do it other ways, but that doesn 't make it a good idea; it's a bit
like Greenspun's Tenth Rule, but for data._

Had to go searching for that rule, which seems to be Lisp-snobbery which is
clearly somewhat justified in theory but almost irrelevant in practice. Right
tool for the job, and all that. It's such a broken metaphor for storage
consistency or availability that I'm not going to comment further.

 _ZFS, or something like it (and there isn 't anything else like it), is the
foundation of any modern setup_

Do you honestly view ZFS as the be-all and end-all of data storage? That would
be ... sad. Other filesystems can offer snapshots and high availability, as
can other elements within a storage system. For example, in Linux, DRBD is a
block device driver that provides even more powerful availability guarantees
that any conventional (~single-host-homed) filesystem. Likewise, LVM2 has
provided block-layer snapshots for ages. Similarly, Linux is unsurprisingly
the most vibrant platform for cluster filesystems. Then there's also other
great general purpose tools such as RAID, signatures/checksums, and such.

 _If your data is important, then you 'll need to look elsewhere than Linux
for the servers where the data sleeps._

That's just ridiculous. I guess you're going to tell me most of the world's
data lives on ZFS? Google uses ZFS? Facebook uses ZFS? Yahoo uses ZFS? Let's
be realistic here: you're absolutely and demonstrably wrong, and have provided
no compelling argument.

~~~
dmpk2k
_which seems to be Lisp-snobbery which is clearly somewhat justified in theory
but almost irrelevant in practice_

I agree, but you're missing the forest for the trees here. Please accept my
arguments in good faith.

 _Do you honestly view ZFS as the be-all and end-all of data storage?_

For local storage? Right now? Yes, it's the best we have.

 _provided no compelling argument_

How many filesystems have Merkle trees? You need something like them to avoid
phantom reads, phantom writes, and silent corruption.

How many filesystems have duplicate metadata blocks, duplicate [what's
analogous to] the superblock several times, and can duplicate data a user-
specified number of times? And then check their validity using the Merkle tree
property above to validate reads?

How many filesystems offer free and instant snapshots? As many as you want?
Those things are wonderful for databases.

How many filesystems offer software RAID? Hardware RAID is a dodgy idea,
because it's a complex binary blob in firmware you have no insight into when
something goes wrong (speaking from bitter experience, things go wrong).
Furthermore some hardware RAID suffers from a write hole.

How many filesystems are transactional? And allow you to roll back if a
transaction becomes unfixably corrupted? How many can replicate? How many use
SSDs efficiently? How many have been in heavy industrial use for years?

ZFS has _all_ that (not some of it, that's the point), and more. There's
nothing else like it. btrfs probably will be one day as well, but not yet.

So, no, it's not ridiculous. I've been down this trail of tears before, and
ZFS has made life so much better. At least I don't need to dread a number in
my database silently flipping a digit anymore -- if that scenario doesn't give
you the hives, then I really don't know what to say.

~~~
contingencies
Many things can corrupt your data ... outside of the filesystem. You seem
unswervingly fixated on ZFS for some reason. This is simply wrong. If there's
any forest-missing going on for tree fascination, it's with you.

~~~
dmpk2k
I make an argument in good faith and get a nonsensical passive-aggressive
blow-off in return. You should be ashamed.

~~~
contingencies
I fully recognize ZFS's great feature set, it's just a tool though, and only
represents one potential solution, appropriate for certain requirements,
within one layer of a storage subsystem. If paranoid levels of data integrity
are an end-to-end requirement, ZFS isn't a magic bullet.

------
magg
the site is lacking the Mac OS X version called Zevo:
[http://getgreenbytes.com/solutions/zevo/](http://getgreenbytes.com/solutions/zevo/).
It was developed by Don Brady, formerly a Senior Software Engineer at Apple

~~~
adsr
And it's at Zpool 28 already.

------
matt_heimer
Not GPL compatible and patents from my understanding. Is there any way they
can expect large scale deployments or linux integration?

~~~
mahrens
ZFS on Linux large scale deployments: 55 Petabytes at LLNL

[http://www.slideshare.net/MatthewAhrens/open-zfs-
linuxcon](http://www.slideshare.net/MatthewAhrens/open-zfs-linuxcon)

See slide 3; last slide has a little more detail.

ZFS on illumos: Nexenta claims 1.5 Exabytes under management (across multiple
deployments)

[http://billroth.ulitzer.com/node/2461630](http://billroth.ulitzer.com/node/2461630)

------
VikingCoder
Can I please, please, please have a Windows port?

At work, we use Windows boxes, and I would love to use ZFS on them.

~~~
astrodust
The operating systems that are getting support are mostly or at least partly
open-source, so good luck with that.

~~~
Demiurge
That's probably not as big of a detriment as the high level functionality
Windows expects to be in the filesystem layer that unix doesn't (or puts in
VFS). From what I read, this is why Microsoft themselves have a hard time
replacing NTFS.

~~~
VikingCoder
There's having ZFS running as a native Windows file system...

...and then there's running ZFS on my Windows systems, which I must use a
special API to access.

The second one isn't ideal, but it would be awesome to have.

------
phryk
FORK YEAH, this is what I've been waiting for the last 2 or so years. <3

------
tete
I am curious. What are you using ZFS for?

~~~
contingencies
[https://en.wikipedia.org/wiki/Comparison_of_filesystems#Feat...](https://en.wikipedia.org/wiki/Comparison_of_filesystems#Features)

~~~
tete
I know. I mostly ask, because a lot of people these days keep nearly every
kind of data in databases and ZFS usually isn't the best thing for this
(depending on the exact use case of course). It's by far one of the most
exciting file systems, for storing "raw" data, but if you are having another
layer, your data base system it (again, based on the exact use case) can
become an unbearable overhead performancewise.

------
cmccabe
I have a lot of respect for those working on this project, but realistically,
if you use Linux, using an out-of-tree filesystem is just asking for pain--
lots of it. I would never use this on a production system. You know how
painful out-of-tree video drivers are? Yeah. Imagine that, only now with the
potential for data loss and divergent on-disk formats. And if it's your root
fs, you can forget about booting if there's a problem.

Sure ZFS has a great reputation, but a lot of that came from how well-
integrated it was into Solaris and how much QA was done on it. Neither of
those things were ever true (or are going to be true in the future) for the
various ZFS-on-Linux projects (yes, there are multiple.)

The comments about btrfs are about 5 years out of date. SuSE has already
shipped btrfs in their "stable" 11.1 distribution, and Red Hat is going to do
so in RHEL7. Give it a chance.

~~~
cbsmith
> The comments about btrfs are about 5 years out of date.

Not really. I've tried it, and it still has pain points I'd not like to have
in my filesystem. It's like ZFS almost a decade ago (and I'm not talking about
features)... although ZFS on Linux vs. btrfs on Linux... right now I'd still
go with btrfs.

> Neither of those things were ever true (or are going to be true in the
> future) for the various ZFS-on-Linux projects (yes, there are multiple.)

I believe there is a shift with regard to this, as demonstrated by the Gentoo
project's integration of ZFS.

~~~
unhammer
> Not really. I've tried it, and it still has pain points I'd not like to have
> in my filesystem. It's like ZFS almost a decade ago (and I'm not talking
> about features)... although ZFS on Linux vs. btrfs on Linux... right now I'd
> still go with btrfs.

Care to elaborate? I've never tried ZFS, but been very happy with btrfs for my
smalltime personal usage, I'm wondering why people find it so painful in
comparison.

~~~
cbsmith
Well, there's just the general lack luster performance:
[http://www.phoronix.com/scan.php?page=article&item=linux_311...](http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems&num=3)

I've also seen particularly bad pain points when doing things like using it
with an NFS server.

~~~
cmccabe
ZFS also has "general lackluster performance" in areas like using memory (it
requires tons of it). It's inherent in the design of a copy-on-write
filesystem.

According to Ted Dunangst: "ZFS wants a lot of memory. A lot lot lot of
memory. So much memory, the kernel address space has trouble wrapping its arms
around ZFS. I haven't studied it extensively, but the hack of pushing some of
the cache off into higher memory and accessing it through a small window may
even work." See [http://www.tedunangst.com/flak/post/ZFS-on-
OpenBSD](http://www.tedunangst.com/flak/post/ZFS-on-OpenBSD)

Different filesystems are good for different things. If you want a filesystem
that has subvolumes, copy-on-write snapshots, built-in RAID, transactions,
space-efficient packing of small files, batch deduplication, checksums on data
and metadata, and so forth, you have to pay a price. Just the same way as
running Apache with all the bells and whistles is not going to be as fast as
ngnix.

~~~
cbsmith
> ZFS also has "general lackluster performance" in areas like using memory (it
> requires tons of it). It's inherent in the design of a copy-on-write
> filesystem.

Those benchmarks aren't about CPU or memory consumption. These days a good
filesystem probably _should_ trade memory and CPU for increased performance.
Those benchmarks are about throughput/latency.

> Just the same way as running Apache with all the bells and whistles is not
> going to be as fast as ngnix.

...except ZFS generally performs very well as compared to other filesystems.
When it first came out it had all kinds of ugly corner cases where it
performed poorly, but it seems to do great these days.

