
ZFS v0.8.2 - turrini
https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.8.2
======
nilsb
I contributed a few patches to ZFS on Linux about 8 years ago - at a time when
it was still very much in its infancy and panic'd when you looked at it in the
wrong way.

It's incredible how far they've come. We're using ZFS on Linux on about 120
servers at work and it's rock solid. Snapshots are a life saver in our day-to-
day ops.

~~~
pletnes
Would you know how it compares to btrfs? I found btrfs easy to set up but with
hard-to-debug rough edges.

~~~
braindeath
zfs is really a study in cli ergonomics. It has a quite idiosyncratic CLI,
which one might assume is a bad thing. But the CLI is so well thought out that
learning this new "language" is quite intuitive.

btrfs uses a more traditional command, subcommand, args setup, which by itself
is not a problem, but the actual command and subcommand structure is a
dumpster fire to be charitable.

Try managing large groups of snapshots without custom tooling. The whole thing
makes me sad really.

btrfs has a lot going for it at the architecture and disk format level (dedup
is actually something theoretically useful, while the zfs design was flawed
from the start), but the implementation has just never made it.

~~~
rkagerer
Dumb question as I haven't used the feature: Why is dedupe flawed? Is it
because it requires "enormous" amounts of RAM? Does it eventually slow down
writes?

~~~
rincebrain
Dedup on ZFS is problematic because ZFS, in exchange for some of its core
useful features, promises that block locations on disk are immutable.

So the only place you can do dedup is inline, as the data is being written the
first time, not after-the-fact.

In addition, this requires you keep a huge indirection table (the DDT, or
dedup table) that needs to be read for all writes, so either that has to be
kept in memory or on fast storage, or you've just turned every write into one
or more random reads, plus writes.

This also means that even if you turn off dedup after turning it on, the
performance implications remain until the DDT no longer contains any blocks
(e.g. you rewrote all the data after turning dedup off).

There are feature proposals to make the performance of dedup less
pathological, but nobody's taken up implementing them so far. (Someone even
did a proof of concept implementation of one of them, and it still hasn't been
finished and integrated.)

------
jordanbeiber
Imo ZFS is just awesome, as many things sprung from sun/solaris.

I first came in contact with it through solaris, and looking back at that
stack with zfs, zones & software defined networking it just seems like such a
contemporary fit. Recently I’ve looked a bit at smartos and joyents offering
that kind of takes it the way I _thought_ it would take, back in the day, pre-
oracle.

Been using ZoL for years now, both for private filers but also professionally
as of late for things like physical docker/container hosts.

Just rock solid!

~~~
trollied
Agree.

Just a shame that Solaris wasn't Open Sourced sooner, as if it'd grown more
traction we'd be in a different world now.

I think Linux kind of creeped up on them, and in the meantime Oracle got its
grubby its on Sun, then embraced Linux itself.

Zones are /amazing/.

~~~
pjmlp
Oracle only got its grubby on Sun, because no one else saw any value doing a
counter offer, including Google after doing their Java curved ball to Sun's
licensing.

~~~
Lutzb
IBM actually offered $7b, but was outbid by Oracle back then.

~~~
pjmlp
Yes they did, but nothing prevented them to do a counter-offer, which wasn't
seen as being worthwhile.

------
kbumsik
Maybe change the title? It's not ZFS v0.8.2, it's ZFS on Linux v0.8.2.

~~~
organsnyder
FreeBSD is migrating to ZoL: [https://papers.freebsd.org/2019/bsdcan/jude-
the_future_of_op...](https://papers.freebsd.org/2019/bsdcan/jude-
the_future_of_openzfs_and_freebsd/)

~~~
kbumsik
I didn't know that, but this is much bigger news to me. I have been always
told that the biggest reason to use FreeBSD is ZFS a long time ago. Things has
been changed that much since then.

~~~
vermaden
On FreeBSD 11.x and 12.x (and older ones) the currently by default remains the
old (not ZFS on Linux) ZFS implementation that has lots of FreeBSD additions
like TRIM that was lacked on Linux.

FreeBSD forked ZFS on Linux GitHub repo and now its called ZFS on FreeBSD. You
can try it already by installing these two ports:

\- sysutils/zol

\- sysutils/openzfs-kmod

Regards, vermaden

~~~
JackMcMack
Damn, I could've used this a few weeks ago. I moved my NAS from bare Ubuntu +
ZoL to FreeNAS, and got stuck on a read-only pool because of unsupported
features such as userobj_accounting (1). I had to zfs send/receive to external
drives and back again to work around this, which was annoying, to say the
least.

[1] [http://open-zfs.org/wiki/Feature_Flags](http://open-
zfs.org/wiki/Feature_Flags)

~~~
danudey
This is kind of standard faire though. I did a `zpool upgrade` on my root pool
once and broke GRUB's ability to boot my system, since there were features
enabled that it didn't understand (wouldn't have hurt anything to ignore, but
GRUB wants to play it safe obviously).

------
rkwasny
Anyone from Cannonical here? When can we get official backports for LTS
version of Ubuntu? Cannot really upgrade servers to 19.10

~~~
pizza234
You don't strictly need a backport (which I think will never happen, as the
ZFS packages are locked to v0.7 on 18.04) - there's a PPA for it:
[https://launchpad.net/~jonathonf/+archive/ubuntu/zfs](https://launchpad.net/~jonathonf/+archive/ubuntu/zfs).

~~~
danudey
Note that this PPA doesn't have 0.8.2 yet!

------
atonse
Always admired ZFS since when it came out. The talks by the creators were so
enlightening.

The license for this is CDDL? Wasn't that the license for the original ZFS,
which was the whole reason people had to port it? (apart from kernel sys
calls, etc)

~~~
beatgammit
They didn't rewrite ZFS for Linux support, they did the syscall stuff in such
a way that it's reasonably compatible license-wise. I don't know the
specifics, I just know it's a forked codebase that improves compatibility with
Linux.

FreeBSD moved to this new codebase because it is getting a lot of attention
and better support.

------
nabla9
ZFS 0.8 has native encryption.

Maybe v0.8.2 is close to the point where 0.8 is stable enough to upgrade?

~~~
pizza234
I can't say about the specific version, but ZFS has a history of great
stability, with very few isolated cases of data loss.

0.8 also has trim support :-)

~~~
paulmd
ZFS on BSD has that track record. ZFS on Linux already has had a couple
incidents in a relatively short track record.

The incident where updating your Ubuntu kernel resulted in data loss probably
could not have occurred under the BSD model due to close coupling between the
kernel and the filesystem.

~~~
rincebrain
To be fair to ZoL, at least the nastiest data loss bug I can think of
(hole_birth) was cross-platform, so FBSD, ZoL, illumos, and everyone else in
the OpenZFS family was equally affected.

~~~
paulmd
I was thinking of this one.

[https://news.ycombinator.com/item?id=16797919](https://news.ycombinator.com/item?id=16797919)

The whole downstream/upstream model with different parts of the OS kernel /
filesystem / userland moving out of sync with each other through multiple
levels of backporting is not a positive one for reliability. FreeBSD is
FreeBSD and the buck stops there. Responsibility is too diffuse in the Linux
model to get the kind of reliability that FreeBSD has enjoyed.

I realize that I'm being a bit of a fuddy-duddy but it really didn't take too
long at all for the ZoL team to start making dumb mistakes that cause data
loss. Hopefully it's a learning experience and didn't happen again.

I myself have a dataset that I can't send from a FreeBSD system to my Ubuntu
16.04/ZoL backup server. It worked at one point, as of about a year ago it no
longer does. When the send is finished, the client machine just spins forever.
I've tried everything short of formatting and reinstalling the backup server.
I tried incremental sends, I tried re-sending the whole thing, I tried killing
the pool, updating everything and re-creating the pool, I tried messing with
microcode in case it was a wayward regression from Spectre, etc etc. ZoL just
won't receive that dataset anymore.

Best of luck to the ZoL guys but I guess FreeBSD is working well enough as a
data storage layer for me. Too much weird shit going on with ZoL.

~~~
rincebrain
Ah yes, the zap shrinking bug, that was silly, but as the one I mentioned
illustrates, not all of them (or even necessarily most of them, few though
there have been) originate with ZoL.

Have you tried reporting the bug you're having with send/recv? I imagine
people would care about that kind of reproducible failure.

Re: FBSD versus ZoL, go with what works for you. If something is working well
enough, I'm not going to advocate changing it. (This tends to be an unpopular
opinion among other people though.)

------
chousuke
Seems like there are no RHEL 8 packages yet, but I think this release fixes
the build issues, so they'll probably happen soon.

~~~
vermaden
ZFS being available since 2005 ...

BTRFS is developed since 2007 ...

... and Red Hat decides to reimplement this kind of pooled storage filesystem
again with XFS on LVM now called Stratis ... not very bright.

~~~
chousuke
I think Red Hat ultimately decided btrfs is not going to go anywhere, and ZFS
is unfortunately a non-starter for licensing reasons, so from their
perspective it really doesn't matter how good it is.

I suppose the benefit of using XFS and LVM is that they're both very well
understood and widely used technologies. Sometimes incremental evolution is
all you need.

I'm following the development with interest, because I think the approach has
merits. Dave Chinner's talk about teaching XFS how to do snapshots and
subvolumes was particularly interesting. The gist of it, as far as I
understand, is that XFS can pretty much do those things already, but it needs
better integration for performance and usability.

~~~
cmurf
Red Hat doesn't have developers to support Btrfs, and it takes a lot of
support if you're going to depend on constant backporting of fixes and
features. All the major work on Btrfs happens upstream and RHEL kernels just
aren't anywhere close to that. So to support this in an enterprise context
with support plans, you'd have to have many knowledgable Btrfs developer and
support staff.

Obviously it is going somewhere, upstream is very active, and SUSE uses it by
default everywhere both enterprise and openSUSE. And they have the developers
to support it. Facebook uses Btrfs quite a lot in production both servers and
desktops.

~~~
danieldk
_Red Hat doesn 't have developers to support Btrfs, and it takes a lot of
support if you're going to depend on constant backporting of fixes and
features._

Given how many kernel developers and filesystem developers they have, it would
not be hard for them to hire btrfs developers or ask some of their kernel
developers with a background in file systems to specialize in btrfs. I think
the reasons are different: either it's because they have a lot of XFS
developers on staff, fighting a transition to btrfs nail and tooth; or they
simply believe that btrfs cannot be made good enough to support _their_
enterprise developers within a reasonable time frame.

~~~
chousuke
I suppose requiring a transition might be the problem in the first place. XFS
and LVM are things that are in use _now_. Though they might not be the most
performant or the most convenient at this time, it makes for a strong strategy
if you can bring in new features without requiring users to switch over to
something completely new.

------
aidenn0
Still no block-pointer rewrite? :P

~~~
nialv7
No, but it's becoming less relevant now. Certain things thought to require bp
rewrite (e.g. vdev removal) are now implemented without it.

~~~
aidenn0
I'd like to see background dedupe. Dedupe is not useful in many real-world
situations because of ARC space requirements, but a COW filesystem ought to be
able to find duplicated blocks and rewrite pointers at some point after the
data is written; after all snapshots are a very specific type of
deduplication, so most of the machinery is there to handle it.

I have a data store (nightly build archive) that gets almost 3x reduction in
space used with dedupe, but the performance hit is just too big for us to use
it.

