
Proposal to move CoreOS off of btrfs - jsnell
https://groups.google.com/forum/m/#!topic/coreos-dev/NDEOXchAbuU
======
HorizonXP
Thank goodness. btrfs has bitten me way too often on CoreOS. I eventually
succumbed to wiping the /var/lib/docker mounted directory on every reboot to
minimize the occurrence of the low disk space issues, but nevertheless, issues
still exist. Even today, I still have some cases where docker will complain
about not being able to create the container/file, due to some inane error.
I've had to reboot machines many times just to try to get it to cooperate.
Furthermore, that solution of wiping the partition means that my machines need
to redownload their docker images on every reboot, which means they come up
more slowly than absolutely necessary.

~~~
23david
This is ridiculous. CoreOS is managing their own distro, so why not just
package in ZFS and get it over with.

It's been how many years now waiting for the next great ZFS competitor? If
nobody is able to improve on ZFS, how about we just all jump on the bandwagon
and move on with our lives?

If nothing else, having more people using ZFS may inspire someone to actually
improve on it. BTRFS has such a limited feature set in comparison to ZFS that
with the advent of ZFS on Linux, it's really hard to understand the
community's continued backing of BTRFS as the Linux community's CoW filesystem
competitor to ZFS. Spend a few weeks using ZFS on Linux on a machine with a
few SSDs and a few HDDs, or just enable it on a linux laptop with a SSD drive,
and it's hard to go back to anything else.

~~~
tacticus
Because sun decided to sabotage it when they released it and oracle haven't
stepped back yet.

~~~
throwaway90446
Sabotage? How, exactly? I have heard of no major corruption or performance
bugs in ZFS, what am I missing?

~~~
techdragon
as others have said below. the CDDL or 'cuddle' license as some chose to
pronounce it. Is incompatible with the GPL, which means two things.

1 - We can't just take the code and add it to the kernel. 2 - The patent and
other legal grants in the licence won't apply if we try to reverse engineer a
compatible GPL2 equivalent for the kernel to use, which will be a HUGE risk
for the few companies that would probably consider the cost of reverse
engineering worth it. And before you say it can be done without company
support. I have 2 things to say, hows that going for the ReiserFS 4 fans who
wanted to keep improving that, and in order to continue the work on ZFS
effectively, the existing ZFS developers have 'ganged up' and now try to work
on everything so that the openzfs project can be a single source of truth for
the ZFS source code, and a reverse engineered version would be very unlikely
to benefit from this, and would therefore require _even more_ developer time
just keeping up with the improvements 'upstream' in ZFS from the 'original'
ZFS codebase.

~~~
23david

      In the case of the kernel, this prevents us from distributing ZFS as part of 
      the kernel binary. However, there is nothing in either license that prevents 
      distributing it in the form of a binary module or in the form of source code.
    

Source:
[http://zfsonlinux.org/faq.html#WhatAboutTheLicensingIssue](http://zfsonlinux.org/faq.html#WhatAboutTheLicensingIssue)

    
    
      ZFS cannot be added to Linux directly because the CDDL is incompatible with the GPL. 
      ZFS can, however, be distributed as a DKMS package separate from the main kernel package.
    

Source: [https://wiki.ubuntu.com/ZFS](https://wiki.ubuntu.com/ZFS)

So why not just distribute the DKMS package with a distro? Easy enough. Will
it ruin the user experience somehow?

------
discardorama
So... I just installed Ubuntu 14.10 on my (new) home PC, and chose BTRFS
because I figured it was stable by now. Should I be worried? I do plan on
playing with docker.

~~~
rogerbinns
I've been using btrfs on several machines for about 3 years. It is not stable:
[https://btrfs.wiki.kernel.org/index.php/Main_Page#Stability_...](https://btrfs.wiki.kernel.org/index.php/Main_Page#Stability_status)

When it goes wrong (about every 4 months for me), you will end up in a
nightmare. Generally attempts to fix issues will make things worse, various
tools and pages contradict each other, and the devs are only interested in the
latest kernel version. Much of this is deliberate - the code is written to not
sweep things under the rug, which means you can hit problems and not recover.
Use backups and make sure you can restore.

The reason why I keep using it is because there is no silent corruption as you
get with ext4. A scrub can verify every byte of data is unaltered and recover
if using anything other than single profile. Compression, volume management,
cheap snapshots etc also make working with it nice. Until things go wrong.

The single biggest "going wrong" is running out of space. Copy on write
filesystems by their nature leave existing content alone and write new
information in the spare space, eventually doing a garbage collect of obsolete
data. When you are out of space that gets rather difficult with bizarre
symptoms and tricky recovery.

~~~
discardorama
Thanks! Luckily, I hadn't done much more than just installing Ubuntu, so now
I'm reinstalling it, this time with EXT4 :)

~~~
rogerbinns
Just be aware that other filesystems like ext4 do not checksum their data.
They have no way of telling if corruption has happened, nor are the
diagnostics useful if you do somehow figure out that a block is problematic.
This has happened to me several times over the decades as hard drives have
lost the plot, or due to bugs. Backing up a corrupted file gets you nonsense
in the backup. (A standard btrfs demo is deliberately corrupting the
filesystem and then showing recovery. You can't do that with ext4 since it has
no idea if what is there is correct in the first place.)

You can use tools like LVM and md as a layer underneath ext4 to provide some
resiliency, but there is a learning curve and two sets of tools to work with.
Changing around disk/partition sizes isn't much fun with them.

------
rickycook
For everyone complaining about the metadata rebalancing, btrfs 3.18 (CoreOS is
currently using 3.17) added auto rebalancing:

[https://btrfs.wiki.kernel.org/index.php/Balance_Filters](https://btrfs.wiki.kernel.org/index.php/Balance_Filters)

[https://btrfs.wiki.kernel.org/index.php?title=Main_Page#News](https://btrfs.wiki.kernel.org/index.php?title=Main_Page#News)

------
mrmondo
First they came for your SELinux... and you did nothing. Then they came for
your Docker... and you did nothing. Now they've come for your filesystem...
Madness!

Just as BTRFS is finally stable and fast CoreOS decide yet again to ship.

We've been using BTRFS in production for over a year now (and heavily with
Docker) and haven't suffered any problems at all(1), in fact we actively
simulate failures to practice / test repair processes which have all been
successful.

To me, while diversity is great CoreOS is going the path of Ubuntu with it's
NIH syndrome.

(1) _With the exclusion of a slow docker push /pull bug that's about to be
patched:_
[https://github.com/docker/docker/pull/9720](https://github.com/docker/docker/pull/9720)

~~~
akerl_
Can you clarify how this is an example of NIH? They're switching the default
from btrfs to ext4, not writing their own filesystem.

Also, it's necessary if they want to switch to overlayfs as the Docker
backend, because btrfs lacks whiteout support which causes problems with
overlayfs.

Switching to overlayfs for Docker makes a lot of sense, given that you get the
dedupe benefits of AUFS with lower overhead than devicemapper/btrfs and it's
now part of the mainline kernel tree.

~~~
mrmondo
overlayfs is still in its infancy and you're then still relying on an
underlying filesystem, also unless I'm mistaken OverlayFS doesn't provide
compression and checksumming?

------
visural
Am I right in saying that the complaints with btrfs in CoreOS are specifically
around its use in conjunction with Docker?

(Interested as I'm thinking about building a homebrew NAS/general purpose
server w/ btrfs, there's a lot of outdated info on btrfs but I was getting the
impresssion that it's now a pretty stable and useable filesystem)

~~~
lstamour
I can say that ZFS has worked great for me on BSD-based home servers. Haven't
used ZFS with Linux yet, though it's possible to do so, it's just unpopular
partly for licensing reasons. I suspect what type of RAID you do may have
greater consequences than what file system you pick, particularly if your
distro is already designed for serving files on the file system you choose. Oh
and working out all the AFP/Samba bits are fun, because there's always
something that surprises you.

~~~
bashinator
I've got severely burned by ZFS on linux running in AWS. Heavy NFS load (ZFS
NFS, not linux kernel NFS) caused a kernel panic, pretty reproducibly. This
was on Ubuntu 12.04 with the offical ZoL PPA sources, so YMMV.

~~~
tensor
I've had ZFS on linux freeze up on me every several months too. Not a
production system fortunately.

~~~
23david
One thing you might want to check is if you're setting a limit in the driver
for the amount of memory ZFS uses for caching. By default it'll use a LOT of
memory, so I usually just set to a max of 2GB and don't see any issues.

------
gojomo
If so many people are so regularly bitten by these BTRFS issues – they're
easy-to-reproduce and painful – shouldn't they be relatively easy to
prioritize and fix?

~~~
mrmondo
I'd like to see a real-world, reproducible BTRFS bug on 3.18 that's a serious
problem.

------
finid
If Btrfs is as bad as this proposal makes it look, why did the folks at
openSUSE make it the default file system for the root partition on openSUSE
13.2?

~~~
mrmondo
Yeah I haven't experienced any BTRFS problems, it could just be people stuck
on RHEL running kernel 2.6 / 3.2.

------
doublerebel
ZFS is coming for Docker. This is a natural move. Now that enterprise relies
on the cloud they need enterprise-reliable storage -- and ZFS is proven.
Looking forward to the ZFS-hosted cloud future.

~~~
23david
Funny thing is that if ZFS happens in Docker it'll be only because the
community demanded it over the opposition of the core Docker maintainers.
AFAICT, Aufs is being dropped due to massive opposition from Redhat.
(Apparently AUFS code is too horrible to merge/support???) Despite its spotty
track record, BTRFS is the only current recommended/supported CoW file system,
and OverlayFS looks to be the docker-blessed next-gen Aufs alternative.

Based on past experiences, I don't see Docker-blessed ZFS support coming
anytime soon, but I hope I'm wrong. Maybe someone over at Joyent or Oracle can
grease the wheels here? :-)

------
preillyme
The out-of-space / metadata balancing problem has bitten me more times than I
care to count. It's essentially a fact of life that I have to blow away
/var/lib/docker and all its subvolumes every few weeks on any given machine,
to clear an out-of-space problem.

~~~
Jonanin
Copying a sentence from the first reply on the list?

