
Don't use BTRFS for OLTP - mercurial
http://blog.pgaddict.com/posts/friends-dont-let-friends-use-btrfs-for-oltp
======
simoncion
Oh. He's using Linux 4.0? That's an _old_ kernel if you're running btrfs. I
hope that he re-tests with 4.2 [0] or 4.3. I _also_ hope that he posts his
benchmark settings so I can run it on my systems at home. :) I'm _very_
curious about the ENOSPC errors.

Also, the linux-btrfs mailing list thread about the rant is somewhat worth
reading, if you're into this sort of thing:
[http://thread.gmane.org/gmane.comp.file-
systems.btrfs/48248](http://thread.gmane.org/gmane.comp.file-
systems.btrfs/48248)

[0] There's a mv/rm deadlock that I can trigger from time to time that was
only fixed in 4.2 and later. (His description of the test stall doesn't
indicate that this deadlock is the cause of his problem, mind.) Happily, this
deadlock only blocks operations on the file being operated on, rather than the
whole FS. Also, a reboot clears up the issue. (An umount, then mount might
also clear up the issue, but btrfs is my rootfs, so I can't test that. :P)

~~~
semi-extrinsic
The author adresses this issue very eloquently: if your file system requires a
Linux kernel newer than 4.0 to get decent performance, it is very hard to
argue that it is "mature" and "production-ready".

~~~
simoncion
> The author adresses this issue very eloquently...

That would be one of the reasons why _I_ never made the claim that btrfs is
ready for general use in all situations. :) I mean, in my footnote I mention
that I'm running into a _deadlock_ triggered by perfectly ordinary filesystem
operations.

------
notacoward
I'm pretty sure that I was at a conference where Chris Mason explicitly
commented on the irony of working at a database company (Oracle) but designing
a file system that's fundamentally bad for database workloads. I think it was
a Linux Foundation event, maybe an EUS in NYC. In any case, it's pretty well
known that COW file systems are not a good fit for this kind of workload. A
more interesting question is whether this level of (in)stability and
(un)predictability is acceptable for _any_ workload other than scratch storage
(which doesn't benefit from snapshots and CRCs very much). My takeaway from
this article is basically that F2FS is worth another look.

~~~
pgaddict
While I'm obviously interested in benchmarks and performance (I'm author of
the blog post referenced here), I'm perfectly OK with sacrificing some of the
performance in exchange for advanced features provided by the filesystem.

For example built-in snapshotting, additional data integrity guarantees thanks
to checksums (e.g. resiliency to torn pages) etc. Because you either can't get
that with the traditional filesystems or it'll come at a cost (e.g. LVM adds
complexity and has impact on performance).

What I'm not quite OK with is getting very unstable performance - with OLTP
workloads you really want smooth behavior, not the jitter or random issues you
get with BTRFS. Especially when the other COW filesystems like ZFS perform so
much more sensibly.

I don't think comparing F2FS and BTRFS is entirely fair, though. Those are
filesystems with very different goals, F2FS is mostly designed to work with
single SSD devices (so no RAID-like stuff like BTRFS) and lacks many of the
advanced features (you can't even do snapshots).

Also, it was not my intention to say that BTRFS is somehow conceptually wrong
and unusable for database workloads. But the current state is not really
something I'd recommend for OLTP in production - that's what the rant is
essentially about.

~~~
notacoward
I also take a dim view of that performance instability, with any workload. In
fact, it's one of the points I address on the slides I just sent in for a
mini-tutorial on storage performance (for LISA'15 in case anyone wants to see
it). All I'm saying is that, even under the best of circumstances, I would
consider any COW file system a dubious choice for OLTP.

------
pella
wiki.archlinux.org / PostgreSQL + Btrfs

 _" Warning: If the database resides on a Btrfs file system, you should
consider disabling Copy-on-Write for the directory before creating any
database."_

[https://wiki.archlinux.org/index.php/PostgreSQL](https://wiki.archlinux.org/index.php/PostgreSQL)

~~~
simoncion
Fun fact: I have a multi-TB, largely-write-only Postgres 9.4 database on a
force-compress multi-device btrfs volume. Sadly, you _must_ enable CoW to use
transparent compression.

The performance is... not the best, and not _all_ of that can be blamed on
either my shitty choice of indexes _long_ ago, or my decision to use firewire
to attach the devices.

However, btrfs hasn't eaten any of my data, compression has given me ~2x the
space to work with, and btrfs handled the sudden and unexpected loss of a
device like a champ.

------
pella
other benchmark from the same author ( 4month ago ) :

Tomas Vondra: "PostgreSQL on EXT4, XFS, BTRFS and ZFS"

[http://www.slideshare.net/fuzzycz/postgresql-on-ext4-xfs-
btr...](http://www.slideshare.net/fuzzycz/postgresql-on-ext4-xfs-btrfs-and-
zfs)

~~~
pgaddict
That's a first version of the benchmark, with some minor differences (newer
kernel, a bit more sensible PostgreSQL configuration).

While the end results are mostly the same, the PostgreSQL changes (significant
increase of checkpoint_segments) significantly improved the COW case.

