
Linux Storage, Filesystem, and Memory-Management Summit - l2dy
https://lwn.net/Articles/lsfmm2019/
======
corbet
If you appreciate this kind of reporting, please consider subscribing to LWN.
Subscriber support is the only thing that allows us to do this kind of work.

------
philips
Favorite quote from the articles so far about BPF:

Gregg started with a demonstration tool that he had just written: it's
immediate manifestation was in the creation of a high-pitched tone that varied
in frequency as he walked around the lectern. It was, it turns out, a BPF-
based tool that extracts the signal strength of the laptop's WiFi connection
from the kernel and creates a noise in response. As he interfered with that
signal with his body, the strength (and thus the pitch of the tone) varied. By
tethering the laptop to his phone, he used the tool to measure how close he
was to the laptop. It may not be the most practical tool, but it did
demonstrate how BPF can be used to do unexpected things.

~~~
bionsystem
Brendan Gregg is also the guy who shouted at a rack in a datacenter to prove
that some hard drives were sensitive performance-wise to vibrations. The video
is on youtube and it's hilarious.

------
gumby
I was surprised to see discussion of NFS. NFS certainly was a big deal "back
in the day" but it had its own quirks and headaches. I haven't seen NFS in 20
years, but that could simply be because of the particular worlds I live in.

Is it still widely used and I just happen never to see it because the
environments in which I work?

Or is it only used for a small number of sites (or certain applications) but
they happen to be extremely important ones?

~~~
happythought
VMware supports NFS as a storage backend so it ends up being used in a lot of
storage arrays in Enterprise shops.

Gitlab.com was also using it as of last year, but I’m not sure if that’s still
the case.

~~~
muxator
Yes, it has now been replaced by Gitaly:
[https://about.gitlab.com/2018/09/12/the-road-to-
gitaly-1-0/](https://about.gitlab.com/2018/09/12/the-road-to-gitaly-1-0/)

~~~
happythought
From the drawing, it looks like gitaly decouples git storage from the workers,
but still uses NFS on the backend.

------
topspin
Btrfs isn't even on the agenda.

~~~
foobiekr
is btrfs as a project even healthy? a startup I worked at looked hard at btrfs
since the COW model and better integrity verification were both very, very
useful from our POV. But btrfs in practice was just simply unsuitable; not
only were there basic reliability issues (corrupted filesystems) and really
bad corner case behavior (full filesystem in particular), we noted that it was
not actually fully endian neutral, at least at the time, which caused
filesystems between x86 and a BE ISA to appear to work, and then to be
horribly corrupted if taken from a BE embedded system and mounted on x86.

Given the volume of btrfs negative experiences that are going to need to be
overcome (for example, many of the posts in [1]), maybe people have just given
up? If ZoL licensing wasn't a problem, would anyone even be interested?

[1]
[https://news.ycombinator.com/item?id=15087754](https://news.ycombinator.com/item?id=15087754)

~~~
LeoPanthera
I routinely hear about how btrfs is unreliable, unstable, and corrupts your
data.

But it's also been the default filesystem for SUSE Linux and Synology NAS
products for a long time now, and they don't seem to be having any problems.

I don't know what to believe.

~~~
Mister_Snuggles
Anecdotally, I've found btrfs unreliable. Here is a comment I made a week ago:

\----------------------------

Personally, BTRFS feels like it has a ways to go before it's ready for prime-
time.

I've had two major and one minor BTRFS-related issues that have scared me away
from it.

1) One of my computers got its BTRFS filesystem into a state where it would
hang when trying to mount read/write. What I suspect is that there was some
filesystem thing happening in the background when I rebooted the machine. I
rebooted via the GUI and there was no sign that something was happening in the
background, so this was really a normal thing that a user would do. No amount
of fixing was able to get it back, but I was able to boot from the
installation media, mount it read-only, and copy the data elsewhere.

2) Virtually all of the Linux servers at work will randomly hang for between
minutes and hours. This was eventually traced to a BTRFS-scrub process that
the OS vendor schedules to run weekly. The length and impact of the hang
seemed to be based on how much write activity happens - servers where all the
heavy activity happens on NFS mounts saw no impact, but servers that write a
lot of logs to the local filesystem would get severely crippled on a weekly
basis. We've moved a bunch of our write-heavy filesystems to non-BTRFS options
as a result of this.

3) This is a more minor issue, but still speaks to my experience. I had a VM
that was basically a remote desktop for me to use. Generally speaking it would
hang hard after a few days of uptime with no actual usage. When I reinstalled
it on a non-BTRFS (sorry, can't remember which filesystem I used) filesystem
it was rock solid. I have no proof that this had anything to do with BTRFS.

All of these were things that happened around a year ago, they may not be a
true representation of the current state of BTRFS. But they've burned me
badly, so now any use of BTRFS will be evaluated very carefully.

In contrast, I've been running ZFS on a couple of FreeBSD servers, with fairly
write-heavy loads, and have had no issues that were filesystem-related. Even
under severe space and memory constraints ZFS has been rock solid.

\----------------------------

The first problem is directly attributable to BTRFS. There is no way a
filesystem should get corrupted by a simple user-initiated reboot, regardless
of what the system is doing in the background.

The second problem is a combination of BTRFS and the distribution. The
distribution added a weekly job which did a BTRFS scrub (IIRC), under certain
workloads that would completely hang machines for minutes to hours. The time
this ran seems to be based on when the OS was installed, so as luck would have
it these brought production systems down during business hours.

The third problem is something I have no idea about. It could be BTRFS, it
could be something completely different, I honestly have no idea.

~~~
bscphil
>One of my computers got its BTRFS filesystem into a state where it would hang
when trying to mount read/write. What I suspect is that there was some
filesystem thing happening in the background when I rebooted the machine.

>The first problem is directly attributable to BTRFS. There is no way a
filesystem should get corrupted by a simple user-initiated reboot, regardless
of what the system is doing in the background.

It doesn't seem like you have any direct evidence that this has something to
do with rebooting. In fact, Btrfs is much less susceptible to these sorts of
problems than previous file systems. That's because writes in Btrfs are
atomic, so a file is never "partially" written to disk. Either it's written,
or it isn't. You can't get disk corruption from write failures.

What I suppose might potentially happen is that you rebooted the system while
in the middle of installing a bootloader or kernel update. I don't know if you
tried mounting the partition r/w from another system or not, but assuming you
didn't, it's probably more likely that something broke on your system that
prevented it from remounting the partition during the boot process.

~~~
Mister_Snuggles
I know that updates were not installing when I rebooted. It may have been
checking for updates, but it definitely wasn't installing them.

To recover, I booted from an installer image on a USB drive and tried to mount
the partition R/W and it hung. Since the installer is a known-good
environment, this rules out breaking my system the way you describe. This was
a corrupt BTRFS filesystem.

One possibility is that because I was using a rolling-release distribution, I
may have gotten a version of BTRFS with a bug in it or minor changes in the
BTRFS code as the system got updated eventually got the filesystem into a
state that rendered it unable to mount R/W. Either scenario doesn't inspire
confidence.

------
htfy96
> System observability with BPF

Initially I thought this was about recent Spectre vulnerability variants
related to BPF. Then I found it is actually discussed in BPF: what's good,
what's coming, and what's needed
([https://lwn.net/Articles/787856/](https://lwn.net/Articles/787856/))

------
navidr
Are videos of this conference available or is going to be available?

~~~
corbet
No video recording was done, so no.

