
Ext4: fix data corruption caused by unwritten and delayed extents - alrs
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg886512.html
======
tux3
This is probably not as serious as the title implies, and apparently not
related to the recent reports of EXT4 corruptions. Quoting [0] Theodore Ts'o:

>So it's pretty hard to hit this bug by accident

>It requires the combination of (a) writing to a portion of a file that was
not previously allocated using buffered I/O, (b) an fallocate of a region of
the file which is a superset of region written in (a) before it has chance to
be written to disk, (c) waiting for the file data in (a) to be written out to
disk (either via fsync or via the writeback daemons), and then (d) before the
extent status cache gets pushed out of memory, another random write to a
portion of the file covered by (a) -- in which case that specific portion of
(a) could be replaced by all zeros.

[0]
[http://thread.gmane.org/gmane.linux.kernel/1956583](http://thread.gmane.org/gmane.linux.kernel/1956583)

~~~
digi_owl
Reminds me of corruption issues that came from HDD controllers reshuffling
writes so that the logs and the actual data got desynced. Only really noticed
if you got a power failure and hit the drive with a fsck afterwards. The fix
was to toggle on write barriers.

------
istvan__
All of my production systems use XFS for the last 5-10 years. I never liked
the idea that we have 5 filesystems on Linux an you have to know which is
better for your use case, while other Unix like systems offer you maybe 2
options but there is one that is supported and recommended (like ZFS on
Solaris). I think ext4 does not offer too much for the most of the users over
ext3 but given its history (lots of data loss scenarios) I would not want to
risk it to start to use it in production. On the top of that, the recent moves
from CentOS to use XFS as the default FS just makes me think I made the right
decision 10 years ago to use XFS for almost everything I do.

Ted Tso about ext4:

> P.S. It's bugs like these which is why I'm always amused by people

> who think that just because a file system is safely being used by

> their developers, that it's safe to throw production workloads on

> them.

~~~
kasabali
XFS doesn't have an impressive track record when it comes to data loss,
either. Come on people, every filesystem with large number of users has had
their screw-ups.

~~~
istvan__
Well I am not sure about that, as anecdotal evidence, most systems engineers I
know use XFS as default on their systems. This might be coincidence, but I
think it is just a sign that XFS has less "screw-ups" than ext3/4\. A
representative survey would help to prove it.

~~~
AceJohnny2
Well, if we're going by anecdotal evidence, I've had a big loss when an XFS-
on-RAID system had a power failure about 4 years ago, and recently I know
someone who's lost hundreds of gigs to XFS as well, with no power loss that
he's aware of.

So there.

------
ComputerGuru
I don't know how or why ext4 took off the way it did. ext2/3fs' dominance was
somewhat understandable, it was the dark ages of filesystem architecture (akin
to the dark ages of cryptography), and no one used math or science to
definitively say something was better than another.

But in the age of ZFS, filesystems like XFS and JFS have proven themselves
worthy to remain in the ring as non-versioning, simple filesystems. Their
resilience (now, despite any known issues in the past) is beyond reproach,
combined with their speed and performance (comparing apples to apples, not
journaled to non-journaled) is where it should be.

Why does _any_ OS/distro use extfs by default for new installations? I can
understand ancient machines that upgrade from ext3 to ext4 to avoid migrating
data to a new filesysystem/mountpoint, but for new installations?

~~~
JoshTriplett
Two notable reasons I know of.

First, people _like_ boring in filesystems, and ext4 was seen as a natural and
less risky successor to ext3, which most people ran. And I think that's a
reasonable perception.

Second, ext4 went out of its way to solve some of the system-level issues that
tended to cause problems, before it went into production. For instance, the
ext4 developers introduced hacks to deal with software that handles write
atomicity incorrectly, making it much less likely that you'd end up with a
zero-length file if you cut power or crashed at the wrong time. XFS and JFS
had similar issues that actually hit users, breaking invalid-but-widespread
assumptions about filesystem semantics. ext4 at first considered having
similar semantics, but instead worked around how many programs actually wrote
files, which made it safer in practice.

(Also, "in the age of ZFS"? ZFS is by no means the dominant filesystem, or
even an obvious contender for that title; it has a few popular features, but
it's hamstrung by its choice of license.)

~~~
kbenson
Also, you could upgrade ext3 to ext4 in-place. That was a powerful feature for
a distro that had ext3 as the default, or for companies that had a lot of data
on ext3 volumes.

------
yuvadam
Similar thread on Arch BBS
[https://bbs.archlinux.org/viewtopic.php?id=197400](https://bbs.archlinux.org/viewtopic.php?id=197400)

------
DangerousPie
I am running Debian 3.2.65 with ext4. How worried do I need to be about this?

I assume/hope that if this was a common occurrence it would have been found
much earlier...

Update: Having reread the report, it appears that this may only be a problem
in the new 4.0 kernel?

Update2: Nope, not just 4.0 kernels (see below).

~~~
kasabali
> Update: Having reread the report, it appears that this may only be a problem
> in the new 4.0 kernel?

No, reporter uses 4.0 kernel which is why bug report lists that version. I've
looked at commit message for the fix but it doesn't talk about any regression,
so this bug may be existing for a long time.

------
feld
"At this point, there is no way to get rid of the delayed extents, because
there are no delayed buffers to write out. So when a we write into said
unwritten extent we will convert it to written, but it still remains delayed."

Ok, so I think what's happening here is

    
    
      1) Write data
      2) Busy filesystem or something makes the write of the
      extent delayed
      3) An update to that data attempts to get written
      4) The delayed extent gets replaced with new data, but then marked as written here 
      instead of when it hits the platters
      5) Now data loss has happened because there are no
      delayed buffers
    

I might be wrong with this breakdown. It's hard to tell. It seems that the
management of delayed extents is not as robust as it should be.

[http://www.spinics.net/lists/linux-
ext4/msg47782.html](http://www.spinics.net/lists/linux-ext4/msg47782.html)

edit: this type of problem wouldn't affect a COW filesystem because when you
overwrite existing data you aren't actually overwriting it, you're writing new
data and updating the map of where to find the entire file contents.

------
cjbprime
As far as I can see, the "critical" label here is coming from a Debian user
submitting a Debian bug report, not from the Debian kernel team or upstream
kernel maintainers.

I think HN submissions should have some standards for stories that will create
the appearance of "there is a critical problem with software X that lots of
people use, and everyone should panic and upgrade" announcements. Having the
authors of software X involved in the announcement seems like it would be a
good place to start.

(I'm not saying that Josh is mistaken about having been hit by this bug, or
was involved in the decision to submit this to HN.)

~~~
dang
> HN submissions should have some standards for stories that will create the
> appearance

HN has such a standard: the guidelines ask for titles not to be misleading or
linkbait. (Not an opinion on the current bug.)

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

~~~
cjbprime
Ah, okay. I think the title is misleading and should be changed.

~~~
dang
Is it (and the url) ok now?

------
tlb
Better explanation and patch: [https://www.mail-archive.com/linux-
kernel@vger.kernel.org/ms...](https://www.mail-archive.com/linux-
kernel@vger.kernel.org/msg886512.html)

~~~
dang
Ok, we changed the url to that from [https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=785672](https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=785672). Thanks.

