Saying that a checksum is weak in a cryptographic sense would mean that it is po...

ansible · on Sept 12, 2014

This is not what I meant when I said that CRC was weak. Instead, I meant that it is trivial for ordinary glitches to corrupt data in ways that the checksums fail to detect. Things byte swaps and non-adjacent double bit flips are all that is necessary. The link I provided elaborates on this ...

Since we're talking about data corruption that has managed to not be detected by the hard drive's own firmware (a read error not detected as a read error), I now wonder what sorts of data errors might also slip by the 32-bit CRC used by btrfs.

It would be hilarious (in a bad way) if the same algorithm was used at both levels. So that the same byte swap (for example) would exactly bypass both the drive firmware and the check done by btrfs.

If the algorithms are different though, then I'd expect the chances are reduced that a particular error can slip by both.

db48x · on Sept 15, 2014

It's pretty scary; data on ordinary filesystems just evaporates over time. Undetected bit-flips every few terabytes from the drive itself, flaky cables and controllers, bit flips in your non-ecc ram, ghost writes, misaligned reads, firmware bugs, cheap port multipliers, etc, etc.

I've had a zfs filesystem for a few years now and twice it's detected and corrected an error that would have been silent data corruption in a lesser filesystem.

colin_mccabe · on Sept 26, 2014

Serious servers use ECC RAM so bit-flips in memory are not an issue. CRC32 is also pretty robust against single-bit errors. If you think about it, you can't really make any guarantees for non-ECC systems. Flip the wrong bit, and the kernel will delete all your data. There is an if statement somewhere waiting to make this happen, and if the program code can randomly change, eventually it will.

Typical disk corruption patterns are a whole sector of zeroes being read, or reading data that belonged somewhere else. If you want to prove that ZFS's checksum is better, you need to prove that it's better against the patterns of corruption that actually occur.

colin_mccabe · on Sept 26, 2014

As for patent grants, you are thinking of the GPLv3. The GPLv2 does not provide any patent grant.

This is incorrect. See http://en.swpat.org/wiki/GPLv2_and_patents