The recent FreeBSD 14 release apparently failed to build for one of the platforms because a file "somehow ended up being full of NUL bytes" [0]. I wonder if that's due to this bug? (Could be just a coincidence, of course.)
OpenZFS 2.1.4 was included in FreeBSD 13.1, according to the release notes.
freebsd-update fetch install is part of the normal upgrade process
If the man page and literally every other piece of documentation about upgrading is correct: no, no it's not. If the intent of freebsd-update were only to allow migration from the latest patch level, it would fail instead of corrupting its state, no?
The only caveat provided by the freebsd-update man page is the admonishment to read the release notes for special instructions, yet the release notes completely fail to mention this footgun.
Or is your assertion that intuition should take the place of documentation?
When I upgraded I was sure it told me to be on the latest using this. I'll check where I read that, I don't think it was in the handbook but elsewhere on the site.
> find potentially corrupted files (with some risk of false positives) by searching for zero-byte blocks
I believe it's not the best method as it will require carefully defining how many null-bytes blocks (and of which size) are a sign of corruption.
Another reason is the frequency of corruption was shown to vary during tests, and to be based on the IO load : the observed zero-byte blocks may have been caused by the artificial scenario to intentionally reproduce the bug. Natural occurrences of this bug may have a different pattern.
You should use ctime+checksums from old backups instead.
I have 18 months of backups, and could go back earlier if needed but accessing and processing each backup will take time. I don't want to do that multiple times.
Anything you can mount, using any filesystem keeping this metadata, should be usable as in input: this means from NTFS to zfs snapshots. It may even be possible to use ZIP files (which keep date and time) to feed a medata sqlite database.
Comparing the metadata DB to the actual ZFS filesystem would give you a list of suspicious files, and the most recent backup you could use to restore them
Alternatively, it could be possible to deduplicate all the backups when measuring the metadata, then to keep local copies of all the files, but it may add complexity and storage requirements.
If you are in a rush, write a script like that, but given how the null-bytes detection approach now seems flawed, you may have to rewrite that as we learn more details.
Also, there's no real fix for this bug yet that isn't introducing other problems. While we're still learning the ins and out of this >18 years old bug, I recommend to keep using a 2.1 version of zfs with zfs_dmu_offset_next_sync=0, and to bet on the low probability of corruption that allowed this bug to persist for such a long time.
Of course, keep your cold backups (don't delete!) but they can't accumulate corruption if you don't access them.
It's a very bad bug, so it is important to note there's an apparently-effective mitigation[1], which doesn't make it impossible to hit, but reduces the risk of it in the real world by a lot.
# as root, or equivalent:
echo 0 > /sys/module/zfs/parameters/zfs_dmu_offset_next_sync
Before making this change, I was able to easily reproduce this bug on all of my ZFS filesystems (using the reproducer.sh[2] script from the bug thread). After making this change, I could no longer reproduce the bug at all.
It seems that the bug is relatively rare (although still very bad if it happens) in most real-world scenarios; one user doing a heuristic scan to look for corrupted files found 0.00027% of their files (7 out of ~2,500,000) were likely affected[3].
The mitigation above (disabling zfs_dmu_offset_next_sync) seems to reduce those odds of the bug happening significantly. Almost everybody reports they can no longer reproduce the bug after changing it.
Note that changing the setting doesn't persist across reboots, so you have to automate that somehow (e.g. editing /etc/modprobe.d/zfs.conf or whatever the normal way to control the setting is on your system; the GitHub thread has info for how to do it on Mac and Windows).
Why this is a spectacularly bad bug is that it not only corrupts data, but it may be impossible to know if your existing files have been hit by this bug, during its long existence. There is active discussion in the bug thread about what kind of heuristics can be used to find "this file was probably corrupted by the bug" but there's no way (so far, and probably ever) to tell for sure (short of having some known-good copy of the file elsewhere to compare it to).
Which makes the above mitigation all the more important while we wait for the fix!
(>_<) Oh man, I knew about [0] when I posted (which is why I said it just reduces the chance of hitting the bug (by a lot)). But after spending all Saturday JST on it, I went to bed before [1] was posted.
Skimming through #6958 though, it seems like it's the lesser of evils, compared to #15526... I think? It's less obvious (to me) what the impact of #6958 is. Is it silent undetectable corruption of your precious data potentially over years, or more likely to cause a crash or runtime error?
But I have to read more. But since the zfs_dmu_offset_next_sync setting was disabled by default until recently, I still suspect (but yeah, don't know for sure) that disabling is the safest thing we can currently do on unmodified ZFS systems.
> It seems that the bug is relatively rare (although still very bad if it happens) in most real-world scenarios; one user doing a heuristic scan to look for corrupted files found 0.00027% of their files (7 out of ~2,500,000) were likely affected.
I'm running the following script to detect corruption.[0] The two files I've found so far seem like false positives.
Yeah, I am running a similar script I got from the GitHub bug thread; so far I have not found any suspected-corrupt files at all, except for in the files generated by the reproducer.sh script, e.g.:
Possible corruption in /mnt/slow/reproducer_146495_720
UPDATE: There is now a very good simple explantaion of the bug, and how it became much more likely to happen recently, even though it has existed for a very long time:
Thanks, I had seen the title with the cloning issue, and I even commented, but I thought it didn't apply to me as new features are never deployed until at least a few months have passed and other people have confirmed they work well.
I was only worried about the zfs send | zfs receive bug corrupting both pools.
I had been too lazy to check the details, so I had missed that the bug could be triggered even when you DON'T use block cloning at all but just copy files, and even on old versions like 2.1.4: it's only the probability of the bug that increases.
I only caught this issue through reddit.
Now thanks to your link after going through all the github comments I'm rereading all the comments from the HN thread from 2 days ago to decide how to analyze several months worth of backups, some of them not on ZFS, but all of them sourced from ZFS and therefore now suspicious of silent corruption.
Hopefully, this warning will stay long enough in the toplist for affected users to deploy the simple countermeasures or at least preserve their backups until we know how to identify the affected files reliably enough (though number of contiguous zeroes, repeating patterns inside the file etc)
It's "silent corruption" so a scrub would not detect it. This is the worst possible scenario which is why this is getting a lot of publicity in contrast to bugs in the past.
What's worse is that the bug has been in production for 18 months (since 2.1.4) so if you've ever put a file on a ZFS filesystem in the last year or so, you may be affected.
The only true way of knowing whether you are are victim of this is to look at every single file and compare it against the known good that has never been on a ZFS filesystem.
Yes, it's extremely bad, and the title of the original submission from 2 days ago may not have caught your attention.
It should have been "If your version of ZFS is less than 18 months old, you may have silent data corruption".
I editorialized the git title as little as I could while still describing the essential problem.
> I have a filesystem that is not ZFS, but it is a backup of a ZFS filesystem. Am I screwed?
If it's a backcup of a ZFS filesystem created with a version of OpenZFS more recent than 18 months, maybe.
That's because even if it's a 100% perfect backup, you can't know if the backup contains files that where silently corrupted when they were on the original ZFS filesystem (unless you have copies of the files before they were on the ZFS)
The bug is deemed "unlikely" from 2.1.4 to the version 2.2 which introduced block cloning and increated the probability of the bug showing up, but you won't know if 0% or 0.1% (or any other proportion) of your files are affected until after you compare checksums.
I'd suggest to wait until more is known. If this bug flew under the radar for so long, it should be rare.
> And within the last two weeks, I destroyed many of my oldest snapshots.
The goal of this repost is to save some trouble to people who may also delete old snapshots if they don't understand how they can be impacted by this bug as the original title was very unclear
Last time I checked, the discussion about how to best detect affected files was still going on.
IIRC having a null-byte prefix may be caused by the special scripts to artificially and forcefully trigger the bug. In naturally corrupted files, the position of null bytes may be more random than that.
New tests reveal the IO load may matters a lot: if you kept your IO low, you may have as few as 0.01% corrupted files.
All this brings me back to my original idea: the best solution may be using metadata from old backups (ctime, checksum) and looking for discontinuities (same ctime, different checksum) like an IDS would, as checking for null bytes would require too much calibration.
This method will require acquiring this metadata from old backups to figure which file got corrupted when, and what's the best backup to restore it from
Yeah, right now, it's actually not known whether roots of this bug dates back to even the Sun days. And because Oracle ZFS is not open source we can't know if this or other bugs are lurking.
It seems to require certain extremely specific situations to trigger it. Like using copy_file_range, which is the new feature which exposed it, or writing a file and then writing a larger than recordsize hole in that file in the same transaction group.
Another contributor who is watching this more closely informed me that the issue appears to predate Oracle’s acquisition of Sun. While this is bad, it at least suggests that this bug is very rare.
The code has never been formally verified, so there was always a possibility of such a bug existing. Without formal verification, it is possible that more such bugs will be found. I should add that there is no formally verified production ready storage stack, so not using ZFS would not eliminate the risk of hitting bugs like this. :/
Formal verification seems more appropriate for finished software not undergoing development or feature changes. It's the last step before software is set permanently in stone, unchanging, forever.
It's worth noting that copy_file_range is used by a lot of things. Most programming languages "copy_file" functions use copy_file_range, everything from Rust to Emacs Lisp! The only language I can think of that doesn't use copy_file_range when copying files is Python.
On Gentoo, the portage package manager is written in Python but has some "native extensions", one of these extensions is copying files with copy_file_range, which is used when merging images to the root filesystem from my understanding.
Also GNU coreutils "cp" command uses it by default in recent releases, I'm not sure which release specifically introduced this change.
There are other things required to trigger the bug that are a lot less common though.
> It's worth noting that copy_file_range is used by a lot of things.
Yes, but the trigger feature, block cloning, only landed in the latest 2.2 release. If you immediately hopped on 2.2, and used a system with lots copy_file_range and FICLONE use, yes, you may have a problem (like, as you note, on Gentoo, where this problem surfaced).
Most people were just hopping on the bandwagon. My distro ships 2.1.5, so I have a 6 month wait until this feature lands, so I was just building copy_file_range support into my ZFS apps, right before news of this bug hit.[0]
> There are other things required to trigger the bug that are a lot less common though.
Exactly. My guess is the incidence of this will exceedingly rare for the common user/small NAS user/etc. I've run a corruption detector[1], and what I've found mostly indicates false positives. Fingers crossed, but, so far, no actual positive matches on a system with probably a little less than 1 million files.
OpenZFS 2.1.4 was included in FreeBSD 13.1, according to the release notes.
[0] https://www.daemonology.net/blog/2023-11-21-late-breaking-Fr...