Could you please explain what you mean by a separate repair tool kit?

lproven · on Dec 7, 2023

I think he means an equivalent for `fsck`.

db48x · on Dec 7, 2023

`zfs scrub` is equivalent to `fsck`.

It just has a different name, and it can fix more kinds of corruption (including corrupted data, which `fsck` for other file systems can never do), and can do so on line. Of course zfs can fix those same errors on line while doing ordinary reads, so really all `zfs scrub` does is read everything.

`zfs scrub` is better than `fsck`.

rincebrain · on Dec 7, 2023

It is not.

`zpool scrub` walks every block in the pool, and implicitly, you do checksums and other things while doing that. That's it.

It's not doing any sort of logic bug repairs or cleanup or anything else.

It's also not checking that you can, say, decrypt things, since that would mean you needed the keys to scrub.

db48x · on Dec 8, 2023

It really is better. The key difference to understand is that in ext4 any form of metadata corruption which can be automatically fixed requires you to take the filesystem off–line (to unmount it), and then run fsck. Meanwhile with zfs, any form of metadata corruption which can automatically be fixed is simply fixed right on the spot, transparently.

In truth fsck is a wart, a kludge, a bag on the file system design.

rincebrain · on Dec 8, 2023

I am passingly acquainted with ZFS.

The key thing to realize is that you can't, actually, automatically fix every problem, sometimes you have found a logic problem which results in an impossible outcome and you need someone to manually clean it up.

In a world without flaws, it would be great to never need that. But the thing about theory and practice is that in theory, they never differ, but in practice...

db48x · on Dec 8, 2023

I know that. But fsck only fixes the ones that it can fix automatically. With anything else, you are completely on your own. Get a hex editor and go to town. With zfs, if there is some kind of problem that cannot be automatically fixed then at least you have one more tool available: zdb. It’s a _debugger_ for zfs filesystems. It will show you everything, more than you ever wanted to know. It is way better for fixing problems than any hex editor.

rincebrain · on Dec 8, 2023

zdb is read-only. It's not fixing anything, just telling you what's going on.

db48x · on Dec 8, 2023

Don’t be an idiot. You can fix more with zdb and a hex editor than you can with the hex editor alone.

rincebrain · on Dec 8, 2023

That seems rather rude.

The discussion was about the need for tools to make it easier to handle cases where you couldn't automatically handle repairing them, and your statement was that zdb is very useful, which is true, but it doesn't fix anything.

fsck for various filesystems has a bunch of common cases like "this is an orphaned file, should I save it or mark it free?", and something similar would indeed be useful for a number of failure cases in ZFS which require more explicit instruction on what to do about it because you can't easily automatically resolve it.

`zpool scrub` is very useful, but ZFS could still benefit from automated tooling to handle some common failure modes, not just let you write bespoke tooling every time.

db48x · on Dec 8, 2023

> fsck for various filesystems has a bunch of common cases like "this is an orphaned file, should I save it or mark it free?"

This is an example of an automated fix that zfs just handles transparently, without needing to prompt the user. Or it would, if zfs could even have orphaned files, which it cannot.

I don’t know why you think this is such a win for fsck, which doesn’t even bother to give you any idea what the file was. It doesn’t try to show you the contents, and it probably doesn’t know what the file was called, or why it was deleted. Or even if it _was_ deleted; a file could be orphaned merely because the data was written but the write to the directory entry got lost. The user has nothing to go on and just guesses, or says `y` for everything. Useless.

> `zpool scrub` is very useful, but ZFS could still benefit from automated tooling to handle some common failure modes

This is precisely and exactly what scrubbing does! All failure modes that can be automatically fixed, whether they are common or not, are transparently fixed without even needing to unmount anything.

zdb is there for the really rare cases where there is so much damage to the filesystem that zfs cannot even mount it safely. Other filesystems don’t have anything like it.

rincebrain · on Dec 8, 2023

> Or it would, if zfs could even have orphaned files, which it cannot.

If you use the zfs_recover parameter, then in a couple of cases, it will just permanently mark space as unallocatable forever because it can't figure out what owns it due to some errors, and you decided that was a better outcome than whatever error it was encountering. (That's what the "leaked" zpool property counts.)

Conceivably, you could write something to carefully figure out what, if anything, owns it, and allow it to be freed, or grab the contents of that region and drop it somewhere and then throw it out after the same safeguards, but by definition if you triggered that handling, something has gone wrong and you don't have a better automated intervention.

I wasn't arguing that ZFS has a case requiring orphaned files, but my point was that that was an example of "I don't know what to do with this, but I know enough to realize I can't decide what to do about this or just throw it out, so here, you do it."

Or if you, for example, wanted to do a destructive rewind on a pool because of some horrible edge case, then you might want some automated way to extract everything you're about to throw out, and that might require more work than just `zdb -R` if the pool won't import in the first place.

One example of something that would be useful to be able to do, particularly offline, would be when pools have some issue like spacemap corruption, you could conceivably walk the non-spacemap metadata to re-synthesize the spacemaps from whole cloth and write them out again. (And it's not an all-or-nothing thing, a number of versions many years ago would have very minor errors in how they computed spacemaps, which just mean zdb whines at you if you ask it to verify them even when the pool is offline, but don't interfere with the running otherwise.)

Could you convince the kernel code to do that for you on import if it's blocking import? Probably, though you'd probably want some out of band communication method to force it sometimes because I would bet there's any number of ways it might go awry and get too far into the woods of thinking it's "fine" before tripping some assertion.

Is it going to be faster to iterate on it from userland, particularly if you can simulate whether the import works from something like zdb with those conceptual spacemaps written out somewhere, versus rebooting on kernel panic? Absolutely.

But that's an example of a case you might have something so disgruntled it could be nice to reconstruct it outside of the normal import flow.

> All failure modes that can be automatically fixed, whether they are common or not, are transparently fixed without even needing to unmount anything.

`zpool scrub` doesn't fix anything except checksum errors. That's it. Any other class of flaw, it doesn't handle. I don't know why you think it does anything else, but I promise you, it absolutely does not.