>One thing I didn't understand conceptually at first is that ZFS is standalone a...

magicalhippo · on Oct 4, 2022

ZFS stores mount points and even NFS shares in its metadata[1], so more than most others.

[1]: https://openzfs.github.io/openzfs-docs/man/8/zfs-set.8.html?...

yamtaddle · on Oct 4, 2022

My one complaint about ZFS is that I'd repeatedly googled for "what's the procedure if your motherboard dies and you need to migrate your disks to a new machine?" since that's super-easy with single non-ZFS disks but I was worried how ZFS mirrored pools would handle it, especially since the setup was so fiddly and (compared to other filesystems I've used) highy non-standard (with good reason, I'm sure).

And yet, this thread right here has more and better info than my searches ever turned up, which were mostly reddit and stackoverflow posts and such that somehow managed never to answer the question or had bad answers.

The one complaint is that I found that to be true for almost everything with ZFS. You can read the manual and figure out which sequence of commands you need eventually, but "I want to do this thing that has to be extremely common, what's the usually procedure, considering that ZFS operations are often multi-stage and things can go very badly if you mess it up?" is weirdly hard to find reliable, accurate, and complete info on with a search.

The result was that I was and am afraid to touch ZFS now that I have it working and dread having to track down info because it's always a pain, but I also don't really want to become a ZFS wizard by deeply-reading all the docs just so I can do some extremely basic things (mirror, expand pools with new drives, replace bad mirrored disks, move the disks to a new machine if this one breaks... that's about it beyond "create the fs and mount it") with it on one machine at home.

The initial setup reminded me of Git, In a bad way. "You want to do this thing that almost every single person using this needs to do? Run these eight commands, zero of which look like they do the thing you want, in exactly this order".

I'm happy with ZFS but dread needing to modify its config.

mustache_kimono · on Oct 4, 2022

As someone who is a total ZFS fan, I think the `zfs` and `zpool` commands are some of the best CLI commands ever made. Just immaculate. So this comment was a head scratcher for me.

> I also don't really want to become a ZFS wizard

Admittedly, ZFS on Linux may require some additional work simply because its not an upstream filesystem, but, once you're over that hump, ZFS feels like it lowers the mental burden of what to do with my filesystems?

I think the issue may be ZFS has some inherent new complexity that certain other filesystems don't have? But I'm not sure we can expect a paradigm shifting filesystem to work exactly like we've been used to, especially when it was originally developed on a different platform? It kinda sounds like you weren't used to a filesystem that does all these things? And may not have wanted any additional complexity?

And, I'd say, that happens to everyone? For example, I wanted to port an app I wrote for ZFS to btrfs[0]. At the time, it felt like such an unholy pain. With some distance, I see it was just a different way of doing things. Very few btrfs decisions with which I had intimate experience, do I now look back on and say "That's just goofy!" It's more -- that's not the choice I would have made, in light of ZFS, etc., but it's not an absurd choice?

> "what's the procedure if your motherboard dies and you need to migrate your disks to a new machine?"

If you're setup is anything like mine, I'm pretty certain you can just boot the root pool? Linux will take care of the rest? The reason you may not find an answer is because the answer is pretty similar to other filesystems?

If you have problems, rescue via a live CD[1]. Rescuing a ZFS root pool that won't boot is no joke sysadmin work (redirect all the zpool mounts, mount --bind all the other junk, and create a chroot env, do more magic...). For people, perhaps like you, that don't want the hassle, maybe it is easier elsewhere? But -- good luck!

[0]: https://github.com/kimono-koans/httm [1]: https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubu...

yamtaddle · on Oct 4, 2022

IDK. I'm an ex-longtime-Gentoo user and have been known to do some moderately-wizardy things with Linux, and git for that matter, and in the server space I have seen some shit, but I managed to accidentally erase my personal-file-server zfs disks a couple times while setting them up. I've since expanded the mirrored pool once and consider it a miracle I didn't wipe out the whole thing, edge of my seat the whole time.

mustache_kimono · on Oct 4, 2022

> I'm an ex-longtime-Gentoo user

Here for this. Already delighted and amused. ;)

> edge of my seat the whole time.

In the future, you may want to try creating a sandbox for yourself to try things? I did all my testing of my app re: btrfs with zvols similarly:

  sudo zfs create -V 1G rpool/test1
  sudo zfs create -V 1G rpool/test2
  sudo zpool create testpool mirror /dev/zvol/rpool/test1 /dev/zvol/rpool/test2
  sudo zfs create -V 2G rpool/test3
  sudo zfs create -V 2G rpool/test4
  sudo zpool replace testpool test1 /dev/zvol/rpool/test3
  sudo zpool replace testpool test2 /dev/zvol/rpool/test4
  sudo zpool set autoexpand=on testpool
  ...

yamtaddle · on Oct 4, 2022

> > I'm an ex-longtime-Gentoo user

> Here for this. Already delighted and amused. ;)

Haha... yeah, I didn't intend that as a brag or badge of honor or anything—more like a badge of idiocy—but you don't play Human Install Script and a-package-upgrade-broke-my-whole-system troubleshooter for several years without learning how things fit together and getting pretty comfortable with system config a level or two below what a lot of Linux users ever dig into. Just meant I'm a little past "complete newbie" so that's not the trouble. :-)

> In the future, you may want to try creating a sandbox for yourself to try things? I did all my testing of my app re: btrfs with zvols similarly:

Really good advice, thanks. I was aware it had substantial capabilities to work in this manner, but using it this way hadn't occurred to me. Gotta get over being stuck in "filesystems operate on disks or partitions on disks that are recorded in such a way that any tools and filesystem, not just a particular one, can understand and work with" mode. I mean I'm comfortable enough with files as virtual disks, but having a specific FS tools, rather than a set of general tools, transparently manage those for me, too, seems... spooky and data-lossy. Which I know it isn't, but it makes the hair on my neck stand up anyway. Maybe my "lock-in" warning sensors are tuned too sensitive.

Now to figure out how to run those commands as a user that doesn't have the ability to destroy any of the real pools... ideally without having to make a whole VM for it, or set up ZFS on a second machine, and—initial search results suggest this may be a problem, for the specific case of want unprivileged users to run zfs-create without granting them too much access—on Linux, not FreeBSD :-/

magicalhippo · on Oct 5, 2022

Yeah I get your feeling. I had it similarly at first, dreading to make changes because I was afraid I'd mess up.

And to be fair, I think ZFS could be better in this regard. Some commands can put your pool into a very sub-optimal state, and ZFS doesn't warn about this when you enter those commands. Heck even the destroy pool command doesn't flinch if by chance nothing is mounted (which it may well be after recovery on a new system).

I found it helped to watch some of the videos from the OpenZFS conferences that explains the history of ZFS and how the architecture works, like the OpenZFS basics[1] one.

But I agree that the documentation[2] could have a lot more introductory material, to help those who aren't familiar with it.

That said, I echo the suggestion to try it out using file vdev's. For larger changes I do spin up a VM just to make sure. For example, it's possible to mess up replacing a disk by adding new disk as a new single vdev rather than replacing the failing one one, so if I feel unsure about it I take 15 minutes in a VM and write down the steps.

Again, this is something I feel they could improve. Adding a single-disk vdev to a mirrored or raid'ed pool should come with a warning requiring confirmation.

On the bright side, I've been running my pool since 2009, and have never lost data despite a few disk failures and countless unexpected power-outages without PSU. And I just run it on consumer hardware without ECC because that's what I got. Been up to 8 disks, now down to 6 and will soon go down to 4 once the new disks arrive. Send/recive ensures the data is just as it ever was on the new configuration.

[1]: https://www.youtube.com/watch?v=MsY-BafQgj4

[2]: https://openzfs.github.io/openzfs-docs/

fomine3 · on Oct 5, 2022

I think mdadm+LVM+LUKS+XFS is far worse experience for you. Currently ZFS CLI is one of the best in the class for features.

neilpanchal · on Oct 4, 2022

Right, what was not clear to me was that raid configuration and snapshots are also part of the file system. Usually, that's done through hardware cards or software raid where the configuration sits not on the file system (?) (Intel VROC, vSAN, etc), at least that was my wrong conceptual model. I used to make multiple copies of the USB stick of FreeNAS 8 back in the day because I didn't want the USB drive to fail and not knowing how to recover the zpool. Messing around with the file system directly cleared up everything.

viraptor · on Oct 4, 2022

That's more common for Linux storage. Whether you're using zfs, btrfs, or lvm, all the configuration required to read it is stored in the header somewhere rather than in a detached configuration.

vetinari · on Oct 4, 2022

It has to be stored out of the band, otherwise you would have chicken-and-egg problem: what's the shape of the array, when the info is in a file stored inside the array?

viraptor · on Oct 4, 2022

It's not stored in a file inside the array of course, but there are two types of out of band: next to the data, or completely disconnected. For example you only need to mount a single btrfs partition - in the header it already contains the information about other copies and will mount the whole raid setup as necessary. It doesn't matter if you move the drives somewhere else to a new system - mount one of them and it works.

On the other hand, if you move drives from a hardware raid and put in new drives, some (all?) controllers will read the raid config from memory and offer to build the same raid on the new drives. That's completely-out-of-band. Depending on the controller, even changing the order the disks are plugged in can give you weird results.