Hard disks are the only part of a computer that is not replaceable. Everything that improves data integrity is improving computer usage as a whole. Improving data integrity is improving user confidence in the machine. This is something that OSes should have done years ago. All any user ever does on a machine is manipulate and work with his personal data. Absolute data integrity should be paramount to any computing technology.
Alas, it is not, and we all have (hopefully) several backups. Again: Anything that improves data integrity is improving computing as a whole. I want this.
So, ultimately, this is a hardware play, then. That makes sense, because most of the features of ZFS don't add any value to a Mac with a single internal drive and at most one or two USB drives, which is how almost all Macs are used.
* Data reliability (checksumming, etc...)
* Lots and lots of undo
* Including undoing entire OS upgrades
* Cheap and quick partitioning
* Easy disk replacements
I've played a bit with ZFS myself and really enjoyed it. It's like the sum of everything we learned about filesystems and storage and previously used to shoehorn into various historical filesystems, only this time it's a clean and simple implementation.
You could even TimeMachine-without-additional-drives... I mean, why should my snapshot system and my backup system be one? How about letting me use TimeMachine locally, and if I plug up an additional drive, TimeMachine asks me if I want to backup to it as well?
Not a game-changer exactly, but a nice-ish upgrade potentially.
The biggest upgrade would honestly just be quicker/continuous replication that doesn't drag the rest of your system down.
Unless something changes between now and RTM, Lion will provide that feature out of the box:
And that's about the only feature that a typical Mac user would benefit from with ZFS, without some kind of "prosumer" or "enterprise-y" external storage device.
Also the reliability aspect. I haven't had HFS+ go bad on my own systems, but I've seen it several times on servers. It's on about the same level as ext3fs in that arena, which is pretty bad in my book.
I get the "less is more" argument and I agree generally. ZFS is so far beyond a traditional FS though.
It also enables retail locations extra options to make users happier quicker: if a customer comes in pissed with a logic board problem that's the "third strike" for that computer, the support staff can simply ask if the customer has Time Machine backups, and send the customer away with a new-in-box computer without having to worry about moving drives around and migrating files in-store.
Apple's vertical enough that keeping Time Machine simple saves them employee hours around the world every day.
(This might still change before they release Lion, though.)
In commercial offerings built around these features, you typically source an SSD that is either write-biased (logzilla) or read-biased, depending on what you are trying to do.
Logzilla's formal name is ZIL, ZFS Intent Log. Generally these are SLC flash SSD's, mirrored as losing the ZIL on a ZFS pool can lead to "interesting" recovery situations.
L2ARC is basically an extension of main memory used to cache data from the drives. If you lose L2ARC, there aren't any serious consequences. Usually L2ARC is implemented with less expensive MLC flash SSD's.
On a related note, Seagate sells a Hybrid SSD+rust drive called the Momentus XT which uses it's 4GB of flash a similar manner to the L2ARC.
This is why Sun Storage products (and Nexenta, or homegrown clusters) put the ZIL SSDs in the JBODs and not the storage heads.
You could of-course create a pool that did use a disk as the ZIL, but again, there'd be no point since the mirror has to be kept consistent otherwise it's worthless.
Since the cache (L2ARC) is just cache, and you can lose it at anytime without data-loss, it is in the heads.
Nit-picking. To be clear, the ZIL can (and should be) mirrored, you just want to do it with SSDs consistently. Mixing devices is possible, but you'll be limited to the performance of the slowest device. You can also create storage clusters with multiple heads. Again, a good idea IMO, but the ZIL is a critical component of the FS. If you lose it, you lose the whole pool (as a rule I think, but there might be clever hacks around that under special circumstances if you're lucky). So if you're going to multi-home your zpool, make sure the ZILs are in the shared-storage region. It's no good having redundant controllers if your ZIL is on the dead one.
It's really a useless feature when you do the math. Unless you're running a small amount of storage with a very VERY large amount of basically identical volumes (think Amazon) you'll spend more on memory than you did on disk and once you get to that point you'd have almost certainly been better off spending that money on SSDs and using any extra memory for primary cache.
ZFS has two tables: A Block Pointer Table. This is your hashes of blocks and stays in RAM to let you know where on disk you'll find a particular block. It's also referenced during writing to ensure that the COW operations are transactional.
The DDT (DataDedupeTable) is similar, but points to de-duped blocks. You need enough RAM to keep this in memory at all times. Otherwise for every write you'll have to scan through the entire table on disk, find matching blocks, then modify it. You'll also have to scan through it for every read to see if the block you're looking for is de-duped.
I've probably mixed up the details. I haven't read the source, just used it. But that's my (maybe flawed) understanding of how it works.
Bottom line though, you DO need enough RAM to keep the DDT in memory. If you don't you'll see SEVERE thrashing and SSD or no SSD your disk performance will slow down from thousands of IOPS to tens of IOPS or lower.
This exact thing happened to us with Sun's 7310, loaded with the Sun specified SSDs. You need to be very sure you have the hardware for de-dupe, and even then the payoff is so small...
The only real use-case I can think of is if you're running a VPS.
So yes it does, but unintentionally.